Presentation 18:30 - 18:45 (15 min)

Industry Talk #1: AI and Alternative Data: Expanding the Information Edge for Momentum Investing

Thursday, March 12, 2026

Paris, France

In this talk, I will discuss the opportunities of integrating alternative data into passive, rule-based quantitative portfolio strategies. Focusing on momentum strategies, I will illustrate how alternative data can improve signal quality and portfolio performance. I will close with a discussion of current limitations and challenges, and how AI may help overcome some of them through the generation of richer, more informative data.

Speaker

Hamza Bahaji

Head of Financial Engineering & Innovative Solutions @ Amundi

Summary

AI and Alternative Data: Expanding the Information Edge for Momentum Investing
Speaker: Hamza Bahaji, Head of Financial Engineering & Innovative Solutions, Amundi
Date: March 12, 2026
Event: Paris — Market Data x AI (Finteda / FactSet)
Team & Role
Hamza is part of a transverse team fully dedicated to passive investing and smart beta investing within Amundi. The team operates in two areas:
1. Partnership & Co-design — Working with clients to co-design dedicated, purely passive, quant, rule-based systematic investment solutions
2. Quant Engineering — Teaming up with business partners to help implement quant solutions, from model selection to implementation
The common denominator is data and analytics. The team includes a dedicated data engineer/strategist/scout role responsible for managing and monitoring data, and coordinating all internal and external data discussions.
Alternative Data: Definition & Motivation
Alternative data refers to all data outside the spectrum of financial and market data. Key characteristics:
- Less commonly used by market participants
- Harder to access with appropriate analytics and tools
- Often unrelated to financial data (except financial transcripts, analyst call minutes)
- Shorter history
- More challenging to use
Why alternative data? It tends to be more informative than financial and market data. "Informativeness" here means how much information a piece of data is likely to contain. This can be measured via entropy measures and other statistical methods, though these can become computationally intractable.
Case Study: Sentiment-Enhanced Momentum Strategy
Five to seven years ago, Amundi partnered with alternative data providers — emerging fintechs specialized in building datasets based on NLP (now upgraded to LLMs). These providers construct sentiment scores: a metric ranging from 0 (very pessimistic) to 100 (very optimistic) about a given security, with large coverage on equities.
The Strategy
Price momentum is a strategy that buys past winners and shorts past losers over a lookback window (typically 12 months). Momentum lacks a fundamental anchor (unlike value or size strategies based on fundamental data), making it prone to investor sentiment — arbitrage activity can be destabilizing.
The idea: Use alternative sentiment data to screen securities within a momentum strategy — overweight securities with positive sentiment, underweight those with negative sentiment.
Results
The strategy has been running as a dedicated index for almost five years. Performance breakdown by quantiles:
- Top quantile / quantile 5 (winners): Sentiment adjustment improves performance
- Bottom quantile / quantile 1 (losers): Sentiment adjustment reduces drawdowns / negative performance
- Benefits are uniform across all time periods considered (1 day to 63 days)
Drawbacks
1. Time to market — By the time data flows from the provider to end users and into the strategy, it may be outdated. The value of the data lies in containing information not yet integrated into market prices.
2. Ownership — Reliance on external data provider
3. Transparency — The data provider's model is effectively a black box
Monitoring
- Alpha is measured by regressing the sentiment-adjusted strategy against the naked momentum benchmark. When the alpha is no longer statistically significant, the added information is no longer relevant.
- Time-to-market assessment is partly judgmental: if data takes a couple of days to flow through, the information may already be priced in.
Next Steps: Internal AI for Sentiment Analysis
Rather than relying solely on external providers, Amundi is planning to use its internal AI engine to:
1. Extract investor sentiment from financial analyst reports (Amundi has access to a large corpus as a major asset manager)
2. Assess informativeness — What added value can be obtained vs. external providers?
3. Expand sources — The wider the sources, the better the sentiment assessment
Key challenge — Prompt engineering sensitivity: What exact instructions to give agents to produce consistent, robust measures of optimism/pessimism in financial markets. This is why Amundi plans to rely on internal experts for this work.
Q&A:
Q: How do you measure time to market and know when data is already priced in?
A: Two approaches: (1) Statistical — regress adjusted strategy vs. naked momentum; when alpha is no longer significant, the information is stale. (2) Judgmental — if the data pipeline takes days, the information may already be integrated into prices.
Q (moderator): Can you elaborate on the AI aspects?
A: Hamza deferred to a colleague (Francois, also present at the event) for a deeper discussion on that topic.
Q: Do you have a solution for forward-looking bias in LLM-based scoring?
A: Amundi doesn't directly use LLMs for scoring — they use the outcomes from the data provider's LLM models. Since it is the provider running the model, the forward-looking bias question falls on the provider's side, not Amundi's.

Full Transcript

Industry Talk #1: AI and Alternative Data: Expanding the Information Edge for Momentum Investing
Speaker: Hamza Bahaji, Head of Financial Engineering & Innovative Solutions, Amundi
Event: Paris — Market Data x AI (Finteda / FactSet), March 12, 2026
[TALK]
Hamza Bahaji: Rule-based strategies. So, as I said previously, my team is a kind of transverse team which is fully dedicated to passive investing and smart beta investing, which is called Amundi. We have mainly two operating areas. The first one is what we call partnership, and through this area we aim to work with clients in order to co-design with them dedicated investment solutions — purely passive, quant, rule-based, systematic. The second operating area is the traditional job of a financial engineer or quant, that consists in teaming up with several business partners and helping them actually implement, choose, and decide on quant solutions. So it ranges from the selection of quant models to the very end of the spectrum, which is the implementation of the model.
And the common denominator of those areas is obviously data and analytics. So as a financial engineer, we are the door opener for dealing with and addressing all the issues related to the data and the choice of analytics. This is the reason why we have a specific position in the team — you can call it whatever you want: data strategist, engineer, scout — and the very specific role of this data engineer is actually to manage the data, to monitor the data, but also to coordinate all the dialogues and discussions that we have on this matter internally and externally.
We use a bunch of data sources. We cover mainly several segments of data, including financial and market data. And we have started, let's say five or seven years ago, actually looking at alternative data. So what do we mean by alternative data? It's a term that was coined by the industry in order to refer to all data outside of the spectrum of financial and market data.
This specific data has some common features. It includes, for instance, that the market participants use it less commonly. This kind of data tends to be hardly accessible with appropriate analytics and tools — and we'll talk about that later on. It's often unrelated to financial data, as I said previously, obviously except from some transcripts of financial text and the very specific case of financial analyst call minutes. It still has a shorter history, obviously, and of course it's more challenging to use.
So why did we start to look at alternative data, in the perspective of integration into our systematic solutions? It's because — at least this is what we think — it tends to be more informative than financial and market data. What I mean by informativeness here is how much information is included in a piece of data, how much information any piece of data is likely to include. And there are several mathematical and statistical methods to gauge the informativeness of data. The very standard one is entropy measures. But entropy measures, while they allow you to assess distributional characteristics of the data and from that assess informativeness — this approach tends to become very quickly computationally intractable. There are other alternatives, but let me leave it here because it's not the aim of the speech.
So as I said before, seven years ago we started some discussions where we partnered with alternative data providers — kind of emerging fintechs specialized in building datasets based on NLP. It was the very beginning. That was the genesis — actually using NLP approaches, semantic analysis, in order to construct and build sentiment scores. So you could think about it as a kind of putative metric, a score that ranges on a scale of 0 to 100 — very pessimistic at 0 and very optimistic at 100 — about a specific given security. They had mainly large coverage on equities. And at that time they used NLP, and they have of course upgraded their approach — they are currently using LLM models in order to build those scores.
And the idea was applying this NLP-based sentiment data in order to improve the properties of a systematic price momentum strategy. So just for those of you who are not familiar with momentum strategies or price momentum strategies — it's a kind of price or market-price-based strategy that consists in buying past winners and shorting (for those who are entitled to short securities) past losers. Usually you look over the past 12-month time window and take the best performers, buy the best performers, and short the worst.
Momentum is actually a typical case of trading or investment strategies that lack a fundamental anchor. By contrast, for example, value or size strategies are based on fundamental data. And that makes it a strategy that is prone to investor sentiment — the arbitrage activity tends to be a kind of destabilizing factor. This makes the strategy very sensitive to investor sentiment.
And the idea here was to use this alternative data on investor sentiment in order to screen out those securities in the momentum strategy that enjoy a kind of positive sentiment from investors. Intuitively, you can think about it as a kind of screening filter that allows you to overweight securities that have positive sentiment in the momentum strategy, and underweight those with negative sentiment. And one would expect to have better performance of this strategy after adjusting with alternative data.
So the strategy is wrapped in a dedicated index that we've been running for almost five years now. Here you have a kind of performance proof over several time periods, ranging from one day to 63 days. And how should we read it? The bar chart on the top is the performance of the momentum strategy — this is the naked one, without any adjustment with alternative data. And the bottom one is the momentum strategy coupled with the adjustment using the sentiment score.
So here we have a breakdown of the investment universe in quantiles. The fifth quantile, number five, is the winners — that should obviously be the securities that a long-only momentum investor should buy. And quantile number one, the bottom quantile, is the losers — the securities that momentum investors should short, or underweight in his portfolio if he is a long-only investor. And what you can see is that the adjustment with the sentiment metric allows to improve the performance of the winners exposure — the performance of the portfolio that is expected to be exposed to winners — and reduces the drawdowns or negative performance of the portion that is shorted or exposed to losers. And this is exactly what we're trying to capture. The benefits are quite uniform whatever the time period considered.
Now, there are some natural drawbacks, and that would lead me to say a couple of words about AI and what we are planning to do. Natural drawbacks with this approach: first of all, time to market. Since we're relying on a data provider, by the time the data is completely channeled to the end users and integrated in this strategy, the data might become outdated. Because the very idea here is to use information that is not already integrated in market prices, so that time to market might sometimes be long. Second thing is ownership — the usual suspect here. And the last one is about transparency: like any kind of data, especially when it's a black box, it might be subject to issues. It's a matter of transparency.
So what should be the solution here? Even if the data provider is improving his models — and he migrated actually to a more sophisticated approach based on LLMs instead of NLP — the idea here, and this is what we are thinking about and planning to work on with our data experts at the firm, is to use the internal AI engine of Amundi. As a first step, as an experimental step, to gauge investor sentiment information from financial analyst reports, because Amundi is a big asset manager — we have access to this set of information — and see what is the gain in terms of informativeness, what is the added value that we can obtain.
Because obviously there are challenges, there is a lot of work to do, especially in prompt engineering. This is why we are planning to rely on our internal experts, because it's a very sensitive matter. What are exactly the instructions that you are going to give to your AI agents in order to assess this market sentiment? How are you going to tell the AI to produce a kind of consistent, robust, and more relevant measure of optimism and pessimism in financial markets? And a second challenge is that the wider the sources of information, the better the assessment of the sentiment analysis. So the idea here is to check to what extent we can have access to other sources of information.
So that's it for me. I hope it was quite informative on our experience — the concrete integration of alternative data in our investment solutions. Thank you.
[Q&A]
Moderator: Yeah, of course, we can take one or two questions.
Audience Member: I have one question about the time to market. How do you measure the time to market, and how do you understand that this data is already integrated in the market price?
Hamza Bahaji: So the way we monitor — two things actually. The performance of the strategy, and the benchmark of the strategy is the momentum strategy, the naked one without any adjustments. And we just — stated in very simple words — we apply a kind of regression of the adjusted one over the naked one. And this is the alpha — the added value. We verify, we check whether the added value in performance is statistically significant or not. When we start seeing that the added value is not anymore statistically significant, that means that the information that you have added as adjustment is no more relevant. So that's the first thing.
And the second thing is — I mean, it's judgmental. There is no AI, no sophistication in it. When you see that the data provider's data, or that the channel of the data, would take a couple of days, that means that maybe the information is already integrated in market prices. I don't know. I have no scientific answer to this question.
Moderator: All right, good, thank you. One last — anybody? Okay, otherwise I can ask you a question on the AI, because I was wondering if you can maybe elaborate slightly on that.
Hamza Bahaji: Ah, so that's — you should ask the question to the experts actually, and potentially the end users. Francois is here, I think — a colleague actually. He's here and you can have a discussion with him.
Moderator: Okay, then last one.
Audience Member: Do you have a specific pipeline or a solution to the forward-looking bias that you can potentially have in the LLM training? When using an LLM for classification or scoring, if you try to backtest and that news has been in the LLM training set —
Hamza Bahaji: We're not directly using the LLM actually. We're using the outcomes of the LLM models. This is why I said that one of the drawbacks is transparency, because we use the data that is provided. The model is built and run by a data provider, see what I mean? And since it's the data provider who's actually running this — I cannot directly address that. That's not on our side of the analysis.
Moderator: Thanks a lot, Hamza.
Hamza Bahaji: Thank you very much.

Back to AI & Market Data