Category

Company Thoughts

Company Thoughts

Company Thoughts

Stay connected with milestones, announcements, and our regular newsletters + blogs
Stay connected with milestones, announcements, and our regular newsletters + blogs

Newsletter

Aug 15, 2025

Catch & Signal Release: Hooking Alpha, Releasing Noise

The Kickoff

August is summer break for most, but if you’re like us, the work hasn’t slowed. Margin–cost spreads in APAC and North America are moving in opposite directions, new datasets are adding less obvious angles, and our conversation with Dr. Paul Bilokon underscored the value of infrastructure that delivers results in live conditions. Welcome to our summer edition, written for quants at their desk or their deck chair.

The Compass

Here's a rundown of what you can find in this edition:

  • A postcard with our most recent insights

  • Hot takes from our chat with Dr. Paul Bilokon

  • APAC Margins Take an Extended Holiday from Costs

  • If you're fishing for alpha, this dataset might help

  • All-you-can-read (and listen) buffet

  • Something chuckle-worthy to add to the holiday mood

Postcard from HQ

It’s been a month of preparation and building momentum. Much of our work has been behind the scenes, setting up the pieces that will come together over the next couple of weeks. On the product side, we’ve expanded our feature library by another 110,000, giving quants more signals to test across diverse market conditions. The UX/UI refresh we began last month is progressing well, now extending to other digital touchpoints to create a more cohesive experience across everything we do and aligning with our longer-term vision. 

We’ve also been preparing for our mid-September event in New York with Rebellion and Databento, as a warm up to the CFEM X Rebellion conference on September 19. Alongside that, September will see the launch of new and expanded products, with details to follow soon. And in the background, Quanted Roundups are preparing for a new era, with major changes in store for how we curate and share research with our audience.

The pieces have been coming together all summer, and September will show the results. We're looking forward to sharing more on this soon.

Expert Hot Takes

We recently had the chance to catch up with Dr. Paul Bilokon, someone whose name will be familiar to many in the quant world. He’s the founder and CEO of Thalesians, which he began while still working in industry and has since grown into an international community for collaboration and research at the intersection of AI, quantitative finance, and cybernetics, with a growing community in London, New York, Paris, Frankfurt, and beyond. He is also a visiting professor at Imperial College London and a Research Staff Member at the university’s Centre for Cryptocurrency Research and Engineering, where he focuses on DeFi and blockchain, exploring cryptographic algorithms and inefficiencies in digital asset markets.

Before turning his focus fully to academia and research, Paul spent over a decade on the sell side, building trading systems and leading quant teams across Morgan Stanley, Lehman Brothers, Nomura Citi, and Deutsche Bank, where he played a key role in developing electronic credit trading. Recognised as Quant of the Year in 2023, Paul has built a remarkable career on bridging academic depth with real-world application.

In our conversation, Paul shares how his experience on trading desks shaped his thinking, what excites him about the future of AI in finance, and why practical results still matter most in both research and application.

Having built algorithmic trading systems across FX and credit at institutions like Citi and Deutsche, what stands out to you as the most defining shift in how quant strategies are built and deployed since you entered the field?

I would love to say that there has been a shift towards slick, reliable deployment infrastructures, but this isn’t universally the case: many organisations (I won’t name them) remain pretty bad at infrastructure, making the same mistakes as those mentioned by Fred Brooks in his Mythical Man Month. The successful ones, though, have learned the importance of infrastructure and that it pays to invest in frameworks just as much as it pays to invest in alpha. Such frameworks are well engineered, avoid spurious complexity and hide inevitable complexity, they are easy to extend (including when markets undergo transformative change) and, in the words of my former boss Martin Zinkin, “are easy to use correctly and difficult to use incorrectly.” Another boss of mine (I won’t name him as he likes to keep a low profile) points out the importance of adhering to Uncle Bob’s SOLID principles - many organisations have learned this lesson the hard way, although it’s always preferable to learn from the mistakes of others. Agile techniques are now universally accepted…

What principles or technical habits from your time on trading desks have stayed with you as you moved into research leadership, teaching, and advisory work? 

I haven’t really moved anywhere in the sense that I continue to trade, where appropriate to lead, teach, write, and advise. Let me perhaps highlight one of the lessons from trading desks that is particularly useful in all kinds of academic work: it’s knowing what works, what doesn’t work, and where to look for stuff that does work - and keeping things simple and results-oriented. When you own the PnL number, either on your own or jointly, you are naturally motivated by results, rather than by intellectual beauty, etc. So you get stuff done. This is something that was hammered into my head early on, since the days I was a mere Analyst. When you bring this to the academe, while keeping the intellectual rigour, the result is the underrated practically useful research. I’m not necessarily saying that all research is practically useful, but it’s a good feeling when some of your research finds significant applications.

Having worked at the intersection of quant teams, infrastructure, and AI, where do you see the greatest room for improvement in how firms move from research to live deployment?

Statistical rigor and attention to the (usually significant) possibility of overfitting come to mind. People are now acutely aware of the various cognitive biases and logical fallacies that lead to adverse results, and they compensate for that, which is good to see. Your infrastructure should make it difficult to mistranslate what you have done in research when you go to production, so this step should be smooth. Some refer to this as the live-backtest invariance, and some frameworks support it. I do find that putting quants under the pressures of Scrum and Kanban is sometimes productive, sometimes less so. Much of research work is nonlinear and involves leaps of intuition and nonlinear working habits (such as working in bursts or seeking inspiration from a walk in the park). Quants are geniuses, and they should be respected as such. Quite often we try to fit them into a conveyor belt. I would say that the system should indeed have conveyor belts here and there, but it has been a mistake to let go of the legendary Google do-what-you-like Fridays. They aren’t a gimmick, they are genuinely useful in the quant world.

As ML/AI becomes more integrated into quant workflows, what shifts do you expect in how predictive models are designed, interpreted, or monitored?

First and foremost we are talking about more automation, which is often a good thing, except where it’s not. The idea that unskilled people can operate AI-driven systems is an illusion. In object-oriented programming we talk about encapsulation as a means of hiding complexity. But I don’t know a single object-oriented programmer who never had to break encapsulation to check what’s inside. The greatest risk here is the Newtonian vulgar Mechanick: a person who thoughtlessly feeds data into AI models and then uncritically processes the results. This is also known as a Chinese room and considered harmful. I’m an expert precisely because I value my agency, not because I’m given a particular set of tools.

At Thalesians, you’ve supported firms working with high-frequency and complex datasets. What recurring challenges do you see in how teams handle data for research and signal generation?

One of the challenges that we see in this space is siloing. Domain knowledge, technical expertise needed for high-frequency data handling, and mathematical versatility often don’t co-exist, they are often relegated to particular silos. Managers should understand these things intimately and not at the level of buzzwords. Thalesians Ltd. often act as translators, as we speak all these languages.

Anything else you’d like to highlight for those who want to learn more, before we wrap up?

This is a great opportunity. First, I would like to invite the readers to join our MSc in Mathematics and Finance at Imperial College London, to the best of my knowledge the best such programme available anywhere in the world. If you dare of course. This is Imperial College London, and you should be pretty damn good. (If you are reading this, chances are that you are.)

If full-time education is not your thing at this stage of your life, I would like to invite you to the evening courses that Thalesians run with WBS: the Quantitative Developer Certificate and the Machine Learning Certificate in Finance.

There are a few things that I would like to highlight on my SSRN page: particularly the tail-aware approach to resource allocation developed in collaboration with Valeriya Varlashova; what I call deep econometrics; what I call topological number theory. Of course, your feedback is always welcome and actively encouraged.

And on a completely different but personally important note, the archive of Professor Paul Gordon James—long thought lost—has now been unearthed and made available for the first time. Born in Bath, Somerset, on July 4, 1870, Prof. James pursued his education at Christ Church, Oxford, before continuing his postgraduate studies at the Royal College of Science (now part of Imperial College London) under the guidance of Prof. Reginald Dixon.

The collection of his papers, chronicled in The Night’s Cipher, provides remarkable insights into 19th-century history, the nature of consciousness, and artificial intelligence—along with shocking revelations that may finally expose the true identity of Jack the Ripper. Equally controversial are the records concerning Prof. Dixon, a man whose actions remain the subject of fierce debate.

A collective of scholars, recognising the profound historical and philosophical implications of these writings, has taken it upon themselves to preserve and publish them. With the support of Atmosphere Press, Prof. James’s long-hidden work is now available to the public, allowing readers to explore the mysteries he left behind and determine the truth for themselves. You can learn more about this here.

Summer Figures

APAC Margin and Cost Trends at Multi-Year Extremes

July’s electronics manufacturing survey shows a sharp regional profitability spread. The APAC profit-margin diffusion index prints at 125 versus 88 in North America, a 37-point gap. Six-month expectations widen the spread to 34 points (129 versus 95), suggesting the divergence is expected to persist. 

The underlying cost structure partly explains the gap. Fourteen percent of APAC firms report falling material costs compared with zero in North America. Labor cost relief is also unique to APAC, with 14 percent expecting declines, again compared with zero in North America. The APAC material cost index stands at 107 now, with forward expectations at 129 — a 22-point rise — indicating expected cost increases rather than declines. North America moves from 134 to 126, an 8-point drop, but from a much higher starting level, leaving net input-cost breadth still materially above APAC in current conditions.

From a modelling perspective, the joint margin–cost picture is notable. In APAC, the positive margin momentum in current conditions is paired with lower current cost breadth than North America, though forward cost expectations in APAC turn higher. North America’s setup shows contracting margins in current breadth terms with elevated cost levels, a combination that in past cycles has correlated with softer earnings trends - though the survey itself does not test that link.

For systematic portfolios, the survey’s orders and shipments data show APAC with a +22 gap in orders and +8 in shipments (future vs current) versus North America’s +10 and +13. Any reference to a “cost–margin composite” or percentile rank, as well as backtest hit-rates for long APAC / short North America configurations, comes from external modelling and is not part of the survey’s published results.

If APAC’s current-cost advantage continues alongside stronger margin breadth, while North America remains cost-pressured in current conditions, the setup could align with sustained cross-regional return differentials into the next reporting cycle - provided forward cost expectations in APAC soften rather than follow the current projected rise.

Source: Global Electronics Association: July 2025 Global Sentiment Report

Data Worth Your Downtime 

To some of us, nothing says summer like a fishing trip, but to others the real catch of fishing is the available datasets. Global Fishing Watch offers AIS-based vessel activity, port visits, loitering, encounters, and SAR detections, all delivered through APIs with near real-time refresh. This allows for building signals for seafood supply, identifying illicit transshipment risk, modeling port congestion, and nowcasting coastal macro indicators. They have API packages in both python and R, allowing incorporation into factor models, ESG screens, and macro frameworks with globally consistent, time-stamped coverage. Undoubtedly, It’s a maritime catch worth reeling in for its niche investment potential.

See more here

On The Lounger

We know you're probably still thinking about work anyway. Here's some stuff to keep your mind busy when you're supposed to be doing nothing but can't quite turn off the mental models:

📚 FIASCO by Frank Partnoy

📚 Stabilising an Unstable Economy by Hyman Minsky

📚 The Art of Doing Science and Engineering: Learning to Learn by Richard W. Hamming

📚 Where are the Customer's Yachts by Fred Schwed

📚 The Hedge Fund Investing Chartbook: Quantitative Perspectives on the Modern Hedge Fund Investing Experience

📄 Super upside factor by Daniel Shin Un Kang

📄 The behavioural biases of fund managers by Joachim Klement

📄 Probability vs. Likelihood: The Most Misunderstood Duo in Data Science by Unicorn Day

📄 Diving into std::function by Ng Song Guan

📄 The Limits of Out-of-Sample Testing by Nam Nguyen Ph.D.

🎧 Quant Trading: How Hedge Funds Use Data | Marco Aboav, Etna Research by George Aliferis, Investology

🎧 The Psychology of Human Misjudgment by We Study Billionaires

🎧 Laurens Bensdorp - Running 55+ Systematic Trading Strategies Simultaneously by Chat with Traders

🎧 Searching for Signals: BlackRock’s Raffaele Savi on the Future of Systematic Investing by Goldman Sach's Exchanges

🎧 Vinesh Jha: The craft of mining alternative data by The Curious Quant


Finance Fun Corner

——

Disclaimer, this newsletter is for educational purposes only and does not constitute financial advice. Any trading strategy discussed is hypothetical, and past performance is not indicative of future results. Before making any investment decisions, please conduct thorough research and consult with a qualified financial professional. Remember, all investments carry risk

Black and white photo of a the beach on a cloudy day and a silhouette of someone fishing and two surfers in the water.

Newsletter

Aug 15, 2025

Catch & Signal Release: Hooking Alpha, Releasing Noise

The Kickoff

August is summer break for most, but if you’re like us, the work hasn’t slowed. Margin–cost spreads in APAC and North America are moving in opposite directions, new datasets are adding less obvious angles, and our conversation with Dr. Paul Bilokon underscored the value of infrastructure that delivers results in live conditions. Welcome to our summer edition, written for quants at their desk or their deck chair.

The Compass

Here's a rundown of what you can find in this edition:

  • A postcard with our most recent insights

  • Hot takes from our chat with Dr. Paul Bilokon

  • APAC Margins Take an Extended Holiday from Costs

  • If you're fishing for alpha, this dataset might help

  • All-you-can-read (and listen) buffet

  • Something chuckle-worthy to add to the holiday mood

Postcard from HQ

It’s been a month of preparation and building momentum. Much of our work has been behind the scenes, setting up the pieces that will come together over the next couple of weeks. On the product side, we’ve expanded our feature library by another 110,000, giving quants more signals to test across diverse market conditions. The UX/UI refresh we began last month is progressing well, now extending to other digital touchpoints to create a more cohesive experience across everything we do and aligning with our longer-term vision. 

We’ve also been preparing for our mid-September event in New York with Rebellion and Databento, as a warm up to the CFEM X Rebellion conference on September 19. Alongside that, September will see the launch of new and expanded products, with details to follow soon. And in the background, Quanted Roundups are preparing for a new era, with major changes in store for how we curate and share research with our audience.

The pieces have been coming together all summer, and September will show the results. We're looking forward to sharing more on this soon.

Expert Hot Takes

We recently had the chance to catch up with Dr. Paul Bilokon, someone whose name will be familiar to many in the quant world. He’s the founder and CEO of Thalesians, which he began while still working in industry and has since grown into an international community for collaboration and research at the intersection of AI, quantitative finance, and cybernetics, with a growing community in London, New York, Paris, Frankfurt, and beyond. He is also a visiting professor at Imperial College London and a Research Staff Member at the university’s Centre for Cryptocurrency Research and Engineering, where he focuses on DeFi and blockchain, exploring cryptographic algorithms and inefficiencies in digital asset markets.

Before turning his focus fully to academia and research, Paul spent over a decade on the sell side, building trading systems and leading quant teams across Morgan Stanley, Lehman Brothers, Nomura Citi, and Deutsche Bank, where he played a key role in developing electronic credit trading. Recognised as Quant of the Year in 2023, Paul has built a remarkable career on bridging academic depth with real-world application.

In our conversation, Paul shares how his experience on trading desks shaped his thinking, what excites him about the future of AI in finance, and why practical results still matter most in both research and application.

Having built algorithmic trading systems across FX and credit at institutions like Citi and Deutsche, what stands out to you as the most defining shift in how quant strategies are built and deployed since you entered the field?

I would love to say that there has been a shift towards slick, reliable deployment infrastructures, but this isn’t universally the case: many organisations (I won’t name them) remain pretty bad at infrastructure, making the same mistakes as those mentioned by Fred Brooks in his Mythical Man Month. The successful ones, though, have learned the importance of infrastructure and that it pays to invest in frameworks just as much as it pays to invest in alpha. Such frameworks are well engineered, avoid spurious complexity and hide inevitable complexity, they are easy to extend (including when markets undergo transformative change) and, in the words of my former boss Martin Zinkin, “are easy to use correctly and difficult to use incorrectly.” Another boss of mine (I won’t name him as he likes to keep a low profile) points out the importance of adhering to Uncle Bob’s SOLID principles - many organisations have learned this lesson the hard way, although it’s always preferable to learn from the mistakes of others. Agile techniques are now universally accepted…

What principles or technical habits from your time on trading desks have stayed with you as you moved into research leadership, teaching, and advisory work? 

I haven’t really moved anywhere in the sense that I continue to trade, where appropriate to lead, teach, write, and advise. Let me perhaps highlight one of the lessons from trading desks that is particularly useful in all kinds of academic work: it’s knowing what works, what doesn’t work, and where to look for stuff that does work - and keeping things simple and results-oriented. When you own the PnL number, either on your own or jointly, you are naturally motivated by results, rather than by intellectual beauty, etc. So you get stuff done. This is something that was hammered into my head early on, since the days I was a mere Analyst. When you bring this to the academe, while keeping the intellectual rigour, the result is the underrated practically useful research. I’m not necessarily saying that all research is practically useful, but it’s a good feeling when some of your research finds significant applications.

Having worked at the intersection of quant teams, infrastructure, and AI, where do you see the greatest room for improvement in how firms move from research to live deployment?

Statistical rigor and attention to the (usually significant) possibility of overfitting come to mind. People are now acutely aware of the various cognitive biases and logical fallacies that lead to adverse results, and they compensate for that, which is good to see. Your infrastructure should make it difficult to mistranslate what you have done in research when you go to production, so this step should be smooth. Some refer to this as the live-backtest invariance, and some frameworks support it. I do find that putting quants under the pressures of Scrum and Kanban is sometimes productive, sometimes less so. Much of research work is nonlinear and involves leaps of intuition and nonlinear working habits (such as working in bursts or seeking inspiration from a walk in the park). Quants are geniuses, and they should be respected as such. Quite often we try to fit them into a conveyor belt. I would say that the system should indeed have conveyor belts here and there, but it has been a mistake to let go of the legendary Google do-what-you-like Fridays. They aren’t a gimmick, they are genuinely useful in the quant world.

As ML/AI becomes more integrated into quant workflows, what shifts do you expect in how predictive models are designed, interpreted, or monitored?

First and foremost we are talking about more automation, which is often a good thing, except where it’s not. The idea that unskilled people can operate AI-driven systems is an illusion. In object-oriented programming we talk about encapsulation as a means of hiding complexity. But I don’t know a single object-oriented programmer who never had to break encapsulation to check what’s inside. The greatest risk here is the Newtonian vulgar Mechanick: a person who thoughtlessly feeds data into AI models and then uncritically processes the results. This is also known as a Chinese room and considered harmful. I’m an expert precisely because I value my agency, not because I’m given a particular set of tools.

At Thalesians, you’ve supported firms working with high-frequency and complex datasets. What recurring challenges do you see in how teams handle data for research and signal generation?

One of the challenges that we see in this space is siloing. Domain knowledge, technical expertise needed for high-frequency data handling, and mathematical versatility often don’t co-exist, they are often relegated to particular silos. Managers should understand these things intimately and not at the level of buzzwords. Thalesians Ltd. often act as translators, as we speak all these languages.

Anything else you’d like to highlight for those who want to learn more, before we wrap up?

This is a great opportunity. First, I would like to invite the readers to join our MSc in Mathematics and Finance at Imperial College London, to the best of my knowledge the best such programme available anywhere in the world. If you dare of course. This is Imperial College London, and you should be pretty damn good. (If you are reading this, chances are that you are.)

If full-time education is not your thing at this stage of your life, I would like to invite you to the evening courses that Thalesians run with WBS: the Quantitative Developer Certificate and the Machine Learning Certificate in Finance.

There are a few things that I would like to highlight on my SSRN page: particularly the tail-aware approach to resource allocation developed in collaboration with Valeriya Varlashova; what I call deep econometrics; what I call topological number theory. Of course, your feedback is always welcome and actively encouraged.

And on a completely different but personally important note, the archive of Professor Paul Gordon James—long thought lost—has now been unearthed and made available for the first time. Born in Bath, Somerset, on July 4, 1870, Prof. James pursued his education at Christ Church, Oxford, before continuing his postgraduate studies at the Royal College of Science (now part of Imperial College London) under the guidance of Prof. Reginald Dixon.

The collection of his papers, chronicled in The Night’s Cipher, provides remarkable insights into 19th-century history, the nature of consciousness, and artificial intelligence—along with shocking revelations that may finally expose the true identity of Jack the Ripper. Equally controversial are the records concerning Prof. Dixon, a man whose actions remain the subject of fierce debate.

A collective of scholars, recognising the profound historical and philosophical implications of these writings, has taken it upon themselves to preserve and publish them. With the support of Atmosphere Press, Prof. James’s long-hidden work is now available to the public, allowing readers to explore the mysteries he left behind and determine the truth for themselves. You can learn more about this here.

Summer Figures

APAC Margin and Cost Trends at Multi-Year Extremes

July’s electronics manufacturing survey shows a sharp regional profitability spread. The APAC profit-margin diffusion index prints at 125 versus 88 in North America, a 37-point gap. Six-month expectations widen the spread to 34 points (129 versus 95), suggesting the divergence is expected to persist. 

The underlying cost structure partly explains the gap. Fourteen percent of APAC firms report falling material costs compared with zero in North America. Labor cost relief is also unique to APAC, with 14 percent expecting declines, again compared with zero in North America. The APAC material cost index stands at 107 now, with forward expectations at 129 — a 22-point rise — indicating expected cost increases rather than declines. North America moves from 134 to 126, an 8-point drop, but from a much higher starting level, leaving net input-cost breadth still materially above APAC in current conditions.

From a modelling perspective, the joint margin–cost picture is notable. In APAC, the positive margin momentum in current conditions is paired with lower current cost breadth than North America, though forward cost expectations in APAC turn higher. North America’s setup shows contracting margins in current breadth terms with elevated cost levels, a combination that in past cycles has correlated with softer earnings trends - though the survey itself does not test that link.

For systematic portfolios, the survey’s orders and shipments data show APAC with a +22 gap in orders and +8 in shipments (future vs current) versus North America’s +10 and +13. Any reference to a “cost–margin composite” or percentile rank, as well as backtest hit-rates for long APAC / short North America configurations, comes from external modelling and is not part of the survey’s published results.

If APAC’s current-cost advantage continues alongside stronger margin breadth, while North America remains cost-pressured in current conditions, the setup could align with sustained cross-regional return differentials into the next reporting cycle - provided forward cost expectations in APAC soften rather than follow the current projected rise.

Source: Global Electronics Association: July 2025 Global Sentiment Report

Data Worth Your Downtime 

To some of us, nothing says summer like a fishing trip, but to others the real catch of fishing is the available datasets. Global Fishing Watch offers AIS-based vessel activity, port visits, loitering, encounters, and SAR detections, all delivered through APIs with near real-time refresh. This allows for building signals for seafood supply, identifying illicit transshipment risk, modeling port congestion, and nowcasting coastal macro indicators. They have API packages in both python and R, allowing incorporation into factor models, ESG screens, and macro frameworks with globally consistent, time-stamped coverage. Undoubtedly, It’s a maritime catch worth reeling in for its niche investment potential.

See more here

On The Lounger

We know you're probably still thinking about work anyway. Here's some stuff to keep your mind busy when you're supposed to be doing nothing but can't quite turn off the mental models:

📚 FIASCO by Frank Partnoy

📚 Stabilising an Unstable Economy by Hyman Minsky

📚 The Art of Doing Science and Engineering: Learning to Learn by Richard W. Hamming

📚 Where are the Customer's Yachts by Fred Schwed

📚 The Hedge Fund Investing Chartbook: Quantitative Perspectives on the Modern Hedge Fund Investing Experience

📄 Super upside factor by Daniel Shin Un Kang

📄 The behavioural biases of fund managers by Joachim Klement

📄 Probability vs. Likelihood: The Most Misunderstood Duo in Data Science by Unicorn Day

📄 Diving into std::function by Ng Song Guan

📄 The Limits of Out-of-Sample Testing by Nam Nguyen Ph.D.

🎧 Quant Trading: How Hedge Funds Use Data | Marco Aboav, Etna Research by George Aliferis, Investology

🎧 The Psychology of Human Misjudgment by We Study Billionaires

🎧 Laurens Bensdorp - Running 55+ Systematic Trading Strategies Simultaneously by Chat with Traders

🎧 Searching for Signals: BlackRock’s Raffaele Savi on the Future of Systematic Investing by Goldman Sach's Exchanges

🎧 Vinesh Jha: The craft of mining alternative data by The Curious Quant


Finance Fun Corner

——

Disclaimer, this newsletter is for educational purposes only and does not constitute financial advice. Any trading strategy discussed is hypothetical, and past performance is not indicative of future results. Before making any investment decisions, please conduct thorough research and consult with a qualified financial professional. Remember, all investments carry risk

Black and white photo of a the beach on a cloudy day and a silhouette of someone fishing and two surfers in the water.

Newsletter

Aug 15, 2025

Catch & Signal Release: Hooking Alpha, Releasing Noise

The Kickoff

August is summer break for most, but if you’re like us, the work hasn’t slowed. Margin–cost spreads in APAC and North America are moving in opposite directions, new datasets are adding less obvious angles, and our conversation with Dr. Paul Bilokon underscored the value of infrastructure that delivers results in live conditions. Welcome to our summer edition, written for quants at their desk or their deck chair.

The Compass

Here's a rundown of what you can find in this edition:

  • A postcard with our most recent insights

  • Hot takes from our chat with Dr. Paul Bilokon

  • APAC Margins Take an Extended Holiday from Costs

  • If you're fishing for alpha, this dataset might help

  • All-you-can-read (and listen) buffet

  • Something chuckle-worthy to add to the holiday mood

Postcard from HQ

It’s been a month of preparation and building momentum. Much of our work has been behind the scenes, setting up the pieces that will come together over the next couple of weeks. On the product side, we’ve expanded our feature library by another 110,000, giving quants more signals to test across diverse market conditions. The UX/UI refresh we began last month is progressing well, now extending to other digital touchpoints to create a more cohesive experience across everything we do and aligning with our longer-term vision. 

We’ve also been preparing for our mid-September event in New York with Rebellion and Databento, as a warm up to the CFEM X Rebellion conference on September 19. Alongside that, September will see the launch of new and expanded products, with details to follow soon. And in the background, Quanted Roundups are preparing for a new era, with major changes in store for how we curate and share research with our audience.

The pieces have been coming together all summer, and September will show the results. We're looking forward to sharing more on this soon.

Expert Hot Takes

We recently had the chance to catch up with Dr. Paul Bilokon, someone whose name will be familiar to many in the quant world. He’s the founder and CEO of Thalesians, which he began while still working in industry and has since grown into an international community for collaboration and research at the intersection of AI, quantitative finance, and cybernetics, with a growing community in London, New York, Paris, Frankfurt, and beyond. He is also a visiting professor at Imperial College London and a Research Staff Member at the university’s Centre for Cryptocurrency Research and Engineering, where he focuses on DeFi and blockchain, exploring cryptographic algorithms and inefficiencies in digital asset markets.

Before turning his focus fully to academia and research, Paul spent over a decade on the sell side, building trading systems and leading quant teams across Morgan Stanley, Lehman Brothers, Nomura Citi, and Deutsche Bank, where he played a key role in developing electronic credit trading. Recognised as Quant of the Year in 2023, Paul has built a remarkable career on bridging academic depth with real-world application.

In our conversation, Paul shares how his experience on trading desks shaped his thinking, what excites him about the future of AI in finance, and why practical results still matter most in both research and application.

Having built algorithmic trading systems across FX and credit at institutions like Citi and Deutsche, what stands out to you as the most defining shift in how quant strategies are built and deployed since you entered the field?

I would love to say that there has been a shift towards slick, reliable deployment infrastructures, but this isn’t universally the case: many organisations (I won’t name them) remain pretty bad at infrastructure, making the same mistakes as those mentioned by Fred Brooks in his Mythical Man Month. The successful ones, though, have learned the importance of infrastructure and that it pays to invest in frameworks just as much as it pays to invest in alpha. Such frameworks are well engineered, avoid spurious complexity and hide inevitable complexity, they are easy to extend (including when markets undergo transformative change) and, in the words of my former boss Martin Zinkin, “are easy to use correctly and difficult to use incorrectly.” Another boss of mine (I won’t name him as he likes to keep a low profile) points out the importance of adhering to Uncle Bob’s SOLID principles - many organisations have learned this lesson the hard way, although it’s always preferable to learn from the mistakes of others. Agile techniques are now universally accepted…

What principles or technical habits from your time on trading desks have stayed with you as you moved into research leadership, teaching, and advisory work? 

I haven’t really moved anywhere in the sense that I continue to trade, where appropriate to lead, teach, write, and advise. Let me perhaps highlight one of the lessons from trading desks that is particularly useful in all kinds of academic work: it’s knowing what works, what doesn’t work, and where to look for stuff that does work - and keeping things simple and results-oriented. When you own the PnL number, either on your own or jointly, you are naturally motivated by results, rather than by intellectual beauty, etc. So you get stuff done. This is something that was hammered into my head early on, since the days I was a mere Analyst. When you bring this to the academe, while keeping the intellectual rigour, the result is the underrated practically useful research. I’m not necessarily saying that all research is practically useful, but it’s a good feeling when some of your research finds significant applications.

Having worked at the intersection of quant teams, infrastructure, and AI, where do you see the greatest room for improvement in how firms move from research to live deployment?

Statistical rigor and attention to the (usually significant) possibility of overfitting come to mind. People are now acutely aware of the various cognitive biases and logical fallacies that lead to adverse results, and they compensate for that, which is good to see. Your infrastructure should make it difficult to mistranslate what you have done in research when you go to production, so this step should be smooth. Some refer to this as the live-backtest invariance, and some frameworks support it. I do find that putting quants under the pressures of Scrum and Kanban is sometimes productive, sometimes less so. Much of research work is nonlinear and involves leaps of intuition and nonlinear working habits (such as working in bursts or seeking inspiration from a walk in the park). Quants are geniuses, and they should be respected as such. Quite often we try to fit them into a conveyor belt. I would say that the system should indeed have conveyor belts here and there, but it has been a mistake to let go of the legendary Google do-what-you-like Fridays. They aren’t a gimmick, they are genuinely useful in the quant world.

As ML/AI becomes more integrated into quant workflows, what shifts do you expect in how predictive models are designed, interpreted, or monitored?

First and foremost we are talking about more automation, which is often a good thing, except where it’s not. The idea that unskilled people can operate AI-driven systems is an illusion. In object-oriented programming we talk about encapsulation as a means of hiding complexity. But I don’t know a single object-oriented programmer who never had to break encapsulation to check what’s inside. The greatest risk here is the Newtonian vulgar Mechanick: a person who thoughtlessly feeds data into AI models and then uncritically processes the results. This is also known as a Chinese room and considered harmful. I’m an expert precisely because I value my agency, not because I’m given a particular set of tools.

At Thalesians, you’ve supported firms working with high-frequency and complex datasets. What recurring challenges do you see in how teams handle data for research and signal generation?

One of the challenges that we see in this space is siloing. Domain knowledge, technical expertise needed for high-frequency data handling, and mathematical versatility often don’t co-exist, they are often relegated to particular silos. Managers should understand these things intimately and not at the level of buzzwords. Thalesians Ltd. often act as translators, as we speak all these languages.

Anything else you’d like to highlight for those who want to learn more, before we wrap up?

This is a great opportunity. First, I would like to invite the readers to join our MSc in Mathematics and Finance at Imperial College London, to the best of my knowledge the best such programme available anywhere in the world. If you dare of course. This is Imperial College London, and you should be pretty damn good. (If you are reading this, chances are that you are.)

If full-time education is not your thing at this stage of your life, I would like to invite you to the evening courses that Thalesians run with WBS: the Quantitative Developer Certificate and the Machine Learning Certificate in Finance.

There are a few things that I would like to highlight on my SSRN page: particularly the tail-aware approach to resource allocation developed in collaboration with Valeriya Varlashova; what I call deep econometrics; what I call topological number theory. Of course, your feedback is always welcome and actively encouraged.

And on a completely different but personally important note, the archive of Professor Paul Gordon James—long thought lost—has now been unearthed and made available for the first time. Born in Bath, Somerset, on July 4, 1870, Prof. James pursued his education at Christ Church, Oxford, before continuing his postgraduate studies at the Royal College of Science (now part of Imperial College London) under the guidance of Prof. Reginald Dixon.

The collection of his papers, chronicled in The Night’s Cipher, provides remarkable insights into 19th-century history, the nature of consciousness, and artificial intelligence—along with shocking revelations that may finally expose the true identity of Jack the Ripper. Equally controversial are the records concerning Prof. Dixon, a man whose actions remain the subject of fierce debate.

A collective of scholars, recognising the profound historical and philosophical implications of these writings, has taken it upon themselves to preserve and publish them. With the support of Atmosphere Press, Prof. James’s long-hidden work is now available to the public, allowing readers to explore the mysteries he left behind and determine the truth for themselves. You can learn more about this here.

Summer Figures

APAC Margin and Cost Trends at Multi-Year Extremes

July’s electronics manufacturing survey shows a sharp regional profitability spread. The APAC profit-margin diffusion index prints at 125 versus 88 in North America, a 37-point gap. Six-month expectations widen the spread to 34 points (129 versus 95), suggesting the divergence is expected to persist. 

The underlying cost structure partly explains the gap. Fourteen percent of APAC firms report falling material costs compared with zero in North America. Labor cost relief is also unique to APAC, with 14 percent expecting declines, again compared with zero in North America. The APAC material cost index stands at 107 now, with forward expectations at 129 — a 22-point rise — indicating expected cost increases rather than declines. North America moves from 134 to 126, an 8-point drop, but from a much higher starting level, leaving net input-cost breadth still materially above APAC in current conditions.

From a modelling perspective, the joint margin–cost picture is notable. In APAC, the positive margin momentum in current conditions is paired with lower current cost breadth than North America, though forward cost expectations in APAC turn higher. North America’s setup shows contracting margins in current breadth terms with elevated cost levels, a combination that in past cycles has correlated with softer earnings trends - though the survey itself does not test that link.

For systematic portfolios, the survey’s orders and shipments data show APAC with a +22 gap in orders and +8 in shipments (future vs current) versus North America’s +10 and +13. Any reference to a “cost–margin composite” or percentile rank, as well as backtest hit-rates for long APAC / short North America configurations, comes from external modelling and is not part of the survey’s published results.

If APAC’s current-cost advantage continues alongside stronger margin breadth, while North America remains cost-pressured in current conditions, the setup could align with sustained cross-regional return differentials into the next reporting cycle - provided forward cost expectations in APAC soften rather than follow the current projected rise.

Source: Global Electronics Association: July 2025 Global Sentiment Report

Data Worth Your Downtime 

To some of us, nothing says summer like a fishing trip, but to others the real catch of fishing is the available datasets. Global Fishing Watch offers AIS-based vessel activity, port visits, loitering, encounters, and SAR detections, all delivered through APIs with near real-time refresh. This allows for building signals for seafood supply, identifying illicit transshipment risk, modeling port congestion, and nowcasting coastal macro indicators. They have API packages in both python and R, allowing incorporation into factor models, ESG screens, and macro frameworks with globally consistent, time-stamped coverage. Undoubtedly, It’s a maritime catch worth reeling in for its niche investment potential.

See more here

On The Lounger

We know you're probably still thinking about work anyway. Here's some stuff to keep your mind busy when you're supposed to be doing nothing but can't quite turn off the mental models:

📚 FIASCO by Frank Partnoy

📚 Stabilising an Unstable Economy by Hyman Minsky

📚 The Art of Doing Science and Engineering: Learning to Learn by Richard W. Hamming

📚 Where are the Customer's Yachts by Fred Schwed

📚 The Hedge Fund Investing Chartbook: Quantitative Perspectives on the Modern Hedge Fund Investing Experience

📄 Super upside factor by Daniel Shin Un Kang

📄 The behavioural biases of fund managers by Joachim Klement

📄 Probability vs. Likelihood: The Most Misunderstood Duo in Data Science by Unicorn Day

📄 Diving into std::function by Ng Song Guan

📄 The Limits of Out-of-Sample Testing by Nam Nguyen Ph.D.

🎧 Quant Trading: How Hedge Funds Use Data | Marco Aboav, Etna Research by George Aliferis, Investology

🎧 The Psychology of Human Misjudgment by We Study Billionaires

🎧 Laurens Bensdorp - Running 55+ Systematic Trading Strategies Simultaneously by Chat with Traders

🎧 Searching for Signals: BlackRock’s Raffaele Savi on the Future of Systematic Investing by Goldman Sach's Exchanges

🎧 Vinesh Jha: The craft of mining alternative data by The Curious Quant


Finance Fun Corner

——

Disclaimer, this newsletter is for educational purposes only and does not constitute financial advice. Any trading strategy discussed is hypothetical, and past performance is not indicative of future results. Before making any investment decisions, please conduct thorough research and consult with a qualified financial professional. Remember, all investments carry risk

Black and white photo of a the beach on a cloudy day and a silhouette of someone fishing and two surfers in the water.

Newsletter

Jul 8, 2025

Feedback Loops: From Signals to Strategy

The Kickoff

July’s brought more than heat. We’ve been looking at how funds respond when signals blur, how markets react more sharply to policy news, and how ideas like AI agents get tested in real hedge fund settings. Some teams are adjusting. Others are sticking to what they know. This edition is about pressure and response. Because in 2025 we've learnt that no one gets to wait for certainty.

The Compass

Here's a rundown of what you can find in this edition:

  • Catching you up on what’s been happening on our side

  • Newest partner addition to the Quanted data lake

  • Insights from our chat with Niall Hurley 

  • The data behind the rising sensitivity of equities to policy and economic news

  • Some events this Q that will keep you informed and well socialised

  • How to build AI Agents for Hedge Fund Settings

  • Some insights we think you should hear from other established funds squinting at the signals.

  • Did you know this about stocks & treasury bills?

Insider Info

A bit of a reset month for us, as startup cycles go. We’ve been focused on laying the groundwork for the next phase of product and commercial growth. On the product side, we’re continuing to expand coverage, with another 1.3 million features added this month to help surface performance drivers across a wider range of market conditions. 

We also kicked off a UI refresh based on early adopter feedback, aimed at making data discovery and validation easier to navigate and more intuitive for everyone using the platform.

On the community side, we joined Databento’s London events and had the chance to catch up with teams from Man, Thelasians, and Sparta. David, our Co-Founder & CTO, made his first guest appearance on the New Barbarians podcast, where he shared a bit about his background and how we’re thinking about the data problems quant teams are trying to solve. 

And in case you're based in New York: our CEO has officially relocated back to the city. If you're around and up for a coffee, feel free to reach out.

More soon.

On the Radar

One new data partner to welcome this month, as we focus on getting recent additions fully onboarded and integrated into the system. Each one adds to the growing pool of features quants can test, validate, and integrate into their strategies. A warm welcome to the partner below: 

Symbol Master

Offers validated U.S. options reference data, corporate action adjustments, and intraday symbology updates to support quants running systematic strategies that depend on accurate instrument mapping, reliable security masters, and low error tolerance across research and production environments.

The Tradewinds

Expert Exchange

We recently spoke with Niall Hurley about the evolving role of data in asset management, capital markets, and corporate strategy. With 24 years of experience including sell side, buy side and the data vendor world, Niall brings a unique perspective on how data informs investment decisions and supports commercial growth.

His career includes roles in equity derivatives at Goldman Sachs and Deutsche Bank, portfolio management at RAB Capital, asset allocation at Goodbody, and M&A at Circle K Europe. He later led Eagle Alpha, one of the earliest alternative data firms, serving as CEO and Director for 7 years, where he worked closely with asset managers and data providers to shape how alternative data is sourced, evaluated, and applied. Today, Niall advises data vendors and corporates on how to assess data and create value more  effectively and uncover new opportunities.

Reflecting on your journey from managing portfolios to leading an alt data company and advising data businesses, how has the role of data in investment and capital markets evolved since you started - and what’s been the most memorable turning point in your career so far?

The biggest evolution has been the growth of the availability of datasets. This has allowed data-driven insights in addition to company and economic tracking in the last 10 years that was simply not possible 20 years ago.

The most memorable turning point was learning these use cases and applications of data sources in 2017 and 2018. You cannot unlearn them! I now listen in on any conversation or exercise as it relates to deal origination, company due diligence, business or economic forecasting, completely different compared to 10 years ago. Facts beat opinions. The availability of facts, via data, has exploded.

What mindsets or workflows from your hedge fund, allocator & industry M&A roles proved most valuable when you transitioned into leading a data solution provider and advising data businesses. 

The most important skills transfer was understanding companies and industries and the types of KPI’s and measurements of businesses that are required by a private or public markets analyst.

Secondary to that, it was understanding the internals of an asset manager. I covered asset managers for derivatives, worked in a multi-strategy and allocated to managers. Whenever I spend time with an asset manager, I try to consider their entire organisation, different skill sets and where the data flows from and to both in terms of central functions and decentralised strategies and teams.

Having worked on both the buy side and with data vendors, where do you see the greatest room for innovation in how firms handle data infrastructure?

It would be wrong not to mention AI. To date, generative AI and LLMs have been mainly utilised outside of production environments and away from live portfolios and trading algorithms, but that is now starting to evolve based on my recent conversations.

In many ways, nothing has changed prior to the “GPT era” - the asset management firms that continue to invest in data infrastructure, talent, and innovation are correlating with those with superior fund performance and asset growth.

Likewise, winning data vendors continue to invest in infrastructure to deliver high-quality and timely data. Their ability to add an analytics or insights layer to their raw data has declined in cost.

As alternative data becomes more embedded in investment workflows, where do you see the biggest opportunities to improve how teams extract and iterate on predictive signals at scale?

I still believe the market approaches data backtesting and evaluation data combinations to arrive at a signal is highly inefficient. When I worked in derivative markets, I saw decisions made with complex derivatives and hundreds of millions or billions of portfolio exposure in a fraction of the time it takes to alpha test a $100k dataset. Firms spend millions on sell-side research without alpha testing it. We know there is no alpha in sell-side research. There are a lot of contradictions, I guess every industry has these dynamics.

Compliance needs to be standardised and centralised; too much time is lost there. Data cleansing, wrangling, and mapping should see a structural improvement and collapse in time allocation thanks to new technologies. If we can do back testing, blending and alpha testing faster, the velocity of the ecosystem can increase in a non-linear and positive way - that is good for everyone.

Looking ahead, what kinds of data-intensive challenges are you most focused on solving now through your advisory work, whether with funds, vendors, or corporates?

Generally, it is helping funds that are focused on alpha, and winning assets on that basis, understand that if you are working with the same data types and processes in 10 years that you are working with today, your investors may allocate elsewhere. For vendors, there are a high number of sub $5mn businesses trying to work out how they can become $20mn businesses or more and brainstorming with CEOs and Founders to solve that. For companies, I still believe there is a lot of “big data” sitting in “small companies” that they have no idea of its value.

They are the main things I think about every morning – that will keep me busy for a long time, and it never feels like work helping to solve those challenges. Data markets are always changing.

Anything else you’d like to highlight for readers to keep in mind going forward?

For my Advisory work, I send out a newsletter, direct to email, for select individuals when I have something important to say. I prefer to send it directly to people I know personally from my time in the industry. For example, this month I have taken an interest in groups like Perplexity, increasing their presence with their finance offering as they secure more data access. But also, I see a real risk of many “AI” apps failing as their data inputs are not differentiated from the incumbents. We saw one of those private equity apps that support origination / due diligence exit the market this month. I see a risk that we have overallocated to “AI” apps.

Numbers & Narratives

Macro Surprise Risk Has Doubled Since 2020

 

BlackRock’s latest regression work draws a clear line in the sand: the post-2020 market regime exhibits double the equity sensitivity to macro and policy surprises compared to the pre-2020 baseline. Their quant team regressed weekly equity index returns on the Citi Economic Surprise Index and the Trade Policy Uncertainty Index (z-scored), and found that the aggregate regression coefficients—a proxy for short-term macro beta—have surged to 2.0, up from a long-run average closer to 1.0 between 2004 and 2019.

This implies a structural shift in the return-generating process. Short-term data surprises and geopolitical signals now exert twice the force on equity prices as they did during the last cycle. With inflation anchors unmoored and fiscal discipline fading, the equity market is effectively operating without long-duration macro gravity.

 

Why this matters for quants:

  • Signal horizon compression: Traditional models assuming slow diffusion of macro information may underreact. Short-term macro forecast accuracy is now more alpha-relevant than ever.

  • Conditional vol scaling: Systems using fixed beta assumptions will underprice response amplitude. Macro-news-aware vol adjustment becomes table stakes.

  • Feature recalibration: Pre-2020 macro-beta priors may be invalid. Factor timing models need to upweight surprise risk and regime-aware features (e.g., conditional dispersion, policy tone).

  • Stress path modeling: With a 2× jump in sensitivity, tail events from unanticipated data (e.g., non-farm payrolls, inflation beats) are more potent. Impact magnitudes have changed even when probabilities haven’t.

  • Model explainability: For machine learning-driven equity models, the sharp rise in macro sensitivity demands clearer mapping between input variables and macro regimes for interpretability.

This reflects a change in transmission mechanics rather than a simple shift in volatility. The equity market is increasingly priced like a derivative on macro surprise itself. Quants who are not tracking this evolving beta risk may find their edge structurally diluted.

Source: Blackrock's Midyear Global Investment Outlook Report

Link to Blackrock Midyear Global Investment Outlook

Time Markers

It’s somehow already Q3 and the calendar is filling up quick. Especially later in the quarter, there’s a strong lineup of quant and data events to keep an eye on:

📆 ARPM Quant Bootcamp 2025, 7- 10 July, New York | A four-day program in New York bringing together quants, portfolio managers, and risk professionals to explore asset allocation, derivatives, and advanced quantitative methods.

📆 Eagle Alpha, 17 September, New York | A one-day event focused on how institutional investors source, evaluate, and apply alternative datasets.

📆 Data & AI Happy Hour Mixer, 17 September, New York | A chilled rooftop gathering for data and AI professionals ahead of the Databricks World Tour.

📆 Neudata, 18 September, London | A full-day event connecting data buyers and vendors to explore developments in traditional and market data.

📆 Cornell Financial Engineering 2025, 19 September, New York | A one-day conference uniting academics and practitioners to discuss AI, machine learning, and data in financial markets.

📆  Battle of the Quants, 23 September, London | A one-day event bringing together quants, allocators, and data providers to discuss AI and systematic investing.

📆  SIPUGday 2025, 23-24 September, Zurich | Two day event uniting banks, data vendors, and fintechs to discuss innovation in market data and infrastructure.

📆 Big Data LDN 2025, September 24-25, 2025, London | A two-day expo where data teams across sectors gather to explore tools and strategies in data management, analytics, and AI.

Navigational Nudges

If you’ve studied robotics, you know it teaches a harsh but valuable lesson: if a control loop is even slightly unstable, the arm slams into the workbench. Apply the same intolerance for wobble when you let a language model design trading signals. An AI agent can prototype hundreds of alphas overnight, but without hard-edged constraints it will happily learn patterns that exist only on your hard drive.

The danger isn’t that the model writes bad code. It’s that it writes seductive code. Backtests soar, Sharpe ratios gleam, and only later do you notice the subtle look-ahead, the synthetic mean-reversion baked into trade-price bars, or the hidden parameter explosion that made everything fit.

Why this matters

Quant desks already battle regime shifts and crowding. Layering a hyper-creative agent on top multiplies the ways a pipeline can hallucinate edge. Unless you engineer guard-rails as rigorously as a safety-critical robot, you swap research velocity for capital erosion. 

These are the tips I’d give if you're building an AI agent that generates and tests trading signals:

  1. Treat raw data like sensor feeds
    Build OHLC bars from bid-ask mid-prices, not last trades, and store opening and closing spreads. That removes fake mean-reversion and lets you debit realistic costs.

  2. Constrain the agent’s degrees of freedom
    Whitelist a compact set of inputs such as mid-price, VWAP, and basic volume. Limit it to a vetted set of transforms. No ad-hoc functions, no peeking at future books. Fewer joints mean fewer failure modes.

  3. Decouple imagination from evaluation
    Stage 1: the model drafts economic hypotheses. Stage 2: a separate test harness converts formulas, charges fees, and walks a rolling train/test split. Keep the fox out of the hen-house.

  4. Penalise complexity early
    Count operators or tree depth. If a feature exceeds the limit, force a rewrite. In robotics we call this weight-budgeting. Lighter parts mean fewer surprises.

  5. Track decay like component fatigue
    Log every alpha, its live PnL, and break-point tests. Retire signals whose correlations slip or whose hit-rate drifts below spec. Maintenance is better than post-crash autopsy.

  6. Correct for multiple testing

    Each strategy tested on the same dataset increases your chances of discovering false positives. Keep a running count of trials, apply corrections for multiple testing, and discount performance metrics accordingly. This protects your process from data mining bias and ensures that the signals you promote are statistically credible.

AI can speed up signal generation, but judgment and process determine whether those signals hold up. Treat it like you would a junior quant: give it structure, review its output, and never skip validation. The value lies not in automation itself, but in the rigour you apply when filtering ideas and deciding what makes it into production. Without that discipline, faster research just means faster failure.

The Knowledge Buffet

📝 Systematic Strategies and Quant Trading 2025  📝

by HedgeNordic

The report pulls together a series of manager writeups on how different systematic funds are adapting to today's harder-to-read markets. It's not trying to make a single argument or push a trend. Instead, you get a mix: some focus on execution and trade design, others on regime detection, signal fragility, or capacity constraints. A few make the case for sticking with simple models, others are exploring more adaptive frameworks. It's worth reading if you're interested in how different teams are handling the same pressures, without assuming there's one right answer.

The Closing Bell

Did you know?

Only 42% of U.S. stocks have outperformed one-month Treasury bills over their entire lifetime.

A row of people walking up a sand dune with footprints in black and white

Newsletter

Jul 8, 2025

Feedback Loops: From Signals to Strategy

The Kickoff

July’s brought more than heat. We’ve been looking at how funds respond when signals blur, how markets react more sharply to policy news, and how ideas like AI agents get tested in real hedge fund settings. Some teams are adjusting. Others are sticking to what they know. This edition is about pressure and response. Because in 2025 we've learnt that no one gets to wait for certainty.

The Compass

Here's a rundown of what you can find in this edition:

  • Catching you up on what’s been happening on our side

  • Newest partner addition to the Quanted data lake

  • Insights from our chat with Niall Hurley 

  • The data behind the rising sensitivity of equities to policy and economic news

  • Some events this Q that will keep you informed and well socialised

  • How to build AI Agents for Hedge Fund Settings

  • Some insights we think you should hear from other established funds squinting at the signals.

  • Did you know this about stocks & treasury bills?

Insider Info

A bit of a reset month for us, as startup cycles go. We’ve been focused on laying the groundwork for the next phase of product and commercial growth. On the product side, we’re continuing to expand coverage, with another 1.3 million features added this month to help surface performance drivers across a wider range of market conditions. 

We also kicked off a UI refresh based on early adopter feedback, aimed at making data discovery and validation easier to navigate and more intuitive for everyone using the platform.

On the community side, we joined Databento’s London events and had the chance to catch up with teams from Man, Thelasians, and Sparta. David, our Co-Founder & CTO, made his first guest appearance on the New Barbarians podcast, where he shared a bit about his background and how we’re thinking about the data problems quant teams are trying to solve. 

And in case you're based in New York: our CEO has officially relocated back to the city. If you're around and up for a coffee, feel free to reach out.

More soon.

On the Radar

One new data partner to welcome this month, as we focus on getting recent additions fully onboarded and integrated into the system. Each one adds to the growing pool of features quants can test, validate, and integrate into their strategies. A warm welcome to the partner below: 

Symbol Master

Offers validated U.S. options reference data, corporate action adjustments, and intraday symbology updates to support quants running systematic strategies that depend on accurate instrument mapping, reliable security masters, and low error tolerance across research and production environments.

The Tradewinds

Expert Exchange

We recently spoke with Niall Hurley about the evolving role of data in asset management, capital markets, and corporate strategy. With 24 years of experience including sell side, buy side and the data vendor world, Niall brings a unique perspective on how data informs investment decisions and supports commercial growth.

His career includes roles in equity derivatives at Goldman Sachs and Deutsche Bank, portfolio management at RAB Capital, asset allocation at Goodbody, and M&A at Circle K Europe. He later led Eagle Alpha, one of the earliest alternative data firms, serving as CEO and Director for 7 years, where he worked closely with asset managers and data providers to shape how alternative data is sourced, evaluated, and applied. Today, Niall advises data vendors and corporates on how to assess data and create value more  effectively and uncover new opportunities.

Reflecting on your journey from managing portfolios to leading an alt data company and advising data businesses, how has the role of data in investment and capital markets evolved since you started - and what’s been the most memorable turning point in your career so far?

The biggest evolution has been the growth of the availability of datasets. This has allowed data-driven insights in addition to company and economic tracking in the last 10 years that was simply not possible 20 years ago.

The most memorable turning point was learning these use cases and applications of data sources in 2017 and 2018. You cannot unlearn them! I now listen in on any conversation or exercise as it relates to deal origination, company due diligence, business or economic forecasting, completely different compared to 10 years ago. Facts beat opinions. The availability of facts, via data, has exploded.

What mindsets or workflows from your hedge fund, allocator & industry M&A roles proved most valuable when you transitioned into leading a data solution provider and advising data businesses. 

The most important skills transfer was understanding companies and industries and the types of KPI’s and measurements of businesses that are required by a private or public markets analyst.

Secondary to that, it was understanding the internals of an asset manager. I covered asset managers for derivatives, worked in a multi-strategy and allocated to managers. Whenever I spend time with an asset manager, I try to consider their entire organisation, different skill sets and where the data flows from and to both in terms of central functions and decentralised strategies and teams.

Having worked on both the buy side and with data vendors, where do you see the greatest room for innovation in how firms handle data infrastructure?

It would be wrong not to mention AI. To date, generative AI and LLMs have been mainly utilised outside of production environments and away from live portfolios and trading algorithms, but that is now starting to evolve based on my recent conversations.

In many ways, nothing has changed prior to the “GPT era” - the asset management firms that continue to invest in data infrastructure, talent, and innovation are correlating with those with superior fund performance and asset growth.

Likewise, winning data vendors continue to invest in infrastructure to deliver high-quality and timely data. Their ability to add an analytics or insights layer to their raw data has declined in cost.

As alternative data becomes more embedded in investment workflows, where do you see the biggest opportunities to improve how teams extract and iterate on predictive signals at scale?

I still believe the market approaches data backtesting and evaluation data combinations to arrive at a signal is highly inefficient. When I worked in derivative markets, I saw decisions made with complex derivatives and hundreds of millions or billions of portfolio exposure in a fraction of the time it takes to alpha test a $100k dataset. Firms spend millions on sell-side research without alpha testing it. We know there is no alpha in sell-side research. There are a lot of contradictions, I guess every industry has these dynamics.

Compliance needs to be standardised and centralised; too much time is lost there. Data cleansing, wrangling, and mapping should see a structural improvement and collapse in time allocation thanks to new technologies. If we can do back testing, blending and alpha testing faster, the velocity of the ecosystem can increase in a non-linear and positive way - that is good for everyone.

Looking ahead, what kinds of data-intensive challenges are you most focused on solving now through your advisory work, whether with funds, vendors, or corporates?

Generally, it is helping funds that are focused on alpha, and winning assets on that basis, understand that if you are working with the same data types and processes in 10 years that you are working with today, your investors may allocate elsewhere. For vendors, there are a high number of sub $5mn businesses trying to work out how they can become $20mn businesses or more and brainstorming with CEOs and Founders to solve that. For companies, I still believe there is a lot of “big data” sitting in “small companies” that they have no idea of its value.

They are the main things I think about every morning – that will keep me busy for a long time, and it never feels like work helping to solve those challenges. Data markets are always changing.

Anything else you’d like to highlight for readers to keep in mind going forward?

For my Advisory work, I send out a newsletter, direct to email, for select individuals when I have something important to say. I prefer to send it directly to people I know personally from my time in the industry. For example, this month I have taken an interest in groups like Perplexity, increasing their presence with their finance offering as they secure more data access. But also, I see a real risk of many “AI” apps failing as their data inputs are not differentiated from the incumbents. We saw one of those private equity apps that support origination / due diligence exit the market this month. I see a risk that we have overallocated to “AI” apps.

Numbers & Narratives

Macro Surprise Risk Has Doubled Since 2020

 

BlackRock’s latest regression work draws a clear line in the sand: the post-2020 market regime exhibits double the equity sensitivity to macro and policy surprises compared to the pre-2020 baseline. Their quant team regressed weekly equity index returns on the Citi Economic Surprise Index and the Trade Policy Uncertainty Index (z-scored), and found that the aggregate regression coefficients—a proxy for short-term macro beta—have surged to 2.0, up from a long-run average closer to 1.0 between 2004 and 2019.

This implies a structural shift in the return-generating process. Short-term data surprises and geopolitical signals now exert twice the force on equity prices as they did during the last cycle. With inflation anchors unmoored and fiscal discipline fading, the equity market is effectively operating without long-duration macro gravity.

 

Why this matters for quants:

  • Signal horizon compression: Traditional models assuming slow diffusion of macro information may underreact. Short-term macro forecast accuracy is now more alpha-relevant than ever.

  • Conditional vol scaling: Systems using fixed beta assumptions will underprice response amplitude. Macro-news-aware vol adjustment becomes table stakes.

  • Feature recalibration: Pre-2020 macro-beta priors may be invalid. Factor timing models need to upweight surprise risk and regime-aware features (e.g., conditional dispersion, policy tone).

  • Stress path modeling: With a 2× jump in sensitivity, tail events from unanticipated data (e.g., non-farm payrolls, inflation beats) are more potent. Impact magnitudes have changed even when probabilities haven’t.

  • Model explainability: For machine learning-driven equity models, the sharp rise in macro sensitivity demands clearer mapping between input variables and macro regimes for interpretability.

This reflects a change in transmission mechanics rather than a simple shift in volatility. The equity market is increasingly priced like a derivative on macro surprise itself. Quants who are not tracking this evolving beta risk may find their edge structurally diluted.

Source: Blackrock's Midyear Global Investment Outlook Report

Link to Blackrock Midyear Global Investment Outlook

Time Markers

It’s somehow already Q3 and the calendar is filling up quick. Especially later in the quarter, there’s a strong lineup of quant and data events to keep an eye on:

📆 ARPM Quant Bootcamp 2025, 7- 10 July, New York | A four-day program in New York bringing together quants, portfolio managers, and risk professionals to explore asset allocation, derivatives, and advanced quantitative methods.

📆 Eagle Alpha, 17 September, New York | A one-day event focused on how institutional investors source, evaluate, and apply alternative datasets.

📆 Data & AI Happy Hour Mixer, 17 September, New York | A chilled rooftop gathering for data and AI professionals ahead of the Databricks World Tour.

📆 Neudata, 18 September, London | A full-day event connecting data buyers and vendors to explore developments in traditional and market data.

📆 Cornell Financial Engineering 2025, 19 September, New York | A one-day conference uniting academics and practitioners to discuss AI, machine learning, and data in financial markets.

📆  Battle of the Quants, 23 September, London | A one-day event bringing together quants, allocators, and data providers to discuss AI and systematic investing.

📆  SIPUGday 2025, 23-24 September, Zurich | Two day event uniting banks, data vendors, and fintechs to discuss innovation in market data and infrastructure.

📆 Big Data LDN 2025, September 24-25, 2025, London | A two-day expo where data teams across sectors gather to explore tools and strategies in data management, analytics, and AI.

Navigational Nudges

If you’ve studied robotics, you know it teaches a harsh but valuable lesson: if a control loop is even slightly unstable, the arm slams into the workbench. Apply the same intolerance for wobble when you let a language model design trading signals. An AI agent can prototype hundreds of alphas overnight, but without hard-edged constraints it will happily learn patterns that exist only on your hard drive.

The danger isn’t that the model writes bad code. It’s that it writes seductive code. Backtests soar, Sharpe ratios gleam, and only later do you notice the subtle look-ahead, the synthetic mean-reversion baked into trade-price bars, or the hidden parameter explosion that made everything fit.

Why this matters

Quant desks already battle regime shifts and crowding. Layering a hyper-creative agent on top multiplies the ways a pipeline can hallucinate edge. Unless you engineer guard-rails as rigorously as a safety-critical robot, you swap research velocity for capital erosion. 

These are the tips I’d give if you're building an AI agent that generates and tests trading signals:

  1. Treat raw data like sensor feeds
    Build OHLC bars from bid-ask mid-prices, not last trades, and store opening and closing spreads. That removes fake mean-reversion and lets you debit realistic costs.

  2. Constrain the agent’s degrees of freedom
    Whitelist a compact set of inputs such as mid-price, VWAP, and basic volume. Limit it to a vetted set of transforms. No ad-hoc functions, no peeking at future books. Fewer joints mean fewer failure modes.

  3. Decouple imagination from evaluation
    Stage 1: the model drafts economic hypotheses. Stage 2: a separate test harness converts formulas, charges fees, and walks a rolling train/test split. Keep the fox out of the hen-house.

  4. Penalise complexity early
    Count operators or tree depth. If a feature exceeds the limit, force a rewrite. In robotics we call this weight-budgeting. Lighter parts mean fewer surprises.

  5. Track decay like component fatigue
    Log every alpha, its live PnL, and break-point tests. Retire signals whose correlations slip or whose hit-rate drifts below spec. Maintenance is better than post-crash autopsy.

  6. Correct for multiple testing

    Each strategy tested on the same dataset increases your chances of discovering false positives. Keep a running count of trials, apply corrections for multiple testing, and discount performance metrics accordingly. This protects your process from data mining bias and ensures that the signals you promote are statistically credible.

AI can speed up signal generation, but judgment and process determine whether those signals hold up. Treat it like you would a junior quant: give it structure, review its output, and never skip validation. The value lies not in automation itself, but in the rigour you apply when filtering ideas and deciding what makes it into production. Without that discipline, faster research just means faster failure.

The Knowledge Buffet

📝 Systematic Strategies and Quant Trading 2025  📝

by HedgeNordic

The report pulls together a series of manager writeups on how different systematic funds are adapting to today's harder-to-read markets. It's not trying to make a single argument or push a trend. Instead, you get a mix: some focus on execution and trade design, others on regime detection, signal fragility, or capacity constraints. A few make the case for sticking with simple models, others are exploring more adaptive frameworks. It's worth reading if you're interested in how different teams are handling the same pressures, without assuming there's one right answer.

The Closing Bell

Did you know?

Only 42% of U.S. stocks have outperformed one-month Treasury bills over their entire lifetime.

A row of people walking up a sand dune with footprints in black and white

Newsletter

Jul 8, 2025

Feedback Loops: From Signals to Strategy

The Kickoff

July’s brought more than heat. We’ve been looking at how funds respond when signals blur, how markets react more sharply to policy news, and how ideas like AI agents get tested in real hedge fund settings. Some teams are adjusting. Others are sticking to what they know. This edition is about pressure and response. Because in 2025 we've learnt that no one gets to wait for certainty.

The Compass

Here's a rundown of what you can find in this edition:

  • Catching you up on what’s been happening on our side

  • Newest partner addition to the Quanted data lake

  • Insights from our chat with Niall Hurley 

  • The data behind the rising sensitivity of equities to policy and economic news

  • Some events this Q that will keep you informed and well socialised

  • How to build AI Agents for Hedge Fund Settings

  • Some insights we think you should hear from other established funds squinting at the signals.

  • Did you know this about stocks & treasury bills?

Insider Info

A bit of a reset month for us, as startup cycles go. We’ve been focused on laying the groundwork for the next phase of product and commercial growth. On the product side, we’re continuing to expand coverage, with another 1.3 million features added this month to help surface performance drivers across a wider range of market conditions. 

We also kicked off a UI refresh based on early adopter feedback, aimed at making data discovery and validation easier to navigate and more intuitive for everyone using the platform.

On the community side, we joined Databento’s London events and had the chance to catch up with teams from Man, Thelasians, and Sparta. David, our Co-Founder & CTO, made his first guest appearance on the New Barbarians podcast, where he shared a bit about his background and how we’re thinking about the data problems quant teams are trying to solve. 

And in case you're based in New York: our CEO has officially relocated back to the city. If you're around and up for a coffee, feel free to reach out.

More soon.

On the Radar

One new data partner to welcome this month, as we focus on getting recent additions fully onboarded and integrated into the system. Each one adds to the growing pool of features quants can test, validate, and integrate into their strategies. A warm welcome to the partner below: 

Symbol Master

Offers validated U.S. options reference data, corporate action adjustments, and intraday symbology updates to support quants running systematic strategies that depend on accurate instrument mapping, reliable security masters, and low error tolerance across research and production environments.

The Tradewinds

Expert Exchange

We recently spoke with Niall Hurley about the evolving role of data in asset management, capital markets, and corporate strategy. With 24 years of experience including sell side, buy side and the data vendor world, Niall brings a unique perspective on how data informs investment decisions and supports commercial growth.

His career includes roles in equity derivatives at Goldman Sachs and Deutsche Bank, portfolio management at RAB Capital, asset allocation at Goodbody, and M&A at Circle K Europe. He later led Eagle Alpha, one of the earliest alternative data firms, serving as CEO and Director for 7 years, where he worked closely with asset managers and data providers to shape how alternative data is sourced, evaluated, and applied. Today, Niall advises data vendors and corporates on how to assess data and create value more  effectively and uncover new opportunities.

Reflecting on your journey from managing portfolios to leading an alt data company and advising data businesses, how has the role of data in investment and capital markets evolved since you started - and what’s been the most memorable turning point in your career so far?

The biggest evolution has been the growth of the availability of datasets. This has allowed data-driven insights in addition to company and economic tracking in the last 10 years that was simply not possible 20 years ago.

The most memorable turning point was learning these use cases and applications of data sources in 2017 and 2018. You cannot unlearn them! I now listen in on any conversation or exercise as it relates to deal origination, company due diligence, business or economic forecasting, completely different compared to 10 years ago. Facts beat opinions. The availability of facts, via data, has exploded.

What mindsets or workflows from your hedge fund, allocator & industry M&A roles proved most valuable when you transitioned into leading a data solution provider and advising data businesses. 

The most important skills transfer was understanding companies and industries and the types of KPI’s and measurements of businesses that are required by a private or public markets analyst.

Secondary to that, it was understanding the internals of an asset manager. I covered asset managers for derivatives, worked in a multi-strategy and allocated to managers. Whenever I spend time with an asset manager, I try to consider their entire organisation, different skill sets and where the data flows from and to both in terms of central functions and decentralised strategies and teams.

Having worked on both the buy side and with data vendors, where do you see the greatest room for innovation in how firms handle data infrastructure?

It would be wrong not to mention AI. To date, generative AI and LLMs have been mainly utilised outside of production environments and away from live portfolios and trading algorithms, but that is now starting to evolve based on my recent conversations.

In many ways, nothing has changed prior to the “GPT era” - the asset management firms that continue to invest in data infrastructure, talent, and innovation are correlating with those with superior fund performance and asset growth.

Likewise, winning data vendors continue to invest in infrastructure to deliver high-quality and timely data. Their ability to add an analytics or insights layer to their raw data has declined in cost.

As alternative data becomes more embedded in investment workflows, where do you see the biggest opportunities to improve how teams extract and iterate on predictive signals at scale?

I still believe the market approaches data backtesting and evaluation data combinations to arrive at a signal is highly inefficient. When I worked in derivative markets, I saw decisions made with complex derivatives and hundreds of millions or billions of portfolio exposure in a fraction of the time it takes to alpha test a $100k dataset. Firms spend millions on sell-side research without alpha testing it. We know there is no alpha in sell-side research. There are a lot of contradictions, I guess every industry has these dynamics.

Compliance needs to be standardised and centralised; too much time is lost there. Data cleansing, wrangling, and mapping should see a structural improvement and collapse in time allocation thanks to new technologies. If we can do back testing, blending and alpha testing faster, the velocity of the ecosystem can increase in a non-linear and positive way - that is good for everyone.

Looking ahead, what kinds of data-intensive challenges are you most focused on solving now through your advisory work, whether with funds, vendors, or corporates?

Generally, it is helping funds that are focused on alpha, and winning assets on that basis, understand that if you are working with the same data types and processes in 10 years that you are working with today, your investors may allocate elsewhere. For vendors, there are a high number of sub $5mn businesses trying to work out how they can become $20mn businesses or more and brainstorming with CEOs and Founders to solve that. For companies, I still believe there is a lot of “big data” sitting in “small companies” that they have no idea of its value.

They are the main things I think about every morning – that will keep me busy for a long time, and it never feels like work helping to solve those challenges. Data markets are always changing.

Anything else you’d like to highlight for readers to keep in mind going forward?

For my Advisory work, I send out a newsletter, direct to email, for select individuals when I have something important to say. I prefer to send it directly to people I know personally from my time in the industry. For example, this month I have taken an interest in groups like Perplexity, increasing their presence with their finance offering as they secure more data access. But also, I see a real risk of many “AI” apps failing as their data inputs are not differentiated from the incumbents. We saw one of those private equity apps that support origination / due diligence exit the market this month. I see a risk that we have overallocated to “AI” apps.

Numbers & Narratives

Macro Surprise Risk Has Doubled Since 2020

 

BlackRock’s latest regression work draws a clear line in the sand: the post-2020 market regime exhibits double the equity sensitivity to macro and policy surprises compared to the pre-2020 baseline. Their quant team regressed weekly equity index returns on the Citi Economic Surprise Index and the Trade Policy Uncertainty Index (z-scored), and found that the aggregate regression coefficients—a proxy for short-term macro beta—have surged to 2.0, up from a long-run average closer to 1.0 between 2004 and 2019.

This implies a structural shift in the return-generating process. Short-term data surprises and geopolitical signals now exert twice the force on equity prices as they did during the last cycle. With inflation anchors unmoored and fiscal discipline fading, the equity market is effectively operating without long-duration macro gravity.

 

Why this matters for quants:

  • Signal horizon compression: Traditional models assuming slow diffusion of macro information may underreact. Short-term macro forecast accuracy is now more alpha-relevant than ever.

  • Conditional vol scaling: Systems using fixed beta assumptions will underprice response amplitude. Macro-news-aware vol adjustment becomes table stakes.

  • Feature recalibration: Pre-2020 macro-beta priors may be invalid. Factor timing models need to upweight surprise risk and regime-aware features (e.g., conditional dispersion, policy tone).

  • Stress path modeling: With a 2× jump in sensitivity, tail events from unanticipated data (e.g., non-farm payrolls, inflation beats) are more potent. Impact magnitudes have changed even when probabilities haven’t.

  • Model explainability: For machine learning-driven equity models, the sharp rise in macro sensitivity demands clearer mapping between input variables and macro regimes for interpretability.

This reflects a change in transmission mechanics rather than a simple shift in volatility. The equity market is increasingly priced like a derivative on macro surprise itself. Quants who are not tracking this evolving beta risk may find their edge structurally diluted.

Source: Blackrock's Midyear Global Investment Outlook Report

Link to Blackrock Midyear Global Investment Outlook

Time Markers

It’s somehow already Q3 and the calendar is filling up quick. Especially later in the quarter, there’s a strong lineup of quant and data events to keep an eye on:

📆 ARPM Quant Bootcamp 2025, 7- 10 July, New York | A four-day program in New York bringing together quants, portfolio managers, and risk professionals to explore asset allocation, derivatives, and advanced quantitative methods.

📆 Eagle Alpha, 17 September, New York | A one-day event focused on how institutional investors source, evaluate, and apply alternative datasets.

📆 Data & AI Happy Hour Mixer, 17 September, New York | A chilled rooftop gathering for data and AI professionals ahead of the Databricks World Tour.

📆 Neudata, 18 September, London | A full-day event connecting data buyers and vendors to explore developments in traditional and market data.

📆 Cornell Financial Engineering 2025, 19 September, New York | A one-day conference uniting academics and practitioners to discuss AI, machine learning, and data in financial markets.

📆  Battle of the Quants, 23 September, London | A one-day event bringing together quants, allocators, and data providers to discuss AI and systematic investing.

📆  SIPUGday 2025, 23-24 September, Zurich | Two day event uniting banks, data vendors, and fintechs to discuss innovation in market data and infrastructure.

📆 Big Data LDN 2025, September 24-25, 2025, London | A two-day expo where data teams across sectors gather to explore tools and strategies in data management, analytics, and AI.

Navigational Nudges

If you’ve studied robotics, you know it teaches a harsh but valuable lesson: if a control loop is even slightly unstable, the arm slams into the workbench. Apply the same intolerance for wobble when you let a language model design trading signals. An AI agent can prototype hundreds of alphas overnight, but without hard-edged constraints it will happily learn patterns that exist only on your hard drive.

The danger isn’t that the model writes bad code. It’s that it writes seductive code. Backtests soar, Sharpe ratios gleam, and only later do you notice the subtle look-ahead, the synthetic mean-reversion baked into trade-price bars, or the hidden parameter explosion that made everything fit.

Why this matters

Quant desks already battle regime shifts and crowding. Layering a hyper-creative agent on top multiplies the ways a pipeline can hallucinate edge. Unless you engineer guard-rails as rigorously as a safety-critical robot, you swap research velocity for capital erosion. 

These are the tips I’d give if you're building an AI agent that generates and tests trading signals:

  1. Treat raw data like sensor feeds
    Build OHLC bars from bid-ask mid-prices, not last trades, and store opening and closing spreads. That removes fake mean-reversion and lets you debit realistic costs.

  2. Constrain the agent’s degrees of freedom
    Whitelist a compact set of inputs such as mid-price, VWAP, and basic volume. Limit it to a vetted set of transforms. No ad-hoc functions, no peeking at future books. Fewer joints mean fewer failure modes.

  3. Decouple imagination from evaluation
    Stage 1: the model drafts economic hypotheses. Stage 2: a separate test harness converts formulas, charges fees, and walks a rolling train/test split. Keep the fox out of the hen-house.

  4. Penalise complexity early
    Count operators or tree depth. If a feature exceeds the limit, force a rewrite. In robotics we call this weight-budgeting. Lighter parts mean fewer surprises.

  5. Track decay like component fatigue
    Log every alpha, its live PnL, and break-point tests. Retire signals whose correlations slip or whose hit-rate drifts below spec. Maintenance is better than post-crash autopsy.

  6. Correct for multiple testing

    Each strategy tested on the same dataset increases your chances of discovering false positives. Keep a running count of trials, apply corrections for multiple testing, and discount performance metrics accordingly. This protects your process from data mining bias and ensures that the signals you promote are statistically credible.

AI can speed up signal generation, but judgment and process determine whether those signals hold up. Treat it like you would a junior quant: give it structure, review its output, and never skip validation. The value lies not in automation itself, but in the rigour you apply when filtering ideas and deciding what makes it into production. Without that discipline, faster research just means faster failure.

The Knowledge Buffet

📝 Systematic Strategies and Quant Trading 2025  📝

by HedgeNordic

The report pulls together a series of manager writeups on how different systematic funds are adapting to today's harder-to-read markets. It's not trying to make a single argument or push a trend. Instead, you get a mix: some focus on execution and trade design, others on regime detection, signal fragility, or capacity constraints. A few make the case for sticking with simple models, others are exploring more adaptive frameworks. It's worth reading if you're interested in how different teams are handling the same pressures, without assuming there's one right answer.

The Closing Bell

Did you know?

Only 42% of U.S. stocks have outperformed one-month Treasury bills over their entire lifetime.

A row of people walking up a sand dune with footprints in black and white

Newsletter

Jun 4, 2025

Partial Moments & Complete Recovery Odds

The Kickoff

June’s here, and with it, a reminder that recovery isn’t always linear, whether you’re talking models, markets, or mindset. We’ve been spending time in the in-betweens: between regimes, between theory and practice, between what the data says and what it means. Not everything resolves neatly. But that’s often where the best questions live. 

The Compass

Here's a rundown of what you can find in this edition:

  • Some updates for anyone wondering what we’ve been up to.

  • What we learned from our chat with Fred Viole, founder of OVVO Labs

  • What the data says about US stock drawdowns and recovery odds

  • What we’re watching in the markets right now

  • The do’s and don’ts of choosing a time-series CV method

  • Some insights we think you should hear from Mark Fleming-Williams on data sourcing. 

  • Your daily dose of humour - because you deserve it.

Insider Info

Milestone month across funding, product, and team this month. Our most recent fundraise is now officially out in the wild with coverage from Tech.eu. The round is a foundational step in backing the technical buildout needed to bring faster, more robust data validation to quant finance. 

That said, on the product side, we’ve now surpassed 3.2 million features in our system. We’ve also spent most of our month refining our product which now has:

  • introduced aggregated reports that summarise results across multiple tests, helping users make quicker and more confident dataset decisions.

  • added features that capture clustered signals describing current market states, giving users more context around model performance.

  • expanded user controls, letting quants customise filters and preferences to surface data that aligns with their strategy or domain focus.

As part of our ongoing effort to give young talent a tangible entry point into quantitative finance, we welcomed Alperen Öztürk this month as our new product intern. Our product internships offer hands-on experience and the chance to work closely with our CTO and senior team. 

We’re also actively growing the team. Roles across data science, full stack development, product, and GTM frequently pop up, so keep an eye on our job ads page if you or someone in your network is exploring new opportunities. Plenty more in the works :)

On the Radar

We've welcomed many new data partners this month, each enriching the pool of features quants have at their fingertips. All ready to unpack, test, and integrate into their strategies. A warm welcome to the partners below:

Context Analytics

Provides structured, machine-readable data from social media, corporate filings, and earnings call transcripts, enabling quants to integrate real-time sentiment and thematic signals into predictive models for alpha generation and risk assessment. 

Paragon Intel

Provides analytics on 2,000 company c-suites, linking executive ability to with future company performance. Leverages proprietary interviews with former colleagues, predictive ratings, and AI analysis of earnings call Q&A to produce consistent, predictive signal.

The Tradewinds

Expert Exchange

We recently sat down with Fred Viole, Founder of OVVO Labs and creator of the NNS statistical framework, to explore his nonlinear approach to quantitative finance. With a career spanning decades as a trader, researcher, and portfolio manager- including time at Morgan Stanley and TGAM - Fred brings a distinctive perspective that bridges behavioural finance, numerical analysis, and machine learning.

Fred is also the co-author of Nonlinear Nonparametric Statistics and Physics Envy, two works that rethink risk and utility through a more flexible and data-driven lens. Alongside his research, he has developed open-source tools like the NNS and meboot R packages, which allow quants to model uncertainty and asymmetry without relying on restrictive assumptions. These methods now power a range of applications, from macro forecasting to option pricing and portfolio optimisation.

In our conversation, Fred shares the ideas behind partial moments, the need to move beyond symmetric risk metrics, and how OVVO Labs is translating nonlinear statistics into real-world applications for quants and investors alike. 

You’ve traded markets since the 1990s. What’s the biggest change you’ve noticed in how quants approach statistical modeling and risk since then?

My passion for markets started early, shaped by my father’s NYSE seat and our Augusts spent at Monmouth Park and Saratoga racetracks, watching his horses run while learning probability through betting and absorbing trading anecdotes. By the time I left Morgan Stanley in 1999 to run a day trading office, handling 20% of daily volume in stocks like INFY, SDLI, and NTAP with sub-minute holds felt like high-frequency trading, until decimalization drove us into sub-second rebate trading.

The biggest shift in quantitative finance since then has been the relentless push to ultra-high frequencies, where technological edge, latency arbitrage, co-location, and fast execution often overshadows statistical modeling. While high-frequency trading leans on infrastructure, longer-term stat-arb has pivoted from classical statistical methods, which struggle with tail risks and nonlinearities, to machine learning (ML) techniques that promise to capture complex market dynamics. 

But ML’s sophistication masks a paradox. While it detects nonlinear patterns, its foundations of covariance matrices and correlation assumptions are inherited from classical statistics. My work with partial moments addresses this: tools like CLPM (Co-Lower Partial Moments) and CUPM (Co-Upper Partial Moments) quantify how assets move together in crashes and rallies separately, without assuming linearity or normality. ML’s black-box models, by contrast, often obscure these dynamics, risking overfitting or missing tail events, a flaw reminiscent of 2008’s models, which collapsed under the weight of their own assumptions. 

The result? My framework bridges ML’s flexibility with classical rigor. By replacing correlation matrices with nonparametric partial moments, we gain robustness, nonlinear insights and interpretability, like upgrading from a blurry satellite image to a high-resolution MRI of market risks.

What single skill or mindset shift made the most difference when transitioning successfully from discretionary trading to fully automated systems? 

In the early 2000s, trading spot FX with grueling hours pushed me to automate my process. The pivotal mindset shift came from embracing Mandelbrot’s fractals and self-similarity, realizing all time frames were equally valid for trading setups. By mathematically modeling my discretionary approach, I built a system trading FX, commodities, and equities, netting positions across independently traded time frames. This produced asymmetric, positively skewed returns, often wrong on small exposures (one contract or 100 shares) but highly profitable when all time frames aligned with full allocations, a dynamic I later captured with partial moments in my NNS R package.

This shift solved my position sizing problem, which I prioritize above exits and then entries, and codified adding to winning positions, a key trading edge. It required abandoning my fixation on high win rates, accepting frequent small losses for outsized gains, a principle later reflected in my upper partial moment (UPM) to lower partial moment (LPM) ratio.

Can you walk us through the moment you first realised variance wasn’t telling
the full story, and how that led you to partial moments?

In the late 2000s, a hiring manager at a quant fund told me my trading system’s Sharpe ratio was too low, despite its highly asymmetrical risk-reward profile and positively skewed returns. Frustrated, I consulted my professor, who pointed me to David Nawrocki, and during our first meeting, he sketched partial moment formulas on a blackboard (a true a-ha moment for me!). It clicked that variance treated gains and losses symmetrically, double-counting observations as both risk and reward in most performance metrics, which misaligned with my trading intuition from years at Morgan Stanley and running a day trading office. This led me to develop the upper partial moment (UPM) to lower partial moment (LPM) ratio as a Sharpe replacement, capturing upside potential and downside risk separately in a nonparametric way.

The enthusiasm for the UPM/LPM ratio spurred years of research into utility theory to provide a theoretical backbone, resulting in several published papers on a full partial moments utility function. Any and all evaluation of returns inherently maps to a utility function, an inconvenient truth for many quants. I reached out to Harry Markowitz, whose early utility work resonated with my portfolio goals, sparking a multi-year correspondence. He endorsed my framework, writing letters of recommendation and acknowledging that my partial moments approach constitutes a general portfolio theory, with mean-variance as a subset.

This work, leveraging the partitioning of variance and covariance, eventually refactored traditional statistical tools (pretty much anything with a σ in it) into their partial moments equivalents, leading to the NNS (Nonlinear Nonparametric Statistics) R package. Today, NNS lets quants replace assumptions-heavy models with flexible, asymmetry-aware tools, a direct outcome of that initial frustration with variance’s blind spots.

How is the wider adoption of nonlinear statistical modelling changing the way
quants design strategies, test robustness, and iterate on their models as market conditions evolve?

Nonlinear statistical modeling, like my partial moments framework, is transforming quant strategies by prioritizing the asymmetry between gains and losses, moving beyond linear correlations and Gaussian assumptions to capture complex market dynamics. Despite this progress, many quants still rely on theoretically flawed shortcuts like CVaR, which Harry Markowitz rejected for assuming a linear utility function for losses beyond a threshold, contradicting decades of behavioral finance research. My NNS package addresses this with non- parametric partial moment matrices (CLPM, CUPM, DLPM, DUPM), which reveal nonlinear co-movements missed by traditional metrics. For instance, my stress-testing method isolates CLPM quadrants to preserve dependence structures in extreme scenarios, outperforming standard Monte Carlo simulations.

Robustness testing has evolved significantly with my Maximum Entropy Bootstrap, originally inspired by my co-author Hrishikesh Vinod, who worked under Tukey at Bell Labs and encouraged me to program NNS in R. This bootstrap generates synthetic data with controlled correlations and dependencies, ensuring strategies hold up across diverse market conditions. If your data is nonstationary and complex (e.g., financial time series with regime shifts), empirical distribution assumptions are typically preferred because they prioritize flexibility and fidelity to the data’s true behavior.

As market structure evolves, where do you think nonlinear tools will add the
most value over the next decade?

Over the next decade, nonlinear tools like partial moments will add the most value in personalized portfolio management and real-time risk assessment. As markets become more fragmented with alternative assets and high-frequency data, traditional models struggle to capture nonlinear dependencies and tail risks. My partial moment framework, embedded in tools like the OVVO Labs portfolio tool, allows investors to customize portfolios by specifying risk profiles (e.g., loss aversion to risk-seeking), directly integrating utility preferences into covariance matrices. This is critical as retail and institutional investors demand strategies tailored to their unique risk tolerances, especially in volatile environments. Not everyone should have the same portfolio!

Additionally, nonlinear tools will shine in stress testing and macro forecasting. My stress-testing approach and my MacroNow tool demonstrate how nonparametric methods can model extreme scenarios and predict macroeconomic variables (e.g., GDP, CPI) with high accuracy. As market structures incorporate AI-driven trading and complex derivatives, nonlinear tools will provide the flexibility to adapt to new data regimes, ensuring quants can manage risks and seize opportunities in real time.

What is the next major project or initiative you’re working on at OVVO Labs, and how do you see it improving the quant domain?

At OVVO Labs, my next major initiative is to integrate a more conversational interface for the end user, while also offering more customization and API access for more experienced quants. This platform will lever- age partial moments to offer quants and retail investors a seamless way to construct utility-driven strategies, stress-test portfolios, and forecast economic indicators, all while incorporating nonlinear dependence measures and dynamic regression techniques from NNS.

This project will improve the quant domain by democratizing advanced nonlinear tools, making them as intuitive as mean-variance models but far more robust. By bridging R and Python ecosystems and enhancing our GPT tool, we’ll empower quants to rapidly prototype and deploy strategies that adapt to market shifts, from high-frequency trading to long-term asset allocation. The goal is to move the industry toward empirical, utility-centric modeling, reducing reliance on outdated assumptions and enabling better decision-making in complex markets.

Anything else you'd like to highlight for those looking to deepen their statistical toolkit?

I’m excited to promote the NNS R package, a game-changer for statistical analysis across finance, economics, and beyond. These robust statistics provide the basis for nonlinear analysis while retaining linear equivalences. NNS offers: Numerical integration, Numerical differentiation, Clustering, Correlation, Dependence, Causal analysis, ANOVA, Regression, Classification, Seasonality, Autoregressive modeling, Normalization, Stochastic dominance and Advanced Monte Carlo sampling, covering roughly 90% of applied statistics. Its open-source nature on GitHub makes it accessible for quants, researchers, and students to explore data-driven insights without rigid assumptions, as seen in applications like portfolio optimization and macro forecasting.

All of the material including presentation, slides and an AI overview of the NNS package can be accessed here. Also, all of the apps have introductory versions where users can get accustomed to the format and output for macroeconomic forecasting, options pricing and portfolio construction via our main page.

If you're curious to learn more about Fred’s fascinating work on Partial Moments Theory & its applications, Fred's created a LinkedIn group where he shares technical insights and ongoing discussions. Feel free to join here

Numbers & Narratives

Drawdown Gravity: Base-Rate Lessons from 6,500 U.S. Stocks

Morgan Stanley just released a sweeping analysis of 40 years of U.S. equity drawdowns, tracking over 6,500 names across their full boom–bust–recovery arcs. The topline stat is brutal: the median maximum drawdown is –85%, and more than half of all stocks never reclaim their prior high. Even the top performers, those with the best total shareholder returns, endured drawdowns >–70% along the way. 

Their recovery table is where it gets even more interesting. Past a –75% drop, the odds of ever getting back to par fall off sharply. Breach –90% and you're down to coin-flip territory. Below –95%, just 1 in 6 names ever recover, and the average time to breakeven stretches to 8 years. Rebounds can be violent, sure, but they rarely retrace the full path. Deep drawdowns mechanically produce large % bounces, but they do not imply true recovery.

What this means for quants:

  • Tail-aware position sizing: If your models cap downside at –50%, you're underestimating exposure. Add tail priors beyond –75%, where the drawdown distribution changes shape sharply.

  • Drawdown gating for signals: Post-collapse reversal signals (value, momentum, etc.) need contextual features. Look for signs of business inflection, such as FCF turning, insider buys, or spread compression.

  • Hold cost assumptions: In the deep buckets, time-to-par often exceeds 5 years. That has material implications for borrow cost, capital lockup, and short-side carry in low-liquidity names.

  • Feature engineering: Treat drawdown depth as a modifier. A 5Y CAGR post –50% drawdown is not the same as post –90%. The conditional distribution is fat-tailed and regime-shifting.

  • Scenario stress tests: Do not assume mean reversion. Model drawdown breakpoints explicitly. Once a name breaches –80%, median recovery trajectories flatten fast.

  • Portfolio heuristics: If your weighting relies on mean reversion or volatility compression, consider overlaying recovery probabilities to avoid structural losers that only look optically cheap.

The data challenges the assumption that all drawdowns are temporary. In many cases, they reflect permanent changes in return expectations, business quality, or capital efficiency. Quants who treat large drawdowns as structural breaks rather than noise will be better equipped to size risk, gate signals, and allocate capital with more precision.

Link to the report

Market Pulse

Hedge funds posted strong gains in May, with systematic equity strategies up 4.2%, lifted by the sharp reversal in tech and AI-linked stocks following a de-escalation of tariff threats. Goldman Sachs noted the fastest pace of hedge fund equity buying since November 2024, concentrated in semiconductors and AI infrastructure, but this flow was unusually one-sided - suggesting not conviction, but positioning risk if the macro regime turns. That fragility is precisely what firms like Picton Mahoney are positioning against; they’ve been buying volatility outright, arguing that the tariff “pause” is superficial and that policy risk remains deeply underpriced. Steve Diggle, too, sees echoes of 2008, pointing not to housing this time, but to opaque leverage in private credit and structurally overvalued equity markets, especially in the U.S., where he warns few investors are properly hedged. That concern is echoed institutionally: the Fed stayed on hold warning that persistent fiscal imbalances and rising Treasury yields could weaken the foundations of the U.S.'s safe-haven role over time, a risk amplified by Moody’s recent downgrade of U.S. sovereign credit from AAA to AA1. While equities soared, the rally was narrow, factor spreads widened, and dispersion surged leaving a market primed for relative value, long-volatility, and cross-asset macro strategies. Taken together, this is a market that rewards tactical aggression but punishes complacency—an environment where quant managers must read not just the signals, but the mispricings in how others are reacting to them.

Navigational Nudges

Cross-validation that ignores the structure of financial data rarely produces models that hold up in live trading. Autocorrelation, overlapping labels, and regime shifts make naïve splits risky by design. In practice, most overfitting in quant strategies originates not in the model architecture, but in the way it was validated. 

Here’s how to choose a split that actually simulates out-of-sample performance: 

  1. Walk Forward with Gap

    Useful for: Short-half-life alphas and data sets with long history.

    Train on observations up to time T, skip a gap at least as long as the label horizon (rule of thumb: gap ≥ horizon, often 1 to 2 times the look ahead window), test on (T + g, T + g + Δ], then roll. Always use full trading days or months, never partial periods.

  2. Purged k-Fold with Embargo (López de Prado 2018)

    Useful for: Limited history or overlapping labels in either time or cross section.

    Purge any training row whose outcome window intersects the test fold, then place an embargo immediately after the test block. Apply the purge across assets that share the same timestamp to stop cross sectional leakage. If data are scarce, switch to a blocked or stationary bootstrap to keep dependence intact.


  3. Combinatorial Purged CV (CPCV)

    Useful for: Final-stage robustness checks on high-stakes strategies.

    Evaluate every viable train-test split under the same purging rules, then measure overfitting with the Probability of Backtest Overfitting (PBO) and the Deflated Sharpe. Combinations scale roughly as O(n²); budget compute or down-sample folds before running the full grid.


  4. Nested Time-Series CV

    Useful for: Hyper-parameter-hungry models such as boosted trees or deep nets.

    Wrap tuning inside an inner walk-forward loop and measure generalisation on an outer holdout. Keep every step of preprocessing, including feature scaling, inside the loop to avoid look ahead bias.

Pick the simplest scheme that respects causality, then pressure test it with a stricter one. The market will always exploit the fold you didn’t test, and most models don’t fail because the signal was absent, they fail because the wrong validation was trusted. Nail that part and everything else gets a lot easier.

The Knowledge Buffet

🎙️ Trading Insights: All About Alternative Data🎙️
by JP Morgan’s Making Sense

In this episode, Mark Flemming-Williams, Head of Data Sourcing at CFM and guest on the podcast, offers one of the most refreshingly honest accounts we've heard of what it really takes to get alternative data into production at a quant fund. From point in time structure to the true cost of trialing, it’s a sharp reminder of how tough the process is and a great validation of why we do what we do at Quanted. Well worth a listen.

The Closing Bell

——

Disclaimer, this newsletter is for educational purposes only and does not constitute financial advice. Any trading strategy discussed is hypothetical, and past performance is not indicative of future results. Before making any investment decisions, please conduct thorough research and consult with a qualified financial professional. Remember, all investments carry risk

A black and white photo of winding a mountain road with light trails at nightfall

Newsletter

Jun 4, 2025

Partial Moments & Complete Recovery Odds

The Kickoff

June’s here, and with it, a reminder that recovery isn’t always linear, whether you’re talking models, markets, or mindset. We’ve been spending time in the in-betweens: between regimes, between theory and practice, between what the data says and what it means. Not everything resolves neatly. But that’s often where the best questions live. 

The Compass

Here's a rundown of what you can find in this edition:

  • Some updates for anyone wondering what we’ve been up to.

  • What we learned from our chat with Fred Viole, founder of OVVO Labs

  • What the data says about US stock drawdowns and recovery odds

  • What we’re watching in the markets right now

  • The do’s and don’ts of choosing a time-series CV method

  • Some insights we think you should hear from Mark Fleming-Williams on data sourcing. 

  • Your daily dose of humour - because you deserve it.

Insider Info

Milestone month across funding, product, and team this month. Our most recent fundraise is now officially out in the wild with coverage from Tech.eu. The round is a foundational step in backing the technical buildout needed to bring faster, more robust data validation to quant finance. 

That said, on the product side, we’ve now surpassed 3.2 million features in our system. We’ve also spent most of our month refining our product which now has:

  • introduced aggregated reports that summarise results across multiple tests, helping users make quicker and more confident dataset decisions.

  • added features that capture clustered signals describing current market states, giving users more context around model performance.

  • expanded user controls, letting quants customise filters and preferences to surface data that aligns with their strategy or domain focus.

As part of our ongoing effort to give young talent a tangible entry point into quantitative finance, we welcomed Alperen Öztürk this month as our new product intern. Our product internships offer hands-on experience and the chance to work closely with our CTO and senior team. 

We’re also actively growing the team. Roles across data science, full stack development, product, and GTM frequently pop up, so keep an eye on our job ads page if you or someone in your network is exploring new opportunities. Plenty more in the works :)

On the Radar

We've welcomed many new data partners this month, each enriching the pool of features quants have at their fingertips. All ready to unpack, test, and integrate into their strategies. A warm welcome to the partners below:

Context Analytics

Provides structured, machine-readable data from social media, corporate filings, and earnings call transcripts, enabling quants to integrate real-time sentiment and thematic signals into predictive models for alpha generation and risk assessment. 

Paragon Intel

Provides analytics on 2,000 company c-suites, linking executive ability to with future company performance. Leverages proprietary interviews with former colleagues, predictive ratings, and AI analysis of earnings call Q&A to produce consistent, predictive signal.

The Tradewinds

Expert Exchange

We recently sat down with Fred Viole, Founder of OVVO Labs and creator of the NNS statistical framework, to explore his nonlinear approach to quantitative finance. With a career spanning decades as a trader, researcher, and portfolio manager- including time at Morgan Stanley and TGAM - Fred brings a distinctive perspective that bridges behavioural finance, numerical analysis, and machine learning.

Fred is also the co-author of Nonlinear Nonparametric Statistics and Physics Envy, two works that rethink risk and utility through a more flexible and data-driven lens. Alongside his research, he has developed open-source tools like the NNS and meboot R packages, which allow quants to model uncertainty and asymmetry without relying on restrictive assumptions. These methods now power a range of applications, from macro forecasting to option pricing and portfolio optimisation.

In our conversation, Fred shares the ideas behind partial moments, the need to move beyond symmetric risk metrics, and how OVVO Labs is translating nonlinear statistics into real-world applications for quants and investors alike. 

You’ve traded markets since the 1990s. What’s the biggest change you’ve noticed in how quants approach statistical modeling and risk since then?

My passion for markets started early, shaped by my father’s NYSE seat and our Augusts spent at Monmouth Park and Saratoga racetracks, watching his horses run while learning probability through betting and absorbing trading anecdotes. By the time I left Morgan Stanley in 1999 to run a day trading office, handling 20% of daily volume in stocks like INFY, SDLI, and NTAP with sub-minute holds felt like high-frequency trading, until decimalization drove us into sub-second rebate trading.

The biggest shift in quantitative finance since then has been the relentless push to ultra-high frequencies, where technological edge, latency arbitrage, co-location, and fast execution often overshadows statistical modeling. While high-frequency trading leans on infrastructure, longer-term stat-arb has pivoted from classical statistical methods, which struggle with tail risks and nonlinearities, to machine learning (ML) techniques that promise to capture complex market dynamics. 

But ML’s sophistication masks a paradox. While it detects nonlinear patterns, its foundations of covariance matrices and correlation assumptions are inherited from classical statistics. My work with partial moments addresses this: tools like CLPM (Co-Lower Partial Moments) and CUPM (Co-Upper Partial Moments) quantify how assets move together in crashes and rallies separately, without assuming linearity or normality. ML’s black-box models, by contrast, often obscure these dynamics, risking overfitting or missing tail events, a flaw reminiscent of 2008’s models, which collapsed under the weight of their own assumptions. 

The result? My framework bridges ML’s flexibility with classical rigor. By replacing correlation matrices with nonparametric partial moments, we gain robustness, nonlinear insights and interpretability, like upgrading from a blurry satellite image to a high-resolution MRI of market risks.

What single skill or mindset shift made the most difference when transitioning successfully from discretionary trading to fully automated systems? 

In the early 2000s, trading spot FX with grueling hours pushed me to automate my process. The pivotal mindset shift came from embracing Mandelbrot’s fractals and self-similarity, realizing all time frames were equally valid for trading setups. By mathematically modeling my discretionary approach, I built a system trading FX, commodities, and equities, netting positions across independently traded time frames. This produced asymmetric, positively skewed returns, often wrong on small exposures (one contract or 100 shares) but highly profitable when all time frames aligned with full allocations, a dynamic I later captured with partial moments in my NNS R package.

This shift solved my position sizing problem, which I prioritize above exits and then entries, and codified adding to winning positions, a key trading edge. It required abandoning my fixation on high win rates, accepting frequent small losses for outsized gains, a principle later reflected in my upper partial moment (UPM) to lower partial moment (LPM) ratio.

Can you walk us through the moment you first realised variance wasn’t telling
the full story, and how that led you to partial moments?

In the late 2000s, a hiring manager at a quant fund told me my trading system’s Sharpe ratio was too low, despite its highly asymmetrical risk-reward profile and positively skewed returns. Frustrated, I consulted my professor, who pointed me to David Nawrocki, and during our first meeting, he sketched partial moment formulas on a blackboard (a true a-ha moment for me!). It clicked that variance treated gains and losses symmetrically, double-counting observations as both risk and reward in most performance metrics, which misaligned with my trading intuition from years at Morgan Stanley and running a day trading office. This led me to develop the upper partial moment (UPM) to lower partial moment (LPM) ratio as a Sharpe replacement, capturing upside potential and downside risk separately in a nonparametric way.

The enthusiasm for the UPM/LPM ratio spurred years of research into utility theory to provide a theoretical backbone, resulting in several published papers on a full partial moments utility function. Any and all evaluation of returns inherently maps to a utility function, an inconvenient truth for many quants. I reached out to Harry Markowitz, whose early utility work resonated with my portfolio goals, sparking a multi-year correspondence. He endorsed my framework, writing letters of recommendation and acknowledging that my partial moments approach constitutes a general portfolio theory, with mean-variance as a subset.

This work, leveraging the partitioning of variance and covariance, eventually refactored traditional statistical tools (pretty much anything with a σ in it) into their partial moments equivalents, leading to the NNS (Nonlinear Nonparametric Statistics) R package. Today, NNS lets quants replace assumptions-heavy models with flexible, asymmetry-aware tools, a direct outcome of that initial frustration with variance’s blind spots.

How is the wider adoption of nonlinear statistical modelling changing the way
quants design strategies, test robustness, and iterate on their models as market conditions evolve?

Nonlinear statistical modeling, like my partial moments framework, is transforming quant strategies by prioritizing the asymmetry between gains and losses, moving beyond linear correlations and Gaussian assumptions to capture complex market dynamics. Despite this progress, many quants still rely on theoretically flawed shortcuts like CVaR, which Harry Markowitz rejected for assuming a linear utility function for losses beyond a threshold, contradicting decades of behavioral finance research. My NNS package addresses this with non- parametric partial moment matrices (CLPM, CUPM, DLPM, DUPM), which reveal nonlinear co-movements missed by traditional metrics. For instance, my stress-testing method isolates CLPM quadrants to preserve dependence structures in extreme scenarios, outperforming standard Monte Carlo simulations.

Robustness testing has evolved significantly with my Maximum Entropy Bootstrap, originally inspired by my co-author Hrishikesh Vinod, who worked under Tukey at Bell Labs and encouraged me to program NNS in R. This bootstrap generates synthetic data with controlled correlations and dependencies, ensuring strategies hold up across diverse market conditions. If your data is nonstationary and complex (e.g., financial time series with regime shifts), empirical distribution assumptions are typically preferred because they prioritize flexibility and fidelity to the data’s true behavior.

As market structure evolves, where do you think nonlinear tools will add the
most value over the next decade?

Over the next decade, nonlinear tools like partial moments will add the most value in personalized portfolio management and real-time risk assessment. As markets become more fragmented with alternative assets and high-frequency data, traditional models struggle to capture nonlinear dependencies and tail risks. My partial moment framework, embedded in tools like the OVVO Labs portfolio tool, allows investors to customize portfolios by specifying risk profiles (e.g., loss aversion to risk-seeking), directly integrating utility preferences into covariance matrices. This is critical as retail and institutional investors demand strategies tailored to their unique risk tolerances, especially in volatile environments. Not everyone should have the same portfolio!

Additionally, nonlinear tools will shine in stress testing and macro forecasting. My stress-testing approach and my MacroNow tool demonstrate how nonparametric methods can model extreme scenarios and predict macroeconomic variables (e.g., GDP, CPI) with high accuracy. As market structures incorporate AI-driven trading and complex derivatives, nonlinear tools will provide the flexibility to adapt to new data regimes, ensuring quants can manage risks and seize opportunities in real time.

What is the next major project or initiative you’re working on at OVVO Labs, and how do you see it improving the quant domain?

At OVVO Labs, my next major initiative is to integrate a more conversational interface for the end user, while also offering more customization and API access for more experienced quants. This platform will lever- age partial moments to offer quants and retail investors a seamless way to construct utility-driven strategies, stress-test portfolios, and forecast economic indicators, all while incorporating nonlinear dependence measures and dynamic regression techniques from NNS.

This project will improve the quant domain by democratizing advanced nonlinear tools, making them as intuitive as mean-variance models but far more robust. By bridging R and Python ecosystems and enhancing our GPT tool, we’ll empower quants to rapidly prototype and deploy strategies that adapt to market shifts, from high-frequency trading to long-term asset allocation. The goal is to move the industry toward empirical, utility-centric modeling, reducing reliance on outdated assumptions and enabling better decision-making in complex markets.

Anything else you'd like to highlight for those looking to deepen their statistical toolkit?

I’m excited to promote the NNS R package, a game-changer for statistical analysis across finance, economics, and beyond. These robust statistics provide the basis for nonlinear analysis while retaining linear equivalences. NNS offers: Numerical integration, Numerical differentiation, Clustering, Correlation, Dependence, Causal analysis, ANOVA, Regression, Classification, Seasonality, Autoregressive modeling, Normalization, Stochastic dominance and Advanced Monte Carlo sampling, covering roughly 90% of applied statistics. Its open-source nature on GitHub makes it accessible for quants, researchers, and students to explore data-driven insights without rigid assumptions, as seen in applications like portfolio optimization and macro forecasting.

All of the material including presentation, slides and an AI overview of the NNS package can be accessed here. Also, all of the apps have introductory versions where users can get accustomed to the format and output for macroeconomic forecasting, options pricing and portfolio construction via our main page.

If you're curious to learn more about Fred’s fascinating work on Partial Moments Theory & its applications, Fred's created a LinkedIn group where he shares technical insights and ongoing discussions. Feel free to join here

Numbers & Narratives

Drawdown Gravity: Base-Rate Lessons from 6,500 U.S. Stocks

Morgan Stanley just released a sweeping analysis of 40 years of U.S. equity drawdowns, tracking over 6,500 names across their full boom–bust–recovery arcs. The topline stat is brutal: the median maximum drawdown is –85%, and more than half of all stocks never reclaim their prior high. Even the top performers, those with the best total shareholder returns, endured drawdowns >–70% along the way. 

Their recovery table is where it gets even more interesting. Past a –75% drop, the odds of ever getting back to par fall off sharply. Breach –90% and you're down to coin-flip territory. Below –95%, just 1 in 6 names ever recover, and the average time to breakeven stretches to 8 years. Rebounds can be violent, sure, but they rarely retrace the full path. Deep drawdowns mechanically produce large % bounces, but they do not imply true recovery.

What this means for quants:

  • Tail-aware position sizing: If your models cap downside at –50%, you're underestimating exposure. Add tail priors beyond –75%, where the drawdown distribution changes shape sharply.

  • Drawdown gating for signals: Post-collapse reversal signals (value, momentum, etc.) need contextual features. Look for signs of business inflection, such as FCF turning, insider buys, or spread compression.

  • Hold cost assumptions: In the deep buckets, time-to-par often exceeds 5 years. That has material implications for borrow cost, capital lockup, and short-side carry in low-liquidity names.

  • Feature engineering: Treat drawdown depth as a modifier. A 5Y CAGR post –50% drawdown is not the same as post –90%. The conditional distribution is fat-tailed and regime-shifting.

  • Scenario stress tests: Do not assume mean reversion. Model drawdown breakpoints explicitly. Once a name breaches –80%, median recovery trajectories flatten fast.

  • Portfolio heuristics: If your weighting relies on mean reversion or volatility compression, consider overlaying recovery probabilities to avoid structural losers that only look optically cheap.

The data challenges the assumption that all drawdowns are temporary. In many cases, they reflect permanent changes in return expectations, business quality, or capital efficiency. Quants who treat large drawdowns as structural breaks rather than noise will be better equipped to size risk, gate signals, and allocate capital with more precision.

Link to the report

Market Pulse

Hedge funds posted strong gains in May, with systematic equity strategies up 4.2%, lifted by the sharp reversal in tech and AI-linked stocks following a de-escalation of tariff threats. Goldman Sachs noted the fastest pace of hedge fund equity buying since November 2024, concentrated in semiconductors and AI infrastructure, but this flow was unusually one-sided - suggesting not conviction, but positioning risk if the macro regime turns. That fragility is precisely what firms like Picton Mahoney are positioning against; they’ve been buying volatility outright, arguing that the tariff “pause” is superficial and that policy risk remains deeply underpriced. Steve Diggle, too, sees echoes of 2008, pointing not to housing this time, but to opaque leverage in private credit and structurally overvalued equity markets, especially in the U.S., where he warns few investors are properly hedged. That concern is echoed institutionally: the Fed stayed on hold warning that persistent fiscal imbalances and rising Treasury yields could weaken the foundations of the U.S.'s safe-haven role over time, a risk amplified by Moody’s recent downgrade of U.S. sovereign credit from AAA to AA1. While equities soared, the rally was narrow, factor spreads widened, and dispersion surged leaving a market primed for relative value, long-volatility, and cross-asset macro strategies. Taken together, this is a market that rewards tactical aggression but punishes complacency—an environment where quant managers must read not just the signals, but the mispricings in how others are reacting to them.

Navigational Nudges

Cross-validation that ignores the structure of financial data rarely produces models that hold up in live trading. Autocorrelation, overlapping labels, and regime shifts make naïve splits risky by design. In practice, most overfitting in quant strategies originates not in the model architecture, but in the way it was validated. 

Here’s how to choose a split that actually simulates out-of-sample performance: 

  1. Walk Forward with Gap

    Useful for: Short-half-life alphas and data sets with long history.

    Train on observations up to time T, skip a gap at least as long as the label horizon (rule of thumb: gap ≥ horizon, often 1 to 2 times the look ahead window), test on (T + g, T + g + Δ], then roll. Always use full trading days or months, never partial periods.

  2. Purged k-Fold with Embargo (López de Prado 2018)

    Useful for: Limited history or overlapping labels in either time or cross section.

    Purge any training row whose outcome window intersects the test fold, then place an embargo immediately after the test block. Apply the purge across assets that share the same timestamp to stop cross sectional leakage. If data are scarce, switch to a blocked or stationary bootstrap to keep dependence intact.


  3. Combinatorial Purged CV (CPCV)

    Useful for: Final-stage robustness checks on high-stakes strategies.

    Evaluate every viable train-test split under the same purging rules, then measure overfitting with the Probability of Backtest Overfitting (PBO) and the Deflated Sharpe. Combinations scale roughly as O(n²); budget compute or down-sample folds before running the full grid.


  4. Nested Time-Series CV

    Useful for: Hyper-parameter-hungry models such as boosted trees or deep nets.

    Wrap tuning inside an inner walk-forward loop and measure generalisation on an outer holdout. Keep every step of preprocessing, including feature scaling, inside the loop to avoid look ahead bias.

Pick the simplest scheme that respects causality, then pressure test it with a stricter one. The market will always exploit the fold you didn’t test, and most models don’t fail because the signal was absent, they fail because the wrong validation was trusted. Nail that part and everything else gets a lot easier.

The Knowledge Buffet

🎙️ Trading Insights: All About Alternative Data🎙️
by JP Morgan’s Making Sense

In this episode, Mark Flemming-Williams, Head of Data Sourcing at CFM and guest on the podcast, offers one of the most refreshingly honest accounts we've heard of what it really takes to get alternative data into production at a quant fund. From point in time structure to the true cost of trialing, it’s a sharp reminder of how tough the process is and a great validation of why we do what we do at Quanted. Well worth a listen.

The Closing Bell

——

Disclaimer, this newsletter is for educational purposes only and does not constitute financial advice. Any trading strategy discussed is hypothetical, and past performance is not indicative of future results. Before making any investment decisions, please conduct thorough research and consult with a qualified financial professional. Remember, all investments carry risk

A black and white photo of winding a mountain road with light trails at nightfall

Newsletter

Jun 4, 2025

Partial Moments & Complete Recovery Odds

The Kickoff

June’s here, and with it, a reminder that recovery isn’t always linear, whether you’re talking models, markets, or mindset. We’ve been spending time in the in-betweens: between regimes, between theory and practice, between what the data says and what it means. Not everything resolves neatly. But that’s often where the best questions live. 

The Compass

Here's a rundown of what you can find in this edition:

  • Some updates for anyone wondering what we’ve been up to.

  • What we learned from our chat with Fred Viole, founder of OVVO Labs

  • What the data says about US stock drawdowns and recovery odds

  • What we’re watching in the markets right now

  • The do’s and don’ts of choosing a time-series CV method

  • Some insights we think you should hear from Mark Fleming-Williams on data sourcing. 

  • Your daily dose of humour - because you deserve it.

Insider Info

Milestone month across funding, product, and team this month. Our most recent fundraise is now officially out in the wild with coverage from Tech.eu. The round is a foundational step in backing the technical buildout needed to bring faster, more robust data validation to quant finance. 

That said, on the product side, we’ve now surpassed 3.2 million features in our system. We’ve also spent most of our month refining our product which now has:

  • introduced aggregated reports that summarise results across multiple tests, helping users make quicker and more confident dataset decisions.

  • added features that capture clustered signals describing current market states, giving users more context around model performance.

  • expanded user controls, letting quants customise filters and preferences to surface data that aligns with their strategy or domain focus.

As part of our ongoing effort to give young talent a tangible entry point into quantitative finance, we welcomed Alperen Öztürk this month as our new product intern. Our product internships offer hands-on experience and the chance to work closely with our CTO and senior team. 

We’re also actively growing the team. Roles across data science, full stack development, product, and GTM frequently pop up, so keep an eye on our job ads page if you or someone in your network is exploring new opportunities. Plenty more in the works :)

On the Radar

We've welcomed many new data partners this month, each enriching the pool of features quants have at their fingertips. All ready to unpack, test, and integrate into their strategies. A warm welcome to the partners below:

Context Analytics

Provides structured, machine-readable data from social media, corporate filings, and earnings call transcripts, enabling quants to integrate real-time sentiment and thematic signals into predictive models for alpha generation and risk assessment. 

Paragon Intel

Provides analytics on 2,000 company c-suites, linking executive ability to with future company performance. Leverages proprietary interviews with former colleagues, predictive ratings, and AI analysis of earnings call Q&A to produce consistent, predictive signal.

The Tradewinds

Expert Exchange

We recently sat down with Fred Viole, Founder of OVVO Labs and creator of the NNS statistical framework, to explore his nonlinear approach to quantitative finance. With a career spanning decades as a trader, researcher, and portfolio manager- including time at Morgan Stanley and TGAM - Fred brings a distinctive perspective that bridges behavioural finance, numerical analysis, and machine learning.

Fred is also the co-author of Nonlinear Nonparametric Statistics and Physics Envy, two works that rethink risk and utility through a more flexible and data-driven lens. Alongside his research, he has developed open-source tools like the NNS and meboot R packages, which allow quants to model uncertainty and asymmetry without relying on restrictive assumptions. These methods now power a range of applications, from macro forecasting to option pricing and portfolio optimisation.

In our conversation, Fred shares the ideas behind partial moments, the need to move beyond symmetric risk metrics, and how OVVO Labs is translating nonlinear statistics into real-world applications for quants and investors alike. 

You’ve traded markets since the 1990s. What’s the biggest change you’ve noticed in how quants approach statistical modeling and risk since then?

My passion for markets started early, shaped by my father’s NYSE seat and our Augusts spent at Monmouth Park and Saratoga racetracks, watching his horses run while learning probability through betting and absorbing trading anecdotes. By the time I left Morgan Stanley in 1999 to run a day trading office, handling 20% of daily volume in stocks like INFY, SDLI, and NTAP with sub-minute holds felt like high-frequency trading, until decimalization drove us into sub-second rebate trading.

The biggest shift in quantitative finance since then has been the relentless push to ultra-high frequencies, where technological edge, latency arbitrage, co-location, and fast execution often overshadows statistical modeling. While high-frequency trading leans on infrastructure, longer-term stat-arb has pivoted from classical statistical methods, which struggle with tail risks and nonlinearities, to machine learning (ML) techniques that promise to capture complex market dynamics. 

But ML’s sophistication masks a paradox. While it detects nonlinear patterns, its foundations of covariance matrices and correlation assumptions are inherited from classical statistics. My work with partial moments addresses this: tools like CLPM (Co-Lower Partial Moments) and CUPM (Co-Upper Partial Moments) quantify how assets move together in crashes and rallies separately, without assuming linearity or normality. ML’s black-box models, by contrast, often obscure these dynamics, risking overfitting or missing tail events, a flaw reminiscent of 2008’s models, which collapsed under the weight of their own assumptions. 

The result? My framework bridges ML’s flexibility with classical rigor. By replacing correlation matrices with nonparametric partial moments, we gain robustness, nonlinear insights and interpretability, like upgrading from a blurry satellite image to a high-resolution MRI of market risks.

What single skill or mindset shift made the most difference when transitioning successfully from discretionary trading to fully automated systems? 

In the early 2000s, trading spot FX with grueling hours pushed me to automate my process. The pivotal mindset shift came from embracing Mandelbrot’s fractals and self-similarity, realizing all time frames were equally valid for trading setups. By mathematically modeling my discretionary approach, I built a system trading FX, commodities, and equities, netting positions across independently traded time frames. This produced asymmetric, positively skewed returns, often wrong on small exposures (one contract or 100 shares) but highly profitable when all time frames aligned with full allocations, a dynamic I later captured with partial moments in my NNS R package.

This shift solved my position sizing problem, which I prioritize above exits and then entries, and codified adding to winning positions, a key trading edge. It required abandoning my fixation on high win rates, accepting frequent small losses for outsized gains, a principle later reflected in my upper partial moment (UPM) to lower partial moment (LPM) ratio.

Can you walk us through the moment you first realised variance wasn’t telling
the full story, and how that led you to partial moments?

In the late 2000s, a hiring manager at a quant fund told me my trading system’s Sharpe ratio was too low, despite its highly asymmetrical risk-reward profile and positively skewed returns. Frustrated, I consulted my professor, who pointed me to David Nawrocki, and during our first meeting, he sketched partial moment formulas on a blackboard (a true a-ha moment for me!). It clicked that variance treated gains and losses symmetrically, double-counting observations as both risk and reward in most performance metrics, which misaligned with my trading intuition from years at Morgan Stanley and running a day trading office. This led me to develop the upper partial moment (UPM) to lower partial moment (LPM) ratio as a Sharpe replacement, capturing upside potential and downside risk separately in a nonparametric way.

The enthusiasm for the UPM/LPM ratio spurred years of research into utility theory to provide a theoretical backbone, resulting in several published papers on a full partial moments utility function. Any and all evaluation of returns inherently maps to a utility function, an inconvenient truth for many quants. I reached out to Harry Markowitz, whose early utility work resonated with my portfolio goals, sparking a multi-year correspondence. He endorsed my framework, writing letters of recommendation and acknowledging that my partial moments approach constitutes a general portfolio theory, with mean-variance as a subset.

This work, leveraging the partitioning of variance and covariance, eventually refactored traditional statistical tools (pretty much anything with a σ in it) into their partial moments equivalents, leading to the NNS (Nonlinear Nonparametric Statistics) R package. Today, NNS lets quants replace assumptions-heavy models with flexible, asymmetry-aware tools, a direct outcome of that initial frustration with variance’s blind spots.

How is the wider adoption of nonlinear statistical modelling changing the way
quants design strategies, test robustness, and iterate on their models as market conditions evolve?

Nonlinear statistical modeling, like my partial moments framework, is transforming quant strategies by prioritizing the asymmetry between gains and losses, moving beyond linear correlations and Gaussian assumptions to capture complex market dynamics. Despite this progress, many quants still rely on theoretically flawed shortcuts like CVaR, which Harry Markowitz rejected for assuming a linear utility function for losses beyond a threshold, contradicting decades of behavioral finance research. My NNS package addresses this with non- parametric partial moment matrices (CLPM, CUPM, DLPM, DUPM), which reveal nonlinear co-movements missed by traditional metrics. For instance, my stress-testing method isolates CLPM quadrants to preserve dependence structures in extreme scenarios, outperforming standard Monte Carlo simulations.

Robustness testing has evolved significantly with my Maximum Entropy Bootstrap, originally inspired by my co-author Hrishikesh Vinod, who worked under Tukey at Bell Labs and encouraged me to program NNS in R. This bootstrap generates synthetic data with controlled correlations and dependencies, ensuring strategies hold up across diverse market conditions. If your data is nonstationary and complex (e.g., financial time series with regime shifts), empirical distribution assumptions are typically preferred because they prioritize flexibility and fidelity to the data’s true behavior.

As market structure evolves, where do you think nonlinear tools will add the
most value over the next decade?

Over the next decade, nonlinear tools like partial moments will add the most value in personalized portfolio management and real-time risk assessment. As markets become more fragmented with alternative assets and high-frequency data, traditional models struggle to capture nonlinear dependencies and tail risks. My partial moment framework, embedded in tools like the OVVO Labs portfolio tool, allows investors to customize portfolios by specifying risk profiles (e.g., loss aversion to risk-seeking), directly integrating utility preferences into covariance matrices. This is critical as retail and institutional investors demand strategies tailored to their unique risk tolerances, especially in volatile environments. Not everyone should have the same portfolio!

Additionally, nonlinear tools will shine in stress testing and macro forecasting. My stress-testing approach and my MacroNow tool demonstrate how nonparametric methods can model extreme scenarios and predict macroeconomic variables (e.g., GDP, CPI) with high accuracy. As market structures incorporate AI-driven trading and complex derivatives, nonlinear tools will provide the flexibility to adapt to new data regimes, ensuring quants can manage risks and seize opportunities in real time.

What is the next major project or initiative you’re working on at OVVO Labs, and how do you see it improving the quant domain?

At OVVO Labs, my next major initiative is to integrate a more conversational interface for the end user, while also offering more customization and API access for more experienced quants. This platform will lever- age partial moments to offer quants and retail investors a seamless way to construct utility-driven strategies, stress-test portfolios, and forecast economic indicators, all while incorporating nonlinear dependence measures and dynamic regression techniques from NNS.

This project will improve the quant domain by democratizing advanced nonlinear tools, making them as intuitive as mean-variance models but far more robust. By bridging R and Python ecosystems and enhancing our GPT tool, we’ll empower quants to rapidly prototype and deploy strategies that adapt to market shifts, from high-frequency trading to long-term asset allocation. The goal is to move the industry toward empirical, utility-centric modeling, reducing reliance on outdated assumptions and enabling better decision-making in complex markets.

Anything else you'd like to highlight for those looking to deepen their statistical toolkit?

I’m excited to promote the NNS R package, a game-changer for statistical analysis across finance, economics, and beyond. These robust statistics provide the basis for nonlinear analysis while retaining linear equivalences. NNS offers: Numerical integration, Numerical differentiation, Clustering, Correlation, Dependence, Causal analysis, ANOVA, Regression, Classification, Seasonality, Autoregressive modeling, Normalization, Stochastic dominance and Advanced Monte Carlo sampling, covering roughly 90% of applied statistics. Its open-source nature on GitHub makes it accessible for quants, researchers, and students to explore data-driven insights without rigid assumptions, as seen in applications like portfolio optimization and macro forecasting.

All of the material including presentation, slides and an AI overview of the NNS package can be accessed here. Also, all of the apps have introductory versions where users can get accustomed to the format and output for macroeconomic forecasting, options pricing and portfolio construction via our main page.

If you're curious to learn more about Fred’s fascinating work on Partial Moments Theory & its applications, Fred's created a LinkedIn group where he shares technical insights and ongoing discussions. Feel free to join here

Numbers & Narratives

Drawdown Gravity: Base-Rate Lessons from 6,500 U.S. Stocks

Morgan Stanley just released a sweeping analysis of 40 years of U.S. equity drawdowns, tracking over 6,500 names across their full boom–bust–recovery arcs. The topline stat is brutal: the median maximum drawdown is –85%, and more than half of all stocks never reclaim their prior high. Even the top performers, those with the best total shareholder returns, endured drawdowns >–70% along the way. 

Their recovery table is where it gets even more interesting. Past a –75% drop, the odds of ever getting back to par fall off sharply. Breach –90% and you're down to coin-flip territory. Below –95%, just 1 in 6 names ever recover, and the average time to breakeven stretches to 8 years. Rebounds can be violent, sure, but they rarely retrace the full path. Deep drawdowns mechanically produce large % bounces, but they do not imply true recovery.

What this means for quants:

  • Tail-aware position sizing: If your models cap downside at –50%, you're underestimating exposure. Add tail priors beyond –75%, where the drawdown distribution changes shape sharply.

  • Drawdown gating for signals: Post-collapse reversal signals (value, momentum, etc.) need contextual features. Look for signs of business inflection, such as FCF turning, insider buys, or spread compression.

  • Hold cost assumptions: In the deep buckets, time-to-par often exceeds 5 years. That has material implications for borrow cost, capital lockup, and short-side carry in low-liquidity names.

  • Feature engineering: Treat drawdown depth as a modifier. A 5Y CAGR post –50% drawdown is not the same as post –90%. The conditional distribution is fat-tailed and regime-shifting.

  • Scenario stress tests: Do not assume mean reversion. Model drawdown breakpoints explicitly. Once a name breaches –80%, median recovery trajectories flatten fast.

  • Portfolio heuristics: If your weighting relies on mean reversion or volatility compression, consider overlaying recovery probabilities to avoid structural losers that only look optically cheap.

The data challenges the assumption that all drawdowns are temporary. In many cases, they reflect permanent changes in return expectations, business quality, or capital efficiency. Quants who treat large drawdowns as structural breaks rather than noise will be better equipped to size risk, gate signals, and allocate capital with more precision.

Link to the report

Market Pulse

Hedge funds posted strong gains in May, with systematic equity strategies up 4.2%, lifted by the sharp reversal in tech and AI-linked stocks following a de-escalation of tariff threats. Goldman Sachs noted the fastest pace of hedge fund equity buying since November 2024, concentrated in semiconductors and AI infrastructure, but this flow was unusually one-sided - suggesting not conviction, but positioning risk if the macro regime turns. That fragility is precisely what firms like Picton Mahoney are positioning against; they’ve been buying volatility outright, arguing that the tariff “pause” is superficial and that policy risk remains deeply underpriced. Steve Diggle, too, sees echoes of 2008, pointing not to housing this time, but to opaque leverage in private credit and structurally overvalued equity markets, especially in the U.S., where he warns few investors are properly hedged. That concern is echoed institutionally: the Fed stayed on hold warning that persistent fiscal imbalances and rising Treasury yields could weaken the foundations of the U.S.'s safe-haven role over time, a risk amplified by Moody’s recent downgrade of U.S. sovereign credit from AAA to AA1. While equities soared, the rally was narrow, factor spreads widened, and dispersion surged leaving a market primed for relative value, long-volatility, and cross-asset macro strategies. Taken together, this is a market that rewards tactical aggression but punishes complacency—an environment where quant managers must read not just the signals, but the mispricings in how others are reacting to them.

Navigational Nudges

Cross-validation that ignores the structure of financial data rarely produces models that hold up in live trading. Autocorrelation, overlapping labels, and regime shifts make naïve splits risky by design. In practice, most overfitting in quant strategies originates not in the model architecture, but in the way it was validated. 

Here’s how to choose a split that actually simulates out-of-sample performance: 

  1. Walk Forward with Gap

    Useful for: Short-half-life alphas and data sets with long history.

    Train on observations up to time T, skip a gap at least as long as the label horizon (rule of thumb: gap ≥ horizon, often 1 to 2 times the look ahead window), test on (T + g, T + g + Δ], then roll. Always use full trading days or months, never partial periods.

  2. Purged k-Fold with Embargo (López de Prado 2018)

    Useful for: Limited history or overlapping labels in either time or cross section.

    Purge any training row whose outcome window intersects the test fold, then place an embargo immediately after the test block. Apply the purge across assets that share the same timestamp to stop cross sectional leakage. If data are scarce, switch to a blocked or stationary bootstrap to keep dependence intact.


  3. Combinatorial Purged CV (CPCV)

    Useful for: Final-stage robustness checks on high-stakes strategies.

    Evaluate every viable train-test split under the same purging rules, then measure overfitting with the Probability of Backtest Overfitting (PBO) and the Deflated Sharpe. Combinations scale roughly as O(n²); budget compute or down-sample folds before running the full grid.


  4. Nested Time-Series CV

    Useful for: Hyper-parameter-hungry models such as boosted trees or deep nets.

    Wrap tuning inside an inner walk-forward loop and measure generalisation on an outer holdout. Keep every step of preprocessing, including feature scaling, inside the loop to avoid look ahead bias.

Pick the simplest scheme that respects causality, then pressure test it with a stricter one. The market will always exploit the fold you didn’t test, and most models don’t fail because the signal was absent, they fail because the wrong validation was trusted. Nail that part and everything else gets a lot easier.

The Knowledge Buffet

🎙️ Trading Insights: All About Alternative Data🎙️
by JP Morgan’s Making Sense

In this episode, Mark Flemming-Williams, Head of Data Sourcing at CFM and guest on the podcast, offers one of the most refreshingly honest accounts we've heard of what it really takes to get alternative data into production at a quant fund. From point in time structure to the true cost of trialing, it’s a sharp reminder of how tough the process is and a great validation of why we do what we do at Quanted. Well worth a listen.

The Closing Bell

——

Disclaimer, this newsletter is for educational purposes only and does not constitute financial advice. Any trading strategy discussed is hypothetical, and past performance is not indicative of future results. Before making any investment decisions, please conduct thorough research and consult with a qualified financial professional. Remember, all investments carry risk

A black and white photo of winding a mountain road with light trails at nightfall

Newsletter

Apr 4, 2025

When Metadata, Modeling Habits & Macro Collide

The Kickoff

April’s here, and so are the regime shifts. As tax deadlines near and trade policy takes centre stage, quants are navigating a market where assumptions age quickly. In this edition, we look at what’s shifting, what’s scaling, and where models may need rethinking as Q2 begins.

The Compass

Here's a rundown of what you can find in this edition:

  • Some updates for anyone wondering what we’ve been up to.

  • Newest partner additions to the Quanted data lake.

  • What we learned from Francesco Cricchio, Ph.D., CEO of Brain.

  • A closer look at what the data says about using websites for stock selection

  • Market insights/ conferences

  • How to model the market after Liberation Day.

  • A useful recommendation for any quant who's running on modeling habits.

  • A shocking fact we bet you didn't know about Medallion’s Sharpe ratio.

Insider Trading

Big strides this month on both the team and product side.

We’re excited to welcome Rahul Mahtani as our new data scientist. He joins from Bloomberg, where he built ESG and factor signals, portfolio optimisers, and market timing tools. With prior experience at Scientific Beta designing volatility tools and EU climate indices, he brings deep expertise in investment signals and model transparency.

On the product side, we’ve now rolled out the full suite of explainability tools we previewed in January: SHAP, ICE plots, feature interaction heatmaps, cumulative feature impact tracking, dataset usage insights—and new this month—causal graphs to interpret relationships between features and targets. We also crossed 3 million features in the system, marking a new scale milestone. More to come as we keep building—now with more hands on deck and even more momentum than ever before.

On the Radar

We’ve added BIScience, Brain, Flywheel, JETNET, and Kayrros to our data partner lake—expanding our alternative data coverage across advertising, sentiment, retail, aviation, and environmental intelligence. Each adds to the breadth of features we surface—highlighting what’s most uniquely relevant to your predictive model.

BIScience
Provides digital and behavioral intelligence with global coverage, offering insights into web traffic, advertising spend, and consumer engagement across platforms.

Brain
Delivers proprietary indicators derived from financial texts—such as news, filings, and earnings calls—by combining machine learning and NLP to transform unstructured text into quant-ready signals.

Flywheel
Providing investors with access to a vast eCommerce price & promo dataset that tracks changes in consumer product pricing, discounting and availability at 650+ tickers and thousands of private comps for the last 7 years.  

JETNET
Provides aviation intelligence with detailed data on aircraft ownership, usage, and transactions—delivering a comprehensive view of global aviation markets.

Kayrros
Delivers environmental intelligence using satellite and sensor data to measure industrial activity, emissions, and commodity flows with high spatial and temporal resolution.

The Tradewinds

Expert Exchange

We sat down with Francesco Cricchio, Ph.D., CEO at Brain, where he’s spent the last eight years applying advanced statistical methods and AI to quantitative investing. With a PhD in Computational Physics and years of experience leading R&D projects across the biomedical, energy, and financial sectors, Francesco has built a career solving complex problems — from quantum-level simulations in solid state Physics to machine learning applications in Finance.

In our conversation, he shared how a scientific mindset shapes his work in Finance today, the innovations driving his team’s research, and the importance of rigorous validation and curiosity in building alternative datasets and quantitative investment strategies.

Over your 8+ years leading Brain, how have you seen the alternative data industry evolve? What’s been the most defining or memorable moment for you?

When we started Brain in 2016, alternative data was still a niche concept — an experimental approach explored by only a few technologically advanced investment firms. Over the years, we’ve seen it transition from the periphery to the core of many investment strategies. A defining moment for us came in 2018, when a major U.S. investment fund became a client and integrated one of our Natural Language Processing–based datasets — specifically focused on sentiment analysis from news — into one of their systematic strategies.

Looking back, what 3 key skills from your physics-to-finance journey have proven most valuable in your career?

During my research in Physics, I focused on predicting material properties using the fundamental equations of Quantum Physics through computer simulations. It’s a truly fascinating process — being able to model reality starting from first principles and, through simulations, generate real, testable predictions.

Looking back, three key skills from that experience have proven especially valuable in my work within quantitative finance and data science:

  1. Analytical rigor: Physics taught me how to model complex systems and break them down into solvable problems. This structured approach is crucial when tackling the challenges of quantitative finance.

  2. Strong statistical foundations: The ability to separate signal from noise and
    rigorously validate assumptions has been essential for building robust and reliable indicators in quantitative research. A constant focus is on mitigating overfitting risk to ensure more stable and consistent out-of-sample performance in predictive models.

  3. Curiosity-driven innovation: Physics trains you to ask the right questions, not just solve equations. That mindset has been instrumental in helping Brain innovate and lead in the application of AI to financial markets.

Could you provide a couple of technical insights or innovations in quantitative data that you believe would be particularly interesting to investment professionals?

One example is the use of transfer learning with Large Language Models (LLMs), where models trained on general financial text can be fine-tuned for specific, domain-focused tasks. We successfully applied this approach in one of the datasets we launched last year: Brain News Topics Analysis with LLM. This dataset leverages a proprietary LLM framework to monitor selected financial topics and their sentiment within the news flow for individual stocks.

Before the advent of LLMs, building NLP models required creating custom architectures and training them from scratch using large labeled datasets. The introduction of LLMs has significantly simplified and accelerated this process. Moreover, with additional customization, LLMs can now be used to understand complex patterns in financial text — a capability that proved crucial when we developed a sentiment indicator for commodities. Given the inherent complexity of commodity-related news, this approach has been especially effective in identifying key factors driving supply and demand, as well as assessing the sentiment associated with those shifts.

As more data providers offer production-ready signals rather than raw inputs, how do you see the relationship between investment teams and data providers changing?

The relationship is becoming more collaborative and strategic. Investment teams no longer want just a data dump — they’re looking for interpretable, backtested, and explainable signals that can integrate into their decision-making frameworks. Providers like us are becoming more like R&D partners, co-developing solutions with clients and offering transparency into how signals are derived.

In line with this direction, we are also investing in the development of validation use cases
through our proprietary validation platform. This tool enables clients to assess the
effectiveness of our datasets by performing statistical analyses such as quintile analysis and Spearman rank correlation. Beyond supporting our clients, the platform has also proven valuable for other alternative data providers, offering an independent and rigorous framework to validate the performance and relevance of their own datasets.

Aside from AI, what other long-term trends do you see shaping quant research over the next 5, 10, and 25 years?

Of course, it’s very difficult to make specific predictions — especially over long horizons.
That said, in the short to mid-term, I foresee a growing adoption of customized Large
Language Models (LLMs) tailored to specific financial tasks. These will enable more
targeted insights and greater automation in both research and strategy development.

Over a longer time horizon, I believe we’ll see a convergence between human and machine
decision-making — not just in terms of speed and accuracy, but in the ability to understand causality and intent in financial markets. This would represent a major shift from traditional statistical models to systems capable of interpreting the underlying drivers of market behavior. 

Another significant challenge — and opportunity — lies in gaining a deeper understanding of market microstructure. It remains one of the most intricate and active areas of research, but I believe that in the long run, advancements in this field will dramatically improve how
strategies are developed and executed, particularly in high-frequency and intraday trading.

What is the next major project or initiative you’re working on, and how do you see it improving the alternative data domain? 

Our current focus — building on several datasets we've had in live production for years — is to add an additional layer of intelligence by combining multiple data sources into more robust signals. One of our key innovations in this area is the development of multi-source ensemble indicators, which aggregate sentiment metrics from sources such as news, earnings calls, and company filings into a single, unified sentiment signal. These combined indicators often outperform those based on individual sources and significantly reduce noise.

This philosophy is embodied in our recently launched dataset, Brain Combined Sentiment. The dataset provides aggregated sentiment metrics for U.S. stocks, derived from multiple textual financial sources — including news articles, earnings calls, and regulatory filings. These metrics are generated using various Brain datasets and consolidated into a single, user-friendly file, making it especially valuable for investors conducting sentiment-based analysis.

Finally, a project we are currently developing involves the creation of mid-frequency
systematic strategies. This initiative leverages proprietary optimization techniques and robust validation frameworks specifically tailored for parametric strategies within a machine learning context.

Looking ahead, is there anything else you'd like to highlight or leave on our readers’ radar?

We’re continuing building new datasets leveraging the latest advancements in A.I. and statistical methods. All our recent developments are published on our website braincompany.co and LinkedIn page.  

Numbers & Narratives

Wolfe Research's quants recently ran a study extracting 64 features from corporate websites—spanning 85 million pages—to build a stock selection model that delivered a Sharpe ratio of 2.1 and 16% CAGR since 2020. These features included structural metadata (e.g., MIME type counts, sitemap depth) and thematic divergence using QesBERT, their finance-specific NLP model.

This comes as web scraping enters a new growth phase. Lowenstein Sandler’s 2024 survey found that adoption jumped 20 percentage points—the largest gain across all tracked alternative data sources—with synthetic data also entering the landscape at scale. 

A few things stood out:

  • Sitemap depth ≠ investor depth: Bigger websites might just reflect broader business lines, more detailed investor relations (IR), or legal disclosure obligations. Without knowing why a site expanded, it’s easy to overfit noise disguised as structural complexity.

  • Topic divergence has legs: The real signal lies in thematic content. Firms diverging from sector peers in what they talk about may be early in narrative cycles—especially in AI or clean energy, where perception moves flows as much as fundamentals.

  • This isn’t just alpha—it’s alt IR: The takeaway isn’t that websites matter. It’s that investor messaging is now machine-readable. That reframes how we model soft signals.

For quants, this hints at a broader shift: public-facing digital communication is becoming a reliable, model-ready input. IR messaging, once designed purely for human consumption, is now being systematically parsed—creating a new layer of insight few are modeling. In other words, the signal landscape is changing. And the universe of machine-readable edge is still expanding.

  1. FT's article on Wolfe

  2. Lowenstein Sandler's Alt Data Report

Market Pulse

We’re adding something new this quarter. With so much happening across quant and data, a few standout conferences felt worth highlighting.

📆 Quaint Quant Conference; 18 April, Dallas | An event where buy-side, sell-side, and academic quants come together to share ideas and collaborate meaningfully. 

📆 Point72 Stanford PhD Coffee Chats and Networking Dinner; 24 April, Stanford | Point72 will be hosting coffee chats and a networking dinner for PhDs.

📆 Hedgeweek Future of the Funds EU; 29 April, London | A summit open to senior execs at fund managers and investors to discuss the future of fund structures, strategies, and investor expectations.

📆 Battle of the Quants; 6 May, New York | A one-day event bringing together quants, allocators, and data providers to discuss AI and systematic investing.

📆 Morningstar Investment Conference; 7 May, London | Morningstar’s flagship UK event covering portfolios, research, and the investor landscape.

📆 Neudata's New York Summer Data Summit; 8 May, New York | A full-day alt data event with panels, vendor meetings, and trend insights for investors.

📆 BattleFin's Discovery Day; 10 June, New York | A matchmaking event connecting data vendors with institutional buyers through curated one-on-one meetings.

Navigational Nudges

Quant strategies often account for macro risk, but political shocks are rarely treated as core modelling inputs. In a recent Barron’s interview, Cliff Asness explained why that approach needs to change when policy shifts start influencing fundamentals directly—through tariffs, regulation, or capital controls. 

He described how AQR evaluated tariff exposure across its portfolios by analysing supply chains and revenue by geography. That kind of work becomes essential when political risk transitions from a volatility driver to a capital flow and cash flow driver. Trump’s “Liberation Day” speech marks exactly that kind of transition. The introduction of sweeping new tariffs has altered expectations around inflation, corporate margins, and global trade flows—conditions quant models rely on to stay stable.

Why does this matter? Markets repriced quickly after the announcement: implied vol rose in trade-exposed sectors, EM FX weakened, and cross-asset dispersion increased. These are clear indicators of a regime shift. Models built under assumptions of tariff-free globalisation now face higher estimation error. Signal degradation, autocorrelation breakdowns, and unstable factor exposures are already visible in the post-speech data. 

What should you be doing now?

1. Quantify Trade Exposure at the Factor and Position Level
Use firm-level revenue segmentation, global value chain linkages, and import-cost pass-through data to surface tariff sensitivity. Incorporate this into both factor risk models and position sizing rules.

2. Test for Structural Signal Decay Post-Announcement
Run rolling window regressions and Chow tests to identify breakpoints in factor performance or autocorrelation. Validate whether alpha signals maintain stability across the new policy regime.

3. Decompose Trend Attribution by Policy Regime
Segment trend strategy PnL around political events and measure conditional Sharpe. Use macro-filtered overlays to isolate persistent price moves from reactive noise.

4. Run Event Studies to Quantify Short-Term Sensitivity to Policy Shocks
Use cumulative abnormal return (CAR) analysis across pre-, intra-, and post-event windows. Calibrate against benchmark exposures estimated prior to the shock (e.g., beta coefficients). Apply statistical tests to detect shifts in market-implied expectations of the strategy’s value or future cash flows.

There’s a difference between volatility and change. Quants know how to handle the former. The challenge now is recognising the latter.

The Knowledge Buffet

📖  Modeling Mindsets: The Many Cultures Of Learning From Data 📖

by Christoph Molnar 

Most quants have a default way of thinking—Bayesian, frequentist, machine learning—and tend to stick with what’s familiar or what’s worked in the past. Modeling Mindsets is a useful reminder that a lot of that is habit > preference. It doesn’t argue for one approach over another, but it does help surface the assumptions that shape how problems are framed, tools are chosen, and results are interpreted. Especially worth reading if a model’s ever failed quietly and the why wasn’t obvious.

The Closing Bell

Did you know?

The Medallion Fund’s Sharpe ratio is so high that standard significance tests stop being useful — because its returns are almost too consistent to model using standard risk metrics. 

Black and white photo of a water splash collision

Newsletter

Apr 4, 2025

When Metadata, Modeling Habits & Macro Collide

The Kickoff

April’s here, and so are the regime shifts. As tax deadlines near and trade policy takes centre stage, quants are navigating a market where assumptions age quickly. In this edition, we look at what’s shifting, what’s scaling, and where models may need rethinking as Q2 begins.

The Compass

Here's a rundown of what you can find in this edition:

  • Some updates for anyone wondering what we’ve been up to.

  • Newest partner additions to the Quanted data lake.

  • What we learned from Francesco Cricchio, Ph.D., CEO of Brain.

  • A closer look at what the data says about using websites for stock selection

  • Market insights/ conferences

  • How to model the market after Liberation Day.

  • A useful recommendation for any quant who's running on modeling habits.

  • A shocking fact we bet you didn't know about Medallion’s Sharpe ratio.

Insider Trading

Big strides this month on both the team and product side.

We’re excited to welcome Rahul Mahtani as our new data scientist. He joins from Bloomberg, where he built ESG and factor signals, portfolio optimisers, and market timing tools. With prior experience at Scientific Beta designing volatility tools and EU climate indices, he brings deep expertise in investment signals and model transparency.

On the product side, we’ve now rolled out the full suite of explainability tools we previewed in January: SHAP, ICE plots, feature interaction heatmaps, cumulative feature impact tracking, dataset usage insights—and new this month—causal graphs to interpret relationships between features and targets. We also crossed 3 million features in the system, marking a new scale milestone. More to come as we keep building—now with more hands on deck and even more momentum than ever before.

On the Radar

We’ve added BIScience, Brain, Flywheel, JETNET, and Kayrros to our data partner lake—expanding our alternative data coverage across advertising, sentiment, retail, aviation, and environmental intelligence. Each adds to the breadth of features we surface—highlighting what’s most uniquely relevant to your predictive model.

BIScience
Provides digital and behavioral intelligence with global coverage, offering insights into web traffic, advertising spend, and consumer engagement across platforms.

Brain
Delivers proprietary indicators derived from financial texts—such as news, filings, and earnings calls—by combining machine learning and NLP to transform unstructured text into quant-ready signals.

Flywheel
Providing investors with access to a vast eCommerce price & promo dataset that tracks changes in consumer product pricing, discounting and availability at 650+ tickers and thousands of private comps for the last 7 years.  

JETNET
Provides aviation intelligence with detailed data on aircraft ownership, usage, and transactions—delivering a comprehensive view of global aviation markets.

Kayrros
Delivers environmental intelligence using satellite and sensor data to measure industrial activity, emissions, and commodity flows with high spatial and temporal resolution.

The Tradewinds

Expert Exchange

We sat down with Francesco Cricchio, Ph.D., CEO at Brain, where he’s spent the last eight years applying advanced statistical methods and AI to quantitative investing. With a PhD in Computational Physics and years of experience leading R&D projects across the biomedical, energy, and financial sectors, Francesco has built a career solving complex problems — from quantum-level simulations in solid state Physics to machine learning applications in Finance.

In our conversation, he shared how a scientific mindset shapes his work in Finance today, the innovations driving his team’s research, and the importance of rigorous validation and curiosity in building alternative datasets and quantitative investment strategies.

Over your 8+ years leading Brain, how have you seen the alternative data industry evolve? What’s been the most defining or memorable moment for you?

When we started Brain in 2016, alternative data was still a niche concept — an experimental approach explored by only a few technologically advanced investment firms. Over the years, we’ve seen it transition from the periphery to the core of many investment strategies. A defining moment for us came in 2018, when a major U.S. investment fund became a client and integrated one of our Natural Language Processing–based datasets — specifically focused on sentiment analysis from news — into one of their systematic strategies.

Looking back, what 3 key skills from your physics-to-finance journey have proven most valuable in your career?

During my research in Physics, I focused on predicting material properties using the fundamental equations of Quantum Physics through computer simulations. It’s a truly fascinating process — being able to model reality starting from first principles and, through simulations, generate real, testable predictions.

Looking back, three key skills from that experience have proven especially valuable in my work within quantitative finance and data science:

  1. Analytical rigor: Physics taught me how to model complex systems and break them down into solvable problems. This structured approach is crucial when tackling the challenges of quantitative finance.

  2. Strong statistical foundations: The ability to separate signal from noise and
    rigorously validate assumptions has been essential for building robust and reliable indicators in quantitative research. A constant focus is on mitigating overfitting risk to ensure more stable and consistent out-of-sample performance in predictive models.

  3. Curiosity-driven innovation: Physics trains you to ask the right questions, not just solve equations. That mindset has been instrumental in helping Brain innovate and lead in the application of AI to financial markets.

Could you provide a couple of technical insights or innovations in quantitative data that you believe would be particularly interesting to investment professionals?

One example is the use of transfer learning with Large Language Models (LLMs), where models trained on general financial text can be fine-tuned for specific, domain-focused tasks. We successfully applied this approach in one of the datasets we launched last year: Brain News Topics Analysis with LLM. This dataset leverages a proprietary LLM framework to monitor selected financial topics and their sentiment within the news flow for individual stocks.

Before the advent of LLMs, building NLP models required creating custom architectures and training them from scratch using large labeled datasets. The introduction of LLMs has significantly simplified and accelerated this process. Moreover, with additional customization, LLMs can now be used to understand complex patterns in financial text — a capability that proved crucial when we developed a sentiment indicator for commodities. Given the inherent complexity of commodity-related news, this approach has been especially effective in identifying key factors driving supply and demand, as well as assessing the sentiment associated with those shifts.

As more data providers offer production-ready signals rather than raw inputs, how do you see the relationship between investment teams and data providers changing?

The relationship is becoming more collaborative and strategic. Investment teams no longer want just a data dump — they’re looking for interpretable, backtested, and explainable signals that can integrate into their decision-making frameworks. Providers like us are becoming more like R&D partners, co-developing solutions with clients and offering transparency into how signals are derived.

In line with this direction, we are also investing in the development of validation use cases
through our proprietary validation platform. This tool enables clients to assess the
effectiveness of our datasets by performing statistical analyses such as quintile analysis and Spearman rank correlation. Beyond supporting our clients, the platform has also proven valuable for other alternative data providers, offering an independent and rigorous framework to validate the performance and relevance of their own datasets.

Aside from AI, what other long-term trends do you see shaping quant research over the next 5, 10, and 25 years?

Of course, it’s very difficult to make specific predictions — especially over long horizons.
That said, in the short to mid-term, I foresee a growing adoption of customized Large
Language Models (LLMs) tailored to specific financial tasks. These will enable more
targeted insights and greater automation in both research and strategy development.

Over a longer time horizon, I believe we’ll see a convergence between human and machine
decision-making — not just in terms of speed and accuracy, but in the ability to understand causality and intent in financial markets. This would represent a major shift from traditional statistical models to systems capable of interpreting the underlying drivers of market behavior. 

Another significant challenge — and opportunity — lies in gaining a deeper understanding of market microstructure. It remains one of the most intricate and active areas of research, but I believe that in the long run, advancements in this field will dramatically improve how
strategies are developed and executed, particularly in high-frequency and intraday trading.

What is the next major project or initiative you’re working on, and how do you see it improving the alternative data domain? 

Our current focus — building on several datasets we've had in live production for years — is to add an additional layer of intelligence by combining multiple data sources into more robust signals. One of our key innovations in this area is the development of multi-source ensemble indicators, which aggregate sentiment metrics from sources such as news, earnings calls, and company filings into a single, unified sentiment signal. These combined indicators often outperform those based on individual sources and significantly reduce noise.

This philosophy is embodied in our recently launched dataset, Brain Combined Sentiment. The dataset provides aggregated sentiment metrics for U.S. stocks, derived from multiple textual financial sources — including news articles, earnings calls, and regulatory filings. These metrics are generated using various Brain datasets and consolidated into a single, user-friendly file, making it especially valuable for investors conducting sentiment-based analysis.

Finally, a project we are currently developing involves the creation of mid-frequency
systematic strategies. This initiative leverages proprietary optimization techniques and robust validation frameworks specifically tailored for parametric strategies within a machine learning context.

Looking ahead, is there anything else you'd like to highlight or leave on our readers’ radar?

We’re continuing building new datasets leveraging the latest advancements in A.I. and statistical methods. All our recent developments are published on our website braincompany.co and LinkedIn page.  

Numbers & Narratives

Wolfe Research's quants recently ran a study extracting 64 features from corporate websites—spanning 85 million pages—to build a stock selection model that delivered a Sharpe ratio of 2.1 and 16% CAGR since 2020. These features included structural metadata (e.g., MIME type counts, sitemap depth) and thematic divergence using QesBERT, their finance-specific NLP model.

This comes as web scraping enters a new growth phase. Lowenstein Sandler’s 2024 survey found that adoption jumped 20 percentage points—the largest gain across all tracked alternative data sources—with synthetic data also entering the landscape at scale. 

A few things stood out:

  • Sitemap depth ≠ investor depth: Bigger websites might just reflect broader business lines, more detailed investor relations (IR), or legal disclosure obligations. Without knowing why a site expanded, it’s easy to overfit noise disguised as structural complexity.

  • Topic divergence has legs: The real signal lies in thematic content. Firms diverging from sector peers in what they talk about may be early in narrative cycles—especially in AI or clean energy, where perception moves flows as much as fundamentals.

  • This isn’t just alpha—it’s alt IR: The takeaway isn’t that websites matter. It’s that investor messaging is now machine-readable. That reframes how we model soft signals.

For quants, this hints at a broader shift: public-facing digital communication is becoming a reliable, model-ready input. IR messaging, once designed purely for human consumption, is now being systematically parsed—creating a new layer of insight few are modeling. In other words, the signal landscape is changing. And the universe of machine-readable edge is still expanding.

  1. FT's article on Wolfe

  2. Lowenstein Sandler's Alt Data Report

Market Pulse

We’re adding something new this quarter. With so much happening across quant and data, a few standout conferences felt worth highlighting.

📆 Quaint Quant Conference; 18 April, Dallas | An event where buy-side, sell-side, and academic quants come together to share ideas and collaborate meaningfully. 

📆 Point72 Stanford PhD Coffee Chats and Networking Dinner; 24 April, Stanford | Point72 will be hosting coffee chats and a networking dinner for PhDs.

📆 Hedgeweek Future of the Funds EU; 29 April, London | A summit open to senior execs at fund managers and investors to discuss the future of fund structures, strategies, and investor expectations.

📆 Battle of the Quants; 6 May, New York | A one-day event bringing together quants, allocators, and data providers to discuss AI and systematic investing.

📆 Morningstar Investment Conference; 7 May, London | Morningstar’s flagship UK event covering portfolios, research, and the investor landscape.

📆 Neudata's New York Summer Data Summit; 8 May, New York | A full-day alt data event with panels, vendor meetings, and trend insights for investors.

📆 BattleFin's Discovery Day; 10 June, New York | A matchmaking event connecting data vendors with institutional buyers through curated one-on-one meetings.

Navigational Nudges

Quant strategies often account for macro risk, but political shocks are rarely treated as core modelling inputs. In a recent Barron’s interview, Cliff Asness explained why that approach needs to change when policy shifts start influencing fundamentals directly—through tariffs, regulation, or capital controls. 

He described how AQR evaluated tariff exposure across its portfolios by analysing supply chains and revenue by geography. That kind of work becomes essential when political risk transitions from a volatility driver to a capital flow and cash flow driver. Trump’s “Liberation Day” speech marks exactly that kind of transition. The introduction of sweeping new tariffs has altered expectations around inflation, corporate margins, and global trade flows—conditions quant models rely on to stay stable.

Why does this matter? Markets repriced quickly after the announcement: implied vol rose in trade-exposed sectors, EM FX weakened, and cross-asset dispersion increased. These are clear indicators of a regime shift. Models built under assumptions of tariff-free globalisation now face higher estimation error. Signal degradation, autocorrelation breakdowns, and unstable factor exposures are already visible in the post-speech data. 

What should you be doing now?

1. Quantify Trade Exposure at the Factor and Position Level
Use firm-level revenue segmentation, global value chain linkages, and import-cost pass-through data to surface tariff sensitivity. Incorporate this into both factor risk models and position sizing rules.

2. Test for Structural Signal Decay Post-Announcement
Run rolling window regressions and Chow tests to identify breakpoints in factor performance or autocorrelation. Validate whether alpha signals maintain stability across the new policy regime.

3. Decompose Trend Attribution by Policy Regime
Segment trend strategy PnL around political events and measure conditional Sharpe. Use macro-filtered overlays to isolate persistent price moves from reactive noise.

4. Run Event Studies to Quantify Short-Term Sensitivity to Policy Shocks
Use cumulative abnormal return (CAR) analysis across pre-, intra-, and post-event windows. Calibrate against benchmark exposures estimated prior to the shock (e.g., beta coefficients). Apply statistical tests to detect shifts in market-implied expectations of the strategy’s value or future cash flows.

There’s a difference between volatility and change. Quants know how to handle the former. The challenge now is recognising the latter.

The Knowledge Buffet

📖  Modeling Mindsets: The Many Cultures Of Learning From Data 📖

by Christoph Molnar 

Most quants have a default way of thinking—Bayesian, frequentist, machine learning—and tend to stick with what’s familiar or what’s worked in the past. Modeling Mindsets is a useful reminder that a lot of that is habit > preference. It doesn’t argue for one approach over another, but it does help surface the assumptions that shape how problems are framed, tools are chosen, and results are interpreted. Especially worth reading if a model’s ever failed quietly and the why wasn’t obvious.

The Closing Bell

Did you know?

The Medallion Fund’s Sharpe ratio is so high that standard significance tests stop being useful — because its returns are almost too consistent to model using standard risk metrics. 

Black and white photo of a water splash collision

Newsletter

Apr 4, 2025

When Metadata, Modeling Habits & Macro Collide

The Kickoff

April’s here, and so are the regime shifts. As tax deadlines near and trade policy takes centre stage, quants are navigating a market where assumptions age quickly. In this edition, we look at what’s shifting, what’s scaling, and where models may need rethinking as Q2 begins.

The Compass

Here's a rundown of what you can find in this edition:

  • Some updates for anyone wondering what we’ve been up to.

  • Newest partner additions to the Quanted data lake.

  • What we learned from Francesco Cricchio, Ph.D., CEO of Brain.

  • A closer look at what the data says about using websites for stock selection

  • Market insights/ conferences

  • How to model the market after Liberation Day.

  • A useful recommendation for any quant who's running on modeling habits.

  • A shocking fact we bet you didn't know about Medallion’s Sharpe ratio.

Insider Trading

Big strides this month on both the team and product side.

We’re excited to welcome Rahul Mahtani as our new data scientist. He joins from Bloomberg, where he built ESG and factor signals, portfolio optimisers, and market timing tools. With prior experience at Scientific Beta designing volatility tools and EU climate indices, he brings deep expertise in investment signals and model transparency.

On the product side, we’ve now rolled out the full suite of explainability tools we previewed in January: SHAP, ICE plots, feature interaction heatmaps, cumulative feature impact tracking, dataset usage insights—and new this month—causal graphs to interpret relationships between features and targets. We also crossed 3 million features in the system, marking a new scale milestone. More to come as we keep building—now with more hands on deck and even more momentum than ever before.

On the Radar

We’ve added BIScience, Brain, Flywheel, JETNET, and Kayrros to our data partner lake—expanding our alternative data coverage across advertising, sentiment, retail, aviation, and environmental intelligence. Each adds to the breadth of features we surface—highlighting what’s most uniquely relevant to your predictive model.

BIScience
Provides digital and behavioral intelligence with global coverage, offering insights into web traffic, advertising spend, and consumer engagement across platforms.

Brain
Delivers proprietary indicators derived from financial texts—such as news, filings, and earnings calls—by combining machine learning and NLP to transform unstructured text into quant-ready signals.

Flywheel
Providing investors with access to a vast eCommerce price & promo dataset that tracks changes in consumer product pricing, discounting and availability at 650+ tickers and thousands of private comps for the last 7 years.  

JETNET
Provides aviation intelligence with detailed data on aircraft ownership, usage, and transactions—delivering a comprehensive view of global aviation markets.

Kayrros
Delivers environmental intelligence using satellite and sensor data to measure industrial activity, emissions, and commodity flows with high spatial and temporal resolution.

The Tradewinds

Expert Exchange

We sat down with Francesco Cricchio, Ph.D., CEO at Brain, where he’s spent the last eight years applying advanced statistical methods and AI to quantitative investing. With a PhD in Computational Physics and years of experience leading R&D projects across the biomedical, energy, and financial sectors, Francesco has built a career solving complex problems — from quantum-level simulations in solid state Physics to machine learning applications in Finance.

In our conversation, he shared how a scientific mindset shapes his work in Finance today, the innovations driving his team’s research, and the importance of rigorous validation and curiosity in building alternative datasets and quantitative investment strategies.

Over your 8+ years leading Brain, how have you seen the alternative data industry evolve? What’s been the most defining or memorable moment for you?

When we started Brain in 2016, alternative data was still a niche concept — an experimental approach explored by only a few technologically advanced investment firms. Over the years, we’ve seen it transition from the periphery to the core of many investment strategies. A defining moment for us came in 2018, when a major U.S. investment fund became a client and integrated one of our Natural Language Processing–based datasets — specifically focused on sentiment analysis from news — into one of their systematic strategies.

Looking back, what 3 key skills from your physics-to-finance journey have proven most valuable in your career?

During my research in Physics, I focused on predicting material properties using the fundamental equations of Quantum Physics through computer simulations. It’s a truly fascinating process — being able to model reality starting from first principles and, through simulations, generate real, testable predictions.

Looking back, three key skills from that experience have proven especially valuable in my work within quantitative finance and data science:

  1. Analytical rigor: Physics taught me how to model complex systems and break them down into solvable problems. This structured approach is crucial when tackling the challenges of quantitative finance.

  2. Strong statistical foundations: The ability to separate signal from noise and
    rigorously validate assumptions has been essential for building robust and reliable indicators in quantitative research. A constant focus is on mitigating overfitting risk to ensure more stable and consistent out-of-sample performance in predictive models.

  3. Curiosity-driven innovation: Physics trains you to ask the right questions, not just solve equations. That mindset has been instrumental in helping Brain innovate and lead in the application of AI to financial markets.

Could you provide a couple of technical insights or innovations in quantitative data that you believe would be particularly interesting to investment professionals?

One example is the use of transfer learning with Large Language Models (LLMs), where models trained on general financial text can be fine-tuned for specific, domain-focused tasks. We successfully applied this approach in one of the datasets we launched last year: Brain News Topics Analysis with LLM. This dataset leverages a proprietary LLM framework to monitor selected financial topics and their sentiment within the news flow for individual stocks.

Before the advent of LLMs, building NLP models required creating custom architectures and training them from scratch using large labeled datasets. The introduction of LLMs has significantly simplified and accelerated this process. Moreover, with additional customization, LLMs can now be used to understand complex patterns in financial text — a capability that proved crucial when we developed a sentiment indicator for commodities. Given the inherent complexity of commodity-related news, this approach has been especially effective in identifying key factors driving supply and demand, as well as assessing the sentiment associated with those shifts.

As more data providers offer production-ready signals rather than raw inputs, how do you see the relationship between investment teams and data providers changing?

The relationship is becoming more collaborative and strategic. Investment teams no longer want just a data dump — they’re looking for interpretable, backtested, and explainable signals that can integrate into their decision-making frameworks. Providers like us are becoming more like R&D partners, co-developing solutions with clients and offering transparency into how signals are derived.

In line with this direction, we are also investing in the development of validation use cases
through our proprietary validation platform. This tool enables clients to assess the
effectiveness of our datasets by performing statistical analyses such as quintile analysis and Spearman rank correlation. Beyond supporting our clients, the platform has also proven valuable for other alternative data providers, offering an independent and rigorous framework to validate the performance and relevance of their own datasets.

Aside from AI, what other long-term trends do you see shaping quant research over the next 5, 10, and 25 years?

Of course, it’s very difficult to make specific predictions — especially over long horizons.
That said, in the short to mid-term, I foresee a growing adoption of customized Large
Language Models (LLMs) tailored to specific financial tasks. These will enable more
targeted insights and greater automation in both research and strategy development.

Over a longer time horizon, I believe we’ll see a convergence between human and machine
decision-making — not just in terms of speed and accuracy, but in the ability to understand causality and intent in financial markets. This would represent a major shift from traditional statistical models to systems capable of interpreting the underlying drivers of market behavior. 

Another significant challenge — and opportunity — lies in gaining a deeper understanding of market microstructure. It remains one of the most intricate and active areas of research, but I believe that in the long run, advancements in this field will dramatically improve how
strategies are developed and executed, particularly in high-frequency and intraday trading.

What is the next major project or initiative you’re working on, and how do you see it improving the alternative data domain? 

Our current focus — building on several datasets we've had in live production for years — is to add an additional layer of intelligence by combining multiple data sources into more robust signals. One of our key innovations in this area is the development of multi-source ensemble indicators, which aggregate sentiment metrics from sources such as news, earnings calls, and company filings into a single, unified sentiment signal. These combined indicators often outperform those based on individual sources and significantly reduce noise.

This philosophy is embodied in our recently launched dataset, Brain Combined Sentiment. The dataset provides aggregated sentiment metrics for U.S. stocks, derived from multiple textual financial sources — including news articles, earnings calls, and regulatory filings. These metrics are generated using various Brain datasets and consolidated into a single, user-friendly file, making it especially valuable for investors conducting sentiment-based analysis.

Finally, a project we are currently developing involves the creation of mid-frequency
systematic strategies. This initiative leverages proprietary optimization techniques and robust validation frameworks specifically tailored for parametric strategies within a machine learning context.

Looking ahead, is there anything else you'd like to highlight or leave on our readers’ radar?

We’re continuing building new datasets leveraging the latest advancements in A.I. and statistical methods. All our recent developments are published on our website braincompany.co and LinkedIn page.  

Numbers & Narratives

Wolfe Research's quants recently ran a study extracting 64 features from corporate websites—spanning 85 million pages—to build a stock selection model that delivered a Sharpe ratio of 2.1 and 16% CAGR since 2020. These features included structural metadata (e.g., MIME type counts, sitemap depth) and thematic divergence using QesBERT, their finance-specific NLP model.

This comes as web scraping enters a new growth phase. Lowenstein Sandler’s 2024 survey found that adoption jumped 20 percentage points—the largest gain across all tracked alternative data sources—with synthetic data also entering the landscape at scale. 

A few things stood out:

  • Sitemap depth ≠ investor depth: Bigger websites might just reflect broader business lines, more detailed investor relations (IR), or legal disclosure obligations. Without knowing why a site expanded, it’s easy to overfit noise disguised as structural complexity.

  • Topic divergence has legs: The real signal lies in thematic content. Firms diverging from sector peers in what they talk about may be early in narrative cycles—especially in AI or clean energy, where perception moves flows as much as fundamentals.

  • This isn’t just alpha—it’s alt IR: The takeaway isn’t that websites matter. It’s that investor messaging is now machine-readable. That reframes how we model soft signals.

For quants, this hints at a broader shift: public-facing digital communication is becoming a reliable, model-ready input. IR messaging, once designed purely for human consumption, is now being systematically parsed—creating a new layer of insight few are modeling. In other words, the signal landscape is changing. And the universe of machine-readable edge is still expanding.

  1. FT's article on Wolfe

  2. Lowenstein Sandler's Alt Data Report

Market Pulse

We’re adding something new this quarter. With so much happening across quant and data, a few standout conferences felt worth highlighting.

📆 Quaint Quant Conference; 18 April, Dallas | An event where buy-side, sell-side, and academic quants come together to share ideas and collaborate meaningfully. 

📆 Point72 Stanford PhD Coffee Chats and Networking Dinner; 24 April, Stanford | Point72 will be hosting coffee chats and a networking dinner for PhDs.

📆 Hedgeweek Future of the Funds EU; 29 April, London | A summit open to senior execs at fund managers and investors to discuss the future of fund structures, strategies, and investor expectations.

📆 Battle of the Quants; 6 May, New York | A one-day event bringing together quants, allocators, and data providers to discuss AI and systematic investing.

📆 Morningstar Investment Conference; 7 May, London | Morningstar’s flagship UK event covering portfolios, research, and the investor landscape.

📆 Neudata's New York Summer Data Summit; 8 May, New York | A full-day alt data event with panels, vendor meetings, and trend insights for investors.

📆 BattleFin's Discovery Day; 10 June, New York | A matchmaking event connecting data vendors with institutional buyers through curated one-on-one meetings.

Navigational Nudges

Quant strategies often account for macro risk, but political shocks are rarely treated as core modelling inputs. In a recent Barron’s interview, Cliff Asness explained why that approach needs to change when policy shifts start influencing fundamentals directly—through tariffs, regulation, or capital controls. 

He described how AQR evaluated tariff exposure across its portfolios by analysing supply chains and revenue by geography. That kind of work becomes essential when political risk transitions from a volatility driver to a capital flow and cash flow driver. Trump’s “Liberation Day” speech marks exactly that kind of transition. The introduction of sweeping new tariffs has altered expectations around inflation, corporate margins, and global trade flows—conditions quant models rely on to stay stable.

Why does this matter? Markets repriced quickly after the announcement: implied vol rose in trade-exposed sectors, EM FX weakened, and cross-asset dispersion increased. These are clear indicators of a regime shift. Models built under assumptions of tariff-free globalisation now face higher estimation error. Signal degradation, autocorrelation breakdowns, and unstable factor exposures are already visible in the post-speech data. 

What should you be doing now?

1. Quantify Trade Exposure at the Factor and Position Level
Use firm-level revenue segmentation, global value chain linkages, and import-cost pass-through data to surface tariff sensitivity. Incorporate this into both factor risk models and position sizing rules.

2. Test for Structural Signal Decay Post-Announcement
Run rolling window regressions and Chow tests to identify breakpoints in factor performance or autocorrelation. Validate whether alpha signals maintain stability across the new policy regime.

3. Decompose Trend Attribution by Policy Regime
Segment trend strategy PnL around political events and measure conditional Sharpe. Use macro-filtered overlays to isolate persistent price moves from reactive noise.

4. Run Event Studies to Quantify Short-Term Sensitivity to Policy Shocks
Use cumulative abnormal return (CAR) analysis across pre-, intra-, and post-event windows. Calibrate against benchmark exposures estimated prior to the shock (e.g., beta coefficients). Apply statistical tests to detect shifts in market-implied expectations of the strategy’s value or future cash flows.

There’s a difference between volatility and change. Quants know how to handle the former. The challenge now is recognising the latter.

The Knowledge Buffet

📖  Modeling Mindsets: The Many Cultures Of Learning From Data 📖

by Christoph Molnar 

Most quants have a default way of thinking—Bayesian, frequentist, machine learning—and tend to stick with what’s familiar or what’s worked in the past. Modeling Mindsets is a useful reminder that a lot of that is habit > preference. It doesn’t argue for one approach over another, but it does help surface the assumptions that shape how problems are framed, tools are chosen, and results are interpreted. Especially worth reading if a model’s ever failed quietly and the why wasn’t obvious.

The Closing Bell

Did you know?

The Medallion Fund’s Sharpe ratio is so high that standard significance tests stop being useful — because its returns are almost too consistent to model using standard risk metrics. 

Black and white photo of a water splash collision