Tomorrow’s Titans 2021: Julia Bonafede, Rosetta Analytics

Deep reinforcement learning and next generation alpha generation

Hamlin Lovell
Originally published on 17 May 2021
Julia Bonafede, Co-Founder, Head of Investments, Rosetta Analytics, Minden, Nevada

Bonafede co-founded women-led firm Rosetta with Angelo Calvello to disrupt active asset management. Since 2015, Rosetta has rolled out four live investment strategies based on applying advanced AI and deep reinforcement learning (DRL) to liquid markets. “DRL is such an early-stage technology that investors are only just starting to see the benefits of it. In other industries such as robotics and healthcare it is already widely used,” says Bonafede.

It is the culmination of the accelerating evolution of quantitative investing which Bonafede witnessed over 24 years at investment consultant and outsourced CIO, Wilshire Consulting, which during her leadership had over $1 trillion of assets under advice. “I saw quant develop through two lenses: I evaluated many asset managers and used multi-factor risk models to monitor manager styles. The earliest quant approach was indexing, followed by style investing with tilts similar to Fama French factors such as size, value and growth, which grew into iterations of over 100 factors including currency, country, industry and fundamentals. Then came smart beta, with a more rules-based approach, and more active systematic bets. Now AI lets the models derive relationships directly from the data, rather than from screened behavior identified in academic research.”

Rosetta believes that its technology has more accurate predictive power than traditional quant, but also recognizes that it is harder to explain. “Many investors’ comfort zone is a 50-year-old linear regression equation. The name Rosetta is a start towards explaining the approach, since a Rosetta stone translates pictures or images into languages – and neural networks can turn all sorts of data into actionable signals,” says Bonafede. 

We define the data that feeds into the models and define the output as the optimal allocation of risk capital. What we do not define is the relationships between the data and the signals.

Julia Bonafede, Co-Founder, Head of Investments, Rosetta Analytics

DRL can however cause confusion because so many asset managers are talking about machine learning, though they may not be applying it holistically to investment signals, if at all, according to Calvello, who previously co-founded Blue Diamond Asset Management AG and Impact Investment Partners AG: “It takes a certain type of talent to create DRL models and most people working at asset managers do not hire those with the right background. It is hard to do this, and many asset managers will not take the risk. Why would they invest the time and money when they already have good profit margins?” Bonafede has also heard scepticism first hand: “Many managers tell us they are not using deep learning and DRL, and this even includes some managers who have published on it academically”.

Managers may be using machine learning for one or more of operational routines, trade execution, portfolio construction and signal generation, but it may only be a fraction of a process, and they may be diluting it by prescriptively defining the framework in which the AI is used.

Letting the data speak

Rosetta is 100% AI in terms of building autonomous algorithms that let the data speak, but the approach is not entirely “unsupervised”. “Our end-to-end learning models ingest carefully assembled data sets and use powerful deep reinforcement learning to create investment signals and allocate between assets. The signals are robust and persist through changing market cycles. By employing deep reinforcement learning our models successfully allocate risk to achieve optimal market exposure to maximize return. Risk of loss is a first order consideration rather than a second order constraint,” says Bonafede. For instance, Rosetta exercises judgment to select a variety of technical/price/volumes data; fundamental economic data such as bond yields or stock and sector level data, and alternative data, with inputs that vary between models and markets. “We define the data that feeds into the models and define the output as the optimal allocation of risk capital. What we do not define is the relationships between the data and the signals,” says Bonafede. These relationships may not make any intuitive economic sense or fit into other established frameworks such as behavioral finance, or map onto a factor zoo determined by humans. “If you need to establish the relationships as priors, the model will only find those. A neural network determines the optimal system, which could be based on millions of parameters, identifying which relationships matter, adapting and learning. It could discover totally new relationships and more often than not does. It is designed to capture non-linear relationships which minimize errors and expand to capture returns more robustly,” says Bonafede. 

Adaptive models relishing volatility

“It also adapts faster to changing environments,” says Calvello. For instance, Rosetta’s models performed very well in March and April 2020. “After the equity market peak on February 23rd, the models took another two weeks to understand the market before generating more active trading decisions. We were then up over 22% in March 2020 and over 12% in April 2020 with active trading almost every day,” says Bonafede. 

The Covid crisis shows that Rosetta’s models have performed well in higher VIX volatility index regimes. “This behavior was not by design but is rather part of the systems’ risk management properties. Their sizing varies as conviction in signals gets stronger or weaker. The DRL model make decisions to maximize reward, and penalties for an incorrect decision are weighted many magnitudes more heavily than reward,” says Bonafede. This borrows from autonomous driving, which some of Rosetta’s scientists have worked on, where there are huge penalties for errors.

In contrast, some other strategies dubbed as “machine learning”, had their worst ever performance in March 2020, illustrating how similar branding labels can bely very different approaches: “Our sense is that some other managers are using a weaker version of machine learning, or are using it to augment other more traditional quantitative techniques. Part of the problem is that traditional quantitative techniques can also come under the machine learning umbrella, which now includes anything rules-based outside passive investing,” says Calvello.

Uncorrelated and differentiated

Rosetta is doing something different even within the narrow space of funds that claim to be using machine learning and is also distinguished from other alternative strategies. Rosetta’s returns generally do not have consistent correlation to other hedge fund strategies or traditional asset classes. Patterns of correlation fluctuate over time and vary with market regimes. For example, the average correlation of Rosetta’s RL One S&P 500 long/short strategy has been near zero, but it has ranged between about +100% and -89%.Rosetta is also somewhat longer term than some quantitative strategies. Though AI is sometimes associated with high frequency trading and split-second trade execution, Rosetta’s strategies are lower turnover.

Model R&D and evolution

Rosetta took over four years to develop its latest models. Rosetta’s first live strategies in 2017 were two first-generation deep learning models to produce directional signals. Rosetta defines deep learning as a machine learning algorithm that uses deep neural networks. The seed investor, a US endowment, wanted them traded on a binary basis, with deep learning one either 100% long or 100% short S&P 500, and deep learning two either 100% long S&P 500 or in cash. 

In May 2020, Rosetta rolled out its next generation of live strategies, adding reinforcement learning, which created totally different models that also allow for variable position sizing. “Deep learning is great at detecting relationships or clustering, but not at allocating or optimizing. To draw analogies with autonomous driving, you might use deep learning to identify an object in front of you, but would need reinforcement learning to slow down, speed up, or turn right,” points out Bonafede. Rosetta defines DRL as algorithms that learn to maximize reward through trial and error, and map situations to actions. These systems can also cope with larger volumes of more complex data, including non-traditional data.

Rosetta’s RL One Strategy, applying DRL to trading the S&P 500, has made 18.27% between May 2020 and April 2021, with a standard deviation of 12%. Their DRL strategy trading ICE European Union Allowance (EUA) futures contracts and related instruments has made 28.26% over the same period with volatility of 19%, less than half that of the underlying market.

Backtesting and training

“So far, the live performance has surpassed the back test, but we are aware of the potential for alpha decay,” says Bonafede. The term “back-testing” more naturally describes a static model than an inherently adaptive one. Rosetta uses decades of what it calls “training data”, with out-of-sample models applied to unseen data, and various techniques used to generalise the model, and determine the level of learning out of sample. “Where data histories are shorter, for newer markets, techniques such as transfer learning can be used to augment gaps in data. We did actually have enough data for EU carbon allowances, which are heavily influenced by regulation, and have quite different liquidity and data inputs compared with our S&P 500 model,” says Bonafede. This relatively new asset class has been of interest to the duo for some time: Calvello founded the Journal of Environmental Investing in 2009, long before ESG investing became a mainstream focus.

Institutional infrastructure 

As well as honing and refining its models, Rosetta has invested in building a scalable, institutional quality operational and trading infrastructure, and team. “This includes computer power that adds up to 43,000-man hours or 48 years of compute time. Data terabytes are not quantified because our edge comes from algorithms not volumes of data,” says Calvello. “We are six people but have the infrastructure of a 100-person organization. We started with a blank slate: we needed to hire special talent (and compete with entities such as Google, FB, NASA, NYU, etc., for this talent); build proprietary IP from the ground up; build an industrial-quality infrastructure that would support rapid experimentation and efficient cycling; build an institutional-quality operational structure (because we are a fiduciary); build a trading operation; register as a CTA; vet and hire vendors (fund admin, FCMs, etc.),” says Bonafede. 

The team includes four experienced and talented machine learning scientists and engineers who have all worked outside the investment and finance industry, solving machine learning problems on a vastly different scale. From the start, Rosetta has also had an experienced advisory board of six allocators, asset managers and academics, who are used episodically as a sounding board for ideas. They include Erik Valtonen, former CIO of Swedish public pension fund AP3. 

Growth strategy

“We were very unusual for a startup in that we started de novo and raised money from an asset owner, not from venture capitalists,” says Calvello. The firm received initial operating capital from Verger Capital Management (an OCIO made up of the former Wake Forest University endowment management team). “We are talking to other potential strategic partners,” says Bonafede.

Rosetta manages $19 million as of May 2021 and discounted fees are available for founders’ share classes. “We have created much more nimble strategies, that are scalable in the deepest markets. I think we focus on what we have,” says Bonafede. 

Rosetta advisory board member, Dr Elisabetta Basilico, sums up the challenge of growing a disruptive and highly innovative strategy: “There is a trade-off between explainability and accuracy. The big question facing investors and advisors is whether to invest in AI based strategies because they are likely to be the most accurate but the least explainable”.