Alternative FX Data

Hamlin Lovell

Head of information services Masami Johnstone outlines how CLS data can evolve the FX market. “There is a strong desire by market participants to make the FX market more efficient and to address opacity issues,” Johnstone says, citing the fact that market fragmentation among trading venues, Electronic Communications Networks (ECNs) and banks makes it difficult to achieve a holistic understanding of market dynamics.

Johnstone joined CLS in September 2019 and her ambitious plans include leveraging CLS’s unparalleled coverage of currency market volume data, which can be used for developing investment strategies, monitoring best execution/transaction cost, enhancing risk management, and fine-tuning algorithmic trading.

“CLS shares a similar profile to a stock exchange for various aspects of its operations, including rigorous data governance and risk management,” says Johnstone, whose previous role was Head of Buy Side Sales at Euronext.

We are not just forecasting based on history. We use more holistic approaches to identify pre-trade signals and how they correlate with CLS trading data.

Masami Johnstone, Head of Information Services, CLS

“A key difference is that CLS is not a trading venue, therefore, we don’t provide tick-by-tick trading data. Our data is derived from settlement data, which means that the design of datasets needs to be different. Our goal is to provide market participants with greater transparency of the FX market combined with comprehensive data analytics.”

CLS is owned by 79 banks and can be described as a utility working for the mutual benefit of market participants. CLS Bank International is recognised as a systemically important financial market infrastructure (FMI). CLS began providing data products derived from its settlement data in 2016, but there is actually very little overlap between settlement clients and data buyers.

The FX market presents special data challenges. It is distinguished from other financial markets by its huge size – $6.6trn per day in April 2019 according to the latest triennial BIS Survey. Currencies still have far more over-the-counter (OTC) trading than other assets, with 30% of FX trades resulting from voice trading, and the majority of FX volume is between dealers. “We understand only about 20% of FX trading is via electronic algorithms, versus 80% in the equity markets, but equity markets had very little electronic trading 15 years ago,” Johnstone says. “We expect that greater transparency will lead to more automated trading in FX.”

And so, the CLS offering is buzzing with initiatives as a strong pipeline of research and analytics is being developed in collaboration with clients. In parallel, CLS continues its support of industry best practice initiatives, such as the FX Global Code − an initiative to promote the highest ethical standards and best practices for the FX market among all market participants.

Alternative FX data

CLS plays a vital role in illuminating the currency markets because its settlement system captures over $1.55trn a day, amounting to more than 50% of FX market volume for CLS-settled currencies. In comparison, primary markets (Thomson Reuters/Refinitiv and EBS/CME) volume is $420bn, futures is $100bn, investment banks is $80bn and FX dealers is $60bn, according to CLS (Fig.1). CLS data is provided daily, whereas regional central bank data tends to be reported every six months, and the BIS survey is only triennial. CLS data is derived from the banks that submit payment instructions for settlement immediately after execution, from circa 25,000 participants and 500,000 trades per day. The CLS FX Spot Volume report currently provides executed intraday FX trades for 18 currencies and 33 currency pairs – both figures that are expected to grow.

The data fields collected by CLS include price, volume, and counterparties, for spot, forward and swap trades. It is based only on executed trades, including rescinded trades. This feeds into the development of TWAP and VWAP measures, as well as spot volume forecasting products, though the data is not real-time like trading venue data. CLS data is collected at five-minute intervals. “There is a known slight time lag between trade execution and settlement, which means the data is adjusted to capture the accurate market colour,” explains Johnstone.

Another distinguishing characteristic of CLS data is that it contains a large amount of voice data. Other data providers may exclude voice data due to the large dispersion in the data, seen particularly around 1600 GMT. “CLS data can therefore shed light on outliers that are not visible elsewhere,” Johnstone says, adding that discussions are underway about innovations to measure dispersion and gauge the impact of voice trading on the FX market.

CLS analysis will also generate value from connections to other alternative data. “From early 2020, we plan to use an external machine learning tool to examine the relationship between CLS datasets and other datasets, to see if combined datasets can provide additional market insights,” she says.


CLS plays a vital role in illuminating the currency markets because its settlement system captures over $1.55trn a day, amounting to more than 50% of FX market volume for CLS-settled currencies.

Investment strategies

In addition to a relative lack of transparency, another distinguishing characteristic of the FX market is that substantial volume comes from other market participants, such as central banks, governments, multinational corporations, retail investors and tourists. This creates potential for investors to profit from being on the other side of trades that can become one-way bets, perhaps most famously when George Soros made over £1bn out of shorting the pound when it exited the Exchange Rate Mechanism (ERM).

Johnstone has respect for the dominion and confidentiality of hedge fund strategies, specifying that CLS is not trying to reverse-engineer or compete with them. “We do not know exactly how clients are using the data, but believe they are using it to test existing strategies. We do not sell trading signals or algorithms, but portfolio managers can use the data to build their own trading strategies,” she says, and in any event, the hedge fund industry is diverse across all asset classes including FX. “Different client groups could have completely different use cases.”

Systematic and technical hedge fund strategies that use CLS data might include momentum, mean reversion and pattern recognition, and these approaches could be combined with volume-based signals. For instance, Foreign Exchange Volume1 identifies a relatively simple volume-based intraday reversal strategy as consistently very profitable. “A daily cross-sectional reversal investment strategy (that buys losing currencies and sells winners from the previous 24 hours) generates an annualised return of over 17.6% and a Sharpe ratio of 1.7… we show that the returns remain sizeable after accounting for transactions costs: an investor could expect to receive an annualised return of up to 10% in the cross-sectional strategy and a Sharpe ratio of around one. The correlation of the strategies with other popular currency investment strategies… are also low, and thus we find large incremental diversification gains when the strategies are added to an existing currency portfolio,”2 states the academic research, published in April 2019 (first published in 2017) by Antonio Gargano and Steven J Riddiough of the University of Melbourne and Lucio Sarno of Cass Business School and CEPR.

Reversal investment is found to be uncorrelated with other strategies such as currency carry and momentum, and therefore offers investors a potential new source of diversification. The system works because price moves backed by large volumes are more likely to persist, whereas price moves based on thin volumes are more likely to reverse, because larger volumes are more likely to flow from well informed traders. Notably, the system should be profitable through most of the trading day, but not during the hours of least liquidity, therefore, the value of yesterday’s volume data is partly conditional on a clear view of today’s intraday liquidity.

Automation to capture bigger datasets could move algos into the mainstream, though the large role of opaque, voice OTC trading means this may take longer for FX than it did for equities.

Masami Johnstone, Head of Information Services, CLS

A further refinement to the strategy is filtering the volume data according to counterparty type, as flow data combined with volume data can be a very powerful combination, generating consistent outperformance, based on back-testing. “We may release more refined combinations of FX volume and flow data,” says Johnstone. The CLS FX Order Flow report breaks down directional volume by counterparty type, including banks, non-bank financial institutions, funds, and corporates, and indicates in which direction they are trading. CLS’s rigorous data governance policy to ensure clients’ confidentiality incorporates strict data aggregation and anonymisation rules and a robust statistical method to determine if the liquidity dispersion within the observation bucket is sufficient to safeguard confidentiality.

Counterparties also can be divided based on their observed behavior rather than their self-identified category. CLS represented 5,000 counterparties as a directed network and categorised them by “coreness,” defined as linkages with others in the market, to infer whether counterparties are likely to be sell side or buy side.

The data can also be used as an input for fundamental or discretionary strategies. “Clients may look at each currency pair in the context of economic cycles to identify patterns and calculate the predictive power of signals,” Johnstone says, and historical data can reveal insights from past patterns informing the development of more sophisticated signals that evolve with the markets. “Our data is not suitable for high frequency trading but should be attractive for longer trading horizons depending on a client’s strategy,” she says. “Time horizons are typically three to five days for some hedge fund strategies, shorter term for market makers and longer term for traditional asset managers, which may benefit from alerts and indicators.”

Masami Johnstone, Head of Information Services, CLS

Pre-trade analysis

Regardless of investment strategy or other objectives for trading currency, FX forecast data can help market participants determine the optimal time to execute, a particularly important factor for large trades. “Forecasts of volume on an hourly basis are generated over a five-day period to provide insights into the volume curve. Our FX Forecast dataset incorporates scheduled macroeconomic events, and therefore, is designed to provide volume profile predictions regardless of specific events that often influence currency movement,” says Johnstone. In other words, the CLS FX Forecast dataset can be used as a risk management tool to help reduce slippage and market impact. “Traditional investors and corporates can also heed volume as a risk factor and try to avoid executing large trades when volume is likely to be low. Periods of low volume are often associated with wider bid/ask spreads and therefore, higher execution costs.”

“We are not just forecasting based on history. We use more holistic approaches to identify pre-trade signals and how they correlate with CLS trading data. We map out a volume curve based on machine learning that takes account of seasonality, macroeconomic releases and decisions such as central bank rate decisions, as well as currency calendars (Fig.2). The robustness of volume forecasts is achieved by deploying an ensemble of machine learning algorithms,” she explains.

Post-trade analytics

Johnstone says CLS developed the FX pricing dataset to provide market participants with a tool to examine their execution performance based on VWAP/TWAP. However there could be scope to widen its function to transaction cost analysis (TCA). Previously, Johnstone led TCA consulting in Europe at ITG (Virtu). “People do not yet recognize CLS as a provider of post-trade analytics or TCA, but we look forward to continuing to grow in this space and may partner with existing TCA providers or trading venues or both,” she says. “We are open to suggestions.”

At ITG, Johnstone’s focus was primarily TCA for equities and ETFs, which involve different types of data. The CLS FX Spot Pricing dataset provides its own VWAP and TWAP measures, which may differ from those based on other datasets. “Our mission is to enrich and broaden clients’ understanding of the marketplace based on the largest volume coverage. Our data includes large volumes of data such as voice trades not captured elsewhere.” CLS can “improve dataset coverage and transparency to serve the FX trading community,” and although CLS is not itself providing execution algorithms or any other execution trades, Johnstone says comprehensive data is a crucial component in order to measure the success of algo-based trading strategies. “We are starting to see more sophisticated execution algorithms, but nobody knows if one algorithm performs better than another as there is not a comprehensive view of the market. Automation to capture bigger datasets could move algos into the mainstream, though the large role of opaque, voice OTC trading means this may take longer for FX than it did for equities.”

CLS’s clients include many of the largest investment banks and other institutions that are very focused on TCA. “We are actively building market insights, including post-trade analytics for banks, who have been struggling to build comprehensive data visualization tools,” Johnstone says.

Data enhancements

Market participants can now access the data directly from CLS and have been accessing the data since 2016 via Quandl, which is now part of Nasdaq. “Some hedge funds already using Quandl for other datasets may find it more convenient to get the CLS data from them as well,” says Johnstone, who is marketing CLS’s alternative FX data globally and has just returned from a trip to her native Japan. Data coverage may be further expanded – for instance, forward and swap flow data is already being rolled out to complement the existing spot data offering.

“Clients are requesting expanded coverage of currencies, and CLS is happy to help with this, subject to a rigorous process that ensures the number of observations is sufficient to safeguard client confidentiality,” she explains.

Frequency is also being assessed. “There could be potential to increase frequency from five minutes, to perhaps one minute or 30 seconds, but we have to balance out issues around unmatched and matched trades. Increasing frequency would be partly a trade-off against lower accuracy,” she points out.


CLS has an Information Services team of 19, including data scientists, data analysts, and individuals skilled at data mining and quant research who are evaluating partnerships for trade analytics integration. Johnstone is keen to see more female quants and data scientists in the industry. She believes diversity will bring a positive impact to the world of data science and machine learning, where out-of-the-box idea generation is important.

“The top priority in 2020 will be to make CLS data analytics into more accessible and digestible market insights,” she says. “Some hedge funds love the raw data, but other market participants need a helping hand to interpret big data with alerts and indicators. We will work on this based on feedback from market participants. The FX data could also be combined with data from other asset classes and/or alternative data, such as sentiment data.”

Many of these projects are at an early stage; however, the vast repository of CLS data holds great promise for new datasets and analytics that not only empower client growth, but are essential components to evolve the maturity and transparency of the FX market.

This article is not intended for trading purposes and should not be construed to include, or be used as, investment advice.


1. Gargano, Antonio et al., Foreign Exchange Volume (April 17, 2019). Available at SSRN.
2. Gargano, Antonio et al., Foreign Exchange Volume (April 17, 2019), page 4. Available at SSRN.