Predicting the Future

Machine learning allows forecasting with big data

EXTRACTS FROM THE TOP TRADERS UNPLUGGED PODCAST, EPISODE #49 AND #50
Originally published in the January 2015 issue

Niels Kaastrup-Larsen in his podcast, Top Traders Unplugged, interviews some of the most successful hedge fund managers in the world. In the following extracts he speaks with Dave Sanderson, president, CEO and one of the founders of KFL Capital Management, a machine learning firm that trades the futures markets. The full interview can be found on the Top Traders Unplugged website.

Niels Kaastrup-Larsen: If someone came to you and said that they had found a way to predict the future with more than 50% accuracy, would you believe them? Would you be intrigued to learn more? What would it take for you to be convinced that this could indeed be done? Big questions in a world of big data, but in reality the answer should be, “tell me more.” Just because no other investment firm has claimed victory when it comes to predictive power using machine intelligence to successfully trade the financial markets, we can’t really exclude the possibility of it having been done by a team of scientists in Ontario, Canada. Dave Sanderson, thank you so much for being with us today – I really appreciate your time.

Dave Sanderson: My pleasure, Niels.

NKL: Now Dave, you and your partners have come up with a very different way to trade the market compared to my previous guests. It’s intriguing; it’s cutting edge, and I think the audience today will really have their eyes opened as to the possibilities we have today when applying the latest technology tofinancial data. How did the firm start?

DS: Dr Gary Li is really the person on whose shoulders we are standing on. Dr Li was a student who had been supervised by Dr Andrew Wong, who is a shareholder of our company and one of the founders — he was the founder of the Pattern Analysis and Machine Intelligence Lab at the University of Waterloo, Ontario, and he was trained at Carnegie Mellon. He has spent his lifetime in data science – since before data science ever became a degree; before big data was ever a term, Dr Wong spent his lifetime in the predictive modeling space. He will tell you, to this day that Gary Li is the brightest student he has ever supervised – and he has supervised, I think, 160 Ph.D. students at the University of Waterloo.

So he tapped Gary on the shoulder and said, “Come and work on the financial data set.” Gary was very resistant at the time because his experience in predictive modeling was that the financial data set was untouchable. Certainly Gary Li and Dr Wong are very, very cognisant of what’s happening around the world at all the great institutions in the category of predictive modelling, and lots of progress was being made on scientific data sets. In fact, Gary had commercialised the algorithm in the oil sands. In the oil sands in Canada, one of the things you can do with predictive technology is maximise the recovery out of the bitumen in the tar sands. In your models are things like flow rate and temperature and all the sensors that you place around the plant.

So Gary’s resistance came from the idea that he was in the prime of his career and he certainly didn’t want to waste a number of years trying to do something when there was no evidence that it could be done. It took a little convincing to get Gary on the project. He agreed to do it on a part-time basis originally until some progress could be made. Dr Wong had some Ph.D. students that had to give up on the thesis because it was too challenging to make predictions on financial data.

So Gary Li started and he made a little bit of progress, a little bit of setback, and really, two and half years goes by and the better part of the starting $2 million gets spent, and there is no progress to show for it. Around Christmas of 2011, Gary comes forward and says, “It’s too hard, it can’t be done.” We encouraged him to keep going. Gary actually wanted to keep going. He knew it wasn’t a contribution to the academic file on predictive modeling if he just says it’s too hard. If he comes back and proves the thesis that you can’t do it, now that’s a contribution to science.

So that’s what he actually started out to do in late 2011. He started out to prove that you can’t do it, and the two things that he focused on were these two characteristics of financial data that he believed were the reason that it cannot be done and it hasn’t been done: number one, the correlations among variables in financial data changes over time. So if you think about that tar sands example, if flow rate is column A in your data set, and temperature is column B, when you look at the relationship between flow rate and temperature, it’s going to be the same on Friday afternoon as it is on Monday morning, as it is three months down the road. But in the financial data set, if column A is the S&P 500 price, and column B is the price of gold, you can imagine that there is sometimes a relationship, strong negative on Friday afternoon; and sometimes on Monday morning there’s no apparent correlation; and three months down the road it can be positively correlated. So that’s a very unique attribute of financial data sets. The second attribute is that when you see a relationship in science, perhaps in the genome, you can call everyone over and point to it and nothing happens to it. When you see, or when everybody sees a relationship in the trading world, the world of financial data,it gets arbed out. People trade that relationship away.

Those two things – the fact that correlations change, and the fact that patterns can appear and disappear – make that data set profoundly more difficult than other data sets when you’re trying to use historical data to make predictions about the future. So Gary started, and it was his proof, he thought, that he could prove that those two things made the data set untouchable. But to his dismay, I suppose, or to his surprise, in the spring of 2012 he actually gets a model that works on the S&P 500. He picked the S&P 500 because, in his view, that’s the most efficient market in the world, if there is such a thing.

He saw that his model could achieve 52.8% accuracy across about a year and half of daily price data at that time. So he was interested in that, but he quite frankly thought that after he tested it he would find some look-ahead bias in the data, or that he would find another reason why it wasn’t true. He expanded the back-test, the baseline test or out of sample test, from a year and a half to five years, and the 52.8% remained. Then he expanded it across not just the S&P 500 but 14 other assets and, lo and behold, it remained.

So now he was being more convinced but still, in his words, unless he could prove it from mathematical first principles he wasn’t going to believe it. It’s like the physicist that says, “Sure, it works in reality, but can you prove it in theory?” So that was Gary’s view of the world. He literally sat down with pen and paper and built the mathematical proof. The mathematical proof indicated to him that something was wrong. But what was wrong was that it shouldn’t produce 52.8% accuracy; it actually
should produce 54%.

So he went back to the models, and he optimised the three parameters that he felt were contributing to overcoming those two characteristics that we talked about. When he did that the back-test was consistent with his mathematical proof, and ever since we’ve been both telling the world and trying to validate to ourselves that we have an edge, and the edge is 54%. So that was May of 2012 when Gary comes forward and we amass a meeting of the partnership, and Gary shares his findings. What I’ve learned about scientists is that their tonality doesn’t change, so they can say “You’re out of business” and “Eureka!” with the same tonality. Quite frankly, we missed the headline that meeting because the headline was, “Eureka! We’ve done it.”

We weren’t as excited as he expected and hoped us to be, but shortly thereafter we called him back and said, OK, if you believe, then let’s continue. We raise some more money. We hook the model up by way of API – application programming interface with no-touch trading to interactive brokers. So over the course of 2012 we refined the infrastructure such that the data feed would come in consistently, and we could treat the live data symmetrically with the historical data. Anyway, we get to January of 2013 and Gary has tightened up the back-test and we say nobody in our industry, Gary, is going to believe the back-test. He was a little bit incredulous to that, because he had used such scientific discipline to produce it, but he took our word for the reaction of the industry and he asked us what we wanted. The answer was, well let’s get 1000 trades. If we can get a 1000 trade sample and you can show this 54% edge, then we will launch a fund.

So my partners and I put $100,000 into that interactive brokers account and the machine in Kitchener Waterloo, Ontario, started shooting trades into that account. Over the course of 11 months, over 2013, by November, we had 1,000 trades. The accuracy number was 54.02% which, I don’t know about you Niels, but I have never seen a back-test become reality with such consistency. As we move forward through this whole discussion, I will share with you that we’re closing our one year of our partner fund, which is the publicly distributed fund, and we now have, in addition to those 1,000 trades, 1,700 more trades, and the accuracy is 54.04%. So it’s amazing, it’s just fabulous – but I’m jumping ahead.

NKL: Now, before we jump into the organisation and how you’ve set that up, I wanted to ask a couple of broader questions. You come from this unique world of machine learning as part of the investment process. And, in fact, you could say that you’re replacing the human brain when it comes to making forecasts. Tell me why you think this is different, and why it’s important to understand the difference compared to what we normally refer to as systematic trading. What’s the distinction?

DS: It’s a very timely question. Systematic traders are waiting for prices and reacting to those prices. So you can imagine, as you well know, the trend followers of the world are waiting for trends to occur. And sometimes you get on those prices and it turns out not to be a trend, so you get off. But you’re really reacting to prices – and that’s the key verb: reacting. What makes us different is that we are predicting where those prices are going over the next few hours. Our average hold period is 30 hours.

So, really that’s the difference. And these terms, “machine learning”, “predictive modeling”, and “big data,” they all get kind of lumped in together and they can be very confusing. And I’m only starting to get clearer and clearer on it. But what I would say is this: the concept of big data is that data is now everywhere. There’s a deluge of it.
Now what does that mean?

Well, it means a couple of things. It means that even things like Twitter feeds are becoming zeros and ones – meaning the Twitter feed can be interpreted, stored, and accessed at a reasonably low cost. So there’s this expansion of the kinds of data available for interpretation. And then, secondly, there are bigger and bigger computers that are able to crunch more and more numbers. So to me those are the two parts of big data.

It doesn’t really apply to us, and I say that because we’re using a data source that’s been around forever. All we use is price data. And so that data has been flashing on people’s screens for years, for decades, and it’s being stored and cataloged, and it’s fairly accessible. So really, there’s nothing new in terms of our data set that makes us different. What I think makes us different is the number of people who’ve tried to do this: to just take in historical data and make a prediction about the future movement of an asset. There are very many, but from our review of literature and anybody who’s talking, we can’t find anyone who’s done it robustly or consistently.

NKL: Interesting. The other thing I picked up on is that you see yourself being a little bit outside of the CTA space, despite the fact that you actually trade futures like most CTAs and you use models, and you’re in fact registered as a CTA. Is the label you wear, is that important or not?

DS: Right, that’s a great question. When Gary and the team developed the predictive technology, the question was, what are we going to use it on? And there are a number of reasons why the futures market makes a lot of sense to point this technology to. Obviously it’s incredibly liquid. There is an immense amount of price data available; it’s electronically traded, there’s embedded, costless leverage in it. And for all of these reasons, it made it the place to point the technology. But really the underlying technology is predictive modeling, it’s machine learning. So we could point it to any data set, and we could point it, for instance, at cash equities. But when Gary asked for the first list of assets to trade, it made so much sense to say we’ll trade the S&P, the Dow, and the Nasdaq, and then trade some other asset classes, like metals, agriculture, or interest rate products. And if we can prove the technology across all those asset classes, then we will validate the technology.

So that’s how we began. And then when we changed from a prop account only into a fund, we as you say, had to register as a CTA. We’re happy to be registered as a CTA. There’s absolutely nothing wrong with it; it just gets challenging when we try to differentiate ourselves from all of the other CTAs and all of the systematic traders, or anyone using a computer-based model to trade markets.

NKL: Do you think that machine learning is superior as a method to analyze and trade the markets? And if so, why? Clearly the traditional way has worked for decades for a number of firms.

DS: Yes. So, I would never use the word superior, because I think that there are some amazing traders out there. And I think it’s a wonderful skill set to have, and it’s very unique. The great traders have created enormous economic value. So I think that all we’re saying is that we have “a way” to create some economic value, and that’s all we’re saying. So there’s no monopoly on how to do it, and there’s no sense of superiority, or if there is, it’s by mistake. And that’s the first point I want to make.

I certainly tried my own hand at trading and I’m keenly aware of how difficult it is. I’m also keenly aware that human beings have biases that we’re not set up to trade the markets well. While trading discretionarily, my partners and I came up with some core beliefs, and one of the first lines in the core beliefs was fundamental and technical analysis. And other forms of hard work will just exacerbate human tendencies that are harmful to trading. So that’s my own personal view, but there are certainly wonderful traders out there who create great value.

I think that once we became familiar with what the machine can do, it’s hard for us to trade any other way. There’s a way of describing it that I’ve just recently come on to that I think helps, and maybe will help you understand why we’re so enamoured with the machine as opposed to the ways we used to trade. And that is the metaphor of the roulette wheel. If you think about a roulette wheel in a casino and the expected value of a bet by a patron, the house advantage is approximately 5.26%. It depends on where you play, and how many spaces are on the table. But let’s just use that 5.26%. If you look at our system, over 2700 trades, the expected value of the bet, the house advantage is 13%. So, if you had a roulette wheel, where you could spin it, and you had a 13% house advantage, it would be really difficult to go back to any other way of trading. Because all that you need is a high frequency of spins, and you’re assured of where you’re going to end up.

NKL: Interesting. Now the next topic I wanted to spend just a little bit of time on is something I think is quite important actually, and certainly to investors. Because that’s really the starting point, and that’s the track record. People look at it to get a feel for the manager, and to gather a level of interest so to speak. But here’s my intellectual challenge when I think about what you do. Someone who starts to do trend following is obviously not going to change completely if they still call themselves a trend follower 20 years later. So there is some kind of consistency in the way they approach the markets. But in your case we’re talking about predictions being made by a machine. I think it can be more challenging to get the comfort of consistency. Because a decision at eight o’ clock in the morning is based on one thing, but the same decision made at two o’ clock the next day is going to be slightly different. So how do you best explain to an investor how they should read your track record and the likelihood of it being able to be repeated?

DS: I like the way you framed that question because you sort of framed it as a challenge to us, when we’ve been thinking of it as a unique advantage. So let me explain that to you. I talk about having, in our view, shown predictive power over 2700 trades. By that I mean, if you try to get 54% heads in flipping a metaphorical coin over 2700 coin flips, there are just a whole bunch of zeros before the first integer. If you don’t have predictive power, it’s almost impossible to do that by luck.

So our view is, you know, please agree with us that we’ve shown predictive power to date. If somebody will come that far, then the question becomes: are you going to be able to sustain that power? So, we look at the consistency you’ve talked about, we look at it as an advantage of ours. We retrain the model every single day. So the model is using more and more information as it’s trading data the more we trade. When we first launched our live trading account, in January of 2013, we only had trading data from 1996 to January of 2013. But next month we’re going to have training data from 1996 to February 2015. So as your trading data set expands, your algorithm, your predictive model, has a chance to learn more relationships and more correlations. And as it learns, it may even get better.

NKL: So the 54% could essentially get higher?

DS: It could, but we think there’s a limit on where it can go. And by that I mean this: what we haven’t talked about so far is that feel of what we’re doing in the marketplace at 8am – what are we really picking up? And one of the things Gary concluded early on was that there weren’t going to be any obvious patterns in financial data. The more obvious they get, they more they’re going to be traded away by the hundreds of thousands of smart people trading these markets. So we’re only going to find those patterns that are very, very subtle. They’re subtle, but they’re non-random. So what we find is on any given trade, any given prediction, we have an enormously high confidence – 99.99% statistical confidence – that the pattern we’re seeing is going to repeat itself a slight majority of times.

So we have a very high confidence of a very slight repetitive pattern. And so when we find patterns it’s because we’re very sure it’s going to repeat itself 54% or 55% of the time. And that’s a kind of under-the-radar assessment that’s going on at every trade, and that’s part of the reason why we feel it’s sustainable – that there’s this evolution going on, there’s this retraining going on. They call it evolutionary computing. And so, as opposed to being a static model that we know has worked in our back-test period and has worked for two years, but committing not to changing it going forward at the risk of style drift or whatever, we’re not saying that. We’re saying we are going to evolve. We’re going to evolve with the changing correlations in the marketplace, with the changing participants in the marketplace. But the one thing that won’t change is that we always find subtle but non-random patterns.

NKL: The environment that Krystal, the programme, operates in, what’s the optimal environment, if you can talk about that? Or can you foresee an environment in which it becomes really difficult for Krystal? Because it may be an environment that it hasn’t really been exposed to before. Do you know where I’m going with this?

DS: Yes, definitely. So there’s a lot in there. Let me pick up on the optimal environment, and also some of the challenges. So let’s talk about the times when Krystal hasn’t performed as well, because that’s helpful for everybody. When we talk about 54%, it’s really only meaningful because it’s made up of two things. It’s made up of trade accuracy, meaning simply, how many trades out of 100 are we winning any money on? And then secondly, the win multiplier. So how much money do we win when we win versus how much we lose when we lose? Those are the two parts that get amalgamated into the 54%. Because of course, 54 by itself is meaningless. There are lots of great trend followers that win only 30% of the time, but they win so many multiples when they win that it makes a very profitable business.

When you look at those two elements for us, to break it down, we only win about 50% of our trades – it’s just slightly over 50%. But when we win, we consistently win 1.13 times what we lose. Why do I raise that in the context of an optimal environment? Because we feel like volatility is good for the P&L. I need to disclaim immediately that there isn’t enough evidence yet in our trading to statistically conclude this. I’m suggesting that volatility is a good thing, but I want to say first that there isn’t a statistical significance to this. Why there’s a fundamental reason that would be the case is this: if you think about a low-volatility time when you’re winning half your trades, and you win $1.13 when you win, and you lose $1.00 when you lose, you have that same relationship of 1.13. But then if it comes a very volatile time, and when you win, you win, say, $13 versus when you lose, you lose $10, you still have the same win multiplier 1.13. But your net win on an average trade is now $3 instead of 30 cents. So there’s some intuitive feeling that volatility will be a good thing.

The second piece is that as you look back at our out-of-sample test, it did include 2008 and 2009. And those were fantastic years for the model. Now let’s look at the challenging years. The biggest challenging moment we had in our out-of-sample test was actually 2010 – not something I expected going into this. In fact, I was encouraging the science team not to back test through 2008 and 2009 because it was so abnormal. Well, they don’t look at the world that way, and if it works it works. So we tested through that period. What we found in late 2009 and 2010 was that was the least amount of predictive power in the model. And so we’ve speculated as to why that’s the case. And I think it goes something like this: there was a fundamental shift in the markets.

If you think about March 2009, when the equity markets bottomed and the change of players in the financial markets at that time: we had the United States government participating in a way that it had never done so far. So short selling was banned. And all these fundamental things changed. Then when you think of the predictor, looking back over that time frame, it’s looking for relationships that have now disappeared. So it took a while: the decline was three months, and the rebound was seven months. In terms of being underwater, there was a 10-month period while Krystal got its correlation feet underneath it again. But I actually quite like it because, it’s as if there was a 100-year flood-type test, and it only took that long for it to find those relationships again and begin to make a fair bit of money.

NKL: Let’s move on to the heart of the strategy, namely the programme, or Krystal, itself. When you go out, and you talk to potential investors, how do you explain Krystal?

DS: First of all, we declare that it’s in many ways not explainable. And I want to be very careful of the way I say that. People say, “Are you a black box?” And we say, “Well, it’s very difficult to answer that question.” If you mean that we’re 100% reliant on our model to make a trade, then you’re right. If you mean we have no idea why a trade gets made, well now I need to be more particular with you, so let me just take you through this: at eight o’ clock in the morning (because we’ve already used this example) we trade the S&P 500 and the two other equities indices we trade. It may be helpful to talk about what’s going on at that moment.

What’s going on at that moment is the predictor in terms of input data has the last three sessions of the S&P 500. And it has the last three sessions in terms of price movement of 49 other independent variables. And it looks across all those variables, and it’s asking a question: have I ever seen this before? It then turns around to the historical data set and goes backwards through all that what we call mathematical space, back to 1996, and it looks for subspaces where those relationships have shown themselves before. And if it finds a significant number of them, it says, okay, what happens next? And if what happens next is the S&P 500 goes up, then we go long. So that’s the process that’s happening twice a day in every asset. And that goes part of the way to explaining to somebody what we do and how we do it.

But let me just finish the thought about black boxes, because it means that when we go long in the S&P 500, we can tell you with great specificity, if we tear back the model findings. We can tell you how many times we found those subspaces where this event occurred and why we think it’s statistically significant. So we can tell you any day why we’re going long with the S&P 500. But we can’t tell you about the causality underneath that. We can’t tell you why oil, gold, and the 10-year note are doing this seems to have this effect on the S&P 500. I can’t do that. I know if you turn on CNN somebody’s going to proffer some answer to why those things are happening, but we declare agnosticity there. We just have no idea what the underlying causality is, but we can tell you why we think the pattern is significant.

NKL: If I’m summarising correctly, you trade 12 different markets and you have 49 independent targets that you check every time, of which I believe 11 of them are the other markets that you’re trading. Can you give examples of what other targets, the non-market targets are?

DS: Yes, absolutely. We have a broad representation of the financial markets in that list of independent variables. You can imagine: you put some currencies in there; you put some single stocks in there, some ETFs, some indices, just a smattering of products, all of which have data back to 1996. What we have found is that pouring more independent variables in doesn’t help. Once you have a broad representation of what’s happening we’re able to pick up these subtle patterns.

NKL: Okay, and the 12 markets that you trade today, do they represent both financial markets and commodity markets?

DS: Yes. We trade the S&P, the Dow, and the Nasdaq, the 10-year note, gold, copper, silver, three soys, corn, and oil.

NKL: Okay, so quite a diverse portfolio actually?

DS: Once again when Gary asked me originally, the real question was, “We want to validate this technology, how do we do it?” And the answer was, “Let’s do 1000 trades,” or thousands of trades, and “Let’s do it across as many sectors as are reasonably accessible.”

NKL: Do you get 54% pretty much on all these different markets? Despite the fact that they are quite different?

DS: Yeah, very, very close. Over a large sample size there are no really dramatic outliers. I think our lowest is 47%, in terms of what we call the coin-toss ratio. That’s the combination of both win/loss and win multiplier. And the highest is 64%. But again, that’s live trading over a reasonably short period of time. But it’s amazing how that tightens up; just the more trades you have, the more tight the dispersion is between the coin toss ratio of all assets.

NKL: From all you’ve seen in terms of Krystal’s ability, what kind of volatility do you expect it to produce and what kind of drawdowns would you expect from Krystal?

DS: So we have a consistent relationship among those things. Given that we feel we can deliver this 54%, the question for each client becomes: what do they want to spend in terms of drawdown, or what do they want to spend in terms of standard deviation? What do they want to achieve in terms of return? So the relationship that we have found is that there’s 1% of annual standard deviation per 1% max drawdown, per 2% net annual returns. It’s a 1/1/2 relationship – if we can continue to deliver that. It’s been slightly better than that in live trading. Our net return at the end of the first year is 30%, and the max drawdown on a monthly basis is only 5%, but peak to trough it’s 9%. So the 9% to 30%, that’s more than we are promising, but we expect that it will level out to a 1/1/2 scenario going forward.

NKL: So with that, just to clarify, the 1/1/2 means a 15% annual standard deviation should produce a 30% return with a 15% max drawdown? Of course, if you can deliver that you probably won’t find a shortage of investors who want to part with their money, so send them to Ontario.

DS: The question is when, when are they going to believe?

NKL: So is there anything that keeps you awake that you think Krystal might not be able to handle so well?

DS: What is on my mind these days are business execution issues, as opposed to technology issues. I’m excited to see what the technology will do during times that we think are several standard deviations, if there is such a thing. So we have looked back at some anomalous times and, once again, I have to declare that there’s not a statistical significance to this narrative I’m about to give you. For example, during the Boston bombing, gold went down $60 and all of the mayhem of that week. We actually made 5% that week. We didn’t make money in gold. We actually lost a little bit of money in gold. We looked at the Flash Crash; we looked at many days in 2008 and 2009. It certainly cannot be the case; you can never say in this business that I’m worried of nothing. I suppose I’m as worried as I am excited about the next really crazy event.

Oil’s gone through some really interesting times recently. We’ll often be short oil; we’ll often be long oil during these times. When you break it down into a micro component like that, where it’s two sessions a day, you get a very different result. So just picking up on that Boston bombing example, I was asking the question, what happened when gold went down $60 overnight, and I thought to myself, gee, is that what happened? The truth is that’s not what happened. It went down $70 over the course of four days. So there are eight trading sessions in those four days and eight different predictions from our model during those days. What gives me comfort is the fact that we’re agnostic as to direction. So in a crazy period of time, we’re going to be long some assets, short some assets, and we’re only going to be holding, or making predictions on average every 12 hours. So, again, I’m not suggesting at all that there’s nothing in terms of market behaviour that won’t have me turning on the machine saying, gee I wonder what happened? But if I were allocating money to strategies, I would say I’m more comfortable trading a multi-asset, multi-directional, short time frame strategy than I am, say, building a large position in one particular asset.

NKL: Let’s shift gears a bit and move on to another very important topic which is research. In a sense, you are very research-focused. In fact, research was the only thing you did for a while before you started trading. I wanted to ask you what kind of research are you doing today? You did the initial research to see if you could prove that it couldn’t be done, but you realised it could be done, so what kind of research do you do today for Krystal to evolve?
DS: Right, well that’s a great area for us and what I love to say at the beginning of discussing this is that research is non-linear. By that I mean, our team can go a very long time without expressing any modifications to Krystal.

NKL: Sorry to interrupt, but Krystal is learning by itself. Is Krystal doing the research for you?

DS: Well I wish I could look at my budget and declare that, but I can’t. So yes, there’s evolutionary computing going on. Really I’m not so sure it’s doing research as much as it’s paying attention to the fact that the correlations that we talked about change over time. So it’s finding these new correlations. In a way it’s really not doing anything dramatically different, it’s just reacting to different relationships in its data set. But thankfully it’s doing that on an ongoing basis, so that we don’t try to find one inefficiency in the market and just build a static model and let that trade until that inefficiency is gone and therefore the P&L goes away from that strategy. So I can’t go that far with you, but what research does go on?

One of the things that we’re close to now is coming up with a second version of Krystal. Right now, we’ve moved it along to the point where our accuracy is 53%. So it’s not as good as the one that’s in the market today, but the interesting attribute about this model is that it is not very correlated with the model we have in the market now. So it may be that if we launch a 54% with a 53% we can have a risk-return result that’s even better than just the 54% alone. So that’s the kind of thing that’s going on.

Gary Li is the driver of that research. He would say we’ve accomplished 10% of what we’re capable of over a lifetime. So that’s the kind or rigour, or pressure he puts on himself to come up with new ideas. He’s always reading. Certainly deep knowledge and deep learning is something that a lot of academic folks are talking about these days, and how that applies to financial data, if it does at all. Certainly the folks that are working on Google Mind have some interesting ways of attacking this problem.

If you think about one of the challenges of that technology, it’s very easy for a three-year-old to recognise a cat in an image; it’s very difficult for a computer to recognise a cat in an image. So Google is making some progress there. Actually some of the people that were hired to do that were out of the University of Toronto, and familiar to our folks. So he’s watching all the time what’s happening in various pods of excellence and applying it to what we do, seeing if there’s any way we can keep improving our technology.

Top Traders Unplugged is a podcast created for the investor, trader or research analyst. As in the 'Market Wizard' books, each week in Top Traders Unplugged Niels Kaastrup-Larsen talks to a current successful hedge fund manager or commodity trading adviser who shares his or her experiences, successes, and failures. www.toptradersunplugged.com