Shares with shared code – part 2
|Mix real-world data with our own custom algorithms to make enough money to buy that volcano you’ve always wanted.
Why do this?
- Work with real-world data
- Prove that you’re smarter than a City whizz-kid
- Laugh at fund managers and their obscene money-for-nothing charges
Settle down everyone – we’re going to write code to buy and sell shares from a portfolio to ensure that we don’t lose money and, hopefully, that we make some money. For simplicity, we’ll begin by looking at a portfolio of shares that’s identical to those in the FTSE 100, and so the only choices are when to buy or sell shares.
The FTSE 100 is a stockmarket index that is constructed by taking a weighted average of the value of shares for the 100 leading companies registered on the London Stock Exchange. At the start of 1984, when the FTSE 100 began, it was given the value 1000 and by the end of 2014, due to the changes in share values, the FTSE 100 had risen to about 6,500. So if we bought shares in 1984, in proportion to how the FTSE 100 is weighted, and kept our portfolio mirroring the composition of companies represented in the FTSE 100, then if we sold them 30 years later in 2014, we should see a return of about 650% (or a factor of 6.5, if you prefer) on our initial investment. Once you correct this for inflation – the fact that prices have risen over the 30 years – it is still a respectable 300%.
But you’d not get such an impressive return for all periods of the FTSE 100’s history. In fact, if you bought at the 2000 peak, which coincidentally was also about 6500, then selling in 2014 would just return the money you put in, and after accounting for inflation you’d be worse off.
Next, let’s consider how a simple sell high and buy low scheme might improve things. If we bought £1,000 of FTSE 100 shares in 1984, then sold at the year 2000 peak to receive £6,500, and invested this all again in 2003 when the FTSE 100 was worth only 3,500, then we’d get (6,500/3,500)*£6,500=£12,071 in 2014, ie a return of 1,200%.
Clearly, it is possible to devise an algorithm that can profit from the peaks and troughs of share prices. In fact, we can improve things even more if we sell at the 2007 peak and buy in the trough of 2008/9. But, there’s a snag, which might well have occurred to you. It’s easy to see peaks and troughs in a graph of historical data, but there’s no way to know whether you have reached a peak (or trough) at the time unless, of course, you can predict the future. That said, it is still possible to devise sell-high, buy-low algorithms that do not rely on knowing you’re at a peak or in a trough.
Aim high, start low
Let’s now construct our first, very simple algorithm to decide when to buy and sell shares for a single company. It makes one of two decisions: either to spend all available money buying shares, or to sell shares we currently possess. This is of course a terrible strategy, and similar to what a desperate drunk might employ in the wee hours at a casino.
The code, written in Java, is:
public class SingleInvestment {
double money, sellThreshold=2, buyThreshold=-0.05;
TimeSeries timeSeries
void invest(double investment) {
money=investment-timeSeries.purchaseShares(investment);
do {
sellShares();
buyShares();
} while (timeSeries.next());
double rawProfit = money+timeSeries.getFinalValue() - investment;
}
}
The money variable will record what’s not invested in shares, and the Threshold variables are parameters, of which, more later. The TimeSeries class handles all aspects of the time series, including iterating through time and purchasing and selling shares, and is designed to keep our investment code clean and readable. In fact, we needn’t concern ourselves with how the TimeSeries class is implemented.
The invest() method takes the amount to be invested and on the very first line splurges it to purchase as many shares as possible with the call to purchaseShares(). Only a whole number of shares can be bought, so it returns the actual amount spent, which we use to update the money variable. Next, we start the loop which calls sellShares() then buyShares() repeatedly until timeSeries.next() returns false, telling us we’ve reached the end of the data. The sellShares() and buyShares() methods are as follows:
void sellShares() {
if (timeSeries.getPrice() > sellThreshold * timeSeries. getPriceAtLastPurchase()) {
money += timeSeries.sellShares(timeSeries.getSharesHeld());
}
}
void buyShares() {
double delta = timeSeries.getDelta(5);
if (timeSeries.getSharesHeld()==0 && delta < buyThreshold * timeSeries.getPrice()) {
money-=timeSeries.purchaseShares(money);
}
}
The sellShares() method checks the current price of the shares we’re holding, and if the price has risen to more than sellThreshold times the price we paid for them, then it sells all shares by calling timeSeries.sellShares() returns the amount from the sale so we can add it to the money variable.
The buyShares() method is similar, except its condition for buying is twofold.
First, we currently hold no shares, and secondly that delta – the difference between the price now and the price five time-steps ago – is less than buyThreshold times the current share price. Applied to monthly FTSE 100 index data from its beginning to January 2015, this algorithm generates £10,982 on an investment of £1,000, ie a profit of £9,982. Since £1,000 in 1984 equates to about £3,000 today when adjusted for inflation, this is a very respectable return, and not too far behind the return achieved with prescient knowledge of peaks and troughs.
But there’s still a snag with this scheme: how should the parameter values be decided, and how sensitive is the profit to their values? The answer to the latter question is quite sensitive: if we change sellThreshold to 1.5, then we no longer make a profit, but a loss of £2,105, and if we change it to 2.5, we make a below-inflation profit of £379. The values of 2 and -0.05 came from trial and error, guided by a thermally efficient, wet neural network, ie my brain. Also, remember that these values may work well on the history of the FTSE 100, but there’s no guarantee that these values will work on future data.
Multiple eggs – one basket
Now let’s move up a gear and work with a number of different shares rather than having a portfolio which is mirroring the FTSE 100. For simplicity, we’ll restrict ourselves to owning only one of these shares at a time: BP.L, ITV.L, LLOY.L, MKS.L, NG.L and TSCO.L. If you want to know more about these companies, you can search for them on yahoo.finance.com or
www.londonstockexchange.com, but leave off the .L suffix if you use the latter, because it just means the shares are listed on the London Stock Exchange.
The code used is not too different from before:
private static int NO_MORE_DATA = 0, NO_NEW_DATA = 1, NEW_DATA = 2;
ArrayList<StockTimeSeries> seriesList = new ArrayList();
ArrayList<StockTimeSeries> availableList = new ArrayList();
...
public void invest(double investment) {
money = investment;
Calendar cal = Calendar.getInstance();
cal.setTime(seriesList.get(0).getFirstDate());
int index=-1;
status = updateAvailableList(cal.getTime());
do {
if (index == -1) {
buyShares();
} else if (sellShares()) {
index = -1;
}
do {
cal.add(Calendar.DAY_OF_MONTH, 1);
status = updateAvailableList(cal.getTime());
} while (status == NO_NEW_DATA);
} while (status != NO_MORE_DATA);
}
The main difference is that we are not just looping through one time series, but looking at a number of series that are in a List called seriesList. The code that prepares this isn’t shown here, but it ensures that the series with index 0 has the earliest start date, and this is the date that we put into the Calendar object. We start with index=-1 which means that we currently hold no shares. Next we call updateAvailableList() with the starting date, and it will put all time series objects with data for that day into the availableList, and return a status of NEW_DATA, which is one of three constants defined in the class.
The loop then starts, and since index is -1 (we have no shares) it attempts to buy some. In later iterations, after we own some shares, an attempt is made to sell shares instead. Then we start another loop that will repeatedly increment the date by one day and then call updateAvailableList() until it returns a status of NO_NEW_DATA. This winds us past weekends and bank holidays on which the stock exchange is closed and the availableList is empty. The loop checks to see if status has not been set to NO_MORE_DATA; if it has we’ve reached the end of all the data and we’re done.
The sellShares() method is nearly identical to before, except that it now returns true if it sells shares, but false otherwise. The buyShares() method is a little bit more involved than before:
void buyShares() {
for (int i = 0; i < availableList.size(); i++) {
StockTimeSeries timeSeries = availableList.get(i);
double delta = timeSeries.getDelta(5);
if (delta < buyThreshold * timeSeries.getPrice()) {
money -= timeSeries.purchaseShares(money);
index = i;
return;
}
}
}
It loops through all series in the availableList and looks for a drop in share price of sufficient size in the same way as before. As soon as it finds one share with such a drop, it will purchase as many shares as it can and return.
The output from using this class, investing £1,000 initially, with sellThreshold=1.5 and buyThreshold=-0.05, is:
1988-08-16,1000,0,3 BUY: BP.L.csv
1990-08-17,-,1522,3 SELL: BP.L.csv
1990-08-21,1522,0,3 BUY: MKS.L.csv
1992-04-13,-,2315,3 SELL: MKS.L.csv
1992-04-30,2315,0,3 BUY: TSCO.L.csv
1995-07-07,-,3507,3 SELL: TSCO.L.csv
1995-09-22,3507,0,3 BUY: TSCO.L.csv
1997-05-16,-,5274,5 SELL: TSCO.L.csv
1997-05-20,5274,0,5 BUY: MKS.L.csv
2006-03-10,-,8051,6 SELL: MKS.L.csv
2006-03-31,8051,0,6 BUY: ITV.L.csv
2013-07-11,-,12150,6 SELL: ITV.L.csv
2013-08-15,12150,0,6 BUY: MKS.L.csv
Final money=0
Final value of held shares=13628
Raw profit=12628
Each line of the output records a transaction: the date of transaction, value of shares held, money held (after this transaction), size of the availableList, and a short text description. You can see here that this scheme is our most successful yet, bringing a greater than 12-fold return, before inflation.
How our companies have been doing
Here’s a plot of prices for all shares we’re interested in, and the value of the FTSE100, which uses a different scale to the right. You can download the data from Yahoo finance. For example, on the command line you can fetch all available data for the share BP.L with:
wget “http://ichart.finance.yahoo.com/table.csv?s=BP.L” -O BP.L.csv
Or you could put that same URL into your web browser. We’ll be working with the Adj.Close column of the data.
A proper portfolio investment
Let’s now implement a true portfolio of shares where we can hold shares of several companies at once. We will decide when to buy and sell shares exactly as above, but a new decision has to be made: how many shares should we buy or sell? Adhering to the KISS principle (Keep It Simple Stupid!), let’s plump for creating two new parameters: sellFraction and buyFraction. The first means that when we decide to sell a particular share, we sell the number of those shares held times sellFraction. Similarly, for buyFraction, once a decision is made to buy shares, the amount of money to be spent is set equal to buyFraction times the amount of money we
currently hold.
The invest() method hardly changes, and in fact ends up becoming simpler because we do away with the index variable, and if-else statements testing it inside the main do-while loop are replaced with:
buyShares()
sellShares()
The buyShares() code hardly changes: we only need to remove the index=i line and in the line above replace money with buyFraction*money. The sellShares() method needs to change a bit so that, like in buyShares(), it loops through all available shares, and takes notice of the new parameter, sellFraction:
void sellShares() {
for (int i = 0; i < availableList.size(); i++) {
StockTimeSeries timeSeries = availableList.get(i);
if (timeSeries.hasShares() && timeSeries.getPrice() > sellThreshold * timeSeries.getPriceAtLastPurchase()) {
money += timeSeries.sellShares(sellFraction);
return
}
}
}
Notice that, keeping to our KISS principle, as soon as we find shares that meet the criterion, we buy or sell then return; only one buy and sell transaction can take place at each time-step.
Running this scheme, with sellThreshold=1.5, buyThreshold=-0.05, buyFraction=0.5 and sellFraction=0.5, gives a return of £12,140 on an investment of £1,000, ie a profit of £11,141, about £1,000 less than the previous scheme. Although this return is lower, our investment is safer in that with the previous scheme we could have lost everything if the single company we’d invested in went bust.
The graphs of this scheme show some odd features. The most striking is that there were no transactions at all between March 2007 and June 2013. During most of this time, the UK, along with many other countries, was either in recession or enduring a feeble recovery. At first it’s surprising that the algorithm didn’t buy shares during the start of the recession in 2008, because that was a time when prices were falling. However, on closer inspection you can see that our money was already fully invested when the recession started, so we rode out those years with an unchanging portfolio. It wasn’t until 2013 that any of the shares rose in price enough to trigger a sale. This is great example of why investing in shares is regarded as a long term investment, and not a get-rich-quick scheme.
Bank share prices suffered the most pronounced fall in the 2008 recession, so it’s particularly pertinent to see how those of LLOY.L – the Lloyds bank – fared at this time. They were bought and sold by our scheme up until 2007, but not after date. In fact, the cunning little algorithm divested itself of all those bank shares before the financial crisis of 2008 started. And a good thing too: LLOY.L shares plummeted from around £300 per share to under £30 during the recession.
Dark arts of parameters
In building models, whether they are to predict time series, or model the climate, or plan a space mission, it’s next to impossible to avoid using parameters. Some parameters, such a physical constant, like one that describes the strength of gravity, can be measured objectively, but others, like the sell and buy thresholds we’ve used, need to be set empirically – that is by looking at real data. And this causes a problem: we want the parameters to work not just on the data we have, but on any data we throw at the model. This issue extends beyond numerical modelling. For example, it crops up in learning a foreign language. At school my French teacher taught us “J’ai treize ans et j’habite a Édimbourg”. If I learned that parrot-fashion, without understanding, it’d soon be useless to me as I wouldn’t be 13 and living in Edinburgh for my entire life (as it happens I was 12 and lived in Glasgow!).
The point is that we must not choose parameters so that they only work well on one set of data. In fact, it’s always possible to keep adding parameters to a model so it can describe a given set of data perfectly, but when it does, it will almost certainly fail when given fresh data. There are various methods available to set parameters to avoid this pitfall, but they mostly boil down to a simple idea: reserve one set of data for choosing the parameters – the training set – and another set of data for checking that the model generalises well – the test set. If you think about it, this is exactly how school learning proceeds – you are taught on one set of examples, but will be tested on an unknown set of examples in the exam.
With great power comes great responsibility
If you were to take £1,000 and invest it in a savings account in 1988, you’d need an annual interest rate of 10% to give returns comparable with those we’ve seen with these simple schemes. This may lead you to think that the author is very rich, only troubled by the choice of which volcano to hollow out for a secret base, or what colour boots his hordes of minions should wear. In truth though, I have not invested heavily in shares for two reasons. Firstly, because, like most people, I don’t have enough money to risk losing it and so when I do have some money to spare, I place it in safe savings accounts with modest interest rates.
Secondly, simply speculating on the stock market has ethical concerns. Gambling on companies you know nothing about can, in concert with a mass of similarly ignorant or short-sighted investors, cause bubbles and busts, of which two are present in the data we’ve looked at – the dot-com bubble in 2000 and the Great Recession starting in 2008.
If you do wish to invest, you could do worse than follow the example of John Maynard Keynes, who advocated taking an interest in the companies and investing in those that would not just be profitable, but productive to the real economy too. That human touch, together with algorithms like those we’ve begun to develop here, can make for investments that are both profitable and ethical.
There’s plenty of scope for extending these basic algorithms. You can use the algorithms here as a starting point and improve on them. Or, if writing raw code isn’t your thing, you can get GNU Octave from your distro’s repositories and with its TSA time series toolbox you can do some serious statistical analysis. You can use the venerable Gnuplot, as a graphing engine for GNU Octave. If you’d like to try something a little bit different, you could try using the FANN library – Fast Artificial Neural Networks – which has bindings for many programming languages. With it you can get your computer to learn to predict time series, but be careful – you really don’t want to be both out-thought and out-earned by a machine, do you?
How our portfolio evolved
The graphs are stacked, which means that the top of each column is the total number of shares held, with the height of the colour rectangles showing the proportion of each share held. For example, from 2007 to 2013, no shares were bought or sold and the number of shares remained constant at just below 70,000. Most of those shares were in ITV.L, with some in BP.L and TSCO.L.
Of more relevance to most people will be the amount of money that our shares are worth. The value of shares held takes on a spiky appearance, reflecting the volatility of share prices. The amount of cash money held is shown too, though it is almost always small in proportion to the total value of shares held.