Try To Remember Please

Published on 14 November 2021 at 20:49

"All life is a experiment the more experiments you make the better" Ralph Waldo Emerson

So I am working on a Stock AI while blogging about what I am learning while doing it. This post acts more of a mission statement for a set of experiments I am building for testing a set of AI which I hope have something more like momentum or memory. I currently have something like 32 tests setup that I need to get through.

What would memory mean

Something that I have been thinking about is what is memory within a AI and how could that be implemented. At present show a neural network input A there will be no impact of input A when exposed to input B. Each forward propagation is distinct. I believe time series data such as stocks would benefit from knowing what stock price was 10 days ago existing in some form of impact within the AI memory.

Naturally I am not 100% sure what this would look like which is why I have been creating some tests. I have off course read about the long term short term memory networks used by Google in text analytics and I have read about MIT Liquid Artificial Intelligence though I recall remembering reading about the same technology developed in the 1980s.

For me it became about that I wanted to see if I can find out something that would make my work different and could substantiate it in some way.

Preface

So this is a continuation of my Blog about the stock trading AI I am building. In developing it I started running it for long periods and gathering data; data such as the plots that I showed of price falls and rises and the AI output. Though something that has been bothering me Is that while there is a natural fitness score for stock market a AI in the form of the almighty dollars I am kind of limited with what I can do about improving the algorithm.

I hadn't stopped and thought about it but I can take a AI work out how effective it is but when that AI is produced from a neuralevolutionary genetic algorithm that runs a tournament to find the best solution I am kind of limited with how I might approach making a improvement.

If I change something and I run it and it produces a AI that performs well does that mean that was a good improvement? or was it just lucky was it trading Amazon instead of penny stocks? Did it get a screwed up setup? So I though I am evolving a population of AI and choosing the best well maybe I need to turn to statistics and analyse the whole population. To do that I need data and to work out what would be a meaningful

To calculate a good sample size

Choose the required confidence level as a %
Choose the margin of error as %
Choose the proportion of the total population your trying to judge (%)
If required, specify the population size

The sample size (n) is calculated according to the formula: n = z² * p * (1 - p) / e²
Where: z = 2.576 for a confidence level (α) of 99%, p = proportion (expressed as a decimal), e = margin of error.

So punching in the numbers I think I need a 99% confidence level. with a neuralevolutionary algorithm with access to the majority of the New York stock exchange history there might be a high level of variety. I am happy wit the 5% error range and so i get 664 as required sample size which I round up to 700 AI's are probably needed to be tested within any domain before I need to start making any decisions.

These figures are obviously guesstimates of how many tests that I should have before I start drawing conclusions and that is probably a lot of tests. But luckily I have a neuralevolutionary algorithm that automates the production and testing from end to end.

Though I am still a bit worried by this as a approach as an evolutionary f=algorithm by necessity ought to be a exponential function i.e. if in a session a algorithm receives say the Amazon stock at or close to the beginning in its evolution process and then just evolves and improves on that is that a fair test?

It is something that I am uncertain of how to test evolutionary algorithms properly I might need to think about constraining the test but any constraint is also a limitation on the direction a AI might evolve in. This could end up focusing on the specific stock or circumstance you then test. So I accept I am sort of ignoring this complexity for now because I am trying to build a better evolutionary algorithm it doesn't seem possible to set controls up for the evolution without missing the point.

Problem Statement

So having set up the code to start saving the data from AI runs I starting building charts and data about my main program.

I found after 463 tests for my initial algorithm the following details. Which is called version 74.

profits version 74 averaged $19.64. Accuracy meaning trades where a profit was made version 74 averaged 16.38 positive trades in a simulation. Of which 4.26 where shorts (sold high bought low). It averaged a maximum of 102.90 days holding a given position. And made 49.36 trades on average.

You can see version 74 as a box and whisker chart below you can see that performance while averages around $20 per AI it is actually brought down by outliers that show losses of -$1500 and lower.

In reality you would be unlikely to choose a AI that made such mistakes back testing would demonstrate that the AI was a unsuitable tool and it would be supposed that a AI that made these mistakes the owner would need to keep funnelling cash into it and letting it make these mistakes. Furthermore the recording of data doesn't distinguish between a initial run for training where maybe the AI had a bad few days because of a system death of a previous pattern that worked very well and the same AI having been fully trained.

So even though it would seem

You can also see that I have 4 versions that already seem to have this behaviour engineered out.

I began to develop a hypothesis though this might not be a problematic behaviour when looking at the amount of times a AI traded (shown below). These are box and whisker charts for trades done in a simulation within the distribution for each version.

You can see from the below which shows a count of profitable sales (what I called accuraccy) that Version 74 is making lots og positive actions more so than the other versions and this

I have a antithesis of this hypothesis that the AI is already trading effectively and the outliers with lots of loss money are both the obvious outliers but can also be explained in that there is different methods of making money o stocks. One would be that you predict the market and trade frequently and take risks having taken risks you staircase up your investment into a larger pot though your always at risk of losing it all.

If you look at stocks long enough you realise nearly all healthy stocks trend up with inflation over the course of the macro scale of say 10 years. If you buy stocks and hold it for a long enough time then it is the simplest method of being successful. Though that leaves a unfortunate problem that at exactly what point do you decide to sell and if your aim is to make money then why not wait another 10 years then another 10 years? And while stocks trend up what about the risk of a market collapse? And when you finally sell you may see the stock go up in price anyway causing sellers remorse.

So there seems to be a risk here that an AI that makes multiple profitable sales also is likely one that risks getting it all wrong and the AI that buys in and holds a stock for long period.

Stocks traders have often been described as bullish or bearish and that really in any trade there is really only two approaches one that of the bull

The below shows that the AI that risk having a lower average profits may also be the ones that are desirable and there might be something worth looking out for that AI with memory as I go onto in my hypothesis I am trying to build may not actually be the AI most suitable to the trading environment. So worth considering profitability might not be the trait to select for...though accuracy is a poor replacement when the market moves and the AI loses the principle...

Given that many unforeseen major market moves are caused by the newsroom not the market data I am during all this considering an inverse hypothesis that actually Version 74 is the best version in that it is doing what is intended considering market data. The ideal would be if I find a version that trades more, with higher accuracy and profits.

Hypothesis

What I have found but not substantiated fully is that it appears that the AI that lose profits actually get into a habit of trading and trading successfully possibly even daily but potentially only for 1 dollar. There is then a system death where the market conditions change they continue making the same mistake which due to the past successes is heavily reinforced in the AI's behaviour but causes the AI to lose its principle (the initial money put up to buy the stock). Because all the tests are zero centred and include commission and fees in calculations the AI loses and loses big.

The problem is that to undo all that reinforcement not only requires lots of mistakes trying the same wrong thing multiple times but afterwards it does not guarantee that new better behaviours will be demonstrated it could just (to use some unscientific terminology) lose the plot and start trying to buy and sell the stock attempting actions without finding a new pattern that works for it.

Currently the theory behind the AI is that it concentrates a whole bunch of data points into a single decision point as a reinforcement learning task. If it changes states i.e. buys or sells it is always based on the data available at that point in time. The AI having no memory my view is that the machine doesn't really have any continuity of decision making.

My view was that to improve the AI if it had memory then upon the market changing it would lead to a faster adapting AI as the AI would recognise the market change from markers in memory and rather than repeating the same action's over and over and over time would start adapting to new behaviours. Or at least that is what I hoped...

I then came up with series of test cases that implemented some form of memory within the AI. My inspiration here was Synapses in biological organisms. AI have neurones but seem to lack synapses we model the decision making inputs and outputs of neurones but we don't have a concept in AI for the affect of propagation of energy along synaptic pathways.

My theory is that in biological organisms short term memory persists because when a Neurone fires in your head the energy travels along the synapse and while we tend to think of it in terms of electrical current biological things are well wet and not timed perfectly. I am no neural scientist but my reading suggests after a neurone fires it spends some time resting and reabsorbing energy which I understand to be from the synaptic pathways.

My reading implied there was time delays in neurone firing, delays in resting before firing again, that the firing of a neurone affected the synapse and the absorption and resting of a neurone might mean a neurone reabsorbs energy from the synapse. I rethought my thinking why we have activation functions in neural networks as a form o energy loss and thought neurones within the brain must have forms of momentum and connections to what they can output now to what they where doing previously. I began doing tests with neurones with persistence states or memories.

I still have a long way to get this information together and format it. As you can see I have more than one working prototype of these memory/momentum AI's and it has with the data I have compiled so far worked.

The various prototypes I have are all forms of momentum AI. I hope one day they might be useful for high frequency trading or Quants.

Conclusion

It does seem to work a number of the prototypes work. They have both traded well enough and increased profitability. Though I have yet to find a variant with the magic combination of increasing the amount of trades and accuracy but I am uncertain if that is the thing I ought to be focusing on.

The thing I have learned from this was to create structured experiments and try changing individual things and see the affects. Interesting my explanation that it relates to buying and holding does not seem to be universal

Also strange that individual changes can affect the AI behaviour in profound ways.,

Current Data

Below is the behaviour of the current prototypes. I have a lot to get through to finish checking all the improvements.

profits version 74 averaged 20.579640209161802 with sample size of 513, version 75 averaged 46.976593752293546 with sample size of 436, version 76 averaged 58.07296118345325 with sample size of 417, version 77 averaged 42.4948578577806 with sample size of 784, version 78 averaged 58.70180394949493 with sample size of 495, version 79 averaged -37.44888789655168 with sample size of 29.

Accuracy version 74 averaged 14.883040935672515 with sample size of 513, version 75 averaged 2.38302752293578 with sample size of 436, version 76 averaged 1.4916067146282974 with sample size of 417, version 77 averaged 0.9400510204081632 with sample size of 784, version 78 averaged 2.81010101010101 with sample size of 495, version 79 averaged 70.0 with sample size of 29.

Shorts version 74 averaged 3.892787524366472 with sample size of 513, version 75 averaged 0.75 with sample size of 436, version 76 averaged 0.4724220623501199 with sample size of 417, version 77 averaged 0.3022959183673469 with sample size of 784, version 78 averaged 0.8525252525252526 with sample size of 495, version 79 averaged 22.103448275862068 with sample size of 29.

Max days a stock held on average version 74 averaged 98.23586744639377 with sample size of 513, version 75 averaged 67.47018348623853 with sample size of 436, version 76 averaged 117.13429256594725 with sample size of 417, version 77 averaged 112.98979591836735 with sample size of 784, version 78 averaged 59.83232323232323 with sample size of 495, version 79 averaged 98.89655172413794 with sample size of 29.

Traded version 74 averaged 44.94931773879142 with sample size of 513, version 75 averaged 6.814220183486238 with sample size of 436, version 76 averaged 3.6019184652278176 with sample size of 417, version 77 averaged 2.372448979591837 with sample size of 784, version 78 averaged 7.638383838383838 with sample size of 495, version 79 averaged 198.79310344827587 with sample size of 29.

Versions

Version 74 is a standard neural network.

Version 75 and 77: This neural network had its values tied as a proportion of previous outputs and current.

Version 76 This neural network had its values tied as a proportion of previous outputs and current.. But before activation function is applied.

Version 78: sum of outputs on a neurone before activation function applied.

Version 79: The MTEM basis of the AI is a network of neural networks. When I coded for this it was possible for closed loops to form between two networks and I noted this seemed to be improvement. This version and several others the intent is to test what is working here and what isn't. This still works as a form of memory as the states of the networks become cyclically passed between each other.

« Previous Visualising a reinforcement learning agent Try To Remember Please - Part 2 Next »