“From nothing else but the brain come joys, delights, laughter, and sports, and sorrows, griefs, despondency, and lamentations.” – Hippocrates
This is a blog of me trying to do strange things with neural networks. In the last week I got my hands on the maths for a human neurone with the addition of a Ca ion channel over a squid neurone that I had been playing with previously and combined it with my experiments in training said squid neurone which showed its performance against random chance. Unsurprisingly the human neurone won that fight.
These neurones differ from normal artificial neural networks neurones in that they are differential equations. They fall under the category of spike neurones in artificial neural network literature such that they have a temporal ebb and flow to them that sees them initial spike and then activity falls off; and I would explain them simply as having a sense of time and rhythm rather than directly mapping inputs to outputs. There is a theory that more information is stored in the temporal elements of when a neurone spikes rather than how much and according to this theory they should perform better than regular AI and They are essentially a more intense simulation of the neurones in our heads than present AI and while I am ignoring wider aspects of temperature, glial cells and other parts of simulating a human brain the intent is to have a crack at simulating a brain itself.
Yes I am trying to simulate a brain. Sounds shocking but I have most of the basics down a piece of software that has reduction in error as it gets bigger and I'm running tests until I get it all down.
I have been interested in the differences with artificial neural network activation functions and the behaviours of a simulated human neurone for some time and have my own software written for the purpose of simulating neurones. This software simulates a human neurone at the individual neurone scale rather than as blocks of linear algebra.
in theory if I had a map of the human brain and a big enough computer I'd start having a go at simulating the whole thing. Probably fail but worth a shot right?
Intent of the project
The neurones themselves take up 240 bytes (I wrote it in C++) with a additional 2 bytes per dendrite simulated so assuming the brain is 16 billion neurones your looking at 3.84 terabytes. Each of these neurones can have up to 15 thousand neurons each which means the connections would be 24.8e+14 in terms of bytes. That is a lot of bits more than I have , more than if I spent 10k on a server I could have. So simulating a human brain is too much for me right now (also to add I am quite a cheapskate and would have waited till I could simulate it for less, £100 seems reasonable).
Though a super computer has between 200-500 terabytes in ram so its not outside the realm of possibilities, Regardless what matters is I can really only investigate this thing at its smallest scale.
so without a super computer I cannot do the whole thing currently but I can start running projects on a few different areas to see how I can improve both closeness of the simulation to reality and what I can learn about building a useable piece of compute out of these wierd spiking neurones.
So the purpose of this article is a statement of intent, having got a method to see how far you can take it to simulate a brain on a home PC and with respect to the transformer model and current AI discuss methods to working towards full true AI. Because at the expense of being glib I am grateful chat GPT can do my homework but I resent its not a brain.
So this article is intended as a gap analysis of the current technology and how we could progress to true AI.
Current Technology
What I keep finding time and time again is when researching this we have artificial neural networks, we have differential equations. The human brain by both definitions appears to be a neural network composed solidly of differential equations at a scale as already discussed high enough to make a super computer blush.
We cannot find the source of memory in a brain and I have not found anything that safely explains how the brain remembers but it is evident that it seems better anecdotally than ANNs. In a artificial neural network like chat got often has dedicated long term short term memory system. They often also use recurrent connections between layers. If you look at the criticisms of which often site concerns over the exploding and vanishing gradient problem.
You look at the transformer, Chat GPT there is no differential equation's there. The interesting thing about ANNs and AI all over the place is once the language is taken away it's just a lot of linear algebra and its questionable if the main nodes in the graph representing that ANN really do relate back to how the brain works at all. Now I really like ANNs and transformer improvement has been really interesting but its really clear there is differences in how our brain works.
Our brains seems vastly more complicated just at the neurone level.
So let us put aside the success of the transformer and as a hypothesis just assume it's never going to get all the way to AGI. Now proponents of the transformer would point at its impressive scaling and successes there but putting current AI success and assuming the critical path to AGI was by simulating the human neurone more closely how would we approach this?
The Neurone Itself
The neurone itself is a differential equation based system and in really simple terms what I mean is something like the below where it's output after a single instance of stimulus thereafter proceeds over time and has a "wobble" to it. This has the benefit of not re-firing if receiving further stimulus and this benefits the brain because it is fairly hard to then have the brain become .
The neurones I have built are software based and are a product of a lot of A and B testing the problem is we don't have a reliable training method for a human simulated neurone as it's hard to tell how it learns which seems to be in reaction to chemical changes and has multiple mechanisms of change some that apply changes to learn against stimulus that take place over minutes, some hours, some days.
it's not just gradient descent, or any one method so this has been a laborious trial and error process for me looking at what ways you can modify a brain simulation that helps it match and learn from stimulus.
This should then work to ensure the "brain" never bottlenecks around a certain point. It seems advantageous over the ANN to have this behaviour.

My current tests seem to indicate this that the human neurone tests are really malleable and can (as in the graph below) keep on expanding laterally (i.e. not stacked but taking part in the same calculation).
This is using the same learning mechanism which built for the squid neurones and which Im still working on. Though the fact it scales so well has been the first indicator to me that might be working on something interesting.
Doing this test on the earlier models often does what I describe as "bottlenecking" where the addition of more neurones does no longer give a increase in performance. This so far really does out perform on that metric. It's also a pointer that our neurones might do something special that ANNs do not. Though which proponents of the transformer model would probably argue that the transformer model manages this by addition of more self attention heads and remains scaleable past what I'm over to show here which is really tiny by comparision but these AI are trying to learn English natively and directly with no tokenisation but still show a error reduction which is a task a transformer cannot do.
So on the point of architecture I'm still searching around in the dark for the brain simulations upper ends and how to architecture it. It is probably worth saying that the transformer and attention mechanism is much better at doing what it does and scaleability than mine.

So my first point of learning is going to be trying different neurones, configurations and the current training regime. I am running this as a set test with my Bayesian Turing test that I created all those years ago being used to detect improvements in performance.
The Brain
The brain itself is very interesting the below comes from the Nilearn python library and represents the connections between different sub networks in the brain and on a whim I hooked up a simulation based on its highest granularity and it score really well. I can only say anecdotally because I have yet to quantify this but it does seem to perform better when modelled as a fractal system.
There currently is no cell level models of the human brain. We are at the level of really modelling sub networks in the brain system and by observation of what parts of the brain seem to light up at the same time. In the future cell by cell mapping if individual sub networks is going to start but anecdotally I've read that at a individual level we start to be more individual and random past the point of the sub networks we have identified so far.
The models of the brain and it's neural connections we have are called Connectomes (which I think is cute). I have to do a deep dive on this at some point but basically I really have not looked at even a fraction of this. It's interesting to think are we the connections in our brain and how it networks or are we the weights in those connections? Even when we start getting these connectome maps to the brain we probably won't know what are the key things to pull out.
This does seem to follow what the research into the brain says it is a complex fractal model, the fact that two given neurones not being connected does seem to matter. In a artificial neural network like chat GPT it has attention heads, and layered encoder as a input for query with a decoder to extrapolate output; by comparision it is hard to model the complex fractal topography of something like our brain which does not seem to have "layers" per say in the way that a ANN does but in a sense everything that fires together wires together in a brain respective of its location makes a problem of sorting out an artificial simulation of one.

Following the above being true; The brain is a fractal system that managed to represent inputs and outputs at multiple levels of granularity due to its fractal nature. The problem is that then gaining a understanding of that fractal nature is likely to be quite hard to quantify.
For example below is two graphs of similar shape in how I connect the neurones in the simulation together with a small difference by adding or admitting gaps in the connection formula one does have a decidedly better jump in quality than another.
It might be a bonkers idea but potentially the shape of the brain really does constrain and change the maths that happen inside it. It sort of creates a problem are we our neurones with the strength of connection between individual neurones being the main feature that determine performance or is it overall structure. You could make a argument that structures is more genetic factors and strength of connection more environmental but when you realise the whole of this thing is evolved over million of years and probably can change the strength of a connection almost in real time but also change its rewiring because of what we are just discovering about neuroplasticity you do realise these mathematical simulations are amazing but also highly limited.


Summary
So in all likelihood I'm not going to build skynet (a boy can only dream). But I feel that I can have a play at building a more in depth model of a brain over current ANN technology and just for fun I'm going to have a go at doing that.
I have identified a few places where I'm working on to explore building out this simulation.
One is the neurones themselves not knowing really how to train them I'm going to run longterm A and B testing between different training methods on their differential equations. This is going to be looking for reductions in error and any spontaneous evidence of "grokking" it and formation of words. (Don't worry it hasn't done this yet but the code would alert if it did).
Secondly going to deep dive on connectomes and see if the shape of the brain really does affect its performance and I need to try this both with artificial shapes and network topography described in connectomes. Currently what I've seen so far says yes it does seem to matter a lot and it's not enough to stack them like how do in artificial neural networks.
finally I may need to get a new phrase than calling them brain simulations it's really clear there is a gap between ANNs and our full brain and someday someone might get annoyed plus I'm not sure I want to call them BS... I'll have to think on that one...
References
Attention is all you need
[1706.03762] Attention Is All You Need
Speeding up transformer training by increasing model size
Add comment
Comments
Very interesting. What a lot of work has gone into this.