"Life is not determined by consciousness but consciousness by life" - Karl Marx
So I started a series called hello world a year ago on his blog and at the time I knew a "OK" amount about Neural Networks but not about self attention or transformers. I therefore went heavily on my own way in building my own model and focused heavily on recurrent neural networks and may have kind of jumped the shark a bit by thinking in terms of consciousness.
At the time this might be somewhat justified as I can see several parallels with recurrent Neural Networks and how we talk about thought such as rivers of thought etc.
It was about this time that this happened so I thought best keep my stupid ideas to myself; but in he intervening time ChatGPT went viral and I started getting better at building my models. So I thought I would return to this crazy idea of gulp AI Conscious...
Google engineer put on leave after saying AI chatbot has become sentient | Google | The Guardian
This is not to say this is a definitive guide to this subject it is more 1) I spent a lot of time learning about recurrent neural networks and got pretty good at building them 2) we don't have any idea what consciousness is...
So because point 2 means you cannot really say what you mean definitively but you could propose a idea.
So I am categorically not saying hey I built a Conscious AI it is merely upon self reflection I thought a blog post on this topic would be useful as a conclusion to this series on this blog but also being a empiricist I think the question of Minds in the Age of AI might be best if you go and build the mind to then critique and see f it works... If it doesn't work well that's one idea less to explore and compare that to philosophers prior to computers f they had a idea about the mind and they where wrong there was very little you could do to argue with them.
It sounds arrogant to say I am going to try and build a mode of artificial consciousness but I think this is less arrogant than me quoting dead philosophers at you and insisting you take them seriously. What is more pompous quoting Nietzsche and claiming t have some insight into the nature of reason itself or trying to build a machine that does the same and claiming you might have a idea how a mind works because you built one?
I think if someone thinks I am a bit crazy hey we can sit down and talk and discuss what you would change and then we can rerun the test and see if the AI performs better on a agreed upon test. That way we should eventually either get conscious machines or agree its impossible and stop.
In Earlier Episodes
First in series I tried to get memory in Neural network to work. This version of the network did not in the end share any relationship with final network.
Hello World | A Logic Called Joe (webador.co.uk)
I developed a test for intelligence. I called it Bayesian Turing test; where test for the entropy of the AI saying the thing it says. Which in even plainer terms means if you get a 1 its a 1/10 chance that the AI would match what your saying. A 6 a 1/1000000.
At the time I thought a 100 or so was really good.
https://alogiccalledjoe.webador.co.uk/798739_hello-world-measures-of-language-skill
I then ran the test.
Hello World - A AI Speaks | A Logic Called Joe (webador.co.uk)
I then used the test to rule out some architectures for network designs such as the blobs.
Hello World - Blob Intelligence | A Logic Called Joe (webador.co.uk)
I then sort of had a conclusion but deleted it because it was very trippy and artsy and I had some numbers showing AI could in fact learn English and that was kind of like consciousness (!?!). I did not like that it was a measure of AI but I felt like I had not really "proven" anything nor could I say well I think conscious is X. I had and still hold the belief that conscious is a wave that oscillates across neurones which individually change it but across the whole network is more than the sum of its parts being a wave or oscillation that picks up all the inputs across the network and is greater than a simple neural network.
Not A transformer
So I am not 100% sure on this but I do not think I am using a transformer. It does not use self attention maybe hat would perform its performance.
Its recurrent and therefore its inputs are a set of outputs its capable passing back to itself. I like this design it feels very much like a Ouroboros the snake eating itself.
Lots of regularisation and so much so I do not know anymore what specific type of ridge regression it is.
I tried connecting network at random and I tried to teach myself graph theory. The below is what happens when you give a neurone a random location in a X and Y grid and connect it to the closest neighbour.

Though if you repeat the test and randomly wire them up completely at random with no sense of space they also do not perform any better in fact they perform worse. So two conclusions about architecting big neurone networks they A) do not appear to be connected entirely at random but B) do appear to have a space metric.
This makes sense a Neurone is more likely to be connected to Neurones next to it than really far from it but the wring schematic is also not random. The schema I ended up using is not random and takes into account a certain space so signals propagate forward through the network and across. I hoped this fractal nature would let it have signals that where recurrent and would cycle back into the network but could be kept from affecting the output and therefore you could have a AI thinking about what would happen next without having to output it (i.e. say it).
I also would not train it in wake mode and have inputs from live data it would just train on by moving these signals which where not connected to its output around its head. The words it was trying to guess was from a book and was testing it on when its complete word output was the same as in the book.
I would use my Bayesian Turing test approach and a measure of error.
So in summary you would get a fractal AI that "speaks" in English, had no ears to hear and would measure it on the unlikelihood of t saying the things t said and coinciding with the book it was targeted against. And I spent the next year getting the dam thing to work... ran 480 versions...which included 23444 different networks of various sizes and traits... but below is the results...
The Test
So that got a about 138369.14915966179 which would mean it got a likelihood of it saying things as a 1/(10^138369.14915966179) which as you can guess fills out a whole page. It actually no longer improves if add actual inputs and is improving both on its test and showing a reduction on its error over time being both more accurate in terms of words that lines up with the text and just being accurate by being closer to "right".
The reason why some of the numbers look close to zero is because they might be around 9k but the 140k values the network starts counting the page numbers.
I still have a few tests I am trying to A) make sure its really scalable as you can see that's not a massive amount of neurones. I also have a few more tests left on alternatives to get the numbers higher. When I am done I want to try and get it to write a book which might not happen. Just because I can detect English in this AI it doesn't mean the words in between when it lines up are particularly good and that is where I am at that it probably needs both size and more data.
There is about 30 million entropy points n the book (so max score of this test is 30 million) it got 130-140k but from this we can estimate how large a network would have to be to get al that data (in reality that would be impossible) but originally I estimated you would need a trillion neurones using these numbers you can see in the appendix but from these improvements I got it down to 7,500 neurones would in theory just absorb all the information in that book.

Conclusion
So to me that feels like I can say that is in the ball park of what a conscious AI would look like and a decent working prototype for saying I think conscious is like a wave transmitted back into a network of neurones. Maybe not definitively but you can see why I feel that I can sort of say recurrence in neural network is sort of like consciousness.
- Both have a flow we talk about a river of consciousness the AI output flows back n as its input.
- Both the AI and a person learn English by exposure it only gets "exposed" but it ends up "thinking" in English.
- Learning is based on error propagation and sleep mode learning. In essence the simple way to put it would be the AI is "dreaming" in English.
- I feel like when someone expresses a theory about the mind I don't like suggest they read this blog post and ask they build something similar :-).
Conclusions about the AI itself.
- It is very small in many ways so the insistence on big and massive language models might be wrong, in fact it gets a massive jump in intelligence at start but thereafter improvement slows. I would argue there is little "proof" that big massive networks are always he way to go. It is likely they are better at generalising but this shows AI can be beautiful.
- Big leaps in intelligence where made when I found and learned new maths principles.
- I tried a hell of a lot of bad maths principles, I don't think the conventional wisdom on AI is as "good" as people say. I think a lot of the ideas out there are probably right for them or maybe niche uses which might indicate that future AI developments will see AI become more niche and tailored to specific uses.
- didn't need self attention.
- I am still trying to think about applications for my sleeping AI (anyone want to see what it does predicting stocks?).
- It did not need data being trained in a sleep state.
- I did not mention anything about specific long term memory and I did not need to use that.
To return to the above quote, "Life is not determined by consciousness but consciousness by life" - Karl Marx
I suggest that this test is in keeping with that quote that merely by pushing the error of the network up and down and recycling it back to itself worked suggests something about the mind that in fact that spawned words at all should be reflected upon.
The AI if it is at all conscious or even in the ball park (or even in the car park of the ball park) then how we think might be determined by the sum of all the feedback good and bad in a life. If your mind is recurrent like this AI is then your thoughts feed back into you meaning you also might be a strange oscillating wave summing up the ups and downs of a long life and trying to predict them.
Questions about modern philosophy
- I think in this network the output re-enters and echo's in the network being changed and its the change to this signal over time. Well are you he weights and strengths f the individual neurones that changes that signal or signal taken as a whole itself? I would suggest both which would preclude being overtly Gnostic about existence. The signal goes with the body but you cannot pull one signal neurone and call it the whole of the wave happening in the head.
- There is a outside world to the AI and that determines its inner reality but the AI own mind does not have any real connection to its text world it inhabits. The real world might exist but we might be completely disconnected from reality only getting limited signals back from the "real world".
- It nonetheless predicts its outer world and starts to match up with it and its inner world is determined by outside signals.
- Maybe imagination or thought is just this a big brain trying to create a simulation of a outside world which gives evolutionary advantage for knowing how the outer world works. This just needs to be a approximation or a story that lets the brain build a response to outside reality.
- This test shows a network which looks like it can explain that outside reality fairly and surprisingly well which would give us some comfort that big brain apes like ourselves might have a pretty good grasp on reality and it not just all be a simulation.
- Alternatively it could all be a simulation but I prefer the above, the world is real we are just peeking at it through the tools evolution gives us. But yes maybe that wall only renders in the simulation when I look at it...
- didn't need self attention, maybe the mind is not that egotistical after all...
- You can build a mind in Python and dare someone to tell you your wrong... I look forward to hearing about what you might write...
I like that as a set of philosophical ideas I feel some peace with the idea I might be a wave in a electrical system both being part of it. If you where a materialist you would emphasise the role of the body i.e. the weights and neurones in that network in sustaining that wave if you where overtly spiritualist you may question if the wave is inseparable from the body and you might call it a soul.
I just like that I can pose these questions by writing code in Python.
Now the above is over and I tried 20k+ of neural networks I am going to make a economic game about the Fermi paradox for a bit...
Appendix
Below is a data dump of the test data showing its regression and size increase in efficiency. I then put any further reading for any article referenced afterwards.
version
472
Highest Score
138369.14915966179
depth inputs error
0 0 0 4.878868
1 1 0 2.968600
2 0 1 2476.708017
3 1 1 1669.874351
4 2 0 7.141319
Total no of training examples (m) = 189
X = [3. 1.] , y = 9566.563794833983
X = [3. 2.] , y = 65205.09182250079
X = [2. 3.] , y = 82409.43959997174
X = [3. 3.] , y = 113968.47477559047
X = [4. 1.] , y = 72606.3668652366
X = [5. 1.] , y = 91626.38583435582
X = [4. 2.] , y = 3144.848556758294
X = [5. 2.] , y = 116014.73588987859
X = [4. 3.] , y = 114424.94937608176
X = [5. 3.] , y = 3310.8715097990034
X = [2. 4.] , y = 42968.434814248256
X = [3. 4.] , y = 118814.85912514957
X = [4. 4.] , y = 106869.08655737233
X = [5. 4.] , y = 89091.02331575434
X = [2. 5.] , y = 16437.067424994115
X = [3. 5.] , y = 12947.182314001251
X = [4. 5.] , y = 108457.22531486444
X = [5. 5.] , y = 5366.389020306608
X = [7. 1.] , y = 34094.528558420236
X = [6. 2.] , y = 88281.44648343184
X = [7. 2.] , y = 87139.56096052323
X = [6. 3.] , y = 9000.725925026805
X = [7. 3.] , y = 8898.119041392081
X = [7. 4.] , y = 132388.59581942507
X = [6. 5.] , y = 9124.041409719344
X = [7. 5.] , y = 15350.715354659427
X = [0. 6.] , y = 3182.0766901304514
X = [2. 6.] , y = 29349.063997536217
X = [4. 6.] , y = 93031.53622286531
X = [5. 6.] , y = 135979.8832556416
X = [6. 6.] , y = 3082.3121320583277
X = [7. 6.] , y = 117198.53911816051
X = [2. 7.] , y = 107020.53394200612
X = [3. 7.] , y = 10867.729272997292
X = [4. 7.] , y = 51788.355486996385
X = [5. 7.] , y = 35394.95617048071
X = [7. 7.] , y = 129339.08916418768
X = [9. 2.] , y = 112045.5676508617
X = [8. 3.] , y = 56288.022391019236
X = [9. 3.] , y = 13736.379661959445
X = [8. 4.] , y = 131511.76123903572
X = [9. 4.] , y = 89655.00660408262
X = [8. 5.] , y = 138193.97187169918
X = [9. 5.] , y = 128869.5575223692
X = [8. 6.] , y = 49590.78985388671
X = [9. 6.] , y = 8213.74302490367
X = [8. 7.] , y = 87673.95728305289
X = [9. 7.] , y = 132927.56219593727
X = [4. 8.] , y = 9216.725592882356
X = [6. 8.] , y = 111207.4098643263
X = [7. 8.] , y = 5795.357072089425
X = [9. 8.] , y = 3040.632863578431
X = [2. 9.] , y = 8999.374385255443
X = [3. 9.] , y = 5417.4384116706715
X = [4. 9.] , y = 138123.5629335798
X = [5. 9.] , y = 83962.54401976183
X = [6. 9.] , y = 45516.044447590095
X = [7. 9.] , y = 17208.35148041292
X = [9. 9.] , y = 14744.162435669286
X = [11. 1.] , y = 114971.22357695828
X = [10. 2.] , y = 118990.03661449268
X = [11. 2.] , y = 8890.699133070908
X = [11. 3.] , y = 79122.22495996919
X = [10. 4.] , y = 130408.15553648185
X = [10. 5.] , y = 9280.632959781049
X = [10. 6.] , y = 25397.848861917137
X = [11. 6.] , y = 105480.83261263272
X = [10. 7.] , y = 10802.01133361254
X = [11. 7.] , y = 13019.00781979702
X = [10. 8.] , y = 33692.94117779898
X = [11. 8.] , y = 125824.55555691766
X = [10. 9.] , y = 131485.9446656902
X = [ 2. 10.] , y = 131560.92036773593
X = [ 3. 10.] , y = 119453.11399189023
X = [ 5. 10.] , y = 47512.640727567654
X = [ 7. 10.] , y = 88644.38108138795
X = [ 8. 10.] , y = 30246.964221699887
X = [ 9. 10.] , y = 14013.08030862231
X = [10. 10.] , y = 88169.10760896635
X = [11. 10.] , y = 94316.69177493246
X = [ 2. 11.] , y = 85473.16304701436
X = [ 6. 11.] , y = 123511.94874658658
X = [10. 11.] , y = 7964.701156701175
X = [11. 11.] , y = 121849.65438792405
X = [12. 1.] , y = 116279.18559986212
X = [13. 1.] , y = 86141.64521766554
X = [12. 2.] , y = 80313.97012410958
X = [13. 2.] , y = 36150.39172677101
X = [12. 3.] , y = 78658.19598601444
X = [13. 3.] , y = 136734.89831694006
X = [12. 6.] , y = 59197.85025980792
X = [12. 7.] , y = 118680.77783900888
X = [12. 8.] , y = 22199.980039172216
X = [13. 8.] , y = 132583.97176881862
X = [12. 9.] , y = 41156.91431704723
X = [13. 9.] , y = 8863.441298183152
X = [12. 10.] , y = 13626.336180715169
X = [13. 10.] , y = 128388.27187253747
X = [13. 11.] , y = 29632.15235107567
X = [ 0. 12.] , y = 3056.305119019205
X = [ 2. 12.] , y = 40108.854818769025
X = [ 3. 12.] , y = 72717.08977594094
X = [ 4. 12.] , y = 8985.173305230319
X = [ 5. 12.] , y = 111484.6510209142
X = [ 6. 12.] , y = 3608.1292211077425
X = [ 7. 12.] , y = 123626.77104264348
X = [ 8. 12.] , y = 92189.3708715345
X = [ 9. 12.] , y = 64925.533904480064
X = [10. 12.] , y = 135928.3381624451
X = [11. 12.] , y = 36467.14452979713
X = [12. 12.] , y = 6191.93915042447
X = [13. 12.] , y = 138369.14915966179
X = [ 4. 13.] , y = 102300.9740080176
coef= [3000.11619518 1079.36811155]
intercept= 14680.177344236683
hard error
Total no of training examples (m) = 189
X = [0. 0.] , y = 2635722.8167143273
X = [1. 0.] , y = 2639210.1615216983
X = [0. 1.] , y = 1202699.9264540959
X = [1. 1.] , y = 1261636.6080598165
X = [2. 0.] , y = 2640453.0907667084
X = [3. 0.] , y = 1173052.6959436608
X = [2. 1.] , y = 1089646.1506698916
X = [3. 1.] , y = 1538060.5596073207
X = [0. 2.] , y = 1314364.867192729
X = [1. 2.] , y = 1198141.7140967904
X = [2. 2.] , y = 1095392.2654636782
X = [3. 2.] , y = 1458352.858116253
X = [0. 3.] , y = 1381897.906228549
X = [1. 3.] , y = 1122929.1902553525
X = [2. 3.] , y = 1440945.3343051982
X = [3. 3.] , y = 1250282.8796052039
X = [4. 0.] , y = 2618420.559134831
X = [5. 0.] , y = 1154057.4604568828
X = [4. 1.] , y = 1452146.1309685244
X = [5. 1.] , y = 1423620.5484085556
X = [4. 2.] , y = 1141403.8506548605
X = [5. 2.] , y = 1252059.4775305407
X = [4. 3.] , y = 1372523.086760373
X = [5. 3.] , y = 1115130.064020463
X = [0. 4.] , y = 1216082.3503326897
X = [1. 4.] , y = 1097456.9256860658
X = [2. 4.] , y = 1193976.2196609825
X = [3. 4.] , y = 1360382.1464357108
X = [4. 4.] , y = 1387918.7840113414
X = [5. 4.] , y = 1429277.191842464
X = [0. 5.] , y = 1095684.4862529237
X = [1. 5.] , y = 1077735.134004862
X = [2. 5.] , y = 1160677.6085049782
X = [3. 5.] , y = 1154077.4939615023
X = [4. 5.] , y = 1384767.7134532288
X = [5. 5.] , y = 1127173.5611937519
X = [6. 0.] , y = 1147838.304821147
X = [7. 0.] , y = 1124570.8524489175
X = [6. 1.] , y = 1235828.9665541034
X = [7. 1.] , y = 1487429.8114059572
X = [6. 2.] , y = 1430708.067700147
X = [7. 2.] , y = 1229770.462452774
X = [6. 3.] , y = 1556236.8771872704
X = [7. 3.] , y = 1576434.2782247204
X = [6. 4.] , y = 1103153.7451969294
X = [7. 4.] , y = 1271136.167250442
X = [6. 5.] , y = 1550273.72351167
X = [7. 5.] , y = 1158959.5667930476
X = [0. 6.] , y = 1154151.0653839966
X = [1. 6.] , y = 1104294.0880938787
X = [2. 6.] , y = 1179653.4082363173
X = [3. 6.] , y = 1380804.4651211137
X = [4. 6.] , y = 1234256.7887773744
X = [5. 6.] , y = 1297665.8260364668
X = [6. 6.] , y = 1135184.7459107884
X = [7. 6.] , y = 1253127.3135980114
X = [0. 7.] , y = 1188905.8890278346
X = [7. 6.] , y = 1293990.5805188157
X = [1. 7.] , y = 1281223.0144374212
X = [1. 7.] , y = 1129553.4620209131
X = [2. 7.] , y = 1387725.7516956942
X = [3. 7.] , y = 1148376.276996805
X = [4. 7.] , y = 1202376.738080193
X = [5. 7.] , y = 1486123.104787519
X = [6. 7.] , y = 1073514.2184994367
X = [7. 7.] , y = 1324599.9760769857
X = [8. 0.] , y = 2640541.4474168113
X = [9. 0.] , y = 1131224.212631044
X = [8. 1.] , y = 1184124.1320524681
X = [9. 1.] , y = 1080277.376525605
X = [8. 2.] , y = 1062624.5802637094
X = [9. 2.] , y = 1248701.2687316302
X = [8. 3.] , y = 1466332.6095307115
X = [9. 3.] , y = 1155804.2297979721
X = [8. 4.] , y = 1313977.9004910118
X = [9. 4.] , y = 1428147.1027221808
X = [8. 5.] , y = 1284438.151000022
X = [9. 5.] , y = 1327093.2360181948
X = [8. 6.] , y = 1200501.712227144
X = [9. 6.] , y = 1139909.6862911743
X = [8. 7.] , y = 1230371.6352943105
X = [9. 7.] , y = 1308013.3738745793
X = [0. 8.] , y = 1229530.725915853
X = [1. 8.] , y = 1242417.332115103
X = [2. 8.] , y = 1062502.652770672
X = [3. 8.] , y = 1109061.4349668382
X = [4. 8.] , y = 1545874.007029652
X = [5. 8.] , y = 1080372.431807124
X = [6. 8.] , y = 1379168.5679798
X = [7. 8.] , y = 1129698.9231006254
X = [8. 8.] , y = 1314654.6875808425
X = [9. 8.] , y = 1168619.466967822
X = [0. 9.] , y = 1108181.2717245582
X = [1. 9.] , y = 1320372.0421037232
X = [2. 9.] , y = 1556709.5660412996
X = [3. 9.] , y = 1128209.2241582563
X = [4. 9.] , y = 1281406.009183678
X = [5. 9.] , y = 1438495.3550056836
X = [6. 9.] , y = 1476386.9344535088
X = [7. 9.] , y = 1162244.3098392885
X = [8. 9.] , y = 1080411.76128465
X = [9. 9.] , y = 1158028.7412961128
X = [10. 0.] , y = 2637069.166885816
X = [11. 0.] , y = 1169827.9342722648
X = [10. 1.] , y = 1306589.6313930573
X = [11. 1.] , y = 1371274.935933917
X = [10. 2.] , y = 1255222.396491431
X = [11. 2.] , y = 1578595.4071948666
X = [10. 3.] , y = 1252345.9120413673
X = [11. 3.] , y = 1223161.930161239
X = [10. 4.] , y = 1318303.4944759398
X = [11. 4.] , y = 1109078.0122514833
X = [10. 5.] , y = 1143657.1979010927
X = [11. 5.] , y = 1303367.2789655162
X = [10. 6.] , y = 1174706.188940723
X = [11. 6.] , y = 1390464.039623091
X = [10. 7.] , y = 1148256.2136104843
X = [11. 7.] , y = 1521510.1091040906
X = [10. 8.] , y = 1487969.5336728063
X = [11. 8.] , y = 1338983.240811119
X = [10. 9.] , y = 1314061.5457461267
X = [11. 9.] , y = 1069032.8973190428
X = [ 0. 10.] , y = 1062741.5818170873
X = [ 1. 10.] , y = 1312311.3658996108
X = [ 2. 10.] , y = 1269544.888038308
X = [ 3. 10.] , y = 1358427.5184805843
X = [ 4. 10.] , y = 1113217.357924923
X = [ 5. 10.] , y = 1198301.8903467888
X = [ 6. 10.] , y = 1094053.2557445224
X = [ 7. 10.] , y = 1429936.9744139577
X = [ 8. 10.] , y = 1180613.0438385375
X = [ 9. 10.] , y = 1156497.6091449412
X = [10. 10.] , y = 1430994.7429615366
X = [11. 10.] , y = 1415623.783242841
X = [ 0. 11.] , y = 1097944.422068236
X = [ 1. 11.] , y = 1124588.6475595478
X = [ 2. 11.] , y = 1228394.5346998533
X = [ 3. 11.] , y = 1208327.3545572015
X = [ 4. 11.] , y = 1169555.465186814
X = [ 5. 11.] , y = 1241824.3993182774
X = [ 6. 11.] , y = 1345176.0471537833
X = [ 7. 11.] , y = 1073333.3992190103
X = [ 8. 11.] , y = 1308482.3117476346
X = [ 9. 11.] , y = 1114793.9631760975
X = [10. 11.] , y = 1138774.2045903485
X = [11. 11.] , y = 1258038.7487528066
X = [12. 0.] , y = 1109256.883283398
X = [13. 0.] , y = 2614629.8368673916
X = [12. 1.] , y = 1368153.609829761
X = [13. 1.] , y = 1228977.5174255578
X = [12. 2.] , y = 1224179.6833911974
X = [13. 2.] , y = 1485380.6728413736
X = [12. 3.] , y = 1445816.1716288896
X = [13. 3.] , y = 1287774.9318226085
X = [12. 4.] , y = 1084129.5978588758
X = [13. 4.] , y = 1089591.0663341947
X = [12. 5.] , y = 1189946.4189258905
X = [13. 5.] , y = 1124698.127611977
X = [12. 6.] , y = 1208242.3795791594
X = [13. 6.] , y = 1128289.234842109
X = [12. 7.] , y = 1360875.087987771
X = [13. 7.] , y = 1245684.8642205908
X = [12. 8.] , y = 1170607.9822807817
X = [13. 8.] , y = 1271652.642813368
X = [12. 9.] , y = 1192187.869562545
X = [13. 9.] , y = 1579873.671793603
X = [12. 10.] , y = 1155587.516359128
X = [13. 10.] , y = 1265570.8555182249
X = [12. 11.] , y = 1104922.4823168821
X = [13. 11.] , y = 1179934.5532606845
X = [ 0. 12.] , y = 1166017.1561054387
X = [ 1. 12.] , y = 1063940.187303341
X = [ 2. 12.] , y = 1481550.5164028911
X = [ 3. 12.] , y = 1452057.9771865313
X = [ 4. 12.] , y = 1556428.746825093
X = [ 5. 12.] , y = 1378776.138035412
X = [ 6. 12.] , y = 1117771.2887416582
X = [ 7. 12.] , y = 1344594.87832022
X = [ 8. 12.] , y = 1421991.1729128284
X = [ 9. 12.] , y = 1212600.4285643483
X = [10. 12.] , y = 1297350.757972783
X = [11. 12.] , y = 1187573.4897871967
X = [12. 12.] , y = 1131901.61680325
X = [13. 12.] , y = 1285678.4254394094
X = [ 0. 13.] , y = 1366968.6043541634
X = [ 1. 13.] , y = 1272000.916059303
X = [ 2. 13.] , y = 1090552.4418671946
X = [ 3. 13.] , y = 1092181.795181028
X = [ 4. 13.] , y = 1395873.3036393106
coef= [ -2121.89112905 -23068.88648205]
intercept= 1462303.661173793
Further Reading
The big white paper on the transformer architecture. When say its not a transformer because it does not use self attention this s what I mean about.
What I mean by wake or sleep mode
Add comment
Comments
Interesting stuff