
Download your free copy of the latest Financial Technologist magazine here.
Neural Networks and Artificial Intelligence
I’m co-leading the FIX Protocol AI Working Group with Rebecca Healy for the Fix Protocol Organisation, but this is not my first encounter with machine learning. In the 1980s, I completed a university degree that included a module on machine learning. This was just after the first AI Winter, and we were using early machine-learning techniques, including backpropagation, to train a neural network to recognise handwritten letters and numbers.
Then, as now, training a neural network involved three important steps, 1–identifying the problems you are trying to solve, in this case, recognising handwritten characters, 2–selecting a model to train, out tutors choose an early neural network, with the late 70s sensation backpropagation, 3–obtaining the data to train the model on, made much easier in the case as it was all provided by my university professor and a forerunner dataset like the NIST SD-1 database.[1]
While my tutors made the three steps easy for us undergraduate students, they left me with a conceptual framework I still use today.
Question, Model, Data
- Question – what question am I asking?
- Model – which techniques am I trying to use to train my model to answer my question?
- Data – what data am I training my model on?
After my undergraduate degree, while the evolution of neural networks endured a couple of AI winters[2], Moore's law continued to increase the power of computers, the speed of networks, and the bandwidth and speed of processing unit interconnects, so dramatically, a modern AI computer is at least a trillion times faster than the 6502 I was training to recognise handwritten characters.
This increase in performance, combined with numerous advances in the types of models available to train to answer our question so far, led us to ChatGPT et al. and to the point where we are debating societally how close we are to Artificial General intelligence and what the consequences of that are.
This debate is happening while we deploy the latest hardware, models and techniques to address a wide range of problems, but one thing remains unchanged: This whole pyramid of questions on top and model in the middle rests on a foundation of data.
Data as the Foundation of Machine Learning
Whilst all three layers of the pyramid are equally important, the importance of correctly curated data is often overlooked. In much the same way the images in the characters dataset were anti-aliased to greyscale to facilitate processing, modern datasets need to be carefully curated. For example, if your dataset consists of prices and orders, are you sure that the time source for the prices and the time source for the orders use the same clock, all synchronised so that you can be sure of the order of events?
This is why the FIX Protocol AI Working Group and Rapid Addition are both focusing on data, the AI Working Group to encourage more standardisation and accuracy in datasets and Rapid Addition to enable our clients to curate their data to make model training easier and faster.
[1] https://en.wikipedia.org/wiki/MNIST_database
[2] https://en.wikipedia.org/wiki/AI_winter
By Kevin Houstoun, Executive Chairman at Rapid Addition
Download your free copy of the latest Financial Technologist magazine here.