Perceptron: Understanding AI
Understanding “AI” today is complex; it has become a buzzword, mixed with terms like “Apple Intelligence”, “Augmented Intelligence”, and “AI” suffixed to every product name you can think of today. The terms and concepts are becoming increasingly complex as the years pass.
Let’s take a step back and revisit AI fundamentals from its inception.
Travel back in time
To truly understand AI, we need to travel back in time to the late 1950s, which was also referred to as the “First Golden Age of AI”.
Machine can learn
The term “artificial intelligence” or “AI” was coined in 1956, recognizing AI as a field of study within computer science. Initially, the concepts were theoretical and experimental rather than solving real-world practical problems.
Just two years later. In 1958, the world witnessed the first practical implementation of artificial intelligence, with the development of Perceptron.
Perceptron demonstrated a machine’s ability to “learn from data” and perform practical tasks like simple image recognition.
It is important to highlight “learn from data” aspect, as it is the foundational concept to the Machine Learning or ML. Perceptron is one of the first Machine Learning algorithm, where the machine is trained to learn from the data and improving its accuracy over time based on the feedback.
In Perceptron, the machine is trained on labeled data, so we call it a “supervised learning” and the feedback is given manually based on the error in predicted output vs actual labeled data.
Once the machine is trained on a sufficiently large dataset with numerous feedback cycle and adjustments, it is referred to as a Machine Learning Model or ML model. The ML model is then deployment to recognize new, unseen data; a process known as “Inference”.
Perceptron was a foundational moment in AI. Prior to this, machine learning capability were based on predefined hard-coded rules, which limited their intelligence on new, unseen data.
Biological vs Artificial Neuron
Perceptron was designed to mimic the biological neuron of human brain. A human brain has billions of neurons that form the complex network of neurons called “neural network” and perceptron is one of the world’s first artificial neuron and artificial neural network.
A biological neuron has Dendrites (source of input), Soma (information processor), a single Axon (source of output) and synapses (connections between neurons).
Similarly, artificial neuron share similar characteristic to biological neuron.
- Inputs / Outputs: Biological neuron have dendrites (inputs) and axon (outputs), where as artificial neuron receive inputs through input nodes and produce outputs.
- Learning Mechanism: Biological neurons learns from experience through a process known as synaptic plasticity, whereas artificial neuron learns by better-adjusting weights associated with each input node.
- Activation: Both biological and artificial neuron use activation mechanisms to decide whether to activate/fire the neuron to produce output; biological neuron fire when certain threshold is reached, while artificial neuron use activation functions (Perceptron uses step function).
Limitation
Perceptron’s artificial neural network consists of an input layer and output layer.
- Input layer consisted of many input neurons that receive multiple binary input data.
- Output layer consisted of a single neuron that produced single binary output (0 or 1)
Soon, it was realized that single output-layer perceptron had limitation in its ability to learn complex data. It was only capable of simple binary classification tasks such as image recognition (dog or cat), spam detection (spam or not spam).
Due to this limitation, there was a decline in AI research and funding, this period is also remembered as “AI winter”.
Breakthrough
In mid-1980s, a significant breakthrough in AI happened with the back propagation machine learning algorithm for training multi-layer perceptron (MLP).
Back propagation algorithm enabled multi-layered artificial neural network to learn from error in their output predictions by propagating errors backward in the network. This process adjusted the weighted connections between neurons, allowing the network to minimize the difference between predicted and actual outputs over time, thereby, significantly improving prediction accuracy.
Back propagation algorithm made it possible to train deep neural network with three or more hidden layers, leading to the term “Deep learning”, which was coined in 2006.
At the time, using CPU for deep learning was inefficient due it’s limited number of cores, which restricted the parallel processing capability needed to train deep neural network.
Soon, that changed in 2006 when Nvidia released GeForce 8800 GTX, revolutionizing parallel processing with 128 CUDA cores while CPUs at that time only had 1 to 4 cores (Core 2 Duo, Core 2 Quad).
In 2012, AlexNet changed the game in deep learning space by demonstrating the practicality of using GPUs to train large-scale deep neural networks. The deep neural network of AlexNet consisted of 8 layers, 650,000 neurons and 60 millions parameters (weighted connections between neurons).
In today’s comparison, GPT-4 has 120 layers, with 1 trillion parameters.
The advancement in deep learning algorithms and powerful GPUs opened the door for significant progress in Natural Language Processing (NLP), Generative AI (GenAI), Computer Vision, and many fields.
In 2024, Geoffrey Hinton along John Hopfield was awarded the Nobel Prize in Physics for their foundational work in deep neural networks, including back propagation.
Going further
This article covered the foundational understanding of AI from it’s inception. Hopefully, enough to unravel the AI terms and complexity for us mere mortals. For me, it was really fun to learn, explore and understand these foundational concepts for a better understanding of modern AI.
As I continue to explore and go deeper down the rabbit hole, I will share more articles like these to simplify AI concepts, keeping your and mine curious mind rolling.
As we learn more, neurons in our brain fires, forming new or reinforcing existing connections in our biological neural network.
Keep those neurons firing! — Puru