AI Primer

Filed in: AIML

What is Artificial Intelligence ? This is a good question and can be broadly answered by saying a machine can be trained to recognise patterns to either predict or provide a new data set or image given enough data to draw its intelligence from.
So the question then is how does a machine provide a prediction or new data set ? Well this all comes down to its training.

Let's take a step back and think about what the practical real world uses of AI are and then you can see what the output of the AI is and infer where it gets its data from. You may not have realised but everyone that has ordered items online particularly from Amazon or has engaged with a Chatbot has actually seen the results of AI at work.
Here are some other uses of AI that you may not realize that can affect your daily life.

If you have a strange spend on your credit card it gets flagged to the card provider who usually calls you (anomaly detection)
Manufacturers use AI to predict the weather patterns months in advance so the can ramp up or down production of seasonal products (supervised learning)
The section on the bottom of an order page that says if you like this product then may I suggest these products or accessories (recommenders)
Engaging with a chatBot or ChatGPT (LLM)

All of these examples have one thing in common and that is the results are derived from a large data set. The results themselves differ in how the training is applied to the data set. AI itself is a broad category so let us split it down into simpler parts to explain the different types of AI.

Machine Learning

This type of AI involves taking a large data set and trying to predict some results from that data set. There are 2 basic categories here Supervised and Unsupervised

Supervised

This is where we provide a data set with a known outcome. ie given a set of variables x1,x2,x3 we know that Y always happens. We can train a model on those variables to determine a mapping between (x1,x2,x3) -> Y and then when we provide a new set of x1,x2,x3 variables we can get a probability that Y will occur or not. A simple example would be when a patient has symptoms x1,x2,x3 then the patient has disease Y. Now a new patient comes along and has some but not all the symptoms do they have the disease ? In the real world it's probably x1..xn number of symptoms and they would all be converted to Integer or floats so the machine can make a prediction.
There are many types of Supervised learning algorithms which are really just pattern recognition algorithms mapping a set of variables to a result such as
Linear Regression
Logistic Regression
Naive Bayes
K Nearest Neighbour
etc…
Each algorithm has its pros and cons and is suited for a different task

Supervised problems can fall into two further sub categories

Classification

The output of a supervised can be classified as given the mapping above a patient has a disease or not

Regression

The output of the Supervised is usually a real world value like dollars or weight

Unsupervised

Unsupervised learning is about finding a pattern in a data set where we have a set of variables x1 to Xn but we don't yet know the answer.
A good example is how a credit card company detects anomalies in your spend on your credit card. Its would take your normal spending habits and create a baseline of your habits by classifying all your spending habits and computing a number. When a new transaction comes in they look at calculating a new number including the new transaction and if its application changes this number significantly it can be queried to see if it is a valid transaction.
A good algorithm for something like this would be a K Means clustering algorithm. If the new transaction is far from the cluster or creates a new focal point in the set it is an anomaly.
However there are many more Unsupervised algorithms.

Practical Steps to ML

When we talk of AI and ML or Machine learning we need to consider various factors that can skew our results such as
Bias - is there a bias in the x1 to Xn we are using for the learning
Size - is the data set large enough to cover all the cases we need to make accurate predictions
Dimensionality - is the data you are learning from a vector with lower dimensions and has less variance so large amount of dimensions don't confuse the learning

If you want to try and do some basic learning there are plenty of libraries you can use like Spark MLLib or Sickit Learn
for recommenders you can use K-nearest neighbours or clustering algorithms.

LLM Large Language Model

A new development in the world of AI is the LLM which can in a supervised and semi-supervised way learn patterns from a large amount of text ( Think the entire web for large amount of text ) and then use this data to create generative AI. The ML models discussed are generally considered predictive AI. The best example of this is ChatGPT which uses a neural network to create and generate answers to questions posed to it.

Again all the caveats mentioned previously apply but I would suggest the biggest is:
Bias - if your data set is the web is everything accurate and unbiased in the training data ?
Usage in a corporate setting - if you want to use an LLM on Corporate data how do you keep that data separate from the rest of the web and or other companies as you don't want data leaks. Will there be many LLMs one for each company ?

Since LLM's are fairly new this whole world is changing at a rapid pace so I won't provide much more info on the LLM's here for now.

Tags: AIML

Arif Jaffer