Machine learning isn't just neural networks.
It's also k-means, Principal Component Analysis, Support Vector Machines, Bayes, Decision Trees, Random Forests, Markov Models, ….
And there are Learning Classifier Systems (LCSs). Many articles don't even mention them.
LCSs are a system to create and improve IF … THEN …
rules for a task.
Let's explore how they work.
Note for the LCS experts out there: in this post I'm solely talking about Michigan-Style LCSs. I may cover other LCS variants in future articles.
Assume for a moment we have a working LCS, trained to tell us whether a number is even, and we want it to give us a result for the number 5. What happens?
5
into the format that our LCS understands. For simplicity, let's use binary: 101
.IF … THEN …
rules, called a population [P], which it learned in the training phase. It now checks all IF …
conditions if they match 101
. We call the rules that match our input the match set [M].Note: despite their name, Learning Classifier Systems are not restricted to classification problems. We can also use them for regression, function approximation, behavior modeling, and much more!
The environment is the entirety of all states (condition + result) the LCS can encounter.
As with all machine learning algorithms, it is crucial to develop a suitable model of the environment:
001#1#0 ➡️ even
for binary inputs, with the #
being a wildcard. Or you have distinct descriptions, e.g. for your customers when building a recommendation system: you could use gender, age, shopping frequency, and category of the last bought item. A rule for this set could look like [#, 20-40, 2-5 per month, smartphones] ➡️ smartphone accessories
.N
, NE
, E
, SE
, S
, SW
, W
, NW
enough?
This step needs experience and trial-and-error.The rule set of an LCS is called its population. Every rule only covers a small part of the problem space.
Each rule has several properties:
Side note on accuracy: from an end-user perspective, we are only interested in the accuracy of the whole rule set, not individual rules. Individual rules have their accuracy and fitness parameters so the LCS can learn and improve its rule set.
To train an LCS, we need several systems working together:
The training stops when a certain condition is met, e.g. an iteration count.
The rule set now still contains some bad and inexperienced rules (e.g. rules that were recently created). We now apply rule compaction to remove them.
An LCS is simple. It's a set of IF … THEN …
rules that is under pressure by a genetic algorithm. Nothing more.
The resulting rule set is much is easier to understand than the weighted graph of a neural network.
Of course, you can make LCS as complex as you want. But the core idea is beautifully simple.
In the upcoming articles in this series, we will further explore the applications of LCS, the available frameworks, and we'll implement our own LCS with Node.js.