How can one perceive a machine learning program? Of course a machine learning is also a program, a piece of software. Where we can give certain inputs and can expect certain outputs in return. Objective of this section is to understand various machine learning tasks in lay man terms.
One of the main distinguishing aspect of writing machine learning software is its decision making ability more similar to a human mind i.e it makes decisions on experiences. Here for the software the experiences are provided in prior through past recorded information/data.
Machine learning task - Supervised Learning
Following is one of the kind of decisions it can make, it can recollect from previous experiences and make a new decision. The new decision could be to identify a single outcome from multiple options that were made in the past, history of data. This is called as the classification algorithm. Like how every human mind processes information differently there are again multiple approaches how a classification algorithm is defined. These classification algorithms were created by different mathematicians and statisticians at different points of time in the last century.
A machine learning software is trained to understand any kind of data, i.e it can be a structured table of numbers, image, audio, video or text. The differences in processing will arise based on how computer reads this kind of information.
Machine learning task - Unsupervised learning
There are multiple other kinds of decision making. It can identify similar looking pieces of information i.e similar rows of data and group them together. This algorithm can be used to group the entire dataset into different groups that shall be present in a dataset. This is called as the clustering algorithm. Again the approaches for a clustering algorithm could be different between kinds of clustering algorithms that were created by mathematicians or statisticians.
To cluster a dataset, there is nothing taught to the software as to what has to be identified (one out of multiple available options) as in the case of a classification process. It merely reads raw collection of data and groups all similar looking things together.
Conclusion
I just introduced two algorithms above, but the overall machine learning algorithms can be grouped into two broad task categories,
- Supervised learning algorithm and
- Unsupervised learning algorithm
Classification comes under the supervised learning, the reason it falls here is that for the task of selecting one out of multiple choices, in the history of data we need to be pretty sure of the combinations that are available i.e all the inputs and the output choice for that particular instance of data. To ensure this, its better a human oversees this activity in creating the dataset. This task of tagging the input and relevant output is called as labelling in machine learning terms.
Clustering comes under the unsupervised learning, simply because there is no need for labelling. It shall read the entire input data and group all similar looking things.