Introduction to Machine Learning

The word “machine learning” sounds like a machine with robot appearance learning something. Actually, machine learning is very related to the user feeding large amount of training data into the machine to learn. The machine then will learn the pattern of the data and, as a result, can understand the data pattern and create a model. The model from machine learning basically can classify, cluster, and predict test data according to the training data.

There are three kinds of machine learning, supervised learning, unsupervised learning, and reinforcement learning. This article discusses supervised and unsupervised learning only. Supervised learning can classify or predict test data from labeled training data. Supervised learning learns the labels of training dataset to classify or predict new dataset according to the variables. Supervised learning can do classification and regression. If the label is categorical, it is called classification. If the label is continuous number, it called regression.

Now, let’s imagine that machine learning is a kid and we want to teach a kid about how to identify animals. First, we show the kid ten different pictures of monkey and tell him that those are monkeys. The kid will learn to recognize different monkeys with their similarities, such as brown color, two arms, found in forest, and others. Next, show the kid 10 pictures of bird and tell him that those are birds. The kid will, again, learn about bird from the ten different birds, but they have similar characteristics. Birds have wings, no arm, and colorful feather. Now, we can test the kid whether he can identify the eleventh monkey and bird.

Examples of supervised learning are

predicting population growth according to predictors, such as current population number, number of female population, and population age (regression).
predicting economic growth according to economic parameters, like income, population number, and living expense (regression).
classifying land cover type into vegetation, soil, water body, and agriculture according to spectral reflectance, and
classifying type of customer into satisfied, neutral, or dissatisfied according to their opinions in survey.

Unsupervised learning finds pattern similarity in the variables to group unlabeled dataset into clusters within large number of data. This will simplify the process of analyzing the data. Unsupervised learning also can be done by dimensionality reduction. This is to simplify dataset with many dimensions or variables by finding which dimensions/variables have high correlation to each other or one another.

Now, let’s imagine machine learning is a kid learning. Give a kid 20 pictures of fruit. Do not tell the kid the fruits name. This part is what make unsupervised learning to be different from supervised learning. We do not tell the kid the fruit names. Let the kid learn himself to categorize the fruits according to their similarities.

The kid will firstly separate fruits with green color and not rounded-shape with the other fruits. Next, the rest fruits will be separated again according their color.

Examples of unsupervised learning are

customer segmentation according to the behavior,
clustering water quality dataset according to the parameters (ion content), and
grouping stocks into according to their temporal price.

Unlike supervised learning, unsupervised learning is applied to dataset of which the label or cluster name is not yet known. If supervised learning classifies customer into 3 classes, “satisfied”, “neutral”, and “dissatisfied”, unsupervised learning divides customers into a number of classes, but the number of the classes is not yet decided. The class labels are also not yet identified.

Example or of Machine Learning are shown below. We will discuss these methods in other articles.

K Nearest Neighbors (kNN)	Supervised Learning	Classification and Regression
Decision Tree/Classification and Regression Tree (CART)	Supervised Learning	Classification and Regression
Random Forest	Supervised Learning	Classification and Regression
Support Vector Machine	Supervised Learning	Classification and Regression
Gradient Boosting	Supervised Learning	Classification and Regression
Naïve Bayes	Supervised Learning	Classification
Linear Regression	Supervised Learning	Regression
Logistic Regression	Supervised Learning	Regression
K-means	Unsupervised Learning	Clustering
Hierarchical Clustering	Unsupervised Learning	Clustering
Principal Component Analysis (PCA)	Unsupervised Learning	Dimensionality Reduction
t-SNE	Unsupervised Learning	Dimensionality Reduction
Non-Negative Matrix Factorization (N-NMF)	Unsupervised Learning	Dimensionality Reduction
Exploratory Factor Analysis [EFA]	Unsupervised Learning	Dimensionality Reduction
and others. . .

Introduction to Machine Learning

Published by RendyK

Leave a comment Cancel reply

Share this:

Related

Published by RendyK

Leave a comment Cancel reply