Machine Learning is a set of computational techniques that allows us to extract patterns from data. In other words, these algorithms can learn the information contained in datasets like tabular data, text, images, or even videos. However, there is no universal algorithm capable of learning every pattern from every kind of data. Each machine learning algorithm solves a specific type of learning problem. We call machine learning tasks the particular learning problems solved by machine learning algorithms.
Machine Learning techniques have many real-world applications and solve a significant number of business problems. They are applicable in a wide range of science and business domains, including medicine, new drug discovery, agriculture, marketing, cybersecurity, and so many more. However, we do not need to develop a new algorithm for each new machine learning application. Instead, we can re-state our domain-specific business problem as a well-known machine learning task and then follow a process to select the most appropriate algorithm to solve the task. We need to understand the information we want to learn from our available datasets and select the proper algorithms to apply to our data.
In the applied machine learning field, the ability to choose the right machine learning task for a given domain-specific problem is even more important than implementing machine learning algorithms from scratch. That’s because there’s a significant number of libraries out there that already implement these algorithms. On the other hand, choosing the right machine learning task is an entirely human-dependent activity making understanding machine learning tasks an essential skill for machine learning practitioners.
There are three main groups of machine learning tasks: Supervised Learning, Unsupervised Learning, and Reinforcement Learning. In the next sections, we are going to understand how they work and some of their applications.
Supervised Learning aims to develop a model capable of predicting a label from a data instance. Thus, we need to provide in the training dataset the correct label of each data instance. There are two types of categories of supervised learning tasks: regression and classification.
The regression task takes an instance of our data as input and predicts a real number as output. In other words, we take inputs like images, texts, or database records and indicate a quantity related to this data. For example, we could create a regression model to predict a car’s price based on its picture. Another possible application would be to predict the number of likes of a social media post based on its content. Whenever we want to estimate a quantity based on a piece of information, we can create a regression model for our problem.
Our training dataset must contain examples of the inputs and their expected output when creating a regression model. In the car’s price prediction use case, we would need many pictures of cars associated with their respective prices. After applying a machine learning algorithm to our training dataset, we would have a regression model capable of estimating a car’s price based on its picture.
The classification task is similar to the regression task. However, instead of predicting a real value from a data instance, we estimate a discrete class or category. Practical examples of classification tasks are spam detection, fraud detection in financial transactions, cancer diagnosis from x-ray images, and many more. In our car example, instead of predicting a car’s price from its picture, we could estimate a category like sport, sedan, or miniwagon. The training dataset for classification must contain examples of the data instances we want to classify and their respective classes or categories.
Introduction to Linear Regression With One Variable:
Create your first machine learning model in 5 minutes with Google Colab: