Exploratory Data Analysis for Machine Learning

Exploratory data analysis is the most challenging task when building a machine learning model, especially for beginners.

A result of the No-Free-Lunch-Theorem is that there’s no single model that will perform well for every dataset. In other words, there’s no silver bullet Machine Learning Algorithm.

The practical consequence is that we need to make a LOT of human decisions when building our model: which algorithm to use, which features to use, which features to discard, apply normalization, regularization, hyperparameters to tune.

And because the space of decisions is so vast, going on simple trial-and-error is a shot in the dark. We need to drive our decisions on actions that could potentially benefit our model.

So, the only way to make better decisions when building a model is to understand our dataset. And that’s why an excellent Exploratory Data Analysis is an essential step in Machine Learning model building.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s