Building a Machine Learning Model: Collecting Data

Data is the main ingredient of Machine Learning models. An actual Machine Learning project starts with the team assessing if the necessary data is available.

If not (which would be no surprise), the team needs to develop a strategy to collect and store the necessary data before modeling Machine Learning Algorithms.

Common sources of data include:

  • Databases
  • Computer files
  • Websites
  • REST APIs
  • Sensor data
  • Physical files
  • Satellite images
  • User interaction in apps
  • Videos

Most Machine Learning algorithms are agnostic to the ingested type of data: most of them receive matrixes and numeric arrays as parameters. From the learning algorithms perspective, it does not matter if the feature came from ebooks or images.

Thus, in Machine Learning Project, a good part of the effort comes from setting the correct strategy to collect, store, clean, and preprocess the data, making them beneficial to train Machine Learning models. 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s