Data is the main ingredient of Machine Learning models. An actual Machine Learning project starts with the team assessing if the necessary data is available.
If not (which would be no surprise), the team needs to develop a strategy to collect and store the necessary data before modeling Machine Learning Algorithms.
Common sources of data include:
- Computer files
- REST APIs
- Sensor data
- Physical files
- Satellite images
- User interaction in apps
Most Machine Learning algorithms are agnostic to the ingested type of data: most of them receive matrixes and numeric arrays as parameters. From the learning algorithms perspective, it does not matter if the feature came from ebooks or images.
Thus, in Machine Learning Project, a good part of the effort comes from setting the correct strategy to collect, store, clean, and preprocess the data, making them beneficial to train Machine Learning models.