Choosing a Project
Focus on Problems First, Then Data
Rather than starting with a dataset and trying to find something interesting to do with it, identify a meaningful problem that could benefit from machine learning. This approach tends to lead to more impactful and realistic projects.
Look for Human Pain Points
Consider processes or tasks that are:
Time-consuming for humans
Tedious or repetitive
Prone to human error
These often make excellent candidates for machine learning solutions.
Project Sources
Bring Your Own Project
If you already have a problem in mind or access to interesting data:
Present your idea in the course or course chat
If you have any doubts about the project, formulate specific questions and ask in the course chat
Partner with Organizations
Local companies and academic departments often have real-world problems waiting for ML solutions:
Reach out to businesses or research groups in your area
Inquire about data-intensive problems they're facing
We can help you define project parameters with your partner
Public Datasets
While convenient, using public datasets sometimes comes with limitations:
Projects tend to focus more on model optimization than real-world implementation
You miss valuable experience in data preparation and feature engineering
The problem may be artificially clean compared to real-world scenarios
However, if you choose this route, consider adding complexity by:
Combining multiple datasets
Creating your own validation methodology
Adding constraints that reflect real-world conditions
Data Resources
If you're looking for public datasets, here are some valuable repositories:
Examples of Past Projects
To get inspiration for a project, you might also want to review the projects nominated for the past VDE Machine Learning Prize in 2025 presented in the document below.
Non-Disclosure Agreement (NDA)
If you need an NDA for data you are getting from an organization or partner, you can use the following:
Last updated
Was this helpful?