Week 3 - Intro Kaggle competition - EDA and baseline models with PyTorch

Learning and testing - a.k.a. don't do Bullshit Machine Learning

Course session


  • Introduction

  • Titanic

  • Paddy

  • Exploratory Data Analysis(EDA) for Paddy Disease Classification

Solutions exercise MLP

Presentation from the participants of the MLP from Coursera



PyTorch 303 (Lab 03)



Go for your own through the Colab Notebook above (PyTorch303) and try to understand and repeat the steps for your own.

Do Week 3 of the Coursera Course

Please register at kaggle.com and join the competition. Go through the Exploratory Data Analysis Notebook session and then train a Logistic regression as baseline model!

The main objective of this Kaggle competition is to develop a machine or deep learning-based model to classify the given paddy leaf images accurately. A training dataset of 10,407 (75%) labeled images across ten classes (nine disease categories and normal leaf) is provided. Moreover, the competition host also provides additional metadata for each image, such as the paddy variety and age. Your task is to classify each paddy image in the given test dataset of 3,469 (25%) images into one of the nine disease categories or a normal leaf.

So that is where we will be heading in the next session trying different tools and techniques.

EDA Notebook

Logistic regression (try first on your own but if your stuck look at the notebook below):


Build an MLP in PyTorchLightning for Paddy Challenge on Kaggle


Do your own EDA on the Paddy Challenge and/or look at other EDA notebooks from competitors. Make a final presentable EDA notebook

Transfer the CNN from the Coursera assignment to our Kaggle competition

Familiarize yourself with this PyTorch Tutorials:

Last updated