arrow-left

All pages
gitbookPowered by GitBook
1 of 13

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Week 3 - Versioning with Git and Data Preparation (Part 1)

hashtag
This week we will...

cover the following topics:

  • Talk about the tasks from last week

  • Statistical Significance

  • Introduction to Version Control with Git

  • Introduction to Data Preparation

hashtag
Learning Resources

hashtag
Until next week you should...

Watch (3 minutes) and designate a person in the team who will create a team repository as shown there.
  • file-pdf
    1MB
    251106_Intro to git and data preparation.pdf
    PDF
    arrow-up-right-from-squareOpen
    this coursearrow-up-right
    this videoarrow-up-right
    thisarrow-up-right
    this videoarrow-up-right
    this videoarrow-up-right

    Week 7 - Overfitting and Regularization

    hashtag
    This week we will...

    cover the following topics:

    • Important terms in machine learning

    • Overfitting and regularization

    • Model quality criteria

    • Introduction to neural nets

    hashtag
    Learning Resources

    • for the definition and estimation of neural networks for different example datasets

    • of the effect of overfitting and regularization

    hashtag
    Until next week you should...

    file-pdf
    4MB
    251204_Overfitting and Model Evaluation.pdf
    PDF
    arrow-up-right-from-squareOpen
    Graphical toolarrow-up-right
    Example arrow-up-right
    Neural networks intuitionarrow-up-right
    TensorFlow implementationarrow-up-right

    Week 10 - Project Presentation

    circle-exclamation

    For the presentation, generate predictions for the Kaggle competition test dataset using your best model and upload them there!

    hashtag
    Presentation (Powerpoint, Keynote or similar)

    Prepare an 8 to 10-minute presentation including:

    • Your team members’ names on the title slide

    • List and brief description of self-created variables

    • Bar charts with confidence intervals for two self-created variables

    Each team member should have a part in the presentation!

    circle-exclamation

    Document your work in the project repository, completing the README files as specified.

    One team member must upload the main README to the EduHub platform as described .

    Linear model optimization: model equation and adjusted R²
  • Type of missing value imputation used

  • Neural network optimization:

    • Source code defining the neural network

    • Loss function plots for training and validation sets

    • MAPE scores for the overall validation set and each product group

  • Highlight “Worst Fail” and “Best Improvement” cases

  • herearrow-up-right

    Week 1 - Introduction to Data Science

    hashtag
    This week you will...

    get an introduction to the following topics:

    • What is data science?

    • R vs. Python vs. SPSS vs. ...

    • Jupyter Notebooks

    hashtag
    Learning Resources

    hashtag
    Until next week you should...

    watch the first four chapters of this videoarrow-up-right on functions (12 minutes)

  • file-download
    26MB
    251023_Introduction.pptx
    arrow-up-right-from-squareOpen
    Markdown Guidearrow-up-right
    herearrow-up-right
    herearrow-up-right
    herearrow-up-right
    this videoarrow-up-right
    this snippetarrow-up-right

    Week 6 - Baseline Models and Linear Regression

    hashtag
    This week we will...

    • get to know about the importance of baseline models

    • learn about naïve forecasting

    • see how linear regressions are defined

    • understand the role of cost functions

    • get an introduction into optimization functions

    hashtag
    Learning Resources

    • on Linear Regression

    hashtag
    Until next week you should...

    Week 2 - Data Import and Visualization

    hashtag
    This week we will...

    cover the following topics:

    • VSCode and GitHub Code Spaces

    Document the linear regression calculations in the “Baseline Model” directory of your team repository.

    file-pdf
    4MB
    251127_Intro to Linear Regression.pdf
    PDF
    arrow-up-right-from-squareOpen
    file-download
    527KB
    IntroMLandLinReg.ipynb
    arrow-up-right-from-squareOpen
    DataCamp Tutorialarrow-up-right
    The problem of overfittingarrow-up-right

    AI-assisted programming

  • Representation of different data structures

  • Reading data from external sources

  • Chart and scale types

  • hashtag
    Learning Resources

    • Get Started with GitHub Copilot in VS Codearrow-up-right

    • Overview on GitHub Copilot in VS Codearrow-up-right

    • Optional local installation of Python and VS Codearrow-up-right

    • for the graphical representation of data

    hashtag
    Until next week you should...

    file-pdf
    2MB
    251030_Import and Graphical Representation of Data.pdf
    PDF
    arrow-up-right-from-squareOpen

    Week 4 - Versioning with Git and Data Preparation (Part 2)

    hashtag
    This week we will...

    cover the following topics:

    • Versioning with Git in a Team

    • Important Issues to Consider for Feature Engineering

    • Introduction into Analyzing Time Series Data

    hashtag
    Learning Resources

    hashtag
    Until next week you should...

    (You need to create a free account with DeeplLearning.AI.)

    Examplesarrow-up-right
    thisarrow-up-right
    https://github.com/opencampus-sh/einfuehrung-in-data-science-und-mlarrow-up-right

    Additional downloaded or self-created data (e.g., holiday lists)

  • Code to merge all data into one dataset

  • Code to create new variables or prepare existing variables for prediction

  • file-pdf
    2MB
    251112_Intro to git and data preparation (Part 2).pdf
    PDF
    arrow-up-right-from-squareOpen
    Moving Averagesarrow-up-right
    Percent Changearrow-up-right
    Segmentationarrow-up-right

    Preparation

    hashtag
    Before the first course session, you should ...

    this linkarrow-up-right
    Introduction to Pythonarrow-up-right

    Week 8 - Neural Nets

    hashtag
    This week we will...

    • learn about different libraries for implementing neural nets

    • review example notebooks for the data preparation and training of neural net using Pandas and TensorFlow

    • get to know additional types of layers in neural nets

    hashtag
    Learning Resources

    • Additional (12 Minuten) on neural nets

    • for a neural net

    • for training a neural net

    hashtag
    Until next week you should...

    examine all your model variables for the existence of missing and implausible values.
  • file-pdf
    3MB
    251211_Neural Nets.pdf
    PDF
    arrow-up-right-from-squareOpen
    introduction videoarrow-up-right
    Example data preparation notebookarrow-up-right
    Example notebookarrow-up-right
    this videoarrow-up-right
    this videoarrow-up-right
    this coursearrow-up-right

    Introduction to Data Science and Machine Learning

    Week 5 - Time Series Analyses and Introduction into Machine Learning

    hashtag
    This week we will...

    • learn about different patterns in times series

    • walk through the general procedure for training machine learning algorithms

    • get to know how to test predictions on Kaggle

    • get an impression of current developments in AI

    hashtag
    Learning Resources

    • for graphical analyses of time series

    hashtag
    Until next week you should...

    file-pdf
    7MB
    251120_Time Series Analysis and Current AI Developments.pdf
    PDF
    arrow-up-right-from-squareOpen
    Example notebookarrow-up-right
    Linear Regressionarrow-up-right
    Multiple Linear Regressionarrow-up-right

    Conditions for Receiving a Certificate or ECTS

    All participants are expected to pursue a certificate of achievement or ECTS credits, that is to fulfill the following conditions to complete the course:

    hashtag
    Online Attendance:

    If you attend via Zoom, please make sure to use your full name, which should be the same that you used to register at edu.opencampus.sh. Otherwise your attendance will not be recorded!

    circle-info

    Online attendance is only accredited if you have the camera on, are participating with a laptop or desktop computer, and are in a sufficiently quite location to participate in the group discussions.

    here
    here

    Week 9 - Missing Values

    hashtag
    This week we will...

    • learn how to use dropout layers

    • get to know different ways to handle missing values

    hashtag
    Learning Resources

    • for handling missing values

    • Chapter 1 of course at datatcamp

    • on the Transformers library

    hashtag
    Until next week you should...

    file-pdf
    3MB
    251218_Missing Values.pdf
    PDF
    arrow-up-right-from-squareOpen
    Example notebookarrow-up-right
    thisarrow-up-right
    Hugging Face coursearrow-up-right
    herearrow-up-right