All pages
Powered by GitBook
1 of 13

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Introduction to Data Science and Machine Learning

Week 1 - Introduction to Data Science

This week you will...

get an introduction to the following topics:

  • What is data science?

  • R vs. Python vs. SPSS vs. ...

  • Jupyter Notebooks

Learning Resources

Until next week you should...

Preparation

Before the first course session, you should ...

Week 3 - Versioning with Git and Data Preparation (Part 1)

This week we will...

cover the following topics:

  • Talk about the tasks from last week

Week 2 - Data Import and Visualization

This week we will...

cover the following topics:

  • VSCode and GitHub Code Spaces

this link
Introduction to Python
on functions (12 minutes)
  • 26MB
    251023_Introduction.pptx
    Open
    Markdown Guide
    here
    here
    here
    this video
    this snippet
    this video
    Statistical Significance
  • Introduction to Version Control with Git

  • Introduction to Data Preparation

  • Learning Resources

    Until next week you should...

    1MB
    251106_Intro to git and data preparation.pdf
    PDF
    Open
    AI-assisted programming
  • Representation of different data structures

  • Reading data from external sources

  • Chart and scale types

  • Learning Resources

    • Get Started with GitHub Copilot in VS Code

    • Overview on GitHub Copilot in VS Code

    • Optional local installation of Python and VS Code

    • Examples for the graphical representation of data

    Until next week you should...

    2MB
    251030_Import and Graphical Representation of Data.pdf
    PDF
    Open

    Week 9 - Missing Values

    This week we will...

    • learn how to use dropout layers

    • get to know different ways to handle missing values

    Learning Resources

    • for handling missing values

    • Chapter 1 of course at datatcamp

    • on the Transformers library

    Until next week you should...

    Week 4 - Versioning with Git and Data Preparation (Part 2)

    This week we will...

    cover the following topics:

    • Versioning with Git in a Team

    • Important Issues to Consider for Feature Engineering

    • Introduction into Analyzing Time Series Data

    Learning Resources

    Until next week you should...

    (You need to create a free account with DeeplLearning.AI.)

    Week 5 - Time Series Analyses and Introduction into Machine Learning

    This week we will...

    • learn about different patterns in times series

    • walk through the general procedure for training machine learning algorithms

    • get to know how to test predictions on Kaggle

    • get an impression of current developments in AI

    Learning Resources

    • for graphical analyses of time series

    Until next week you should...

    Conditions for Receiving a Certificate or ECTS

    All participants are expected to pursue a certificate of achievement or ECTS credits, that is to fulfill the following conditions to complete the course:

    Online Attendance:

    If you attend via Zoom, please make sure to use your full name, which should be the same that you used to register at edu.opencampus.sh. Otherwise your attendance will not be recorded!

    Online attendance is only accredited if you have the camera on, are participating with a laptop or desktop computer, and are in a sufficiently quite location to participate in the group discussions.

    Week 6 - Baseline Models and Linear Regression

    This week we will...

    • get to know about the importance of baseline models

    • learn about naĂŻve forecasting

    Week 10 - Project Presentation

    For the presentation, generate predictions for the Kaggle competition test dataset using your best model and upload them there!

    Presentation (Powerpoint, Keynote or similar)

    Prepare an 8 to 10-minute presentation including:

  • Your team members’ names on the title slide

  • List and brief description of self-created variables

  • Bar charts with confidence intervals for two self-created variables

  • Linear model optimization: model equation and adjusted R²

  • Type of missing value imputation used

  • Neural network optimization:

    • Source code defining the neural network

    • Loss function plots for training and validation sets

    • MAPE scores for the overall validation set and each product group

  • Highlight “Worst Fail” and “Best Improvement” cases

  • Each team member should have a part in the presentation!

    Document your work in the project repository, completing the README files as specified.

    One team member must upload the main README to the EduHub platform as described here.

    this video
    https://github.com/opencampus-sh/einfuehrung-in-data-science-und-ml
  • 3MB
    250619_Missing Values.pdf
    PDF
    Open
    Example notebook
    this
    Hugging Face course
    this video
    7MB
    251120_Time Series Analysis and Current AI Developments.pdf
    PDF
    Open
    Example notebook
    Linear Regression
    Multiple Linear Regression
    here

    see how linear regressions are defined

  • understand the role of cost functions

  • get an introduction into optimization functions

  • Learning Resources

    • DataCamp Tutorial on Linear Regression

    Until next week you should...

    4MB
    251127_Intro to Linear Regression.pdf
    PDF
    Open
    527KB
    IntroMLandLinReg.ipynb
    Open

    Code to merge all data into one dataset

  • Code to create new variables or prepare existing variables for prediction

  • 2MB
    251112_Intro to git and data preparation (Part 2).pdf
    PDF
    Open

    Week 7 - Overfitting and Regularization

    This week we will...

    cover the following topics:

    • Important terms in machine learning

    • Overfitting and regularization

    • Model quality criteria

    • Introduction to neural nets

    Learning Resources

    • for the definition and estimation of neural networks for different example datasets

    • of the effect of overfitting and regularization

    Until next week you should...

    Week 8 - Neural Nets

    This week we will...

    • learn about different libraries for implementing neural nets

    • review example notebooks for the data preparation and training of neural net using Pandas and TensorFlow

    • get to know additional types of layers in neural nets

    Learning Resources

    • Additional (12 Minuten) on neural nets

    • for a neural net

    • for training a neural net

    Until next week you should...

    Percent Change
    Segmentation
    4MB
    251204_Overfitting and Model Evaluation.pdf
    PDF
    Open
    Graphical tool
    Example
    Neural networks intuition
    TensorFlow implementation
  • 3MB
    251211_Neural Nets.pdf
    PDF
    Open
    introduction video
    Example data preparation notebook
    Example notebook
    this video
    this video
    this course