opencampus.sh Machine Learning Program
  • opencampus.sh Machine Learning Program
  • Course Kick-Off
  • How do I choose a course?
  • FAQ
  • Courses
    • Introduction to Data Science and Machine Learning
      • Conditions for Receiving a Certificate or ECTS
      • Preparation
      • Week 1 - Introduction to Data Science
      • Week 2 - Import and Visualization of Data
      • Week 3 - Versioning with Git (Part 1) and data preparation
      • Woche 4 - Versionierung mit git (Teil 2) und aktuelle Entwicklungen im Bereich ML
      • Woche 5 - Einführung in das maschinelle Lernen
      • Woche 6 - Overfitting und Regularisierung
      • Woche 7 - Neuronale Netze
      • Woche 8 - Fehlende Werte
      • Woche 9 - Zeitreihenanalysen
      • Woche 10 - Projektpräsentationen
    • Machine Learning with TensorFlow
      • Requirements for a Certificate of Achievement or ECTS
      • Preparation
      • Week 1 - General Introduction
      • Week 2 - Introduction to TensorFlow,Part I
      • Week 3 - Introduction to TensorFlow,Part II
      • Week 4 - Convolutional Neural Networks, Part I
      • Week 5 - Convolutional Neural Networks, Part II
      • Week 6 - Natural Language Processing, Part I
      • Week 7 - Natural Language Processing, Part II
      • Week 8 - Project Work
      • Week 9 - Sequences, Time Series and Prediction, Part I
      • Week 10 - Sequences, Time Series and Prediction, Part II
      • Week 11 & 12 - Presentation of the Final Projects
    • Intermediate Machine Learning
      • Hello and welcome😊
      • Prequisites
      • Week 1 - Course Introduction
        • Cousera Videos
      • Week 2 - Recap ML Basics, Intro to PyTorch
        • Coursera Videos
      • Week 3 - Intro Kaggle competition - EDA and baseline models with PyTorch
        • Coursera Videos
      • Week 4 - Convolutional Neural Networks
      • Week 5 - Recurrent Neural Networks
        • Cousera Videos
      • Week 6 - CNN and RNN Applications
        • Cousera Videos
      • Week 7 - Transformers & Hugging Face
      • Week 8-10 - Kaggle competiton sessions
      • Week 11 - Final Presentations
    • From LLMs to AI Agents🤖
      • Hello and welcome🤖
      • Week 1 - Course Introduction
      • Week2 - RAG +Introduction to frameworks(langchain & llamaindex)
      • Week 3 - Prompt Engineering & Demo Chatbot
    • Advanced Time Series Prediction
      • Requirements for a Certificate of Achievement or ECTS
      • Projects & Frameworks
      • Lecture material + YouTube
      • References / Books
      • Week 1 - Intro + Organisation
      • Week 2 - SARIMA(X) + GARCH-Models
      • Week 3 - Labour Day
      • Week 4 - State-Space models // Filtering
      • Week 5 - Dependence concepts: Copula // Gaussian Processes // RMT
      • Week 6 - Extremes // Anomalies // Signatures
      • Week 7 - Tree models: XGBoost // LightGBM // CatBoost
      • Week 8 - (Deep) recurrent architectures for time series data
      • Week 9 - Transformers + TemporalFusionTransformers
      • Week 10 - NBEATS(x) + NHITS
      • Week 11 - LLM for time series problems
      • Week 12 - Final Presentations
      • Week 13 - Final Presentations (Back-Up)
    • Python: Beginner to Practitioner
      • Week 1
      • Week 2
      • Week 3
      • Week 4
      • Week 5
      • Resources
        • Worklabs
        • Harvard Course
    • Fine-Tuning and Deployment of Large Language Models
      • Requirements for a Certificate of Achievement or ECTS
      • Preparation
      • Week 1 - General Introduction
      • Week 2 - Project Definition and Introduction to Fine-Tuning
      • Week 3 - Fine-Tuning Characteristics
      • Week 4 - Model Evaluation
      • Week 8 - Tokenization for Instruction Tuning
      • Week 9 - Efficient Inference
      • Week 10 - Project Presentations
    • Archive
      • Deep Learning from Scratch
        • Requirements for a Certificate of Achievement or ECTS
        • Preparation
        • Week 1 - General Introduction
        • Week 2 - Introduction to Deep Learning and Neural Network Basics
        • Week 3 - Shallow Neural Networks
        • Week 4 - Deep Neural Networks
        • Week 5 - Practical Aspects of Deep Learning
        • Week 6 - Optimization Algorithms
        • Week 7 - Hyperparameter Tuning
        • Week 8 - Machine Learning Strategy 1 & 2
        • Week 9 - Neural Networks Architecture | Project Checkpoint
        • Week 10 - Bonus: most voted topic
        • Week 11 - Presentation of Final Projects, Part I
        • Week 12 - Presentation of Final Projects, Part II
      • Deep Learning for Computer Vision
        • Requirements for a Certificate of Achievement or ECTS
        • Preparation
        • Week 1 - General Introduction
        • Week 2 - Foundations of Convolutional Neural Networks
        • Week 3 - Convolution Model Application
        • Week 4 - Residual Networks
        • Week 5 - Transfer Learning
        • Week 6 - Detection Algorithms
        • Week 7 - Project Checkpoint | Image Segmentation
        • Week 8 - Face Recognition
        • Week 9 - Art Generation with Neural Style Transfer
        • Week 10 - CNN Bonus
        • Week 11 - Final Presentation of the Projects
      • Application of Transformer Models
        • Requirements for a Certificate of Achievement or ECTS
        • Week 1 - General Introduction
        • Week 2 - Self-Attention and Prompt Design
        • Week 3 - Introduction to Transformer Models
        • Week 4 - Fine-Tuning Pretrained Models
        • Week 5 - The Datasets Library
        • Week 6 - The Tokenizers Library
        • Week 7 - Main NLP Tasks
        • Week 8 - Presentation of the Final Projects
      • Generative Adversarial Networks
        • Requirements for a Certificate of Achievement or ECTS
        • Preparation
        • Motivation - Things you can do with NLP
        • Week 1 - General Introduction to the course
        • Week 2 - Sentiment Analysis with Logistic Regression
        • Week 3 - Sentiment Analysis with Naïve Bayes
        • Week 4 - Vector Space Models
        • Week 5 - Machine Translation and Document Search
        • Week 6 - Autocorrect
        • Week 7 - Part of Speech Tagging and Hidden Markov Models
        • Week 8 - Autocomplete and Language Models
        • Week 9 - Word embeddings with neural networks
        • Week 10 - Final Projects
      • Lehren und Lernen mit KI
        • Woche 1 - Einführung
        • Woche 2 - Anwendungsbeispiele #twlz
        • Woche 3 - KI-Tools für den Bildungsbereich
        • Woche 4 - Nicht-technische Einführung in die KI
        • Woche 5 - Kreatives Schreiben
        • Woche 6 - Automatische Klassifizierung von Textantworten
        • Woche 7 - IQSH Handreichung zu CHatGPT
        • Woche 8 - Veränderungen in benötigten Kompetenzen
        • Woche 9 - Präsentation Abschlussprojekte
      • Reinforcement Learning
      • Machine Learning Operations (MLOps)
        • 19-04-2023 - General Introduction
        • 26-04-2023 ML Lifecycle Overview and Model Selection
        • 03-05-2023 Data Definition and Collection
        • 10-05-2023 From Feature Engineering to Data Storage
        • 17-05-2023 Advanced Data Processing & Intro into Model Serving
        • 24-05-2023 Model Infrastructure & Delivery
        • 31-05-2023 Model Monitoring
        • 07-06-2023 Project Presentations
      • Mathematik für maschinelles Lernen
      • TensorFlow Course: Week 10 - Special Issues Considering Your Final Projects
      • Deep Dive into LLMs
        • Week 1 - Introduction
        • Week 2 - Tokens & Embeddings revisted
        • Week 3 - Introduction to Transformers
        • Week 4 - Prompt Engineering
          • Chain of Thought
          • TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex Tasks
          • More techniques
        • Week 5 - RAG and Agents
        • Week 6 - Model Evaluation
        • Week 7 - Fine-Tuning I
        • Week 8 - Fine-Tuning II and Model Inference
        • Week 9 - Advisory Session
        • Week 10 - Project Presentations
      • Intermediate Machine Learning (Legacy SS2023)
        • Hello and welcome😊
        • Prequisites
        • Week 1 - Course Introduction
        • Week 2 - Recap ML Basics, Intro to PyTorch
        • Week 3 - Intro Kaggle competition - EDA and baseline models with PyTorch
        • Week 4 - Convolutional Neural Networks
        • Week 5 - Recurrent Neural Networks
        • Week 6 - CNN and RNN Applications
        • Week 7 - Transformers Part 1
        • Week 8 - Transformers Part 2
        • Week 9 - Vision Transformers
        • Week 10-12 - Projects sessions
        • Week 13 - Project Presentations
        • Week 14+
      • Practical Engineering with LLMs
        • Week 1- General Introduction
        • Week 2 - Prompt Engineering
        • Week 3 - Introduction to LangChain
        • Week 4 - Introduction to Retrieval Augmented Generation
        • Week 5 - Advanced Retrieval Augmented Generation
        • Week 6 - Building User Interfaces with Gradio
        • Week 7 - Evaluation of LLM outputs and structured outputs
        • Week 8 - Open-Source LLMs
        • Week 9 - Project Presentations
      • Python: From Beginner to Practictioner (Legacy WS2023)
        • Course Info
        • Week 1
        • Week 2
        • Week 3
        • Week 4
        • Week 5
        • Week 6
        • Week 7
        • Week 8
        • Week 9
        • Solutions & Materials
      • Machine Learning für die Medizin
        • Bedingungen für ein Leistungszertifikat oder ECTS
        • Vorbereitung
      • Time Series Prediction
        • Requirements for a Certificate of Achievement or ECTS
        • Projects & Frameworks
        • Preparation / YouTube
        • References / Books
        • Week 1 - Intro + Organisation
        • Week 2 - Forecasting basics with trends: AR + MA-models
        • Week 3 - Covering seasonality: From ARMA to SARIMA-models
        • Week 4 - Towards multidimensional settings: SARIMAX + VAR-models
        • Week 5 - Non-Stationary model classes: GARCH + DCC-GARCH
        • Week 6 - Copula Methods
        • Week 7 - Milestone Meeting + Spectral Analysis of Time Series + Kalman-Filtering
        • Week 8 - Supervised Learning I: Trees + Random Forests + Boosting
        • Week 9 - Supervised Learning II: XGBoost + LightGBM + CatBoost
        • Week 10 - Neural Networks for Sequences: RNNs + GRUs + LSTMs + LMUs
        • Week 11 - Prophet(Facebook) + DeepAR(Amazon) + GPVAR
        • Week 12 - Transformers + TFTs
        • Week 13 - NBEATS(s) + NHITS(x)
        • Week 14 - Final Presentation
      • Python: From Beginner to Practitioner (Legacy 2024S)
        • Course Info
        • Week 1
        • Week 2
        • Week 3
        • Week 4
        • Week 5
        • Week 6
        • Week 7
        • Week 8
        • Week 9
        • Week 10
        • Week 11
        • Week 12
        • Material
      • Einführung in Data Science und maschinelles Lernen
        • Bedingungen für ein Leistungszertifikat oder ECTS
        • Vorbereitung
        • Woche 1 - Einführung in Data Science
        • Woche 2 - Import und Visualisierung von Daten
        • Woche 3 - Versionierung mit git (Teil 1) und Datenaufbereitung
        • Woche 4 - Versionierung mit git (Teil 2) und aktuelle Entwicklungen im Bereich ML
        • Woche 5 - Einführung in das maschinelle Lernen
        • Woche 6 - Overfitting und Regularisierung
        • Woche 7 - Neuronale Netze
        • Woche 8 - Fehlende Werte
        • Woche 9 - Zeitreihenanalysen
        • Woche 10 - Projektpräsentationen
      • Python: From Beginner to Practitioner (Legacy 2024W)
        • Course Info
        • Week 1
        • Week 2
        • Week 3
        • Week 4
        • Week 5
        • Week 6
        • Week 7
        • Week 8
        • Week 9
        • Week 10
        • Week 11
        • Final Project
        • Resources
  • Events
    • Coding.Waterkant 2023
    • Prototyping Week
  • Course Projects
    • Choosing a Project
    • How to Start, Complete, and Submit Your Project
  • Additional Resourses
    • Glossary
    • Coursera
    • Selecting the Optimizer
    • Choosing the Learning Rate
    • Learning Linear Algebra
    • Learning Python
    • Support Vector Machines
    • ML Statistics
  • Tools
    • Git
    • RStudio
    • Google Colab
    • Zoom
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. Additional Resourses

Selecting the Optimizer

PreviousCourseraNextChoosing the Learning Rate

Last updated 4 years ago

Was this helpful?

The following Blog post by Sebastian Ruder gives you a detailed introduction into the different existing optimizers and their specifics.

The Blog post pasted below was published in on September 9, 2020, and gives you a quick overview of the current most important optimizer and their empirical differences, based on a paper from Robin Schmidt and colleagues from the University of Tübingen.

Optimizer Shootout

  • The authors tested each optimization method on eight deep learning problems consisting of a dataset (image or text), standard architecture, and loss function. The problems include both generative and classification tasks.

  • They used the initial hyperparameter values proposed by each optimizer’s original authors. They also searched 25 and 50 random values to probe each one’s robustness.

  • They applied four different learning rate schedules including constant value, smooth decay, cyclical values, and a trapezoidal method (in which the learning rate increased linearly at the beginning, maintained its value, and decreased linearly at the very end).

  • Each experiment was performed using 10 different initializations in case a given initialization degraded performance.

Everyone has a favorite optimization method, but it’s not always clear which one works best in a given situation. New research aims to establish a set of benchmarks. What’s new: Robin Schmidt and colleagues at University of Tübingen evaluated using the some of them introduced last year. Key insight: Choosing an optimizer is something of a dark art. Testing the most popular ones in several common tasks is a first step toward setting baselines for comparison. How it works: The authors evaluated methods including , , (see Andrew’s on the topic), (), and . Their selection was based on the number of mentions a given optimizer received in the abstracts of arXiv.org preprints.

Results:No particular method yielded the best performance in all problems, but several popular ones worked well on the majority of problems. (These included Adam, giving weight to the common advice to use it as a default choice.) No particular hyperparameter search or learning rate schedule proved universally superior, but hyperparameter search raised median performance among all optimizers on every task. Why it matters: Optimizers are so numerous that it’s impossible to compare them all, and differences among models and datasets are bound to introduce confounding variables. Rather than relying on a few personal favorites, machine learning engineers can use this work to get an objective read on the options. We’re thinking: That’s 14 optimizers down and hundreds to go! The is open source, so in time we may get to the rest.

14 popular optimizers
Deep Optimization Benchmark Suite
AMSGrad
AdaGrad
Adam
video
RMSProp
video
stochastic gradient descent
code
The Batch
An overview of gradient descent optimization algorithmsSebastian Ruder
Logo
14 most popular optimizers according to arXiv mentions