Git Product home page Git Product logo

representationlearning_ss2023's Introduction

Representation Learning MSc course Summer Semester 2023

Summary

This course is tailored for MSc students of the AI and Data Science Master of the Heinrich Heine University of Dusseldorf.

We provide all the course materials, including lectures, slides, and exercise classes.

YouTube Playlist of videos

Week 1 - Introduction to Representation Learning

Exercise

Image autoencoders. Learning to use and evaluate the intermediate learned representation.

Week 2 - Overview of visual self-supervised learning methods

  • Lecture || Slides
  • Self-supervised learning VS Transfer Learning. Pretext VS Downstream Task
  • Pretext tasks covered: Colorization, Jigsaw puzzles, Image inpainting, Shuffle and Learn (Videos), - Classify corrupted images, Rotation Prediction
  • Semi-supervised learning: Consistency loss
  • A small intro to Contrastive loss (infoNCE)

Exercise

In this exercise, we will train a ResNet18 on the task of rotation prediction. Rotation prediction provides a simple, yet effective way to learn rich representations from unlabeled image data. The basic idea behind rotation prediction is that the network is trained to predict the orientation of a given image after it has been rotated by a certain angle (e.g., 0°, 90°, 180°, or 270°).

Week 3 - BERT:Learning Natural Language Representations

  • Lecture || Slides
  • Natural Language Processing (NLP) basics
  • RNN, self-attention, and Transformer recap
  • Language pretext tasks
  • Pretext tasks for representation learning in NLP. An in-depth look into BERT.

Exercise

In this exercise, you will train a small BERT model on the IMDB dataset (https://huggingface.co/datasets/imdb). You will then use the model to classify the sentiment of movie reviews and the sentiment of sentences from the Stanford Sentiment Treebank (SST2, https://huggingface.co/datasets/sst2).

Week 4 - Contrastive Learning, SimCLR and mutual information-based proof

Exercise

Build and train SimCLR resnet18 on CIFAR10.

Week 5 - Understanding Contrastive learning & MoCO and image clustering

  • Lecture || Slides || MoCO implementation
  • Contrastive Learning, L2 normalization, Properties of contrastive loss
  • Momentum encoder (MoCO). Issues and concerns regarding batch normalization
  • Multi-view contrastive learning
  • Deep Image Clustering: task definition and challenges, K-means and SCAN, PMI and TEMI

Exercise

Use pretrained MoCO ResNet50 for image clustering.

Week 6 - Vision Transformers and Knowledge Distillation

  • Lecture || Slides
  • Transformer encoder and Vision transformer
  • ViTs VS CNNs: receptive field and inductive biases
  • Knowledge distillation and the mysteries of model ensembles
  • Knowledge distillation in ViTs and masked image modeling

Exercise

Knowledge distillation on CIFAR100 with Vision Transformers.

Week 7 - Self-supervised learning without negative samples (BYOL, DINO)

  • Lecture || Slides
  • A small review of self-supervised methods
  • A small review of knowledge distillation
  • Self-Supervised Learning & knowledge distillation
  • An in-depth look into DINO

Exercise (2-week assignment)

In this exercise you will implement and train a DINO model on a medical dataset, the PathMNIST dataset from medmnist consisting of low-resolution images of various colon pathologies.

Week 8 - Masked-based visual representation learning: MAE, BEiT, iBOT, DINOv2

Week 9 - Multimodal representation learning, robustness, and visual anomaly detection

  • Lecture || Slides
  • Defining Robustness and Types of Robustness
  • Zero-shot learning
  • Contrastive Language Image Pretraining (CLIP)
  • Image captioning
  • Few-shot learning
  • Visual anomaly detection: task definition
  • Anomaly detection scores
  • Anomaly detection metrics: AUROC

Exercise (2-week assignment)

Use a CLIP-pre-trained model for out-of-distribution detection.

Week 10 - Emerging properties of the learned representations and scaling laws

  • Lecture || Slides
  • Investigating CLIP models and scaling laws
  • Determine factor of success of CLIP?
  • How does CLIP scale to larger datasets and models?
  • OpenCLIP: Scaling laws of CLIP models and connection to NLP scaling laws
  • Robustness of CLIP models against image manipulations
  • Learned representations of supervised models:CNNs VS Vision Transformers (ViTs), the texture-shape bias
  • Robustness and generalization of supervised-pretrained CNNs VS ViTs
  • Scaling (Supervised) Vision Transformers
  • Properties of ViT pretrained models

Week 11 - Investigating the self-supervised learned representations

  • Lecture || Slides
  • Limitations of existing vision language models
  • Self-supervised VS supervised learned feature representations
  • What do vision transformers (ViTs) learn “on their own”?
  • MoCOv3 and DINO: https://arxiv.org/abs/2104.14294
  • Self-supervised learning in medical imaging
  • Investigating the pre-training self-supervised objectives

Exercise

No exercise takes place this week.

Week 12 - Representation Learning in Proteins

Exercise

Use a pretrained Protein Language Model

Week 13 AlphaFold2

Exercise

Just play around with an Alphafold notebook

Additional info

Feel free to open issues regarding errors on the exercises or missing information and we will try to get back to you.

Important: Solutions to the exercises are not provided, but you can cross-check your results with the Expected results in the notebook.

representationlearning_ss2023's People

Contributors

black0017 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.