Practice · Foundational papers

Implement classic papers from scratch

Pick any paper below and PaperNova generates a guided workbook: an outline, exercise-wise explanations from beginner to advanced, and a downloadable Jupyter notebook you can run locally.

LLM Fine-tuning
2021
INTERMEDIATEFeatured

LoRA: Low-Rank Adaptation of Large Language Models

Edward J. Hu, Yelong Shen, Phillip Wallis, +5 more

LoRA. Inject low-rank matrices into frozen pretrained weights for cheap, effective fine-tuning — the backbone of most open-source LLM adaptation today.

Start workbook →
Multimodal
2021
INTERMEDIATEFeatured

Learning Transferable Visual Models From Natural Language Supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, +1 more

CLIP. Contrastive pretraining on 400M image-text pairs produces a zero-shot classifier that rivals supervised ImageNet models — a cornerstone of multimodal learning.

Start workbook →
Generative
2021
ADVANCEDFeatured

Diffusion Models Beat GANs on Image Synthesis

Prafulla Dhariwal, Alex Nichol

Diffusion models surpass GANs on high-fidelity image synthesis — the bridge to Stable Diffusion, DALL·E and the modern image generation era.

Start workbook →
Large Language Models
2020
ADVANCEDFeatured

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann, Nick Ryder, +3 more

GPT-3. Shows that scale alone unlocks in-context learning — a 175B parameter LM can tackle new tasks from a handful of examples in the prompt.

Start workbook →
Computer Vision
2020
INTERMEDIATEFeatured

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, +3 more

Vision Transformer (ViT). Applies a pure Transformer to image patches and matches CNNs on ImageNet at scale — the paper that unified vision and language architectures.

Start workbook →
NLP
2018
INTERMEDIATEFeatured

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, +1 more

Bidirectional masked-language modelling that reshaped NLP benchmarks and set the pretraining-then-finetuning pattern for years to come.

Start workbook →
NLP
2017
INTERMEDIATEFeatured

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, Niki Parmar, +5 more

The foundational Transformer paper. Introduces multi-head self-attention and dispenses with recurrence and convolutions — the blueprint behind every modern large language model.

Start workbook →
Computer Vision
2015
BEGINNERFeatured

Deep Residual Learning for Image Recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, +1 more

ResNet. Residual connections made it possible to train networks with hundreds of layers and became standard plumbing for nearly every deep architecture since.

Start workbook →
Computer Vision
2012
BEGINNERFeatured

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton

AlexNet. The paper that kicked off the modern deep-learning era in computer vision by winning ImageNet 2012 with a convolutional neural network trained on GPUs.

Start workbook →
Large Language Models
2023
INTERMEDIATE

LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron, Thibaut Lavril, Gautier Izacard, +1 more

LLaMA. Open-weights foundation model family that matched or beat GPT-3 at a fraction of the parameters and catalysed the open-source LLM ecosystem.

Start workbook →
Speech & Audio
2022
INTERMEDIATE

Robust Speech Recognition via Large-Scale Weak Supervision

Alec Radford, Jong Wook Kim, Tao Xu, +3 more

Whisper. An encoder-decoder Transformer trained on 680k hours of multilingual weakly-supervised audio — near-human speech recognition, zero-shot.

Start workbook →
Large Language Models
2022
BEGINNER

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Jason Wei, Xuezhi Wang, Dale Schuurmans, +3 more

Chain-of-Thought. A few worked-example prompts dramatically improve LLM reasoning on arithmetic, commonsense and symbolic tasks — zero training required.

Start workbook →
Self-Supervised Learning
2021
INTERMEDIATE

Masked Autoencoders Are Scalable Vision Learners

Kaiming He, Xinlei Chen, Saining Xie, +3 more

MAE. Mask 75% of image patches and reconstruct them — a BERT-style objective that yields strong, scalable vision representations.

Start workbook →
Self-Supervised Learning
2020
INTERMEDIATE

A Simple Framework for Contrastive Learning of Visual Representations

Ting Chen, Simon Kornblith, Mohammad Norouzi, +1 more

SimCLR. A clean, effective contrastive framework that learns visual representations without labels — closing much of the gap with supervised pretraining.

Start workbook →
Generative
2020
ADVANCED

Denoising Diffusion Probabilistic Models

Jonathan Ho, Ajay Jain, Pieter Abbeel

DDPM. The paper that made diffusion models practical — a simple denoising objective that scales to photorealistic generation.

Start workbook →
Reinforcement Learning
2017
INTERMEDIATE

Proximal Policy Optimization Algorithms

John Schulman, Filip Wolski, Prafulla Dhariwal, +2 more

PPO. A clipped-surrogate policy-gradient method that balances stability and simplicity — the default RL algorithm behind RLHF and most modern agents.

Start workbook →
Graph Learning
2016
INTERMEDIATE

Semi-Supervised Classification with Graph Convolutional Networks

Thomas N. Kipf, Max Welling

GCN. A first-order approximation of spectral graph convolutions that made graph neural networks simple, fast, and widely applicable.

Start workbook →
Classical ML
2016
BEGINNER

XGBoost: A Scalable Tree Boosting System

Tianqi Chen, Carlos Guestrin

XGBoost. An engineering-heavy gradient boosted trees framework that dominated Kaggle and production ML for years with regularised learning objectives.

Start workbook →
Computer Vision
2016
INTERMEDIATE

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, Santosh Divvala, Ross Girshick, +1 more

YOLO. Single-shot object detection that frames detection as a regression problem — fast, end-to-end, and the backbone of every real-time vision system since.

Start workbook →
Reinforcement Learning
2016
ADVANCED

Mastering the Game of Go with Deep Neural Networks and Tree Search

David Silver, Aja Huang, Chris J. Maddison, +1 more

AlphaGo. Combines deep policy/value networks with Monte Carlo tree search to beat the world champion — a landmark demonstration of RL at scale.

Start workbook →
Speech & Audio
2016
ADVANCED

WaveNet: A Generative Model for Raw Audio

Aaron van den Oord, Sander Dieleman, Heiga Zen, +1 more

WaveNet. An autoregressive convolutional model over raw audio samples that lifted text-to-speech naturalness to near-human levels.

Start workbook →
Training & Optimization
2015
BEGINNER

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, Christian Szegedy

Batch Normalization. Normalising activations per-batch stabilises and speeds up training — a now-standard building block in deep networks.

Start workbook →
Computer Vision
2015
INTERMEDIATE

U-Net: Convolutional Networks for Biomedical Image Segmentation

Olaf Ronneberger, Philipp Fischer, Thomas Brox

U-Net. Encoder-decoder with skip connections that became the default architecture for medical imaging and any dense-prediction task on small datasets.

Start workbook →
Optimization
2014
BEGINNER

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, Jimmy Ba

Adam. Adaptive first-order optimizer that became the default for almost every deep learning codebase — a must-implement from scratch.

Start workbook →
Regularization
2014
BEGINNER

Dropout: A Simple Way to Prevent Neural Networks from Overfitting

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, +2 more

Dropout. Randomly zeroing units during training as an implicit ensemble — one of the simplest and most effective regularizers in deep learning.

Start workbook →
Generative
2014
INTERMEDIATE

Generative Adversarial Networks

Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, +5 more

The original GAN paper. A generator and a discriminator locked in a minimax game — an elegant framing that opened an entire subfield of generative modelling.

Start workbook →
Generative
2013
INTERMEDIATE

Auto-Encoding Variational Bayes

Diederik P. Kingma, Max Welling

VAE. The variational autoencoder — a principled probabilistic generative model with a learned latent space and the reparameterisation trick.

Start workbook →
Reinforcement Learning
2013
INTERMEDIATE

Playing Atari with Deep Reinforcement Learning

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, +2 more

DQN. Deep Q-Networks learn to play Atari games from raw pixels, kickstarting the deep reinforcement learning era.

Start workbook →
NLP
2013
BEGINNER

Efficient Estimation of Word Representations in Vector Space

Tomas Mikolov, Kai Chen, Greg Corrado, +1 more

Word2Vec. Skip-gram and CBOW turned words into dense vectors whose geometry encodes meaning — the bridge between symbolic text and modern deep learning.

Start workbook →
Sequence Models
1997
INTERMEDIATE

Long Short-Term Memory

Sepp Hochreiter, Jürgen Schmidhuber

LSTM. The gated recurrent unit that solved the vanishing gradient problem and ran sequence modelling for two decades.

Start workbook →

Why implement classic papers?

Reading a paper and implementing it are two very different skills. PaperNova's workbook tool bridges that gap: Gemini turns the paper into a sequence of small, self-contained exercises — from a warm-up reimplementation of the core idea up to advanced extensions — then assembles them into a Jupyter notebook you can run, edit and extend.

Prefer to work from your own paper? Upload a PDF and get the same guided workbook tailored to it.