Library

Your Research Library

6 documents across 3 collections

Attention Is All You Need
Vaswani et al. · 2017

Introduces the Transformer architecture based solely on attention mechanisms, dispensing recurrence and convolutions entirely.

TransformerSelf-AttentionMulti-Head Attention
15p
NLP Research 2024
BERT: Pre-training of Deep Bidirectional Transformers
Devlin et al. · 2018

Proposes a bidirectional transformer pre-training approach that achieves SOTA on 11 NLP tasks.

Pre-trainingBidirectionalFine-tuning
16p
NLP Research 2024
An Image is Worth 16x16 Words
Dosovitskiy et al. · 2020

Applies transformer architecture directly to image patches for classification at scale.

ViTImage PatchesVision Transformer
21p
Computer Vision Papers
Proximal Policy Optimization Algorithms
Schulman et al. · 2017

Presents PPO, a family of policy gradient methods for RL that balances simplicity and performance.

PPOPolicy GradientClipping
12p
Reinforcement Learning
GPT-4 Technical Report
OpenAI · 2023

Describes GPT-4, a large-scale multimodal model capable of processing image and text inputs.

LLMMultimodalRLHF
98p
NLP Research 2024
Denoising Diffusion Probabilistic Models
Ho et al. · 2020

Introduces DDPM, connecting diffusion models with denoising score matching for high-quality image generation.

DiffusionScore MatchingGenerative Models
24p
Computer Vision Papers
Connecting…