Library

Your Research Library

6 documents across 3 collections

Attention Is All You Need

Vaswani et al. · 2017

Introduces the Transformer architecture based solely on attention mechanisms, dispensing recurrence and convolutions entirely.

TransformerSelf-AttentionMulti-Head Attention

15p

NLP Research 2024

BERT: Pre-training of Deep Bidirectional Transformers

Devlin et al. · 2018

Proposes a bidirectional transformer pre-training approach that achieves SOTA on 11 NLP tasks.

Pre-trainingBidirectionalFine-tuning

16p

NLP Research 2024

An Image is Worth 16x16 Words

Dosovitskiy et al. · 2020

Applies transformer architecture directly to image patches for classification at scale.

ViTImage PatchesVision Transformer

21p

Computer Vision Papers

Proximal Policy Optimization Algorithms

Schulman et al. · 2017

Presents PPO, a family of policy gradient methods for RL that balances simplicity and performance.

PPOPolicy GradientClipping

12p

Reinforcement Learning

GPT-4 Technical Report

OpenAI · 2023

Describes GPT-4, a large-scale multimodal model capable of processing image and text inputs.

LLMMultimodalRLHF

98p

NLP Research 2024

Denoising Diffusion Probabilistic Models

Ho et al. · 2020

Introduces DDPM, connecting diffusion models with denoising score matching for high-quality image generation.

DiffusionScore MatchingGenerative Models

24p

Computer Vision Papers