Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2005.14165

Most influential papers in AI

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 41
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
Universal Language Model Fine-tuning for Text Classification

Paper • 1801.06146 • Published Jan 18, 2018 • 6
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 11

SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding

Paper • 2408.15545 • Published 23 days ago • 32
Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published 28 days ago • 61
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published about 1 month ago • 39
Automated Design of Agentic Systems

Paper • 2408.08435 • Published Aug 15 • 37

LLM Fundamental papers

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 41
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 11
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

Paper • 2305.13245 • Published May 22, 2023 • 5
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 239

Language model papers

RoFormer: Enhanced Transformer with Rotary Position Embedding

Paper • 2104.09864 • Published Apr 20, 2021 • 9
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 41
LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 29
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Paper • 2205.14135 • Published May 27, 2022 • 9

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 41
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 14
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 11

LLM foundations

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 103
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 142
Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 103
Large Language Models Struggle to Learn Long-Tail Knowledge

Paper • 2211.08411 • Published Nov 15, 2022 • 3

Fundamentals LLM

Long-form factuality in large language models

Paper • 2403.18802 • Published Mar 27 • 23
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 41
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 11
A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4

Paper • 2310.12321 • Published Oct 4, 2023 • 1

Must Reads On Language Model

Dive into the world of generative AI with some prominent papers of Language Model, unlocking the secrets of natural language processing.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 7
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 11
OPT: Open Pre-trained Transformer Language Models

Paper • 2205.01068 • Published May 2, 2022 • 2

Lost in the Middle: How Language Models Use Long Contexts

Paper • 2307.03172 • Published Jul 6, 2023 • 35
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 41
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 239

Most influential papers

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 41
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 14
Universal Language Model Fine-tuning for Text Classification

Paper • 1801.06146 • Published Jan 18, 2018 • 6
Language Models are Few-Shot Learners

Paper • 2005.14165 • Published May 28, 2020 • 11

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs