Synthetic Data for Large Language Models

non-profit

AI & ML interests

Exploring the diversity in synthetic data for pretraining large (and smol) language models.

models

None public yet

datasets

None public yet