New @datologyai research: 35ร cheaper per correct answer through pretraining data curation alone. No parameter reduction or decoding tricks required. ๐งต
A million correct answers cost ~$1.34 from our curated 4B, and ~$47 from verbose Qwen3.5-4B. Same answers.
01:20
What if you could induce models to be more concise via pretraining data curation?




