Advances in artificial intelligence, particularly large language models (LLMs), have been driven by the "scaling law" paradigm: performance improves with more data, computation, and larger models.