Links for 2025-02-05
AI:
1. A step towards robust jailbreak defenses: “After thousands of hours of red teaming, not one participant found a reliable jailbreak that extracted detailed information across a set of 10 harmful questions.” https://www.anthropic.com/research/constitutional-classifiers
2. Google presents: Scaling Embedding Layers in Language Models —Outperforms a 1.9B parameter baseline across diverse corpora, while using only half the inference time FLOPS https://arxiv.org/abs/2502.01637
3. Improving Transformer World Models for Data-Efficient RL —super-human-level performance on the challenging Craftax-classic benchmark, an open-world 2D survival game https://arxiv.org/abs/2502.01591
4. Process Reinforcement through Implicit Rewards—PRIME achieves a 15.1% average improvement across several key reasoning benchmarks over the SFT model https://arxiv.org/abs/2502.01456
5. SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training https://arxiv.org/abs/2501.17161
6. First large-scale, verifiable code training dataset (ACECODE-89K) with automatically generated test cases. This is a major step towards enabling more effective RL for code generation. https://arxiv.org/abs/2502.01718
7. Reinforcement Learning for Long-Horizon Interactive LLM Agents — ”Our analysis reveals a variety of behavioral patterns that emerge in the course of training…” https://arxiv.org/abs/2502.01600
8. Natural Language Reinforcement Learning —enabling the training of chain-of-thought language policies and language value function (known as generative value) solely from environment feedback, without human or stronger model's labels. https://github.com/waterhorse1/Natural-language-RL
9. Chain-of-Associated-Thoughts (CoAT) is a new framework that enhances LLMs' reasoning abilities by combining Monte Carlo Tree Search with dynamic knowledge integration. https://arxiv.org/abs/2502.02390
10. Language Models Use Trigonometry to Do Addition https://www.lesswrong.com/posts/E7z89FKLsHk5DkmDL/language-models-use-trigonometry-to-do-addition-1
11. S1: The $6 R1 Competitor? https://timkellogg.me/blog/2025/02/03/s1
12. Sam Altman says the leap from GPT-4 to GPT-5 will be as big as that of GPT-3 to 4 and the plan is to integrate the GPT and o series of models into one model that can do everything https://youtu.be/qyTOVq31JIE?si=TzSFM3W45hPCSXZ1&t=741
13. Sam Altman: “finally for the first time, I think the models that are on the near term horizon, um the models that will release in the coming months, are over the threshold of being good enough to really address these problems and now people just have to go build the solutions” https://www.youtube.com/live/8vHr_8k8IbM?si=HUFjdfZvkPG921Te&t=3446
14. Decoding can be just as good as regular pointwise heads for regression, but you also get density estimation for free. https://arxiv.org/abs/2501.19383
15. Google DeepMind released a book on scaling language models on TPUs. https://jax-ml.github.io/scaling-book/index
16. Time to democratize humanoid robots! ToddlerBot, a low-cost ($6K), open-source humanoid for robotics and AI research. https://toddlerbot.github.io/
AI politics:
1. How to Rapidly Build Gigawatt-Scale AI Clusters in the United States https://ifp.org/special-compute-zones/
2. Palantir CTO Shyam Sankar says the US is in a winner-take-all AI arms race and war with China and DeepSeek has made it clear that "the time to mobilize has come" https://www.youtube.com/live/MW0zvoEMdRA?si=mmuKS3myNpSeSufO&t=2025
3. 93% of IT Leaders Plan to Deploy AI Agents by 2026 https://www.zdnet.com/article/93-of-it-leaders-will-implement-ai-agents-in-the-next-two-years/
Science:
1. Scientists ‘mimic real biological processes’ using synthetic neurons https://news.northwestern.edu/stories/2025/01/scientists-mimic-real-biological-processes-using-synthetic-neurons
2. Necessity of complex numbers https://www.youtube.com/watch?v=f079K1f2WQk
3. The chance of asteroid 2024 YR4 hitting out planet in 2032 is now 1.5%, or 1 in 67. https://x.com/Astro_Jonny/status/1886742128199336362
AI:
1. A step towards robust jailbreak defenses: “After thousands of hours of red teaming, not one participant found a reliable jailbreak that extracted detailed information across a set of 10 harmful questions.” https://www.anthropic.com/research/constitutional-classifiers
2. Google presents: Scaling Embedding Layers in Language Models —Outperforms a 1.9B parameter baseline across diverse corpora, while using only half the inference time FLOPS https://arxiv.org/abs/2502.01637
3. Improving Transformer World Models for Data-Efficient RL —super-human-level performance on the challenging Craftax-classic benchmark, an open-world 2D survival game https://arxiv.org/abs/2502.01591
4. Process Reinforcement through Implicit Rewards—PRIME achieves a 15.1% average improvement across several key reasoning benchmarks over the SFT model https://arxiv.org/abs/2502.01456
5. SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training https://arxiv.org/abs/2501.17161
6. First large-scale, verifiable code training dataset (ACECODE-89K) with automatically generated test cases. This is a major step towards enabling more effective RL for code generation. https://arxiv.org/abs/2502.01718
7. Reinforcement Learning for Long-Horizon Interactive LLM Agents — ”Our analysis reveals a variety of behavioral patterns that emerge in the course of training…” https://arxiv.org/abs/2502.01600
8. Natural Language Reinforcement Learning —enabling the training of chain-of-thought language policies and language value function (known as generative value) solely from environment feedback, without human or stronger model's labels. https://github.com/waterhorse1/Natural-language-RL
9. Chain-of-Associated-Thoughts (CoAT) is a new framework that enhances LLMs' reasoning abilities by combining Monte Carlo Tree Search with dynamic knowledge integration. https://arxiv.org/abs/2502.02390
10. Language Models Use Trigonometry to Do Addition https://www.lesswrong.com/posts/E7z89FKLsHk5DkmDL/language-models-use-trigonometry-to-do-addition-1
11. S1: The $6 R1 Competitor? https://timkellogg.me/blog/2025/02/03/s1
12. Sam Altman says the leap from GPT-4 to GPT-5 will be as big as that of GPT-3 to 4 and the plan is to integrate the GPT and o series of models into one model that can do everything https://youtu.be/qyTOVq31JIE?si=TzSFM3W45hPCSXZ1&t=741
13. Sam Altman: “finally for the first time, I think the models that are on the near term horizon, um the models that will release in the coming months, are over the threshold of being good enough to really address these problems and now people just have to go build the solutions” https://www.youtube.com/live/8vHr_8k8IbM?si=HUFjdfZvkPG921Te&t=3446
14. Decoding can be just as good as regular pointwise heads for regression, but you also get density estimation for free. https://arxiv.org/abs/2501.19383
15. Google DeepMind released a book on scaling language models on TPUs. https://jax-ml.github.io/scaling-book/index
16. Time to democratize humanoid robots! ToddlerBot, a low-cost ($6K), open-source humanoid for robotics and AI research. https://toddlerbot.github.io/
AI politics:
1. How to Rapidly Build Gigawatt-Scale AI Clusters in the United States https://ifp.org/special-compute-zones/
2. Palantir CTO Shyam Sankar says the US is in a winner-take-all AI arms race and war with China and DeepSeek has made it clear that "the time to mobilize has come" https://www.youtube.com/live/MW0zvoEMdRA?si=mmuKS3myNpSeSufO&t=2025
3. 93% of IT Leaders Plan to Deploy AI Agents by 2026 https://www.zdnet.com/article/93-of-it-leaders-will-implement-ai-agents-in-the-next-two-years/
Science:
1. Scientists ‘mimic real biological processes’ using synthetic neurons https://news.northwestern.edu/stories/2025/01/scientists-mimic-real-biological-processes-using-synthetic-neurons
2. Necessity of complex numbers https://www.youtube.com/watch?v=f079K1f2WQk
3. The chance of asteroid 2024 YR4 hitting out planet in 2032 is now 1.5%, or 1 in 67. https://x.com/Astro_Jonny/status/1886742128199336362