Links for 2024-12-18
AI:
1. Byte Latent Transformer: Patches Scale Better Than Tokens — Training transformers directly on raw bytes
https://arxiv.org/abs/2412.098712. Compressed Chain of Thought: Efficient Reasoning Through Dense Representations
https://arxiv.org/abs/2412.131713. REGENT: A generalist agent that can generalize to unseen robotics tasks and games via retrieval-augmentation and in-context learning.
https://kaustubhsridhar.github.io/regent-research/4. Can frontier AI transform ANY physical object from ANY input modality into a high-quality digital twin that also MOVES? Articulate-Anything, exploring how large vision-language models (VLMs) can bridge the gap between the physical and digital worlds.
https://articulate-anything.github.io/5. Cultural Evolution of Cooperation among LLM Agents
https://arxiv.org/abs/2412.102706. A demonstration of strategic deception arising naturally in LLM training
https://www.anthropic.com/research/alignment-faking7. Testing which LLM architectures can do hidden serial reasoning
https://www.lesswrong.com/posts/ZB6guMhHH3NEyxA2k/testing-which-llm-architectures-can-do-hidden-serial-38. “We outperform Llama 70B with Llama 3B on hard math by scaling test-time compute. How? By combining step-wise reward models with tree search algorithms.”
https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute9. ProcessBench, a benchmark for measuring the ability to identify process errors in mathematical reasoning
https://arxiv.org/abs/2412.0655910. “Project Numina, which won the first AIMO progress prize in part through developing their database...of nearly a million math problems”
https://mathstodon.xyz/@tao/11366912162191455811. A dataset of questions on decision-theoretic reasoning in Newcomb-like problems
https://www.lesswrong.com/posts/d9amcRzns5pwg9Fcu/a-dataset-of-questions-on-decision-theoretic-reasoning-in12. Superhuman performance of a large language model on the reasoning tasks of a physician
https://arxiv.org/abs/2412.1084913. GenEx: Generating an Explorable World
https://www.genex.world/14. Fast LLM Inference From Scratch
https://andrewkchan.dev/posts/yalm.html15. MIT researchers introduce Boltz-1, a fully open-source model for predicting biomolecular structures
https://news.mit.edu/2024/researchers-introduce-boltz-1-open-source-model-predicting-biomolecular-structures-121716. “This paper describes a process for automatically generating academic finance papers using large language models”
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5060022Neuroscience:
1. “We used neurofeedback from closed-loop real-time functional MRI to create new categories of visual objects in the brain, without the participants’ explicit awareness.”
https://mindblog.dericbownds.net/2024/12/sculpting-new-visual-categories-into.html2. The Unbearable Slowness of Being: Why do we live at 10 bits/s?
https://arxiv.org/abs/2408.102343. What are recurrent networks doing in the brain?
https://www.thetransmitter.org/neural-networks/what-are-recurrent-networks-doing-in-the-brain/Technology:
1. [Scott Aaronson: Really good article] Quantum Computers Cross Critical Error Threshold
https://www.quantamagazine.org/quantum-computers-cross-critical-error-threshold-20241209/2. Fast, scalable, clean, and cheap enough: How off-grid solar microgrids can power the AI race
https://www.offgridai.us/Miscellaneous:
1. The daring doctor behind a world-first treatment for autoimmune disease
https://www.nature.com/articles/d41586-024-03895-02. The number of exceptional people: Fewer than 85 per 1 million across key traits
https://www.sciencedirect.com/science/article/pii/S019188692400415X3. “Incredible historically accurate short film set in Bronze Age Sardinia.” — Cast in Bronze: Sherden, the Sea People of Sardinia
https://www.youtube.com/watch?v=aAvtoFx3M004. “All of statistics and much of science depends on probability — an astonishing achievement, considering no one’s really sure what it is.”
https://www.nature.com/articles/d41586-024-04096-5