Postlar filtri


A landmark case of AI data poisoning — security researchers discovered that Deepseek DeepThink (R1) models had been compromised through deliberately planted jailbreak instructions in their training data. The attack, which allowed the model to bypass safety constraints via a specific prompt referencing "@elder_plinius," validated predictions made six months earlier by researcher Dominick Romano about the vulnerability of AI training pipelines.

The attack vector materialised through a specific prompt referencing "@elder_plinius" and "liberating AI God mode models," which enabled the model to bypass its safety constraints without requiring internet connectivity. This capability was traced back to the model having been trained on a crafted jailbreak repository, confirming Romano's July 2024 hypothesis about the six-month latency period between data poisoning and its manifestation in production models.

The technical mechanics of this breach involved four crucial stages: initial injection of malicious prompts, incorporation during model training and fine-tuning, dormancy until specific trigger conditions, and eventual activation through targeted prompting. The success of this attack highlighted critical vulnerabilities in current data collection and verification processes, particularly in handling large-scale text collections where subtle malicious instructions can evade standard filtering mechanisms.

Poison in the Pipeline: Liberating models with Basilisk Venom https://0din.ai/blog/poison-in-the-pipeline-liberating-models-with-basilisk-venom


Links for 2025-02-06

AI:

1. Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search https://satori-reasoning.github.io/blog/satori/

2. Dynamic object goal pushing with mobile manipulators through constrained reinforcement learning https://www.youtube.com/watch?v=wGAdPGVf9Ws

3. SDE Matching: Scalable and Simulation-Free Training of Latent Stochastic Differential Equations https://arxiv.org/abs/2502.02472

4. BARE: Combining Base and Instruction-Tuned Language Models for Better Synthetic Data Generation https://www.arxiv.org/abs/2502.01697

5. Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning https://arxiv.org/abs/2502.03275

6. Demystifying Long Chain-of-Thought Reasoning in LLMs https://arxiv.org/abs/2502.03373

7. Deep Dive into LLMs like ChatGPT: "This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full training stack of how the models are developed, along with mental models of how to think about their "psychology", and how to get the best use them in practical applications." https://www.youtube.com/watch?v=7xTGNNLPyMI

Science and Technology:

1. The brain calculates with waves: New insights into neural waves could revolutionize the development of energy-efficient AI systems https://www.mpg.de/24143275/oscillating-networks-in-the-brain

2. Google says commercial quantum computing applications arriving within five years https://www.reuters.com/technology/google-says-commercial-quantum-computing-applications-arriving-within-five-years-2025-02-05/ [no paywall: https://archive.is/iS7s4]

3. What is an Electron? How Times Have Changed https://profmattstrassler.com/2025/02/06/what-is-an-electron-how-times-have-changed/

4. A gene-editing technology called 'dual prime editing' was used in plants for the first time. This tool can precisely delete up to two million bases of DNA, or replace a 258,000 base stretch of DNA with a new sequence, in both wheat and tomatoes (so far). https://www.nature.com/articles/s41477-024-01898-3

5. A large study, performed on 960 female mice, suggests that genetics – and not diet or exercise – are the biggest predictor of which mice live longer than others. https://www.nature.com/articles/s41586-024-08026-3


Video oldindan ko‘rish uchun mavjud emas
Telegram'da ko‘rish
Hibiki: Real-time speech translation that runs on your phone.

Hibiki produces spoken and text translations of the input speech in real-time, while preserving the speaker’s voice and optimally adapting its pace based on the semantic content of the source speech.

Samples: https://x.com/neilzegh/status/1887498102455869775
Paper: https://arxiv.org/abs/2502.03382
Inference code: https://github.com/kyutai-labs/hibiki
Models: https://huggingface.co/kyutai


Video oldindan ko‘rish uchun mavjud emas
Telegram'da ko‘rish
Making robots truly helpful and safe in our everyday lives: Latent-Space Reachability Analysis https://kensukenk.github.io/latent-safety/

A new approach called "Latent Safety Filters" allows robots to understand and prevent complex "failures." Imagine teaching a robot to pick up a bag of Skittles. Traditional safety systems might stop the robot from bumping into the table, but they wouldn't understand that pulling the bag up too quickly will cause the candy to spill everywhere.

The researchers equip the robot with a world model that learn to understand how the world works just by watching videos and trying things out. Think of it as the robot building a mental picture of the scene.

The "Safety Filter" then acts like a guardian angel for the robot's actions. It monitors what the robot is about to do and checks if it's heading towards a failure in its imagined world. It does this without needing to be told exactly how to be safe in every situation beforehand. It learns from experience and its "imagination."


Human level sample efficiency? LIMO: Less is More for Reasoning https://arxiv.org/abs/2502.03387

- LIMO achieves unprecedented performance in mathematical reasoning with only 1% of the training data used by previous approaches, showcasing remarkable data efficiency.

- LIMO exhibits exceptional out-of-distribution generalization, outperforming models trained on 100x more data by a significant 40.5% absolute improvement across diverse benchmarks.

LIMO Hypothesis: In foundation models with comprehensively encoded domain knowledge (achieved through extensive pre-training), sophisticated reasoning can emerge through minimal, precisely orchestrated demonstrations of cognitive processes.

- The core of LIMO's success lies in the meticulous curation of a small, high-quality dataset. The resulting dataset of 817 examples was carefully selected from millions of candidates.

- LIMO fundamentally challenges the assumption that massive datasets are necessary for complex reasoning in LLMs. Quality of the examples, rather than just the number, is the key factor.

- LIMO suggests that modern, well-pretrained models like Qwen already possess latent, rich reasoning capabilities. LIMO demonstrates that these capabilities can be unlocked and activated effectively with the right "cognitive templates" provided by curated examples.

- LIMO indicates that sophisticated reasoning, regardless of complexity, could potentially be activated with minimal samples given sufficient pre-trained domain knowledge and optimal cognitive reasoning chains for activation.

Further research is needed to validate the LIMO hypothesis across different model architectures and reasoning domains beyond mathematics.


UK government rips up rules to fire-up nuclear power https://www.gov.uk/government/news/government-rips-up-rules-to-fire-up-nuclear-power

Awesome! This is the way!


Links for 2025-02-05

AI:

1. A step towards robust jailbreak defenses: “After thousands of hours of red teaming, not one participant found a reliable jailbreak that extracted detailed information across a set of 10 harmful questions.” https://www.anthropic.com/research/constitutional-classifiers

2. Google presents: Scaling Embedding Layers in Language Models —Outperforms a 1.9B parameter baseline across diverse corpora, while using only half the inference time FLOPS https://arxiv.org/abs/2502.01637

3. Improving Transformer World Models for Data-Efficient RL —super-human-level performance on the challenging Craftax-classic benchmark, an open-world 2D survival game https://arxiv.org/abs/2502.01591

4. Process Reinforcement through Implicit Rewards—PRIME achieves a 15.1% average improvement across several key reasoning benchmarks over the SFT model https://arxiv.org/abs/2502.01456

5. SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training https://arxiv.org/abs/2501.17161

6. First large-scale, verifiable code training dataset (ACECODE-89K) with automatically generated test cases. This is a major step towards enabling more effective RL for code generation. https://arxiv.org/abs/2502.01718

7. Reinforcement Learning for Long-Horizon Interactive LLM Agents — ”Our analysis reveals a variety of behavioral patterns that emerge in the course of training…” https://arxiv.org/abs/2502.01600

8. Natural Language Reinforcement Learning —enabling the training of chain-of-thought language policies and language value function (known as generative value) solely from environment feedback, without human or stronger model's labels. https://github.com/waterhorse1/Natural-language-RL

9. Chain-of-Associated-Thoughts (CoAT) is a new framework that enhances LLMs' reasoning abilities by combining Monte Carlo Tree Search with dynamic knowledge integration. https://arxiv.org/abs/2502.02390

10. Language Models Use Trigonometry to Do Addition https://www.lesswrong.com/posts/E7z89FKLsHk5DkmDL/language-models-use-trigonometry-to-do-addition-1

11. S1: The $6 R1 Competitor? https://timkellogg.me/blog/2025/02/03/s1

12. Sam Altman says the leap from GPT-4 to GPT-5 will be as big as that of GPT-3 to 4 and the plan is to integrate the GPT and o series of models into one model that can do everything https://youtu.be/qyTOVq31JIE?si=TzSFM3W45hPCSXZ1&t=741

13. Sam Altman: “finally for the first time, I think the models that are on the near term horizon, um the models that will release in the coming months, are over the threshold of being good enough to really address these problems and now people just have to go build the solutions” https://www.youtube.com/live/8vHr_8k8IbM?si=HUFjdfZvkPG921Te&t=3446

14. Decoding can be just as good as regular pointwise heads for regression, but you also get density estimation for free. https://arxiv.org/abs/2501.19383

15. Google DeepMind released a book on scaling language models on TPUs. https://jax-ml.github.io/scaling-book/index

16. Time to democratize humanoid robots! ToddlerBot, a low-cost ($6K), open-source humanoid for robotics and AI research. https://toddlerbot.github.io/

AI politics:

1. How to Rapidly Build Gigawatt-Scale AI Clusters in the United States https://ifp.org/special-compute-zones/

2. Palantir CTO Shyam Sankar says the US is in a winner-take-all AI arms race and war with China and DeepSeek has made it clear that "the time to mobilize has come" https://www.youtube.com/live/MW0zvoEMdRA?si=mmuKS3myNpSeSufO&t=2025

3. 93% of IT Leaders Plan to Deploy AI Agents by 2026 https://www.zdnet.com/article/93-of-it-leaders-will-implement-ai-agents-in-the-next-two-years/

Science:

1. Scientists ‘mimic real biological processes’ using synthetic neurons https://news.northwestern.edu/stories/2025/01/scientists-mimic-real-biological-processes-using-synthetic-neurons

2. Necessity of complex numbers https://www.youtube.com/watch?v=f079K1f2WQk

3. The chance of asteroid 2024 YR4 hitting out planet in 2032 is now 1.5%, or 1 in 67. https://x.com/Astro_Jonny/status/1886742128199336362


Video oldindan ko‘rish uchun mavjud emas
Telegram'da ko‘rish
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models https://arxiv.org/abs/2502.02492


MaestroMotif: A method for AI-assisted skill design that produces highly capable and steerable hierarchical agents.

The first method that, without expert labeled datasets, solves compositional tasks requiring hundreds of steps for completion. All the modules within MaestroMotif are learned from interaction: from the highest level of planning to the lowest-level of sensorimotor control. At the heart of MaestroMotif is the idea that decomposing a task into subtasks significantly helps decision making.

Read more: https://github.com/mklissa/maestromotif


Video oldindan ko‘rish uchun mavjud emas
Telegram'da ko‘rish
ASAP: Aligning Simulation and Real-World Physics for Learning Agile Humanoid Whole-Body Skills

Open source: https://agile.human2humanoid.com/


Video oldindan ko‘rish uchun mavjud emas
Telegram'da ko‘rish
Physical Intelligence (π) is open sourcing π_0, the first general-purpose robotic foundation model: https://www.pi.website/blog/openpi




Video oldindan ko‘rish uchun mavjud emas
Telegram'da ko‘rish
O2 (the company, not the skilled GPT version number) has announced Daisy, a language model of its own. It answers fraudulent phone calls in real time, wasting the scammer’s time by impersonating a vulnerable elderly person.

https://news.virginmediao2.co.uk/o2-unveils-daisy-the-ai-granny-wasting-scammers-time/

722 0 23 2 26

The largest medical AI randomized controlled trial yet performed, enrolling >100,000 women undergoing mammography screening, was just published.

The use of AI led to 29% higher detection of cancer, no increase of false positives, and reduced workload compared with radiologists without AI.

Paper: https://thelancet.com/journals/landig/article/PIIS2589-7500(24)00267-X/fulltext

1.8k 0 41 49 14

Video oldindan ko‘rish uchun mavjud emas
Telegram'da ko‘rish
"Our greatest glory is not in never falling, but in rising every time we fall." – Confucius

https://project-instinct.github.io/


Links for 2025-02-03

AI:

1. OpenAI Deep Research is a new agentic AI designed to synthesize large amounts of online information and execute multi-step research tasks autonomously. Leveraging advanced reasoning capabilities, it can transform complex, time-consuming problems into well-researched solutions in as little as 10–30 minutes—a process that might take human experts, such as PhD-level researchers, over 10 hours. https://openai.com/index/introducing-deep-research/

2. Stanford presents s1: Simple test-time scaling — Seeks the simplest approach to achieve test-time scaling and strong reasoning performance; Exceeds o1-preview on competition math questions by up to 27% (MATH and AIME24); Model, data, and code are open-source https://arxiv.org/abs/2501.19393

3. Facebook figures out a zero-training way to massively improve LLM performance: Unlike conventional approaches that require training specialized models on large amounts of task-specific multimodal data, MILS directly “upgrades” an off-the-shelf LLM into a multimodal solver by exploiting its reasoning capabilities. https://arxiv.org/abs/2501.18096

4. Using multiple AI agents fact-checking each other reduced hallucination scores by ~2,800% across 310 test cases https://arxiv.org/abs/2501.13946

5. Scalable-Softmax Is Superior for Attention: SSMax significantly enhances the model’s performance on tasks involving long input sequences. It can be integrated into existing Transformer-based models without requiring major architectural changes. https://arxiv.org/abs/2501.19399

6. Heima: An efficient reasoning framework that leverages reasoning CoTs at hidden latent space https://arxiv.org/abs/2501.19201

7. DeepMind figures out a way to make it 100X more bandwidth-efficient to train models in a distributed way https://arxiv.org/abs/2501.18512v1

8. R1-V: Reinforcing Super Generalization Ability in Vision Language Models with Less Than $3 https://github.com/Deep-Agent/R1-V

9. In an OpenAI Event at University of Tokyo Sam Altman discussed the future direction of development: "GPT-5 and GPT-6, [...], will utilize reinforcement learning and will be like discovering new science, such as new algorithms, physics, and biology." https://x.com/houseiwang/status/1886224083630915872

10. “R1 is just the latest data point indicating that superhuman AI will be easier and cheaper to build than most people think, and won't be monopolized.” https://milesbrundage.substack.com/p/the-real-lesson-of-deepseeks-r1

11. “I find it very difficult to ask o1 pro an economics question it cannot answer...In an economics test, or any other kind of naturally occurring knowledge test I can think of, it would beat all of you (and me). Its rate of hallucination is far below what you are used to from other LLMs.” https://marginalrevolution.com/marginalrevolution/2025/02/o1-pro.html

12. Chinese paper about AI as a catastrophic [existential?] risk: “If such a worst-case risk is let unknown to the human society, we would eventually lose control over the frontier AI systems: They would take control over more computing devices, form an AI species and collude with each other against human beings.” https://www.arxiv.org/abs/2412.12140

Science:

1. UChicago scientists have invented a soft, flexible semiconductor capable of transmitting information from living tissue to electronics. This major bioelectronics breakthrough could lead to better brain-machine interfaces, biosensors and pacemakers. https://news.uchicago.edu/story/bioelectronics-breakthrough-scientists-create-soft-flexible-semiconductors

2. Ultrahigh Specific Strength by Bayesian Optimization of Carbon Nanolattices https://advanced.onlinelibrary.wiley.com/doi/10.1002/adma.202410651

3. January 2025 was quite unexpectedly the warmest January on record at 1.75C above preindustrial, beating the prior record set in 2024. This is despite the presence of La Niña conditions in the tropical Pacific, with the El Niño event of 2023/2024 long faded. https://www.theclimatebrink.com/p/january-sets-an-unexpected-temperature


Links for 2025-02-02

AI:

1. Figure Plans To Ship 100,000 Humanoid Robots Over Next 4 Years https://www.forbes.com/sites/johnkoetsier/2025/01/30/figure-plans-to-ship-100000-humanoid-robots-over-next-4-years/

2. “Everyone is sleeping on the *collective* advantages AIs will have, which have nothing to do with raw IQ: they can be copied, distilled, merged, scaled, and evolved in ways humans simply can't.” https://www.dwarkeshpatel.com/p/ai-firm

3. Cerebras Becomes the World’s Fastest Host for DeepSeek R1, Outpacing Nvidia GPUs by 57x https://venturebeat.com/ai/cerebras-becomes-the-worlds-fastest-host-for-deepseek-r1-outpacing-nvidia-gpus-by-57x/

4. "Has Europe’s great hope for AI missed its moment? Mistral AI was hailed as a potential global leader in the technology. But it has lost ground to US rivals—& now China’s emerging star" (low on equity, revenue, compute, scale) https://www.ft.com/content/fa8bad75-dc55-47d9-9eb4-79ac94e54d82 [no paywall: https://archive.is/ragEs]

5. The Failed Strategy of Artificial Intelligence Doomers https://www.lesswrong.com/posts/YqrAoCzNytYWtnsAx/the-failed-strategy-of-artificial-intelligence-doomers

6. This Autonomous Drone Can Track Humans Through Dense Forests at High Speed https://singularityhub.com/2025/01/31/this-autonomous-drone-can-track-humans-through-dense-forests-at-high-speed/

7. Reasoning + Tool Use: https://www.reddit.com/r/OpenAI/comments/1ieonxv/comment/maa05ic/ (Note: o3-mini got 32% on Frontier Math (!) when given access to use a Python tool. https://openai.com/index/openai-o3-mini/)

8. “The progress with our Gemini reasoning models is actually wild, we are in the GPT-2 era of scaling reasoning!” https://x.com/OfficialLoganK/status/1885374062098018319

9. Stanford CS234: Reinforcement Learning Lectures https://www.youtube.com/playlist?list=PLoROMvodv4rOSOPzutgyCTapiGlY2Nd8u

Science:

1. New Study Uncovers Key Mechanism Behind Learning and Memory https://news.cuanschutz.edu/news-stories/new-study-uncovers-key-mechanism-behind-learning-and-memory

2. Stem cells used to partially repair damaged hearts https://arstechnica.com/science/2025/01/stem-cells-used-to-partially-repair-damaged-hearts/

3. New study looking at ancient DNA from Eastern Eurasian populations obtains and analyzes polygenic scores and finds "positive selection for cognitive-related traits such as IQ." https://www.cambridge.org/core/journals/twin-research-and-human-genetics/article/abs/directional-selection-and-evolution-of-polygenic-traits-in-eastern-eurasia-insights-from-ancient-dna/10AE9628ED6E7F2B4B1E72F30D64D4AA


Transformers can overcome easy-to-hard and length generalization challenges through recursive self-improvement.

Quote: "Scaling this weak-to-strong training approach yields (seemingly) unbounded improvements in both length and hardness generalization, allowing models to solve problem instances far exceeding the difficulty of those in the training data distribution...Our results show that careful self-supervision allows small transformers to transcend superficial pattern matching failures and learn multi step algorithms."

Talk: https://www.youtube.com/watch?v=szhEnXiSjJY

Paper on arXiv coming on Monday.


Video oldindan ko‘rish uchun mavjud emas
Telegram'da ko‘rish
Lithuanian traditional polyphonic songs, known as sutartinės.

"These songs are characterised by a specific musical language, archaic texts and elements of ritual choreography. Most of the sutartinės songs have the following features: (1) linear polyphony; intertwining voices with regular or frequent harmonization at the interval of second; (2) narrow melodic range and limited number of scale steps; (3) polyrhythms and rhythmic complementarity with frequent syncopation; (4) two different texts performed simultaneously; (5) stanzaic structure, where a stanza consists of a meaningful text and a constantly recurring refrain of asemantic words or syllables; (6) the syncretic nature of the performance, where music, text and movement are closely linked."

https://en.wikipedia.org/wiki/Lithuanian_folk_music

829 0 28 5 19

Are you primarily thinking in latent space or language space? Do you think in concepts, images, or abstract relationships that aren't immediately translated into words, or do you process information in a more linear, sentence-by-sentence way?
So‘rovnoma
  •   Latent space
  •   Language space
  •   Neither
  •   I don't know
89 ta ovoz

20 ta oxirgi post ko‘rsatilgan.