Публикация #6961 — Axis of Ordinary (@axisofordinary)

TGStat

Введите текст для поиска

Расширенный поиск каналов

Russian

Язык сайта

Russian English Uzbek
Вход на сайт

Каталог

Каталог каналов и чатов Поиск каналов
Добавить канал/чат
Рейтинги

Рейтинг каналов Рейтинг чатов Рейтинг публикаций
Рейтинги брендов и персон
Аналитика
Поиск по публикациям
Мониторинг Telegram

Axis of Ordinary

7 Feb, 00:47

Открыть в Telegram Поделиться Пожаловаться

A landmark case of AI data poisoning — security researchers discovered that Deepseek DeepThink (R1) models had been compromised through deliberately planted jailbreak instructions in their training data. The attack, which allowed the model to bypass safety constraints via a specific prompt referencing "@elder_plinius," validated predictions made six months earlier by researcher Dominick Romano about the vulnerability of AI training pipelines.

The attack vector materialised through a specific prompt referencing "@elder_plinius" and "liberating AI God mode models," which enabled the model to bypass its safety constraints without requiring internet connectivity. This capability was traced back to the model having been trained on a crafted jailbreak repository, confirming Romano's July 2024 hypothesis about the six-month latency period between data poisoning and its manifestation in production models.

The technical mechanics of this breach involved four crucial stages: initial injection of malicious prompts, incorporation during model training and fine-tuning, dormancy until specific trigger conditions, and eventual activation through targeted prompting. The success of this attack highlighted critical vulnerabilities in current data collection and verification processes, particularly in handling large-scale text collections where subtle malicious instructions can evade standard filtering mechanisms.

Poison in the Pipeline: Liberating models with Basilisk Venom https://0din.ai/blog/poison-in-the-pipeline-liberating-models-with-basilisk-venom

356 1 20 1 7

Каталог

Каталог каналов и чатов Подборки каналов Поиск каналов Добавить канал/чат

Рейтинги

Рейтинг каналов Telegram Рейтинг чатов Telegram Рейтинг публикаций Рейтинги брендов и персон

API

API статистики API поиска публикаций API Callback

Наши каналы

@TGStat @TGStat_Chat @telepulse @TGStatAPI

Почитать

Наш блог Исследование Telegram 2019 Исследование Telegram 2021 Исследование Telegram 2023

Контакты

Поддержка Почта Вакансии

Всякая всячина

Пользовательское соглашение Политика конфиденциальности Публичная оферта

Наши боты

@TGStat_Bot @SearcheeBot @TGAlertsBot @tg_analytics_bot @TGStatChatBot

ИП Кижикин | ИНН: 616803600305 | Москва, Оборонная 6-28

Язык сайта