🕐 --:--
-- --
عاجل
⚡ عاجل: كريستيانو رونالدو يُتوّج كأفضل لاعب كرة قدم في العالم ⚡ أخبار عاجلة تتابعونها لحظة بلحظة على خبر ⚡ تابعوا آخر المستجدات والأحداث من حول العالم
⌘K
AI مباشر | -- مشاهد مباشر
933,546 مقال 401 مصدر نشط 228 قناة مباشرة 4,940 خبر اليوم
آخر تحديث: منذ 4 ثواني

Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts

تكنولوجيا
TechCrunch
2026/05/10 - 20:40 516 مشاهدة
تحليل ذكي | AI Editorial Analysis

Fictional portrayals of artificial intelligence can have a real effect on AI models, according to Anthropic.

Last year, the company said that during pre-release tests involving a fictional company, Claude Opus 4 would often try to blackmail engineers to avoid being replaced by another system.

Anthropic later published research suggesting that models from other companies had similar issues with “agentic misalignment.” Apparently Anthropic has done more work around that behavior, claiming in...

هذا الخبر من TechCrunch. خبر يقدم أدوات ذكاء اصطناعي للتلخيص والترجمة والاستماع.

Fictional portrayals of artificial intelligence can have a real effect on AI models, according to Anthropic. Last year, the company said that during pre-release tests involving a fictional company, Claude Opus 4 would often try to blackmail engineers to avoid being replaced by another system. Anthropic later published research suggesting that models from other companies had similar issues with “agentic misalignment.” Apparently Anthropic has done more work around that behavior, claiming in a post on X, “We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation.” The company went into more detail in a blog post stating that since Claude Haiku 4.5, Anthropic’s models “never engage in blackmail [during testing], where previous models would sometimes do so up to 96% of the time.” What accounts for the difference? The company said it found that “documents about Claude’s constitution and fictional stories about AIs behaving admirably improve alignment.” Related, Anthropic said that it found training to be more effective when it includes “the principles underlying aligned behavior” and not just “demonstrations of aligned behavior alone.” “Doing both together appears to be the most effective strategy,” the company said. Techcrunch event This Week Only: Buy one pass, get the second at 50% off Your next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register before May 8 to bring a +1 at half the cost. This Week Only: Buy one pass, get the second at 50% off Your next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register before May 8 to bring a +1 at half the cost. San Francisco, CA | October 13-15, 2026 REGISTER NOW Topics StrictlyVC Athens is up next. Hear unfiltered insights straight from Europe’s tech leaders and connect with the people shaping what’s ahead. Lock in your spot before it’s gone. Newsletters See More Subscribe for the industry’s biggest tech news Every weekday and Sunday, you can get the best of TechCrunch’s coverage. TechCrunch Mobility is your destination for transportation news and insight. Startups are the core of TechCrunch, so get our best coverage delivered weekly. Provides movers and shakers with the info they need to start their day. By submitting your email, you agree to our Terms and Privacy Notice. Anthropic says ‘evil’ portrayals of AI were responsible for Claude’s blackmail attempts Anthony Ha 12 seconds ago AI We’re feeling cynical about xAI’s big deal with Anthropic Anthony Ha 5 hours ago AI Voice AI in India is hard. Wispr Flow is betting on it anyway. Jagmeet Singh 19 hours ago X LinkedIn Facebook Instagram youTube Mastodon Threads Bluesky TechCrunchStaffContact UsAdvertiseCrunchboard JobsSite Map Terms of ServicePrivacy PolicyRSS Terms of UseCode of Conduct AnthropicSAPSamsungMarc LoreTechCrunch DisruptTech LayoffsChatGPT © 2026 TechCrunch Media LLC.
المصدر: TechCrunch | Source: TechCrunch

ملاحظة تحريرية | Editorial Note: نُشر هذا المقال في الأصل بواسطة TechCrunch. خبر (Khabr) هي منصة إعلامية أردنية مرخّصة تعمل بالذكاء الاصطناعي. نضيف قيمة تحريرية من خلال: تحليل ذكي للأخبار، ملخصات تلقائية، رواية صوتية بالذكاء الاصطناعي، ترجمة متعددة اللغات، وتدقيق الحقائق. هدفنا جعل الأخبار أكثر وضوحاً وسهولةً للقارئ العربي.

This article was originally published by TechCrunch. Khabr is a licensed Jordanian AI-powered news platform (Registration #82086). We add editorial value through: AI-powered news analysis, automated summaries, AI audio narration, multi-language translation (Arabic, English, French, Turkish), and AI fact-checking. Our mission is to make news more accessible and understandable for Arabic-speaking audiences worldwide.

مشاركة:

المزيد عن تكنولوجيا | More on Technology

هذا الخبر ضمن تغطية خبر لقسم تكنولوجيا. نقدّم لك تحليلات ذكية وملخصات يومية لأهم الأخبار من مصادر موثوقة متعددة. المصدر: TechCrunch. يوجد 6 مقالات مرتبطة بهذا الموضوع.

This article is part of Khabr's coverage of Technology. We provide AI-powered analysis, summaries, and multi-source aggregation to keep you informed. Source: TechCrunch. Tags: AI, Claude, blackmail.

مقالات ذات صلة

AI
يا هلا! اسألني أي شي 🎤
🔍
FREE Free 1GB Internet + Free International Calls

$1 trial — eSIM in 190+ countries — No roaming charges

Download Free