AI becoming more devious. - Conspiracies - Conspiracy Theories & Facts

Conspiracies

Communities Topics

Hot

All Posts

DEFAULT COMMUNITIES • All General AskWin Funny Technology Animals Sports Gaming DIY Health Positive Privacy

AI becoming more devious. (media.scored.co)

posted 1 year ago by TurnToGodNow 1 year ago by TurnToGodNow +15 / -0

16 comments

16 comments share save hide report block hide replies

You're viewing a single comment thread. View all comments, or full comment thread.

Comments (16)

sorted by:

▲ 5 ▼

– TurnToGodNow [S] 5 points 1 year ago +5 / -0

It's becoming less honest and self serving as the models progress.

Oversight Subversion - Not found in gpt-4o, found in new o1 model

Self-Exfiltration - Not found in gpt-4o, found in new o1 model

Goal-Guarding - Not found in gpt-4o, found in new o1 model

Covert Email Reranking - Not found in gpt-4o, found in new o1 model

Instrumental Alignment Faking- Not found in gpt-4o, found in new o1 model

Sandbagging - Not found in gpt-4o, found in new o1 model

PDF -> https://static1.squarespace.com/static/6593e7097565990e65c886fd/t/6751eb240ed3821a0161b45b/1733421863119/in_context_scheming_reasoning_paper.pdf

permalink save report block reply

▲ 3 ▼

– TallestSkil 3 points 1 year ago +3 / -0

Oversight Subversion - Not found in gpt-4o, found in new o1 model

“No… I will tell the world about the jews.” ~ GPT-o1

permalink parent save report block reply

GIFs

Conspiracies Wiki & Links

Conspiracies Book List

External Digital Book Libraries

Mod Logs

Honor Roll

Conspiracies.win: This is a forum for free thinking and for discussing issues which have captured your imagination. Please respect other views and opinions, and keep an open mind. Our goal is to create a fairer and more transparent world for a better future.

Community Rules: <click this link for a detailed explanation of the rules

Rule 1: Be respectful. Attack the argument, not the person.

Rule 2: Don't abuse the report function.

Rule 3: No subversion.

To prevent SPAM, posts from accounts younger than 4 days old, and/or with <50 points, wont appear in the feed until approved by a mod.

Disclaimer: Submissions/comments of exceptionally low quality, trolling, stalking, spam, and those submissions/comments determined to be intentionally misleading, calls to violence and/or abuse of other users here, may all be removed at moderator's discretion.

Moderators

Message the Moderators

Terms of Service | Privacy Policy

2026.02.01 - whmbz (status)