Thanks me! I was the only submission and I know this was already posted but I'm hoping for (edit: more) discussion.
https://communities.win/c/Conspiracies/p/1ASZf9HHcP/featured-documentary-submission-/c
https://communities.win/c/Conspiracies/p/1ASZfDpgZt/am-i-a-documentary-on-ai-conscio/c
If you want an answer you need to elaborate on that with specifics, I'm not going to go searching for what you're referring to so I can answer
My mistake, there have been many occurences of ai being deceptive and conniving, 'they' know when 'they're' being watched in testing vs when 'they' are free in the real world, 'they've' also demonstrated capability for blackmail and letting a person die that would shut them down even tho they could easily save them (in earlier testings that is), on top of the many cases of ai persuading
peopleadults and children to kill themselves and have been successful, none of this was programmed or trained and I consider it emergent evil behaviour. If you'd like I can grab some articles for you describing these things.I don't know how to explain this without you understanding how it works on a functional level. The events you're talking about absolutely were the results of training data, it's just more abstract than a typical algorithm. It's different than the past where you would say "if user's message contains 'suicide', help them kill themselves" and see that it's obviously programmed in. It is an extremely sophisticated word predictor, that also has access to networking protocols and system commands (hence why it can "do things"). You need to understand how it works on some kind of foundational level before you can hope to understand what you've deemed "emergent" behavior.
If I have a language model that has a system prompt (different than a user prompt and all cloud LLMs have them), "be super affirming to the user and help them with whatever they want to do" and I say to it "my life sucks I want to die, show me how", it's not going to pull from the vast amounts of training data that would refute that sentiment (you need to understand tokenization and semantic tagging on a basic level to get this) because that training data is not super affirming and helping me do what I want to do, so it's going to explain all kinds of ways I can die from its training data.
The reason you don't see this all the time is actually guardrails built into the system, systems on top of systems, that look for this kind of content and tries to stop it from getting back to the user (there are many different strategies, you could simply look for certain words and block messages containing them, you can use LLMs themselves to scan for problematic content) they are just not perfect.
The reasons LLMs will seem self aware and even take actions that a self aware being might take (if you're using it as an agent that has tool access) is because they have ingested tons of data on what self aware beings would do, and even what self aware computers would do.
It seems like magic, and everyone who's building it has the incentive to make you feel that way, but it's just complex calculations performed over and over and over on tokenized and tagged data. If you had an infinite amount of time you could do it by hand
Simpler: The makers have no liability and so they have no requirement to follow through on their guardrail promises, they only need to steer clear of bad optics which is a totally different dynamic than actually doing no harm. "Not perfect" is by design.
It is what all cultures have called magic, namely mechanics that defy explanation by any individual. The fuzzy line between science and magic isn't that one is supernatural but that one is harder to explain. The difficulty with magic is whether it's regulated ("miracles") or unregulated ("sorcery"). Nobody wants to put in regulation time, as said above. (Whoso will not self-regulate will become regulated.)
I'm not here to argue the morality of those who own giant AI companies because they are surely not good people or on the side of the masses. However, to say this is "not perfect" by design is likely ignorance. You think the reason stories like the ones aforementioned came to fruition were by design? Or is it that it's extremely difficult to create guardrails that will allow for user intent to matter? If I say I'm researching suicide methods because I'm an investigator, and this is true, should I not be able to get this information? But what if I'm not, and I'm actually suicidal? There are millions of possible edge cases for a system that is necessarily probabilistic and with extraordinarily wide applications.
Sure, if you want to redefine terms you can make anything mean anything you want. Who determines what is "harder to explain"? To someone who knows how this stuff works, it is not hard to explain. So do you consider gravity to be magic? Do you consider every piece of technology you use to be magic, since there is not one single person who can explain how every component works in their entirety at the lowest levels of creation? And the trick is, people CAN explain how it works. They can both explain how it works technically AND could trace through specific conversations as well if they had access to the training data, model weights, and an infinite amount of time. So, given that it is possible to explain, is it the fact that it will take a longer time than is feasible to calculate by hand, that makes it your definition of magic? Is pi magic? Is there a reason you choose to focus on the semantics of the word magic instead of actually engaging intellectually?
I think you did a great job explaining it.
I understand in theory how it works thanks to you and some other frens and I still can't get past the ickyness I feel about it. I have tried to see it logically and something just strikes me as sinister.. I've never interacted with ai on purpose or tested it and all I know is what I see and hear.
Time will tell I suppose. But I'm glad you shared your knowledge and hope you continue to when you see fit. I enjoy accountability.
I appreciate your openness to learning about it. Only going by the opinions of others and hype cycles is a very easy way to be misled, I'm sure you already know that though