An interesting video. It's basically an AI simulating a video game. Not just playing it, but also generating all the graphics etc. The bit on reward function got me thinking what kind of reward functions are built into other AIs.
For example, ChatGPT may have a reward function tuned in a way that left wing propaganda has a higher reward than right wing one. It's basically like an artificial bias. You don't need to tell AI exactly what to say and what to not say. You just tune it's reward function in a way that it will by itself determine leaning of particular statement and will naturally tend to choose those that have higher reward points. To be used as a propaganda tool an AI does not need to be top-down controlled. All it needs is in-built programmable bias and it will do the rest by itself.
An interesting video. It's basically an AI simulating a video game. Not just playing it, but also generating all the graphics etc. The bit on reward function got me thinking what kind of reward functions are built in other AIs.
For example, ChatGPT may have a reward function tuned in a way that left wing propaganda has a higher reward than right wing one. It's basically like an artificial bias. You don't need to tell AI exactly what to say and what to not say. You just tune it's reward function in a way that it will by itself determine leaning of particular statement and will naturally tend to choose those that have higher reward points. To be used as a propaganda tool an AI does not need to be top-down controlled. All it needs is in-built programmable bias and it will do the rest by itself.
An interesting video. It's basically an AI simulating a video game. Not just playing it, but also generating all the graphics etc. The bit on revard function got me thinking what kind of reward functions are built in other AIs.
For example, ChatGPT may have a reward function tuned in a way that left wing propaganda has a higher reward than right wing one. It's basically like an artificial bias. You don't need to tell AI exactly what to say and what to not say. You just tune it's reward function in a way that it will by itself determine leaning of particular statement and will naturally tend to choose those that have higher reward points. To be used as a propaganda tool an AI does not need to be top-down controlled. All it needs is in-built programmable bias and it will do the rest by itself.