Code for extraction
https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX
Working collision
https://github.com/AsuharietYgvar/AppleNeuralHash2ONNX/issues/1
Apple's claim
https://www.apple.com/child-safety/
Apple’s method of detecting known CSAM is designed with user privacy in mind. Instead of scanning images in the cloud, the system performs on-device matching using a database of known CSAM image hashes provided by NCMEC and other child safety organizations. Apple further transforms this database into an unreadable set of hashes that is securely stored on users’ devices.
And not only that, according to this reddit post
Believe it or not, [the NeuralHash algorithm for on-device CSAM detection] already exists as early as iOS 14.3, hidden under obfuscated class names.
The hashes are fingerprints of some subset of image features, not the entire bitmap, so that a simple crop or resize or just flipping one bit does not drastically change the hash as something like SHA256 or MD5 is designed to do.
A hash table is a set of buckets/bins with a list of items of the same hash, yes. That is just using the hash function to make lookups easier. Not relevant here.
What Apple are doing is hashing the images you have on your phone, comparing the hash to a set of hashes created by the National Center for Missing and Exploited Children (NCMEC) of known Child Porn images.
If you have an image with a matching hash, a thumbnail of it is sent to Apple to be reviewed manually. If the thumbnail looks suspicious they will inform Law Enforcement and then you will get a visit from them.
Apple are also planning to scan uploads to iCloud with the same system.
Now someone can run the hashing externally from Apple new scenarios can arise:
Child Porn distributors can hash their images to see which will be caught.
A Denial of Service attack on Apple can be executed by creating many hash colliding images that will need to be manually reviewed.