Tricking neural networks // University of Oldenburg

Contact

Vita

Prof. Dr Daniel Neider has been a university lecturer in Safety and Explainability of Learning Systems at the University of Oldenburg’s Department of Computing Science since March 2022. He was previously head of the Logic and Learning research group at the Max Planck Institute for Software Systems in Kaiserslautern. The computer scientist focuses on ways to ensure the safety of artificial intelligence methods in safety-critical areas such as autonomous driving, medical technology and aerospace technology. He works on verifying the reliability of artificial intelligence using mathematical procedures and develops methods to explain AI decisions in a transparent and easily understandable way.

Contact

Prof. Dr Daniel Neider

Department of Computing Science

daniel.neider@uol.de

What can image recognition with neural networks do - and what not? And: what "collateral damage" does society accept when using them? The researchers around Neider want to contribute to the debate about this. Photo: Yan Krukov/Pexels
Computer scientist Daniel Neider is working on securing artificial intelligence in safety-critical areas. Photo: University of Oldenburg
To put the system to the test, the researchers defined images of dogs as "dangerous" and changed other photo motifs - such as that of the magpie shown on the left - so that (as on the right) its fingerprint corresponded to that of a dog image. To the human eye, both photos look identical. Image: https://arxiv.org/abs/2111.06628
On the right, the dog image defined as "dangerous" on a test basis. Because of the visible differences between the two magpie images on the left, the system sounded the alarm for the photo that had been altered in this way. Anyone familiar with machine learning could thus "relatively easily" play tricks with the system, Neider says. Image: https://arxiv.org/abs/2111.06628

Can artificial intelligence (AI) methods reliably detect child pornography images on user devices? A study in which Oldenburg computer scientist Daniel Neider was involved raises doubts about whether this is currently possible.

What can image recognition with neural networks do - and what not? And: what "collateral damage" does society accept when using them? The researchers around Neider want to contribute to the debate about this. Photo: Yan Krukov/Pexels

Computer scientist Daniel Neider is working on securing artificial intelligence in safety-critical areas. Photo: University of Oldenburg

To put the system to the test, the researchers defined images of dogs as "dangerous" and changed other photo motifs - such as that of the magpie shown on the left - so that (as on the right) its fingerprint corresponded to that of a dog image. To the human eye, both photos look identical. Image: https://arxiv.org/abs/2111.06628

On the right, the dog image defined as "dangerous" on a test basis. Because of the visible differences between the two magpie images on the left, the system sounded the alarm for the photo that had been altered in this way. Anyone familiar with machine learning could thus "relatively easily" play tricks with the system, Neider says. Image: https://arxiv.org/abs/2111.06628

Mr Neider, do you have a virus scanner on your computer?

I think every Windows computer comes with an antivirus programme – so yes.

Apple installed its NeuralHash programme, which automatically scans image files for child pornography, on end devices last year. Does it work in a similar way to an antivirus programme?

NeuralHash does something similar, though the way it works is different: the software scans end devices for a specific type of content – not, as with antivirus programmes, for malware, but for illegal images. This is known as client-side scanning, which refers to the scanning of files on the user’s device.

How exactly does NeuralHash work?

The programme is based on artificial intelligence methods and uses so-called neural networks. Put simply, it’s a computer programme that is trained to recognize certain patterns in images. The programme assigns a kind of code to each image, basically a sequence of numbers and letters. These codes are called hashes. You can imagine them as fingerprints that are generated for each image. The trick is that images that look similar are assigned the same hash – so, for example, all images featuring black cats could be assigned the hash 3x580ac97e. Apple has a large database of such hashes, which can be assigned to child pornography images. And whenever a user tries to upload an image with a hash that is in the database, the image is marked without the user noticing. Such images cannot be forwarded.

Does that mean Apple knows what images are on my mobile phone?

No, Apple doesn’t look at the images. It only has the database with the codes. The idea is that the company works together with child protection organizations. And on the basis of material that comes from law enforcement agencies, these organizations use a programme to generate hashes for the database.

You analysed NeuralHash in a research project with colleagues from the Technical University of Darmstadt. How did the project come about?

Neural networks don't always work the way we think they do. The technology is very promising, but it isn’t always one hundred percent accurate. It’s often difficult to find out why it delivers a certain result, because the procedure has not been explicitly programmed. In principle, this technology has simply learned to recognize certain patterns in the data. However, this can also be exploited to trick the programme – and it works with alarming frequency. So we asked ourselves: how does this affect a system that is intended to be used to assess illegal content? What happens if you slightly modify images, for example?

Why did you focus on NeuralHash?

In 2021, Apple delivered a prototype of NeuralHash together with an operating system update to end devices – basically all devices that can send photos to the iCloud cloud storage service, like iPhones or Macs. The prototype was not yet activated, so the programme didn’t start checking images on the Apple devices. But this move made the technology available to us; we were able to extract the programme and thus gain access to the neural network. We wanted to take a look at how a big company would go about such a task. Apple later refrained from officially rolling out NeuralHash due to massive criticism of the mass surveillance and invasion of privacy it entailed.

What exactly did you test?

We tested how the system could be abused. To avoid having to work with child pornography material, we defined images of dogs as “dangerous”. Then we calculated their “digital fingerprints”. In the first scenario, we took images of other things, for example of a cat, and tried to modify them slightly so that the result was a "fingerprint" of a dog.

Did it work?

Yes, and it turned out to be relatively easy. You need access to the neural network – which we had because the programme was installed on the devices – and you need some knowledge of how machine learning works. But then it's quite easy to alter the cat images so that they generate any other hash. To the human eye, the manipulated photos look almost like the original, you can't really tell the difference.

And that’s when things get problematic...

Right. Because I could send you a manipulated photo, and the moment you try to upload it to your cloud or send it to someone else via a messaging app, the system is triggered without you noticing. You don't even know why the upload or forwarding function is blocked. But the real problem is that Apple also notices that you’ve tried to send a suspicious image. And if this happens too often, Apple decrypts the material and, if deemed necessary, reports it to the local law enforcement authorities. This means that material could be planted on someone to incriminate them.

What else did you test?

We also posed the opposite question: can I bypass the system? Can I manipulate an image with a fingerprint in the database in such a way that it generates a different fingerprint? In one scenario, we again assumed that the user has access to the AI via their device, is familiar with the neural network and has some knowledge of machine learning.

And can the images be manipulated to make them look unsuspicious?

Yes, it works very well. But what we also discovered is that even if you don't have access to the system and make very simple changes to a photo that anyone can make with their mobile phone, it’s possible to trick the programme. For instance, simply by rotating an image by 90 degrees you can substantially alter the "fingerprint". This, of course, is not good, because you can undo this change just by rotating the image 90 degrees in the other direction. The entire information contained in the image is retained. This shows that it’s relatively easy to trick the system.

What conclusions do you draw from the study?

In my view, we don't know enough about neural networks at the moment to be able to use them safely. These programmes are not robust enough for such sensitive tasks – as we saw in this case study. Moreover, in my opinion, the legislators should not rely on programmes developed by corporations like Apple or Facebook in response to a law to do the right thing. For example, there is the danger that these companies will block more content than necessary, as a pre-emptive measure, so to speak, to avoid getting into trouble and having to pay high fines. Something similar is already happening in reaction to the Network Enforcement Law (also known as the Facebook Act).

So should we not use technology to automatically prevent the uploading of indexed images?

On the contrary, my colleagues and I are also in favour of using technology to combat child pornography. But we think it’s important that there is a public discourse about what image recognition using neural networks can do, what it can’t do, and what we are prepared to accept as collateral damage. From our point of view, it’s always a matter of weighing up the pros and cons: if it’s so easy to trick a programme, is it really justifiable to install it on everyone’s devices? After all, there is a risk of false alarms. At the same time, anyone who wants to can bypass the system relatively easily. So doesn't it actually do more harm than good? Of course, it’s not up to us computer scientists to make the decisions here. Our contribution is to point out the problems with the technology so that a meaningful discussion can take place on that basis.

Interview: Ute Kehse

This might also be of interest to you:

Robot arm places battery pack on a conveyor belt.

25 Mar 2024 Research Top News Computing Science

Improving battery production

Digital models of battery factories could help increase the production of lithium-ion batteries in Europe significantly in the coming years. Computer…

Portrait on the campus, green trees and bicycle stands in the background.

10 Jan 2024 Research Top News Computing Science

"A large part of the world's knowledge is in AI"

Artificial intelligence is currently a hot topic in the media. In this interview, computer scientist Oliver Kramer explains what programmes such as…

A dark screen divided into several fields with incomprehensible information.

20 Jan 2023 Top News Computing Science

When the hackers are already in the system

Cyberattacks on critical infrastructure such as power grids are on the rise. Oldenburg experts Andreas Peter and Sebastian Lehnhoff explain how to…

All news

Topics

Presse & Kommunikation (Changed: 18 Oct 2024) |

Sprache wechseln

Change Language

Hell-/Dunkelmodus

Light mode / Dark mode

Contact

More

Vita

Contact