Physical adversarial examples against deep neural networks

by BAIR Blog

31 December 2017

Digital Adversarial Examples

Different methods have been proposed to generate adversarial examples in the white-box setting, where the adversary has full access to the DNN. The white-box setting assumes a powerful adversary and thus can help set the foundation for developing future fool-proof defenses. These methods contribute to understanding digital adversarial examples.

Goodfellow et al. proposed the fast gradient method that applies a first-order approximation of the loss function to construct adversarial samples.

Optimization based methods have also been proposed to create adversarial perturbations for targeted attacks. Specifically, these attacks formulate an objective function whose solution seeks to maximize the difference between the true labeling of an input, and the attacker’s desired target labeling, while minimizing how different the inputs are, for some definition of input similarity. In computer vision classification problems, a common measure is the L2-norm of the input vectors. Often, inputs with low L2 distances will be closer to each other. Thus, it is possible to compute inputs that are visually very similar to the human eye, but to a classifier, are very different.

Recent work has examined the black-box transferability of digital adversarial examples, generating adversarial examples in black-box settings is also possible. These techniques involve generating adversarial examples for another known model in a white-box manner, and then running them against the target unknown model.

Physical Adversarial Examples

To better understand these vulnerabilities, there has been extensive research on how adversarial examples may affect DNNs deployed in the physical world.

Kurakin et al. showed that printed adversarial examples can be misclassified when viewed through a smartphone camera. Sharif et al. attacked face recognition systems by printing adversarial perturbations on the frames of eyeglasses. Their work demonstrated successful physical attacks in relatively stable physical conditions with little variation in pose, distance/angle from the camera, and lighting. This contributes an interesting understanding of physical examples in stable environments.

Our recent work “Robust physical-world attacks on deep learning models” has shown physical attacks on classifiers. (Check out the videos here.) As the next logical step, we show attacks on object detectors. These computer vision algorithms identify relevant objects in a scene and predict bounding boxes indicating objects’ position and kind. Compared with classifiers, detectors are more challenging to fool as they process the entire image and can use contextual information (e.g. the orientation and position of the target object in the scene) in their predictions.

We demonstrate physical adversarial examples against the YOLO detector, a popular state-of-the-art algorithm with good real-time performance. Our examples take the form of sticker perturbations that we apply to a real STOP sign. The following image shows our example physical adversarial perturbation.

We also perform dynamic tests by recording a video to test out the detection performance. As can be seen in the video, the YOLO network does not perceive the STOP sign in almost all the frames. If a real autonomous vehicle were driving down the road with such an adversarial STOP sign, it would not see the STOP, possibly leading to a crash at an intersection. The perturbation we created is robust to changing distances and angles – the most commonly changing factors in a self-driving scenario.

More interestingly, the physical adversarial examples generated for the YOLO detector are also be able to fool standard Faster-RCNN. Our demo videos contains a dynamic test of the physical adversarial example on Faster-RCNN. As this is a black box attack on Faster-RCNN, the attack is not as successful as it is in the YOLO case. This is expected behavior. We believe that with additional techniques (such as ensemble training), the black box attack could be made more effective. Additionally, specially optimizing an attack for Faster-RCNN will yield better results. We are currently working on a paper that explores these attacks in more detail. The image below is an example of Faster-RCNN not perceiving the STOP sign.

In both cases (YOLO and Faster-RCNN), a STOP sign is detected only when the camera is very close to the sign (about 3 to 4 feet away). In real settings, this distance is too close for a vehicle to take effective corrective action. Stay tuned for our upcoming paper that contains more details about the algorithm and results of physical perturbations against state-of-the-art object detectors.

Attack Algorithm Overview

This algorithm is based off our earlier work on attacking classifiers. Fundamentally, we take an optimization approach to generating adversarial examples. However, our experimental experience indicates that generating robust physical adversarial examples for detectors requires simulating a larger set of varying physical conditions than what is needed to fool classifiers. This is likely because a detector takes much more contextual information into account while generating predictions. Key properties of the algorithm include the ability to specify sequences of physical condition simulations, and the ability to specify the translation invariance property. That is, a perturbation should be effective no matter where the target object is situated within the scene. As an object can move around freely in the scene depending on the viewer, perturbations not optimized for this property will likely break when the object moves. Our upcoming paper on this topic will contain more details on this algorithm.

Potential defenses

Given these adversarial examples in both digital and physical world, potential defense methods have also been widely studied. Among them, different types of adversarial training methods are the most effective. Goodfellow et al. first proposed adversarial training as an effective way to improve the robustness of DNNs, and Tramèr et al. extend it to ensemble adversarial learning. Madry et al. have also proposed robust networks via iterative training with adversarial examples. To conduct an adversarial training based defense, a large number of adversarial examples are required. In addition, these adversarial examples can make the defense more robust if they come from different models as suggested by work on ensemble training. The benefit of ensemble adversarial training is to increase the diversity of adversarial examples so that the model can fully explore the adversarial example space. There are other types of defense methods as well, but Carlini and Wagner have shown that none of these existing defense method is robust enough given adaptive attack.

Overall, we are still a long way from finding the optimal defense strategy against these adversarial examples, and we are looking forward to exploring this exciting research area.

Physical Adversarial Sticker Perturbations for YOLO

Physical Adversarial Examples for YOLO (2)

Black box transfer to Faster RCNN of physical adversarial examples generated for YOLO

This article was initially published on the BAIR blog, and appears here with the authors’ permission.

tags: c-Research-Innovation

BAIR Blog is the official blog of the Berkeley Artificial Intelligence Research (BAIR) Lab.

BAIR Blog is the official blog of the Berkeley Artificial Intelligence Research (BAIR) Lab.

Subscribe to Robohub mailing list

Related posts :

Octopus inspires new suction mechanism for robots

Suction cup grasping a stone - Image credit: Tianqi Yue The team, based at Bristol Robotics Laboratory, studied the structures of octopus biological suckers, which have superb adaptive s...

18 April 2024, by University of Bristol

Open Robotics Launches the Open Source Robotics Alliance

The Open Source Robotics Foundation (OSRF) is pleased to announce the creation of the Open Source Robotics Alliance (OSRA), a new initiative to strengthen the governance of our open-source robotics so...

18 March 2024, by Open Source Robotics Foundation

Robot Talk Episode 77 – Patricia Shaw

In the latest episode of the Robot Talk podcast, Claire chatted to Patricia Shaw from Aberystwyth University all about home assistance robots, and robot learning and development.

18 March 2024, by Robot Talk

Robot Talk Episode 64 – Rav Chunilal

In the latest episode of the Robot Talk podcast, Claire chatted to Rav Chunilal from Sellafield all about robotics and AI for nuclear decommissioning.

31 December 2023, by Robot Talk

AI holidays 2023

Thanks to those that sent and suggested AI and robotics-themed holiday videos, images, and stories. Here’s a sample to get you into the spirit this season....

31 December 2023, by AIhub and Lucy Smith

Faced with dwindling bee colonies, scientists are arming queens with robots and smart hives

By Farshad Arvin, Martin Stefanec, and Tomas Krajnik Be it the news or the dwindling number of creatures hitting your windscreens, it will not have evaded you that the insect world in bad shape. ...

31 December 2023, by The Conversation

Physical adversarial examples against deep neural networks

Digital Adversarial Examples

Physical Adversarial Examples

Attack Algorithm Overview

Potential defenses

Related posts :

Octopus inspires new suction mechanism for robots

Open Robotics Launches the Open Source Robotics Alliance

Robot Talk Episode 77 – Patricia Shaw

Robot Talk Episode 64 – Rav Chunilal

AI holidays 2023

Faced with dwindling bee colonies, scientists are arming queens with robots and smart hives

↑

Would you like to learn how to tell impactful stories about your robot or AI system?