About

This website contains results from a black-box attack on models hosted by Clarifai. The details of these attacks can be found in our recent paper [‘Exploring the Space of Black-box Attacks on Deep Neural Networks’][arxiv-link]. The models we attack are the NSFW model and the Content Moderation model. Since the input images to these models can be disturbing and/or offensive, we display both the benign and adversarial images on this website. For any questions, please email Arjun Nitin Bhagoji (abhagoji@princeton.edu).

[arxiv-link]: