QUANTIFYING THE EXTENT TO WHICH POPULAR PRE-TRAINED CONVOLUTIONAL NEURAL NETWORKS IMPLICITLY LEARN HIGH-LEVEL PROTECTED ATTRIBUTES

Report ID: TR-002-18
Author: Roberts, Claudia
Date: 2018-04-20
Pages: 41
Download Formats: |PDF|
Abstract:

In a popular technique called “transfer learning,” a technique used widely in the computer vision field, researchers adapt publicly released models pre-trained on millions of images, for example, to determine whether a person talking in a video is telling the truth. But could the resulting classifier be biased? Has the pre-trained neural network model learned high-level features that correspond to protected attributes such as race, gender, religion, or disability status? Understanding the high-level features encoded in deep neural network representations is pivotal to understanding the kinds of biases that may be introduced in a broad range of applications during transfer learning. In this paper, we quantify the extent to which three popular pre-trained convolutional neural networks are implicitly learning and encoding age, gender, and race information during the transfer learning process. Results indicate that these readily used pre-trained models encode information that can be used to infer protected attributes such as race, gender, or age, even with very limited labeled data available at a very high accuracy.