A database of more than 10,000 human images to evaluate biases in artificial intelligence (AI) models for human-centric computer vision is presented in Nature this week. The Fair Human-Centric Image Benchmark (FHIBE), developed by Sony AI, is an ethically sourced, consent-based dataset that can be used to evaluate human-centric computer vision tasks to identify and correct biases and stereotypes.
Computer vision covers a range of applications, from autonomous vehicles to facial recognition technology. Many AI models used in computer vision were developed using flawed datasets that may have been collected without consent, often taken from large-scale image scraping from the web. AI models have also been known to reflect biases that may perpetuate sexist, racist, or other stereotypes.
Alice Xiang and colleagues present an image dataset that implements best practices for a number of factors, including consent, diversity, and privacy. FHIBE includes 10,318 images of 1,981 people from 81 distinct countries or regions. The database includes comprehensive annotations of demographic and physical attributes, including age, pronoun category, ancestry, and hair and skin color.