Families in the Wild

Families in the Wild (FIW): Large-Scale Kinship Image Database and Benchmarks

Robinson, Joseph P and Shao, Ming and Wu, Yue and Fu, Yun
Department of ECE, College of Engineering
College of Computer and Information Science
Northeastern University, Boston, MA, USA


We present the largest kinship recognition dataset to date, Families in the Wild (FIW). Motivated by the lack of a single, unified dataset for kinship recognition, we aim to provide a dataset that captivates the interest of the research community. With only a small team, we were able to collect, organize, and label over 10,000 family photos of 1,000 families with our annotation tool designed to mark complex hierarchical relationships and local label information in a quick and efficient manner. We include several benchmarks for two image-based tasks, kinship verification and family recognition. For this, we incorporate several visual features and metric learning methods as baselines. Also, we demonstrate that a pre-trained Convolutional Neural Network (CNN) as an off-the-shelf feature extractor outperforms the other feature types. Then, results were further boosted by fine-tuning two deep CNNs on FIW data: (1) for kinship verification, a triplet loss function was learned on top of the network of pre-train weights; (2) for family recognition, a family-specific softmax classifier was added to the network.


Sample faces chosen of 11 relationship types of FIW. Parent-child: (top row) Father-Daughter (F-D), Father-Son (F-S), MotherDaughter (M-S) Mother-Son (M-S). Grandparentgrandchild: (middle row) same labeling convention as above. Siblings: (bottom row) Sister-Brother (SIBS), Brother-Brother (B-B), Sister-Sister (S-S)


Method to construct FIW. Data Collection: a list of candidate families (with an unique FID) and photos (with an unique PID) are collected. Data Annotation: a labeling tool optimized the process of marking the complex hierarchical nature of the 1,000 family trees of FIW. Data Parsing: post-processed the two sets of labels generated by the tool to partition data for kinship verification and family recognition.


Comparison of FIW with related datasets.


Pair counts for FIW and related datasets.


Relationship specific ROC curves depicting performance of each method.


Verification accuracy scores (%) for 5-fold experiment on FIW. No family overlap between folds.

Dataset and Code

1. FIW Dataset: [download] (available soon)

2. Codes: [download] (available soon)


Families in the wild (fiw): Large-scale kinship image database and benchmarks. ACM MM 2016 [pdf] [poster] [BibTeX]