Datasets – Augmented Cognition Lab

Open-source datasets collected at ACLab

License: “By downloading or using any of the datasets provided by the ACLab, you are agreeing to the “Non-commercial Purposes” condition. “Non-commercial Purposes” means research, teaching, scientific publication and personal experimentation. Non-commercial Purposes include use of the Dataset to perform benchmarking for purposes of academic or applied research publication. Non-commercial Purposes does not include purposes primarily intended for or directed towards commercial advantage or monetary compensation, or purposes intended for or directed towards litigation, licensing, or enforcement, even in part. These datasets are provided as-is, are experimental in nature, and not intended for use by, with, or for the diagnosis of human subjects for incorporation into a product.”

Datasets

AIR-125: A manually annotated infant respiration dataset, consisting of 125 videos of 8 infants, drawn from baby monitors and YouTube.
Non-nutritive sucking in-the-wild dataset (NNS) – This public NNS in-the-wild dataset, consisting of 10 naturalistic infant video clips annotated for NNS activity.
SPAC-Animals – This dataset is synthetic images for two species, zebra and rhino, generated by SPAC-Net.
Infant Annotated Faces (InfAnFace) – This dataset consisting of 410 images of infant faces with labels for 68 facial landmark locations and
various pose attributes.
Synthetic and Real Infant Pose (SyRIP) – An infant pose dataset with diverse and fully-annotated real infant images and generated synthetic infant images.
Simultaneously-collected multimodal Lying Pose (SLP) – The first-ever large scale dataset on in-bed poses called “Simultaneously-collected multimodal Lying Pose (SLP)” (is pronounced as SLEEP).
ScanAva+: Recent update to ScanAva with additional subjects to total of 44 scans.
ScanAva – This dataset is a large synthetic human pose dataset, called Scanned Avatar (ScanAva) using 3D scans of 7 individuals based on our proposed augmentation approach presented in our ECCV2018 workshop paper “A Semi-Supervised Data Augmentation Approach using 3D Graphical Engines.“
The Emotional Voices Database – This dataset is built for the purpose of emotional speech synthesis. The transcript were based on the CMU arctic database. Our database includes recordings for four speakers (2 males and 2 females). The emotional styles are neutral, sleepiness, anger, disgust and amused. Each audio file is recorded in 16bits .wav format
Mannequin RGB in-bed dataset (High-res) – This in-bed pose dataset is collected via regular webcam in a simulated hospital room in the College of Health Science at Northeastern University.
Mannequin RGB in-bed dataset (Low-res) – This in-bed pose dataset is collected via regular webcam in a simulated hospital room in the College of Health Science at Northeastern University. The images have been downsampled to work with our In-Bed-Posture-Estimation code.
Mannequin IRS in-bed dataset – This in-bed pose dataset is collected via our infrared selective (IRS) system in a simulated hospital room in the College of Health Science at Northeastern University. Raw pose dataset is provided with labeling where images keep original color and resolution during collection.
Preprocessed multiple direction dataset – We also provide preprocessed version of IRS images with multiple lying directions. All images are scaled and make into 3 channel gray-scale data, which can be hooked up directly to our In-Bed-Pose-Estimation code.

ToolBoxes & Apps

AI Human Co-Labeling Toolbox (AH-CoLT) – The goal of the AH-CoLT is to provide an efficient and augmentative annotation tool to facilitate creating large labeled visual datasets. This toolbox presents an efficient semi-automatic groundtruth generation framework for unlabeled images/videos. AH-CoLT enables accurate groundtruth labeling by incorporating the outcomes of state-of-the-art AI recognizers into a time-efficient human-based review and revise process.
Biosignal-Specific Processing (Bio-SP) Tool – This Mathworks toolbox centers around the development of a biosignal-speciﬁc processing pipeline in order to analyze these physiological signals in a modular fashion based on the state-of-the-art studies reported in scientific literature. Also, our paper “A Biosignal-Specific Processing Tool for Machine Learning and Pattern Recognition” is published at the IEEE-NIH 2017 Special Topics Conference on Healthcare Innovations and Point-of-Care Technologies (HI-POCT 2017).
ACLab Video and Motion Collector – For our vision-inertial data fusion evaluation in our IEEE ION paper “First-Person Indoor Navigation via Vision-Inertial Data Fusion“, we have developed an iPhone application that collects video and IMU data synchronously with an adjustable recording frequency.
Kinect V2 Recorder – Microsoft Windows RGBD data recorder for Kinect V2. Has adaptive frame rate and lossy compression of RGB and lossless compression of Depth data. Data can be accessed by the included python script or the reader software below.
Kinect V2 Reader – Microsoft Windows RGBD data playback for Kinect V2. Can playback and use Kinect RGBD data recorded by the above recorder.