Sixth International Workshop on Knowledge
Discovery from Sensor Data (Sensor-KDD '12)


To be held in conjunction with

'12 Workshop
August 12, 2012
Beijing, China.



Invited Speakers

In addition to the oral presentation of accepted papers, there will be four invited speakers:

Dr. Ashok N. Srivastava
Intelligent Data Understanding Group Lead
NASA Ames Research Center
Moffett Field, CA 94035


Dr. Ian Davidson
Department of Computer Science, University of California, Davis


Dr. Hillol Kargupta
Department of Computer Science and Electrical Engineering,
University of Maryland, Baltimore County,


Dr. Ralf Birken
Department of Civil and Environmental Engineering
Northeastern University
Boston, MA

Invited Speaker Bio-Sketches and Abstracts:

Dr. Ashok N. Srivastava, is the Project Manager for the System-Wide Safety and Assurance Technologies Project at NASA. He is formerly the Principal Investigator for the Integrated Vehicle Health Management research project at NASA. His current research focuses on the development of data mining algorithms for anomaly detection in massive data streams, kernel methods in machine learning, and text mining algorithms. Dr. Srivastava is also the leader of the Intelligent Data Understanding group at NASA Ames Research Center. The group performs research and development of advanced machine learning and data mining algorithms in support of NASA missions. He performs data mining research in a number of areas in aviation safety and application domains such as earth sciences to study global climate processes and astrophysics to help characterize the large-scale structure of the universe.

Greener Aviation with the Virtual Sensors: a Case Study


The environmental impact of aviation is enormous given the fact that in the US alone there are nearly 6 million flights per year of commercial aircraft. This situation has driven numerous policy and procedural measures to help develop environmentally friendly technologies which are safe and affordable and reduce the environmental impact of aviation. However, many of these technologies require significant initial investment in newer aircraft fleets and modifications to existing regulations which are both long and costly enterprises. We demonstrate the use of an anomaly detection method based on Virtual Sensors to help detect overconsumption of fuel in aircraft which relies only on the data recorded during flight of most existing commercial aircraft, thus significantly reducing the cost and complexity of implementing this method. The Virtual Sensors developed here are ensemble-learning regression models for detecting the overconsumption of fuel based on instantaneous measurements of the aircraft state. This approach requires no additional information about standard operating procedures or other encoded domain knowledge. We present experimental results on three data sets and compare five different Virtual Sensors algorithms. The first two data sets are publicly available and consist of a simulated data set from a flight simulator and a real-world turbine disk. We show the ability to detect anomalies with high accuracy on these data sets. These sets contain seeded faults, meaning that they have been deliberately injected into the system. The second data set is from real-world fleet of 330 jet aircraft where we show the ability to detect fuel overconsumption which can have a significant environmental and economic impact. To the best of our knowledge, this is the first study of its kind in the aviation domain

Dr. Hillol Kargupta is a Professor of Computer Science at the University of
Maryland, Baltimore County. He is also a co-founder of AGNIK, a vehicle performance data analytics company for mobile, distributed, and embedded environments. He received his Ph.D. in Computer Science from University of Illinois at Urbana-Champaign in 1996. His research interests include mobile and distributed data mining. Dr. Kargupta is an IEEE Fellow. He won the IBM Innovation Award in 2008 and a National Science Foundation CAREER award in 2001 for his research on ubiquitous and distributed data mining. He and his team received the 2010 Frost and Sullivan Enabling Technology of the Year Award for the MineFleet vehicle performance data mining product and the IEEE Top-10 Data Mining Case Studies Award. His other awards include the best paper award for the 2003 IEEE International Conference on Data Mining for a paper on privacy-preserving data mining, the 2000 TRW Foundation Award, and the 1997 Los Alamos Award for Outstanding Technical Achievement. His dissertation earned him the 1996 Society for Industrial and Applied Mathematics annual best student paper prize.He has published more than one hundred peer-reviewed articles. His research has been funded by the US National Science Foundation, US Air Force, Department of Homeland Security, NASA and various other organizations. He has co-edited several books. He serve(s/d) as an associate editor of the IEEE Transactions on Knowledge and Data Engineering, IEEE Transactions on Systems, Man, and Cybernetics, Part B and Statistical Analysis and Data Mining Journal. He is/was the Program Co-Chair of 2009 IEEE International Data Mining Conference, General Chair of 2007 NSF Next Generation Data Mining Symposium, Program Co-Chair of 2005 SIAM Data Mining Conference and Associate General Chair of the 2003 ACM SIGKDD Conference, among others.


Next Generation of Machine-to-Machine Environments, and Distributed Data Mining


Next generation of Machine-to-Machine (M2M) networks will be dealing with billions of devices connected over wireless networks. Most large wireless network carriers of the world are now gearing up with major investments in the M2M world. This talk will focus on data analytics in M2M and use in-vehicle platforms for illustrations. Modern vehicles are embedded with varieties of sensors monitoring different functional components of the car and the driver behavior. With vehicles getting connected over wide-area wireless networks, many of these vehicle diagnostic-data along with location and accelerometer information are now accessible to a wider audience through wireless aftermarket devices. This data offer rich source of information about the vehicle and driver performance. Once this is combined with other contextual data about the car, environment, location, and the driver, it can offer exciting possibilities. Distributed data mining technology powered by onboard analysis of data is changing the face of such vehicle telematics applications for the consumer market, insurance industry, car repair chains and car OEMs. This talk will offer an overview of the market, emerging product-types, and identify some of the core technical challenges. It will describe how advanced data analysis has helped creating new innovative products and made them commercially successful. The talk will offer a perspective on the algorithmic issues and describe their practical significances. It will end with remarks on how the next generation of data mining researchers can play an important role in shaping that.


Dr. Ian Davidson is an Associate Professor in the Department of Computer Science at the University of California at Davis. His research interests include constrained clustering, behavioral analysis using network analysis, tensor decomposition and semi-supservised and unsupervised models of transfer learning among others. He has published numerous articles in peer-reviewed conferences, journals and books. His research has been funded by NSF (CAREER Award), Office of Naval Research, DoD and Google (Research Award). Dr. Davidson has served in the editorial boards of IEEE Transactions of Knowledge Discovery and Data Mining, Knowledge Discovery and Data Mining, Knowledge and Information Systems and program committees of SDM 2012, ICDM 2011 and KDD 2012 at various capacity.


New Approaches for Analyzing raw fMRI Data


fMRI data is arguably one of the most complex forms of sensor data. Typically there may be many readings per second over a spatial region consisting of close to one quarter million zones. With the advent of cheaper scanners there now exist the possibility to collect multiple readings from the same people over time and multiple readings from a population of individuals. This offers a whole range of challenges and opportunities. These include how to fuse this data together and importantly discovery knowledge beyond simple labels, anomalies and clusters without heavy pre-processing is a challenge common to many fields in sensor analysis. Furthermore, with such complex and plentiful data comes the reality that there may be many explanations of the data and finding explanations that are plausible and usable requires injecting domain expertise. In this talk we will provide a high level overview of these problems and our initial progress at addressing them.



Dr. Ralf Birken is a Research Assistant Professor of Civil and Environmental Engineering at Northeastern University, Boston. He received his Ph.D. in Geophysical and Geological Engineering from The University of Arizona in 1997 and a MS in Geophysics from the University of Cologne, Germany in 1992. He has over 15 years of research and engineering experience in industry and academia in near-surface geophysical mapping designing new measurement systems and sensor technology. His current research interest focuses on the integration of multi-channel multi-domain sensor technology into geophysical subsurface imaging systems. Dr. Birken is an expert in applied electromagnetic geophysics and also interested in the efficient, large-scale, and accurate mapping of subsurface utilities by combining modern positioning systems with geophysical array technology. Dr. Birken is actively involved with the VOTERS (Versatile Onboard Traffic-Embedded Roaming Sensors) project as project manager and lead scientist. The VOTERS project provides a framework to shift from periodical localized inspections to continuous network-wide health monitoring of roadways and bridge decks. Research focuses on the development of a cost-effective, lightweight package of advanced radar, acoustic, and optical sensor technology that is compatible with this framework. VOTERS’ technology, once installed beneath a fleet vehicle, can monitor road conditions at both the surface and sub-surface levels. At the same time hazardous, congestion-prone work zones, that are typically set up to gather these critical inspection data sets, are eliminated as the traffic-embedded vehicle roams through daily traffic going about its normal business.

VOTERS: Design of a Mobile Multi-Modal Multi-Sensor System


The VOTERS (Versatile Onboard Traffic-Embedded Roaming Sensors) project ( provides a framework to complement periodical localized inspections of roadways and bridge decks with continuous network-wide health monitoring. Utilizing traffic-embedded Vehicles Of Opportunity (VOOs) roaming through daily traffic eliminates hazardous, congestion-prone work zones, that are typically set up to gather these critical inspection data sets. It also provides maintenance decision makers and researchers with a temporal and spatial data set not available in roadway and bridge deck inspection today.
Research focuses on the development of a cost-effective, lightweight package of multi-modal sensor systems compatible with this framework. At the same time an innovative software infrastructure is created that collects, processes, and evaluates these large time-lapse multi-modal data streams with the purpose of detecting anomalies without having to discard any data unseen. Part of the overall VOTERS system is a VOTERS control center which is in constant wireless communication with the VOOs equipped with the autonomous VOTERS sensing system.
VOTERS’ technology, once installed beneath multiple VOOs can frequently inspect road conditions at both the surface and subsurface levels using advanced ground penetrating radar, acoustic, and optical sensors, and a compatible controller technology. Each VOO requires an on-board controller that manages the individual sensor systems, synchronizes data streams, registers all data streams in time and space, and an access point to the control center.
The control center manages multiple VOOs and the data for further analysis, visualization, and decision making. The two most distinctive features of this data set are the network-wide coverage and the constant repeats, which create a time-lapse data set that allows for the monitoring of the deterioration process at unprecedented time intervals, thereby providing experimental results that can be used in life-cycle cost analysis models.
Various software infrastructure design and implementation strategies that are compatible with the requirements given by the VOTERS system and framework will be explored. A hierarchical multi-tier architecture leads to the distribution of responsibilities throughout the system. System communication is a key feature that ties the various subcomponents of the system together. Sensor fusion aspects will be discussed considering the need for accurate spatial and temporal registration of all sensor data streams. A multi-level processing strategy is desired as the amount of data collected exceeds the communication bandwidth. The multi-domain and heterogeneous Operating System environment requires special attention.



SensorKDD Cup
KDD 2012
Datastreams Mining in Wikipedia

© 2012 | KDD 2012 - ACM Workshop on Knowledge Discovery from Sensor Data | Email Webmaster