1 option

Challenges and Approaches in Connected Vehicles Data Wrangling Ford Motor Company

SAE Technical Papers (1906-current) Available online

Format:: Conference/Event
Author/Creator:: Raman, Raman, author.
Contributor:: Narsude, Mayur; Padmanaban, Damodharan
Conference Name:: WCX 17: SAE World Congress Experience (2017-04-04 : Detroit, Michigan, United States)
Language:: English
Physical Description:: 1 online resource
Place of Publication:: Warrendale, PA SAE International 2017
Summary:: AbstractThis manuscript compares window-based data imputation approaches for data coming from connected vehicles during actual driving scenarios and obtained using on-board data acquisition devices. Three distinct window-based approaches were used for cleansing and imputing the missing values in different CAN-bus (Controller Area Network) signals. Lengths of windows used for data imputation for the three approaches were: 1) entire time-course for each vehicle ID, 2) day, and 3) trip (defined as duration between vehicle's ignition statuses ON to OFF). An algorithm for identification of ignition ON and OFF events is also presented, since this signal was not explicitly captured during the data acquisition phase. As a case study, these imputation techniques were applied to the data from a driver behavior classification experiment. Forty four connected vehicles were used to provide data on various signals viz., engine speed, vehicle speed, engine torque, brake, clutch, acceleration pedal, and gear. Distribution plots for all variables showed similar difference when 3 methods were compared. Mainly, the shapes of the histograms were the same for all methods. However, dataset size was around 37% more for both the vehicle ID-wise and day-wise imputed dataset compared to the trip-wise imputation approach. K-Means clustering did not show significant differences between vehicle ID-wise and day-wise imputed datasets, but around 16% vehicles were assigned to different clusters when trip-wise imputed data was used. Trip-window was perceived to be a superior window compared to the other two sizes since it provides a means to remove noisy records from the connected vehicle data, thus increasing the robustness of any analytical model built on top of it according to garbage-in-garbage-out rule. Given the scale of the data, big data tools, like Hive and Spark are used on Hadoop platform to process and impute the data set
Notes:: Vendor supplied data
Publisher Number:: 2017-01-0069
Access Restriction:: Restricted for use by site license

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

1 option

Challenges and Approaches in Connected Vehicles Data Wrangling Ford Motor Company

Find

My Account

Guides