Abstract Search

ISEF | Projects Database | Finalist Abstract

Back to Search Results | Print PDF

Novel Method to Efficiently Analyze Videos Using Spatio-temporal Analysis, Advanced Deep Learning in the Cloud

Booth Id:
ROBO005

Category:
Robotics and Intelligent Machines

Year:
2020

Finalist Names:
Yalamanchili, Laya (School: L C Anderson High School)

Abstract:
According to the IDC, the world’s data (the DataSphere) will grow to a mind-boggling 175 Zetta Bytes (175 Billion Terabytes) by 2025. Video data is expected to account for more than 40% of this data. With this exponential growth, the ability to recognize objects and how they behave is extremely important to be able to detect unusual patterns. Finding such abnormalities in videos is beneficial for applications such as security surveillance to detect thefts, robberies, and abandoned luggage/packages in public areas; health care for senior fall detection and irregular behavior of patients/medical personnel; quality control of industrial assembly processes; etc. Traditional supervised learning methods for detecting abnormal events require labeling normal and abnormal behavior. It is impractical to generate enough volume of labeled data for all the possible spatiotemporal variations. As a part of this project, I have explored unsupervised and semi-supervised methods using a combination of deep neural networks, such as CNNs, RNNs, LSTMs, etc. I also explored different machine learning frameworks such as Theano, MxNet, Keras, TensorFlow, etc. with different CPU and GPU machine configurations in the cloud. Finally, I experimented with various video file configurations such as image size, stride length, epochs, padding, etc. By optimizing different variables, an accuracy rate of 90.3% was achieved for multiple scenarios and the results were produced in under 10 secs. I developed a mechanism to provide near real-time alerts through SMS and email when an abnormal incident occurs, or an abnormal behavior persists past a pre-set time threshold. The system requires minimal training data, can be set up quickly, and can be scaled to monitor multiple areas concurrently.