Robotics and Intelligent Machines
Mukherjee, Anwesha (School: Westview High School)
People with autism spectrum disorder (ASD) have difficulty recognizing and comprehending emotions from others’ facial expressions. This inhibits their development of empathy and affects their social interactions. A software package that recognizes emotions in real-time would aid people with ASD. This research tested a new approach using a computer vision (CV) function to split video streams into frame-by-frame images and feeding an ML function the images in Linux pipe framework to detect emotions. First, a CV function was created to take either BSON or FFmpeg-encoded video input, split it into frames, and output both the frame-by-frame images and the original video input. Then a convolutional neural network (CNN) was designed to recognize facial emotion from both the frame-by-frame images and streaming videos. The CNN had 2 convolution layers, a two-route convolution system with max pooling, a third universal convolution layer with a new Leaky ReLU activation function, global average pooling, and a SoftMax affine layer. It was trained with square grayscale images to avoid bias and confounding variables and was tested with live video feed, playback, and frame-by-frame images. The results were compared for the different input types. The model was successful with an average accuracy of 78% for static images, 62% for playback, and 59% for the live video feed. The research also showed 15% lower accuracy for females, and 33% lower accuracy for people of color due to lack of training data in these categories.