Tags:
This study describes the preliminary development of a facial emotion recognition (FER) evaluation system using RGB-D imagery captured with a mobile device. The study outlines features of a non-occluded facial type control group and a set of participants wearing a head mounted display (HMD) to demonstrate an occluded facial type. We explore an architecture for developing a FER system suitable for occluded facial analysis. This paper describes the methodology, experimental design, and future work that will be undertaken to deliver such a system.
Imagine being able to accurately determine a person’s emotional state based on images taken from a consumer device such as an iPhone or iPad. Taking a person’s image through the device through colour (RGB) and contours or depth (RGB-D) of the face, would allow for detection of micro expressions, small changes in facial muscle that only a depth camera could see. We test this against a standard emotion model to demonstrate happy, sad, angry, confused and a variety of other feelings on the scale. To make sure we are detecting the correct emotion we use a collection of visual and audio data designed to evoke common reactions from people and use that alongside our model to determine if the correct emotion is classified.
This technology can be utilised in video calls in either personal or professional settings, allowing end users to be informed of other participants’ emotional state while on the call. In a personal setting one could detect if someone is upset at the topic of conversation, or in a business meeting it could be used to objectively determined if the participant was disengaged or bored at the current topic of conversation. This in turn could be then used to improve the inclusiveness of the call or sway the direction of the call.
Alternatively, this technology can be used in medical settings to determine if a patient who is engaged in a telemedicine call is in pain or upset. Picture a world where you could express some of your ailments to the doctor without even saying a word. Likewise, this can be used in rehabilitative applications to allow for dynamic feedback of therapies.
From our work to date we seek to develop a remote analysis tool that can be used in the user’s own home and expand the potential use cases of this technology. In future papers we would seek to alter the weightings for specific facial feature identification in order to more closely align recognition accuracy with full facial feature detection. More so we can alter the weighting to reflect other elements of occluded facial features such as those wearing a facemask or those with heavy facial hair. Furthermore, as smart AR (Augmented reality) glasses become more common place we would be interested in performing a similar study to determine occlusion around AR headsets.
As mentioned above the video conferencing and medical sectors can greatly benefit from the use of this technology. This research was initially carried out in a COVID-19 environment where mask wearing was more of a norm. However now as we move into a world with more VR and AR headsets, the capabilities of this research to determine the emotional state when the face is partially occluded means we can still perform this research on those who wear smart devices. This could help those in the sports and rehabilitation sectors to determine training satisfaction, or those with disabilities, such a blindness, to help them better interpret their environment, which can be a life changing tool for those with need for accessibility.
Publication Title: Facial Emotion recognition analysis using deep learning through RGB-D imagery of VR participants through partially occluded facial types
Author(s): Ian Mills. Frances Cleary
Publication Date: 2022-03
Name of Journal: 2022 IEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW)
Link to publication (if available): 10.1109/vrw55335.2022.00282