COMPUTERIZED EMOTIONS CLASSIFICATION SYSTEM BASED ON FACIAL EXPRESSION RECOGNITION USING IMAGE PROCESSING AND ARTIFICIAL INTELLIGENCE TECHNIQUES
Date: 2025-06-16
Degree: Doctoral Thesis
Programme: Doctor of Information Systems
Authors: Qian Cheng
Supervisors: Alexandre Lobo, University of Saint Joseph, Macau
Abstract:
Facial expression recognition is a key topic in computer vision, playing a crucial role in non-verbal communication. With the rapid development of artificial intelligence, significant progress has been made in improving recognition accuracy and generalization abilities. Traditional methods often suffer from low precision and poor generalization, while deep learning models have substantially advanced the field. However, deploying deep and complex models on different platforms remains challenging due to their high computational demands and frameworks. Hence, developing an efficient, real-time, and lightweight facial expression recognition system is critical. This study focuses on creating an efficient, accurate, and lightweight real-time facial expression recognition system with an emphasis on cross-platform deployment. It integrates various deep learning and optimization techniques to demonstrate flexibility across platforms. In this context, this study evaluates the performance of 10 advanced CNN models (VGG16, VGG19, ResNet50, etc.) on the facial expression dataset FER2013. YOLOv8 combined with ResNet50 achieved 70.56% accuracy on FER2013, outperforming YOLOv8 alone by 2.1%. Multi-module fusion models (MobileFaceNet, IR50, HyViT, SE) achieved an accuracy of 92.58% and 74.8% on the RAF-DB and FER2013 datasets, respectively, showing superior performance in ablation experiments. Given the significant impact of data quality on model performance, this study performed data cleaning on the FER2013 dataset, resulting in a 3.25% accuracy improvement for the YOLOv8 + ResNet50 model, which reached 73.81%. The high-resolution RAF-DB dataset, with fewer errors, led to improved performance, achieving 92.56% accuracy with the fusion model. A multi-purpose facial expression recognition system, VISTA, was developed using Python and PyQt5. The system supports multiple data formats and provides real-time emotional feedback, thus enhancing its usability for both research and practical applications. Furthermore, the fusion model was quantized using the OpenVINO toolkit, reducing its parameters by 75% while maintaining an accuracy of 91.17%. Inference speed was improved, and XGrad-CAM was employed to enhance model interpretability, revealing that the YOLOv8 + ResNet50 combination more effectively captured facial features. Finally, the high-performance model was successfully deployed on Intel CPUs, NVIDIA GPUs, and embedded devices Raspberry Pi 4B, demonstrating the portability and flexibility of the VISTA system across various platforms. This research provides promising solutions for applications in human-computer interaction, affective computing, and real-time emotional analysis, with significant advancements made in improving system real-time performance, accuracy, and cross-platform deployment capabilities. It contributes to the development of facial expression recognition technology and lays the foundation for its widespread future applications in fields such as smart healthcare, business analytics, education, and mental health.
Keywords: Facial Expression Recognition Analysis, Deep Learning, Transfer Learning, Intelligent System, Internet of Things.