In addition, unlike electrical brain activity (EEG) or voice data, the face can directly identify a person’s state of severity up to certain extent. “But we cannot violate patients’ privacy, and also, collecting and combining data from several sources is more promising for further use,” says the professor at KTU Faculty of Informatics (IF).
Maskeliūnas emphasises that the used EEG dataset was obtained from the Multimodal Open Dataset for Mental Disorder Analysis (MODMA), as the KTU research group represents computer science and not the medical science field.
MODMA EEG data was collected and recorded for five minutes while participants were awake, at rest, and with their eyes closed. In the audio part of the experiment, the patients participated in a question-and-answer session and several activities focused on reading and describing pictures to capture their natural language and cognitive state.
AI will need to learn how to justify the diagnosis
The collected EEG and audio signals were transformed into spectrograms, allowing the data to be visualised. Special noise filters and pre-processing methods were applied to make the data noise free and comparable, and a modified DenseNet-121 deep-learning model was used to identify signs of depression in the images. Each image reflected signal changes over time. The EEG showed waveforms of brain activity, and the sound showed frequency and intensity distributions.
The model included a custom classification layer trained to split the data into classes of healthy or depressed people. Successful classification was evaluated and then the accuracy of the application was assessed.
In the future, this AI model could speed up the diagnosis of depression, or even make it remote, and reduce the risk of subjective evaluations. This requires further clinical trials and improvements to the programme. However, Maskeliūnas adds, that the latter aspect of the research might raise some challenges.
“The main problem with these studies is the lack of data because people tend to remain private about their mental health matters,” he says.
Another important aspect mentioned by the professor of the KTU Department of Multimedia Engineering is that the algorithm needs to be improved in such a way that it is not only accurate but also provides information to the medical professional on what led to this diagnostic result. “The algorithm still has to learn how to explain the diagnosis in a comprehensible way,” says Maskeliūnas.
According to a KTU professor, due to the growing demand for AI solutions that directly affect people in areas such as healthcare, finance, and the legal system, similar requirements are becoming common.
This is why explainable artificial intelligence (XAI), which aims to explain to the user why the model makes certain decisions and to increase their trust in the AI, is now gaining momentum.
The article Multimodal Fusion of EEG and Audio Spectrogram for Major Depressive Disorder Recognition Using Modified DenseNet121 was published in Brain Sciences Journal, and can be accessed here.