Abstract: In this paper, we propose a method to improve the accuracy of speech emotion recognition (SER) by using vision transformer (ViT) to attend to the correlation of frequency (y-axis) with time ...
This is the first step of a bigger audio-visual project. For now, I added to this repository a simple Streamlit app to preview your audio tracks from a specific folder and convert them into their Mel ...
Abstract: Schizophrenia is an example of a rare mental disorder that is challenging to diagnose using conventional methods. Deep learning methods have been extensively employed to aid in the diagnosis ...