Human speech can be used to express much more than semantic content (meaning). The tone of voice, intonation (pronunciation), rhythm changes are all capable of expressing the speaker's intentional or spontaneous emotional intentions, mood or state of health, in addition to the message to be conveyed. These are called non-verbal information, which can also be identified by the human voice and speech formation mechanism (change of vocal cord tension, change of rhythm, change of volume, closed articulation, etc.). Several machine learning methods have already been tried for automatic emotion recognition based on voice, but the emergence of new deep learning techniques is constantly allowing the introduction of new methods. In this topic, the task of the students is to test machine learning procedures (including deep learning) that implement emotion recognition using speech signal input. The procedure can also be used in human-machine interface and customer service automation. For more information, please contact us in the Informatics building, room B 156, at sztaho.david@vik.bme.hu or Microsoft Teams (Sztahó Dávid, sztaho.david@vik.bme.hu).