change log
project main page on Github
further samples
languages config
emotions config
last update: 2017/7/14,

Emofilt is an open source program to simulate emotional arousal with speech synthesis based on the free-for-non-commercial-use MBROLA synthesis engine.
It is NOT a complete text-to-speech system, but acts as a transformer between the phonemisation and the speech-generation component. So you need to have Mbrola installed in order to use emofilt!
Originally developed at the Technical University of Berlin in 1998 it was recently revived as a source forge open-source project and completely rewritten in the Java programming language.
Emofilt's language-dependent modules are controlled by external XML-files and it is as multilingual as MBROLA which currently supports 35 languages.

There is an article that describes emofilt. There is another paper that describes the storytagger interface.

You're welcome to use emofilt in your projects: if you do, please let me know to be included on the reference list.


  • Graded emotions: apply 40% happiness!
  • Pluggable architectre: implement your own processing step based on the emofilt framework.
  • Custom syllables: Define your own syllables by adding ": -" and ": - focus" markers to your input pho-file

fig 1: screenshot of EmofiltDeveloper (click for a larger version)

fig 2: emofilt acts as a transformer between the natural language processing component (NLP) and the digital speech processing component (DSP), i.e. MBROLA.

fig 3: processing steps

fig 4: GUI of the storytagger interface



English male voice (en1)
mp3 mp3 mp3 mp3 mp3

French male voice (fr1)
mp3 mp3 mp3 mp3 mp3

German male voice (de6) (neutral prosody txt2pho)
mp3 mp3 mp3 mp3 mp3

Greek male voice (gr2)
mp3 mp3 mp3 mp3 mp3

Hungarian male voice (hu1)
mp3 mp3 mp3 mp3 mp3

Italian male voice (it3)
mp3 mp3 mp3 mp3 mp3

Dutch male voice (it3)
mp3 mp3 mp3 mp3 mp3

Turkish male voice (tr1)
mp3 mp3 mp3 mp3 mp3
The emotional simulation is achieved by manipulationof the following aspects of a speech signal, constrained by the limitations that a diphone-concatenation approach presents:
  • Pitch changes
  • Duration changes
  • Voice Quality (simulation of Jitter and support of multiple-voice-quality database)
  • Articulation (replacement of central / decentral vowels with their antagonists)

Emofilt consists of three main interfaces:

  • Emofilt-Developer: a graphical editor for emotion-description XML-files with visual and acoustic feedback. (see screenshot 1)
  • Storytagger Interface: a graphical editor to tag a sequence of sentences emotinally and read it with Emofilt. (see screenshot 4)
  • Emofilt itself, taking the emotion-description files as input to act as a filter in the MBROLA framework.

author's contact

email: felixbur@gmx.de
web: Felix Burkhardt