Skip to main content

5.1.23. Slovakia

FLaReNet Summary

There are not many activities in Slovakia focused on Language resources at the moment. One of them is the text collection Slovak National Corpus. Another activity is focused on speech database building for speech recognition purposes. This is being done at the Slovak Academy of Sciences - Dept of Speech analysis/synthesis. Some of the databases were built in cooperation with universities (SpeechDat-E Slovak, MobilDat Sk, built in cooperation with the universities of Bratislava, Kosice and Zilina, Parliament speech database and the Legal speech database (under construction).

The Dept of Speech analysis/synthesis at the Slovak Academy of Sciences has built speech databases for speech synthesis. All the synthesis speech databases are in Slovak except one bilingual speaker in both Slovak and Serviko Romani [the language mostly used by the Slovak Romanies (Gypsies)].

Contact Point Input

National/Regional contact: Milan Rusko, Slovak Ac. Sciences - Dept Speech analysis/synthesis.

Programs

There are not many activities in Slovakia focused on Language resources at the moment.

  1. One of them is the text collection Slovak National Corpus.
      The person responsible for Computer-Aided tools for corpus building, mathematical linguistics issues etc. is Mr. Radovan Garabík (radog@juls.savba.sk).

  2. The other activity is focused on speech database building for speech recognition purposes.
      This is being done at the Slovak Academy of Sciences - Dept of Speech analysis/synthesis.
      Some of the databases were built in cooperation with universities:
        - SpeechDat-E Slovak (1000 speakers) for fixed telephone network;
        - MobilDat Sk for mobile telephone speech (1100 speakers) - in cooperation with technical universities of Bratislava, Kosice and Zilina;
        - Parliament speech database (130 hours);
        - Legal speech database (under construction).

  3. The Dept of Speech analysis/synthesis at the Slovak Academy of Sciences built speech databases for speech synthesis for our own purposes
      (several speakers, several hours each).
      All the synthesis speech databases are in Slovak except one bilingual speaker in both Slovak and Serviko Romani
      [the language mostly used by the Slovak Romanies (Gypsies)].
        - Natural (non-acted) expressive speech database (under construction).

  4. Text collections are owned also by press-monitoring companies, such as SITA, Slovakia online, Newton technologies and others.

Another activity which has a lot to do with speech and sharing sources is the COST 2102 Cross-Modal Analysis of Verbal and Non-verbal Communication, which is in a very preliminary stage now (see Annex 4).