The new Matxa model is now available to be tested and run on the Hugging Face open source AI platform
The Aina project for artificial intelligence and language technologies celebrates Sant Jordi by publishing the first voice synthesis model in the main dialectal variants of Catalan, called Matxa . It is the first technological solution published as an open language model that offers text to speech interpretation (Text To Speech/TTS) in central, north-western, Balearic and Valencian Catalan. The Aina project is promoted and financed by the Generalitat de Catalunya.
All users can access the model available on Hugging Face, the AI community with open source resources, from where it can be tested and run. The technology developed by the Language Technologies Unit of the Barcelona Supercomputing Center – Centro Nacional de Supercomputación (BSC-CNS) is trained with different datasets or sets of data, including Festcat, OpenSLR69 or the recently created Fresh that includes recordings in four dialect variants and 8 different speakers.
Matxa is a step forward in terms of performance and quality, as it maintains the naturalness and characteristics of the voices chosen to train it. For its composition, it is based on the combination of the Matcha-TTS and Vocos architectures that stand out for their novelty and very low execution times through neural networks. The dialect system has been configured and trained through the new supercomputer MareNostrum 5 and FinisTerrae III of the Centro de Supercomputación de Galicia (CESGA).
Through the public demo , you can do a first test of how Matxa works:
The new data set La FresCat is a pioneering development in the field of digital resources in Catalan, as it incorporates up to 8 speakers with different characteristics. In total, two voices for each of the main dialects. The dataset will be made public in the coming weeks and will be available for download and use by all users. According to the BSC researcher, specialized in voice, Baybars Külebi, it is “an innovative resource that makes available to everyone digital resources that take into account the plurality of the Catalan language”.
The development of speech synthesis technologies opens the door to a large volume of possible applications. In fact, the Aina Project, through the BSC, is already working with companies and institutions to offer specific solutions using the artificial intelligence tools developed at the center.