Two Poles have created the world’s best text to voice technology. Based on it, they intend to make a range of products to help transform the company into a unicorn.
Mateusz Staniszewski and Piotr Dąbkowski founded Eleven Labs in January 2022. Its solution allows you to generate a synthetic voice based on text or voice cloning based on a supplied sound sample. Because it will revolutionize the entertainment sector, from the audiobook industry to film and gaming, investors quickly invested $2M. And after 2 years – as much as $80M.
– We can build the world’s best company dealing with the development of voice technologies using artificial intelligence. Our goal is that in the future all content can be accessed with the highest sound quality. In any language and voice – said Mateusz Staniszewski.
– For many years, we met every six months, implementing various technological projects, mainly for fun and intellectual training. Eventually, we began to think about the technology that could analyze speech for sentiment and emotion. Then, the idea from which Eleven Labs hatched was born – said Piotr Dąbkowski.
Text to voice: history of success
There were several favorable circumstances for this. Firstly, Dąbkowski has been conducting research in the field of machine learning for several years. Secondly, the space for AI development has recently changed so much that it is no longer reserved only for large companies. Third, they quickly understood where their technology could be used, e.g., by dubbing English-language films. It turned out that while the work on creating synthetic text or video is already quite advanced, the voice area is still at a very early stage of development. They quickly identified the components available for testing and how to prototype such a solution. They tried by first collecting an extensive dataset and then training the algorithms. So that they learn the translation of the text into not only voice but also the context of the analyzed content.
First investors
The effects were shocking. In January 2022, they established a company. Six months later they found first investors. The British fund Concept Ventures, the Czech Credo Ventures, and several groups of business angels invested in their company $2M. In their opinion, the company has created the world’s best “text to voice” technology. It allows you to generate long-format audio statements based on text. Thanks to it, it will be possible to watch films with Tom Hanks speaking Polish. Or listen to audiobooks read in English by Polish actors. Every online content creator can publish their materials in any language.
Rapid development
In January 2023, the company unveiled its first product, which attracted more than 1 million users to its platform in just two months. In January 2024, it raised $80 million in funding from the Andreessen Horowitz Fund and entrepreneurs Nat Friedman and Daniel Gross, with participation from Sequoia Capital and SV Angel. As a result, within two years, Eleven Labs became a unicorn with a valuation exceeding $1 billion. Within a year, Eleven Labs’ users generated audio content with a lifespan of more than 100 years. The organization has grown from 5 to 40 people, and its technology is used by employees of 41% of Fortune 500 companies. Its main goal is to change how people interact with content by breaking down language and communication barriers.
Initially, the company focused on providing a solution for independent authors operating in Polish and English. It was primarily for authors of books and newsletters. Then, they entered the media industry, enabling news outlets to broadcast their content in audio form.
They also plan to create an automatic dubbing solution, taking on emotionally subdued documentaries first. However, they would like the first Hollywood production to use their solution as early as 2024. If they build a set of products to solve voice problems, they can become a billion-dollar independent business. If not, their startup could be acquired by Google, Amazon or OpenAI because of the technology.
——————————–
TOP-3
- DeepL Translator is a powerful next-generation online translation tool based on neural networks. The service supports 26 languages (as of May 2022).
- Eleven Labs, the world’s best technology for speech synthesis.
- Tomasz Czajka, Flight Software Engineer at SpaceX. He was a member of the team that was the first from Poland to win the 2003 ICPC finals.
TOP-3 on the Polish site (links lead to the English version)
- Booksy is the world’s most popular app for booking appointments at hairdressers and beauty salons.
- Vasco Translator V4 is a translation device that works online in 150+ countries, speaks 75 languages, and translates captions from photos from 108 in 0.5 sec.
- TOP-10 Polish companies in the gamedev sector 2023. The leader is CD Projekt, ahead of Techland and PlayWay.