We use cookies to make sure our website better meets your expectations.
You can adjust your web browser's settings to stop accepting cookies. For further information, read our cookie policy.
SEARCH
IN Warsaw
Exchange Rates
Warsaw Stock Exchange - Indices
Special guest
You have to be logged in to use the ReadSpeaker utility and listen to a text. It's free-of-charge. Just log in to the site or register if you are not registered user yet.
Speech Recognition: Poles Set the Pace
November 3, 2014   
Article's tools:
Print

Tomasz Szwelnik, CEO of Voicelab, a Polish company that is carrying out a project focusing on speech recognition technology, talks to Karolina Olszewska.

The full name of the project is “Continuous speech recognition, background noise reduction and voice biometrics technology” and it is financed by the National Center for Research and Development (NCBiR) under its Go_Global.pl program.

In which fields is your speech recognition technology being used?

It will soon come to be used by all banks. Our flagship project is called Voicebanking and involves innovative voice services for banks. The technology we have developed makes it possible for customers to use their voice to make a transfer in a mobile banking application, check their account balance or carry out other banking operations. It’s enough to say, for instance, “I want to make a transfer,” “How much money do I have on my account?” or “How much did I spend last week?”

Speech recognition technology provides quick access to information without doing anything else with the app. That’s much more comfortable than the traditional point-and-click kind of experience. We’ve already introduced this solution in two banks in Poland, and we’re negotiating with others. These are the first innovative implementations of this kind in Europe.

Other implementations in this area include voice biometrics, which is also used in banking. This makes things much easier for customers because they do not go through the authorization procedure in the traditional way, being kept waiting on a hotline for a long time. A voice password is enough. Thanks to voice biometrics methods, the application will correctly recognize the voice features of a registered customer and make it possible for them to perform banking operations, for example in mobile banking. Such a system is safer than traditional methods, which are often based on identity card data that are easy to intercept. It’s more difficult to counterfeit biometric features, which are inseparable from the structure of the human voice tract. With a voice sample provided, the system instantly recognizes the speaker and the user can immediately use the banking application.

The system can also be used in customer service hotlines. You can ask to be put through to a selected department or carry out the requested operation without pressing any keys on your phone. This is a great convenience, especially for those who have trouble using their phone keyboard. Our solution is targeted at large companies, corporations that have many customers and want to make this process automatic. This will make it easier for the customer to access information, even on a round-the-clock basis.

What’s the main focus of your project?

We focus on speech recognition technology, background noise reduction and voice biometrics. We’ve developed our own speech recognition system that stands a chance of competing with rival systems developed by some major corporations. We are very advanced in this area and want to make a technological breakthrough internationally. That we have potential to be successful is shown by the awards we have won: a gold medal at the INPEX trade fair, the largest invention exhibition in the United States, a silver medal at an innovation show in Belgium, and a bronze medal in Taiwan for a solution we have submitted for patenting.

We are also conducting our own research and developing voice biometrics technology. Thanks to cooperation with the National Center for Research and Development, we are carrying out a large project focused on state security and defense. It is called “Developing an IT system for voice identification of emergency callers.” This technology will make it possible to identify people suspected of misleading emergency services, for example those responsible for repeated bomb hoaxes. Our speech recognition system will recognize them: it will determine their psychological profile, age, sex, and whether they are under the influence of intoxicants. The project gives us a chance for broader cooperation and for developing our technology. We’re carrying it out in a consortium led by the AGH University of Science and Technology in Cracow. The project is supervised by Prof. MariuszZiółko from that university. It began in 2013, and we expect to complete it in two years’ time. The total cost of the project is zl.1,549,000.

And our innovative background noise reduction system called Noisebusters is a godsend for those who must communicate where there is noise, static or other interference. We recently received a gold medal for this technology at the INPEX innovation show in the U.S.

The list of potential users of your technology is much longer…

Defense, medicine, justice administration, media, telecommunications—these are just a few of them. In the case of medicine, the issue also involves converting speech into text. The system is for transcribing verbal descriptions of various results of examinations carried out by doctors. It recognizes the voice of the doctors as they convey their diagnosis and recommendations for medical procedures. The doctor will send the recording to a transcription center that will then convert it into text and send it to the appropriate medical services to carry out. Medical centers abroad are already doing that. Doctors don’t have to fill out tons of forms thanks to such technology. In Poland, this will be an innovation.

What about justice administration, media and marketing? How can your technology be used in these areas?

For judges this is a great help that makes their work easier. They dictate plenty of rulings, verdicts and substantiations to their stenographers. Testimonies are also written by hand. Law firms are inundated by recordings from court trials and hearings that need to be listened to. Making these activities automatic will be a veritable revolution for lawyers. We are planning such implementations. But there’s a problem with the quality of these recordings from courtroom hearings. Most often there’s a lot of interference that makes speech recognition difficult. Specialist directional microphones are needed for such recordings. We provide the equipment needed.

When it comes to the media, our system makes it possible to transcribe a recorded TV or radio program. Another aspect is the monitoring of marketing campaigns. A company conducting such a campaign usually uses a keyword or product name. When people call to inquire about a specific product, it is possible to measure the number of such calls and thus evaluate the effectiveness of the campaign. Such analysis is provided by our software for recognizing keywords in phone calls. It is also useful in call centers. The system detects words that should or mustn’t be used in a phone call. By doing so, the system can check if a customer wants to complain about a product, withdraw from a contract or perhaps is threatening the company with legal action. This makes it possible to detect such words at an early stage and redirect the call to a more experienced consultant.

Your company is a cross between a regular business and a highly innovative research laboratory. You work with universities, public institutions and industry. Voicelab works along commercial lines like any other business, but has a scientific advisory board made up of world-class experts in the field. Tell us some more about the company.

We graduated from the Gdańsk University of Technology and specialize in sound engineering. That’s where our interest in speech processing and speech recognition technology comes from. In 2009, in the wake of the Innovator competition held by the Foundation for Polish Science, we founded a company for research and development of speech recognition as well as background noise reduction technology. I founded the company together with MarcinKuropatwiński. We developed methods that had potential for going commercial. We needed money for the development of our company and so we took part in an EU project financed from funds managed by the Polish Agency for Enterprise Development (PARP). The project helped us develop our products.

Thanks to the funds we have obtained we can be innovative and flexible in our implementations. Our speech recognition technology has also gotten the thumbs-up from a group of internationally acclaimed experts who have joined our scientific board. They include Prof. Bastiaan Kleijn, founder of Global IP Solutions, currently with the School of Engineering and Computer Science in Victoria, New Zealand, and Prof. Reinhold Haeb-Umbach, known for his pioneering work on speech recognition at Philips Laboratories and currently head of the Multimedia Group at the University of Paderborn in Germany. We are also supported by scientists from the Gdańsk University of Technology: Prof. Maciej NiedĽwiecki from the Department of Automatic Control, and Jan Daciuk, Ph.D., from the Department of Intelligent Interactive Systems.

You’ve decided to compete in an industry dominated by global giants. What made you certain you would be able to carve out a piece of the pie for yourselves on this difficult market?

Speech recognition is one of the most advanced technologies in the world and is full of challenges. People have been interested in speech recognition since the 1960s. It all started with research conducted at Bell Labs, research done by IBM, Microsoft and other research institutes abroad. One of the biggest challenges is recognizing human speech and making the computer understand it. That’s a challenge we took up.

Earlier this year Marcin Kuropatwiński presented our technology at the International Conference on Acoustics, Speech and Signal Processing (ICASSP) [in Florence, Italy]. He was one of six Poles there. The competition was huge—there were around 1,000 researchers from America, and about 400 from Germany. We’re proud that a small Polish company has managed to develop world-class technology. We have a chance to put it on international markets thanks to the Go_Global program. Funds from this program, to the tune of around zl.200,000, will enable us to go to the U.S. and showcase our speech recognition, voice biometrics and background noise reduction technology there. We received a recommendation from an international organization called Plug’n’Play Tech Center, which is a partner of the National Center for Research and Development and which invited us to Silicon Valley. We want to go global. We plan to use more languages in our products: German, English, Spanish and Russian. We are capable of competing with large corporations in such an advanced area of technology. But this is not only about competition. Nowadays, not every company wants to use services provided by global corporations. We approach our customers in an individual and flexible way. There is room for everyone on the market, especially as our technology can be used in mobile devices such as mobile phones, laptops, tablets, without the need to go online. It can also find application in devices such as voice-operated coffee machines, washing machines and fridges. And toys and games are yet another example. We don’t yet fully know in which new products this technology with a future could be used. That’s precisely why we’re going to Silicon Valley: to look for new market niches, attract new partners, find a place for ourselves on the international market, and come back to Poland full of new ideas and with new contacts.
Latest articles in The Polish Science Voice
Latest news in The Polish Science Voice
Mercure - The 6 Friends Theory - Casting call
© The Warsaw Voice 2010-2018
E-mail Marketing Powered by SARE