Address
304 North Cardinal St.
Dorchester Center, MA 02124
Work Hours
Monday to Friday: 7AM - 7PM
Weekend: 10AM - 5PM
Address
304 North Cardinal St.
Dorchester Center, MA 02124
Work Hours
Monday to Friday: 7AM - 7PM
Weekend: 10AM - 5PM
Tobias Martens and I had a great chat about voice AI. Tobias is the enthusiastic founder of Whoelse AI. He unravels his inspiring journey in the realm of voice AI. From addressing the problem of explaining technology to diverse demographics to reshaping the way we interact with online services through his standardized voice AI solutions, Martens’ insights promise a captivating read into the future of artificial intelligence.
Get insights into AI in business strategy, as Tobias explains from Whoelse AI’s contributions to DIN Standards and ISO protocols. Additionally, understand how the growth of AI influenced his innovative venture that aims to simplify online services and standardize voice AI interactions.
I’ve worked in the technology business for public and private organizations for the last ten years, including the European Commission, The German Institute for Standardisation(DIN), and some corporate consulting companies.
Whoelse AI was created because we discovered that over the previous eight years, customers of all ages had difficulties remembering how online services operate. The idea came to me when I explained how different internet services worked to my nephew and grandmother at the same time.
And it occurred to me at the moment that rather than trying to explain that AirBnB was for apartment sharing, Tinder was for dating, and Ticket Master was for event tickets, it could be easier to explain using Whoelse AI. The aim was to establish a single brand that could describe any type of Internet business, A concept created in answer to the issue of explaining technology from a pedagogical and social standpoint.
It turned out to be a valid question about AI interoperability because numerous groups are now working on a standard to use it on voice AI assistants.
Everyone is familiar with Amazon’s Alexa and Apple’s Siri. However, there are over 1000 voice AI technologies on the market. And it makes you wonder: What type of wake word should these voice assistants respond to?
So, if you say, “Alexa, hello.” You know that is an Amazon device, this is the Amazon standard, and there are more standards on the way, such as the World Wide Web Consortium (W3C). The Internet Protocol Standardization Association, there’s even an initiative called Voice Network. They’re working on a register for Voice Internet services. Using the same analogy, a more straightforward name for voice and a simpler name for Internet services,
We believe this becomes now relevant because voice AI assistants need a wake word. And the question will be, What kind of wake word can you navigate a voice assistant in the easiest way? Because voice assistants work without any screen, you have to explain it to explore the usability using your own voice.
That’s why we believe that this was the moment for us to build up WhoelseAI. Since the last couple of years, the topic has become more relevant, so we decided to do it now.
Image courtesy of Whoelse AI
We launched the project in 2018. We introduced the project first to the DIN Standard (the ISO’s Standard German representation). The ISO Standard regime is unique because it is usually a requirement in public procurement processes, which means that if the European Commission is writing government contracts in Europe, they must use ISO standard before they can use any kind of proprietary standard and because we have been contributing with the ISO Standard with our research. And in this way, we could gradually ensure the efficiency of our own business.
We’re all working together to write the technical specifications for voice AI collectivity. In addition, we aim to create a human language protocol. The question is whether the protocol for human language can be implemented using the current standards. As a solution, we developed a standardized framework for encoding any language.
And, in this field of competing standards, we are not seeking to produce yet another standard or the greatest standard currently available. We attempt to provide the most basic standard feasible, and we will not do so for any technological component. We’re only doing it for the sake of linguistic structure. So, for example, we’re collaborating with the World Wide Web Consortium (W3C) on technical parameters for voice AI collectivity, and we’d want to develop a protocol for human language. The question is whether the protocol for human language can be implemented using current standards.
Because every language is slightly different on the inside, we develop an encoding of every language in a standardized manner. And we don’t believe that the German, French, and Italian firms will agree on a common denotation. And that creates a problem for the companies attempting to use AI for their products. To use AI, they have to either use an already existing company, like Amazon, or build a new one, but it can be a lot more expensive.
We serve as a bridge between ARM and IBM Watson. Utilizing the DIN Standard that we created, we also offer the protocol used with it. The problem is that sophisticated models must be trained and developed. Because the larger a language model becomes, the less accurate the model is. I have maybe 100 distinct intentions that I can put into this instance of IBM Watson. So, for example, I can use up to 80 extra intents on my hardware platform like Hey, find me a cinema, playing “XYZ” movie, And then IBM Watson can ask the Cinema ticket Whoelse AI. Then we say, okay, cinema home ticket Whoelse AI can be set by provider “XYZ” that just uses any other system. It will work like a phone automated line that you can use to request information or do tasks.
We’re always thinking about voice assistance becoming a globally intelligent assistant who knows everything about my schedule, my bank account, and so on. So, we’re on the same page, a one-size-fits-all approach to artificial intelligence assistants.
My vision is that there will be various words with varying capabilities. The initial function of just forwarding user requests. In the long term, it will also be an interconnection between these various AIs, but for now, it’s simply the forward request, and the platform will perform it. And we showed it in the DIN Standard, which we created using Google Dialog, Slope, Nuance, or IBM Works.