Tobias Martens · AI Innovators · AI Dev Lab

AI Dev Lab · Impact Series · Profile 4

Voice AI

Tobias Martens

Founder, Whoelse AI

The fragmentation problem in voice AI and why interoperability matters more than any individual feature. Tobias came from standards bodies and European policy, which gave him a different frame: when AI systems can't talk to each other, adoption stalls for everyone.

The Core Insight

Fragmentation is a tax on every organization trying to adopt AI. The people working on interoperability aren't building products. They're building the conditions under which products can actually work together at scale.

The moment that started everything

Tobias Martens spent a decade working at the intersection of technology and public institutions: the European Commission, the German Institute for Standardization (DIN), and corporate consulting. That background gave him a frame most AI founders don't have: he'd watched how standards get made, how they get adopted, and how their absence creates invisible friction across entire industries.

The founding insight of Whoelse AI came from a personal moment: trying to explain how different internet services worked to his nephew and grandmother at the same time. AirBnB for apartments. Tinder for dating. Ticketmaster for events. Each required a different mental model, a different vocabulary, a different way of navigating.

It occurred to him that the problem wasn't the services themselves. It was that every service required its own conceptual framework before you could use it. And that the same fragmentation that made internet services hard to explain was showing up in voice AI, in a more consequential way. When there are over a thousand voice AI technologies on the market and each one uses a different wake word, a different intent format, and a different API. Nobody can build on top of any of them with confidence.

Why 1,000 voice AI platforms is a problem, not a success

When Tobias and I spoke, there were over a thousand voice AI technologies on the market. Most people were aware of two or three. That gap isn't a marketing problem: it's a structural problem. Organizations trying to build voice-first experiences face a fragmented ecosystem where every vendor speaks a different language, maintains different standards, and can be deprecated or acquired without warning.

The consequence: organizations either standardize on one platform and accept the dependency risk, or they maintain multiple integrations and absorb the ongoing complexity cost. Neither option is good. And neither solves the underlying problem, which is that there's no shared protocol for what voice AI systems are supposed to do or how they're supposed to talk to each other.

Martens' framing is that this is the same problem the internet solved with TCP/IP, that email solved with SMTP, that telephony solved with signaling protocols. Every time a communication technology matures, it goes through a period of fragmentation, followed by convergence around a shared standard. Voice AI is in the fragmentation phase. The work Whoelse AI is doing is about accelerating the convergence.

Standards as competitive strategy, not just compliance

One of the more unusual aspects of Martens' approach is treating standards contribution as a business strategy rather than a technical obligation. Whoelse AI has contributed to DIN Standards (the German representation of ISO), the World Wide Web Consortium, and the Voice Network initiative, working groups that are writing the technical specifications for how voice AI systems should communicate.

The strategic logic is elegant: in European government procurement, ISO compliance is frequently a contractual requirement. By contributing to the development of the relevant standards rather than waiting to comply with them, Whoelse AI shaped the environment in which it would compete. When the standard is adopted, they're already aligned, and they have the expertise and documentation to help others achieve compliance.

This is a longer game than most AI startups play. But for organizations building for regulated or government-adjacent markets, it's a model worth understanding. The standards governing how AI is deployed in public contexts are being written now. The organizations that participate have influence over what those standards require.

The architecture of interoperability

Whoelse AI's technical approach to the fragmentation problem focuses on the linguistic structure layer rather than the full technology stack. Rather than trying to get different platforms to adopt a common API, the team focused on encoding language intent in a standardized way that could be interpreted by multiple underlying systems.

The practical result is a bridge layer, built on DIN Standard protocols and connecting platforms like ARM and IBM Watson, that can accept a user request, parse its intent in a standardized format, and route that intent to whichever underlying AI system is best suited to handle it. The user doesn't know which system responded. They just get an answer.

Martens describes the long-term vision as a network of specialized AI assistants, each expert in a different domain, connected by a shared protocol for passing requests and returning results. Not one AI that knows everything, but many AIs that each know their domain deeply, able to communicate with each other through a shared language. That's the infrastructure model for AI at scale.

Key Facts

Background spans 10 years across European Commission, DIN (German ISO representation), and corporate technology consulting before founding Whoelse AI.

Contributing author to DIN Standards, ISO protocols, and the W3C Voice Network initiative, working groups writing the technical specs for voice AI interoperability.

ISO compliance is a contractual requirement in many European government procurement processes, making standards contribution a direct business development strategy.

Technical architecture: a bridge layer between ARM and IBM Watson, using DIN Standard protocols to route user intent across different AI platforms.

Chose to work on the "most basic standard feasible": linguistic structure encoding, rather than competing on features or attempting a comprehensive platform standard.

“

"We're not seeking the greatest standard currently available. We want the most basic standard feasible, built for linguistic structure, nothing more."

Tobias Martens · Founder, Whoelse AI

What This Means For Your Organization

Lessons that travel beyond the story.

Fragmentation is a real cost. Design around it from the beginning.

Before committing to AI infrastructure, understand how tightly your implementation will couple to a single vendor's format and API. Systems that can route between AI providers or use standardized intent formats give you flexibility that proprietary implementations don't. Switching costs compound.

The cognitive load of fragmentation kills adoption.

When every AI tool requires a different workflow, different prompting approach, and different mental model, the cumulative friction is what stalls organizational AI adoption, not the quality of any individual tool. Reducing cognitive load across a tool ecosystem is a change management problem as much as a technology one.

Standards participation is undervalued as a market strategy.

In regulated industries and government-adjacent markets, the organizations that helped write the standards have a structural advantage when those standards become procurement requirements. If you're building AI products for these markets, contributing to relevant standards bodies is not a compliance exercise: it's a competitive one.

More in the series

Work with a team that
knows the landscape.

The principles Tobias followed are the same ones we bring to every AI engagement at AI Dev Lab. If you're building AI products or figuring out where AI fits, we should talk.

Let's Talk Read Original Interview No commitment · 30 minutes · Senior leadership

Jason Wells

Co-Founder & Chief Strategy Officer, AI Dev Lab

MBA, Wharton MS Applied Mathematics Former SVP, Sony Pictures Kearney Alum 4× Ironman

Building AI products in transit and enterprise since before it was a pitch deck category.

Tobias Martens

The moment that started everything

Why 1,000 voice AI platforms is a problem, not a success

Standards as competitive strategy, not just compliance

The architecture of interoperability

Lessons that travel beyond the story.

More in the series

Work with a team that
knows the landscape.

Solutions

Resources

Company

The moment that started everything

Why 1,000 voice AI platforms is a problem, not a success

Standards as competitive strategy, not just compliance

The architecture of interoperability

Lessons that travel beyond the story.

More in the series

Work with a team thatknows the landscape.

Work with a team that
knows the landscape.