Microsoft has taken a big step toward AI independence by unveiling three in-house models designed to compete with rivals OpenAI and Google.

The new models – MAI-Transcribe-1, MAI-Voice-1 and MAI-Image-2 – are now available via Microsoft Foundry and the MAI Playground, targeting key enterprise use cases across speech recognition, voice generation and image creation.

The move marks Microsoft’s first public release of proprietary AI models built outside its long-standing partnership with OpenAI.

While that relationship remains intact, the launch signals a parallel strategy aimed at reducing reliance on external providers.

MAI-Transcribe-1 delivers speech-to-text across 25 languages and is reportedly up to 2.5 times faster than Microsoft’s previous Azure-based offering. MAI-Voice-1 can generate 60 seconds of natural-sounding audio in just one second and supports custom voice creation from short audio samples. Meanwhile, MAI-Image-2 has already ranked among the top performers on image-generation benchmarks and is being rolled out across products including Bing and PowerPoint.

Microsoft claims the models were developed by small teams of fewer than 10 engineers, using significantly less compute resources than competing systems.

Pricing is also a key part of the strategy. Microsoft has pitched the MAI family below comparable offerings from Google and Amazon.

Microsoft is facing growing investor scrutiny to demonstrate returns on its massive AI infrastructure spending, following its weakest quarterly performance since 2008.

Until late 2025, Microsoft was contractually restricted from developing its own frontier AI under its agreement with OpenAI. A renegotiation lifted those limits, enabling the company to accelerate internal model development.

Plans for a full-scale large language model are currently in development.