G2 takes pride in showing unbiased reviews on user satisfaction in our ratings and reports. We do not allow paid placements in any of our ratings, rankings, or reports. Learn about our scoring methodologies.
Google Cloud’s Speech API processes more than 1 billion voice minutes per month with close to human levels of understanding for many commonly spoken languages. Powered by the best of Google's AI resea
Enterprise Voice AI platform designed for developers building voice-first products using speech-to-text, text-to-speech, or speech-to-speech APIs. Over 200,000 developers build with Deepgram's voice-n
Deepgram is a speech-to-text service that provides transcription, sentiment analysis, and other features for audio processing. Reviewers appreciate Deepgram's high accuracy in transcription, real-time processing capabilities, extensive language support, and user-friendly API, which integrates easily with other tools and services. Users mentioned issues with Deepgram's pricing structure, limited language support, and the need for improvements in speaker diarization and handling of heavy accents or noisy audio.
Krisp is a voice productivity and real-time AI communication platform that helps teams, contact centers, and developers deliver clearer conversations through real-time noise suppression, accent conver
Krisp is a noise cancellation and transcription software that aims to improve the clarity of audio during calls and transcribe meeting notes. Reviewers appreciate Krisp's effective noise cancellation, automatic recording and transcription features, and its ability to integrate with various platforms, enhancing productivity and meeting efficiency. Reviewers noted issues with Krisp's transcription accuracy for certain languages, occasional software glitches, and the lack of customization options for summaries and action items.
Azure AI Speech is a comprehensive suite of AI-powered speech services designed to enhance applications with advanced voice capabilities. It offers developers tools to integrate features such as speec
Azure AI Speech is a speech recognition and synthesis tool that supports multiple languages and offers features such as sentiment analysis and language translation. Users like the high accuracy of Azure AI Speech, its multilingual support, and its seamless integration with other Microsoft tools and services, which simplifies deployment and enhances daily activities. Users experienced issues with Azure AI Speech's accuracy when dealing with quick speaker changes or low-quality audio, and found the setup and configuration process complex, the pricing structure complicated, and the official documentation lacking in simplicity and robustness.
Speechmatics: Best-in-Market Speech-to-Text & Voice AI for Enterprises Speechmatics delivers industry-leading Speech-to-Text and Voice AI solutions, designed for enterprises that demand best-in
Speechmatics is a transcription technology that provides speech-to-text services, speaker identification, and language recognition. Users frequently mention the high accuracy of transcriptions, the speed of the service, the ability to recognize multiple languages, and the responsive support staff. Users experienced limitations with the free trial plan, lack of support for diverse local languages, deletion of transcription jobs after 7 days, and the need to combine Speechmatics technology with other capabilities for specialized use-cases.
Mihup Interaction Analytics analyses 100% of customer conversations, uncovering their voice while revealing sales, service, and renewal opportunities for contact center teams to capitalise on. Its AI
Mihup is a platform that analyzes conversation and detects emotions and key topics, turning voice and text interactions into actionable intelligence and providing services such as live alerts during calls, compliance monitoring, sentiment shifts, and agent guidance. Users like Mihup's accuracy and clarity in speech analytics, its seamless multilingual voice recognition, its ability to integrate with existing call systems and CRM tools, and the proactive and knowledgeable customer support team. Reviewers mentioned that the user interface could be improved, the initial configuration for large datasets can be time-consuming, and the platform lacks transparency in pricing and other details.
Digital evidence has grown 10–100x in the last decade — body-worn cameras on every officer, dash cams on every car, smartphones and doorbells recording every incident, and hours of 911, jail calls, an
Rev is a transcription service that converts audio from meetings, interviews, and webinars into text, allowing users to avoid manual typing and re-listening to recordings. Users frequently mention the speed and accuracy of Rev's transcriptions, its ease of use, and its ability to save them significant time in their workflows. Reviewers noted that Rev struggles with understanding dialects and accents, leading to inaccuracies in the transcriptions, and some users found the user interface slightly complicated.
Otter.ai is the leading AI Meeting Assistant that helps sales, marketing, product, finance, operations design, customer success, customer support and cross functional teams automatically record, trans
Otter.ai is a transcription and note-taking tool that automatically joins meetings, records audio, and provides transcriptions and summaries. Reviewers frequently mention the tool's accuracy in transcribing conversations, its ability to provide clear notes and summaries, and its seamless integration with platforms like Zoom and Google Meet. Users reported issues with transcription accuracy for non-English languages and regional accents, difficulties in speaker identification, and limitations in the free plan.
Founded in 2017 and headquartered in San Francisco, AssemblyAI is a Speech AI platform serving over 200,000 developers worldwide. AssemblyAI specializes in providing speech recognition and understandi
AssemblyAI - Speech to Text API is a tool used to convert recorded audio and video files into written transcripts, often used for transcribing therapy sessions, call center recordings, and long-form audio files. Reviewers frequently mention the high transcription accuracy, the ability to detect languages and speakers, the support for multiple languages, and the ease of integration and setup as key benefits of using AssemblyAI - Speech to Text API. Reviewers mentioned issues with the cost when processing large amounts of audio, limited configurability around diarization, the need for more language support for the latest model, and the desire for improved speaker differentiation and transcription speed.
Jasper is an open source platform for developing always-on, voice-controlled applications