Alternatives to Txtplay
Compare Txtplay alternatives for your business or organization using the curated list below. SourceForge ranks the best alternatives to Txtplay in 2025. Compare features, ratings, user reviews, pricing, and more from Txtplay competitors and alternatives in order to make an informed decision for your business.
- 
    1
    
Google Cloud Speech-to-Text
Google
Google Cloud’s Speech API processes more than 1 billion voice minutes per month with close to human levels of understanding for many commonly spoken languages. Powered by the best of Google's AI research and technology, Google Cloud's Speech-to-Text API helps you accurately transcribe speech into text in 73 languages and 137 different local variants. Leverage Google’s most advanced deep learning neural network algorithms for automatic speech recognition (ASR) and deploy ASR wherever you need it, whether in the cloud with the API, on-premises with Speech-to-Text On-Prem, or locally on any device with Speech On-Device. - 
    2
    
Speechmatics
Speechmatics
Best-in-Market Speech-to-Text & Voice AI for Enterprises. Speechmatics delivers industry-leading Speech-to-Text and Voice AI for enterprises needing unrivaled accuracy, security, and flexibility. Our enterprise-grade APIs provide real-time and batch transcription with exceptional precision—across the widest range of languages, dialects, and accents. Powered by Foundational Speech Technology, Speechmatics supports mission-critical voice applications in media, contact centers, finance, healthcare, and more. With on-prem, cloud, and hybrid deployment, businesses maintain full control over data security while unlocking voice insights. Trusted by global leaders, Speechmatics is the top choice for best-in-class transcription and voice intelligence. 🔹 Unmatched Accuracy – Superior transcription across languages & accents 🔹 Flexible Deployment – Cloud, on-prem, and hybrid 🔹 Enterprise-Grade Security – Full data control 🔹 Real-Time & Batch Processing – Scalable transcriptionStarting Price: $0 per month - 
    3
    
Twilio Voice
Twilio
Create a scalable voice experience with the API that connects millions globally. With Twilio Voice, you can build unique phone call experiences with one API, to create, receive, control and monitor calls with just a few lines of code. Create an engaging voice experience that you can quickly scale and modify with a wide array of customization options and resources, like our Voice SDK. Then, add on features like Interactive Voice Response (IVR), recording transcriptions, and speech recognition to create an experience that your customers will appreciate. Whether you're looking to set up global conferencing or alerts & notifications, Twilio has the support you need for building with Voice. Find docs, code samples, helper libraries, and developer tools such as Twilio Runtime and our visual workflow builder, Studio.Starting Price: $0.0085 per min - 
    4
    
Rev
Rev
Rev provides premium on-demand, manual and automated transcription, closed caption, and foreign subtitling services. With 170,000+ customers, Rev's clients span from global enterprises to freelance journalists. Rev processes more audio and video than any other provider and has the ability to scale to fit any customer's needs. Pricing is simple starting at just $0.25 per audio/video minute for automated speech-to-text services and $1.25/min for manual with 99% accuracy. Rev also offers Rev.ai which is a speech recognition engine that's available to companies that want it.Starting Price: $1.25 per minute - 
    5
    
Otter.ai
Otter.ai
Otter is where conversations live Generate rich notes for meetings, interviews, lectures, and other important voice conversations with Otter, your AI-powered assistant. Organizations who have the Otter advantage. Teams big and small trust Otter to transcribe their important conversations. Our shiny new release, Otter 2.0, adds more functionality to improve collaboration and productivity. The Teams plan includes capabilities designed especially for small and medium businesses and teams in larger enterprises. Record and review in real time. Search, play, edit, organize, and share your conversations from any device. Record conversations using Otter on your phone or web browser. Import or sync recordings from other services. Integrate with Zoom. Get real-time streaming transcripts and, within minutes, rich, searchable notes with text, audio, images, speaker ID, and key phrases. Share or export voice notes to inform others and get on the same page.Starting Price: $8.33 per month - 
    6
    
Amazon Transcribe
Amazon
Amazon Transcribe makes it easy for developers to add speech to text capabilities to their applications. Audio data is virtually impossible for computers to search and analyze. Therefore, recorded speech needs to be converted to text before it can be used in applications. Historically, customers had to work with transcription providers that required them to sign expensive contracts and were hard to integrate into their technology stacks to accomplish this task. Many of these providers use outdated technology that does not adapt well to different scenarios, like low-fidelity phone audio common in contact centers, which results in poor accuracy. Amazon Transcribe uses a deep learning process called automatic speech recognition (ASR) to convert speech to text quickly and accurately. Amazon Transcribe can be used to transcribe customer service calls, automate subtitling, and generate metadata for media assets to create a fully searchable archive.Starting Price: $0.00013 - 
    7
    
Maestra
Maestra
Automatic Transcripts, Subtitles and Voiceovers. In just minutes. Highly accurate speech to text software with a built in advanced text editor. Translate in English, French, Spanish, German and 80+ languages. Save time and money with Maestra’s automatic audio to text transcription software. Transcribe audio files to text automatically within seconds. No credit card required for the first 15 minutes. Creating subtitles for video with online automatic subtitling software can save you a considerable amount of time. You'll be able to auto generate subtitles for videos in just a few minutes. You can also translate your subtitles automatically to 80+ languages. With Maestra video dubber you can automatically voiceover your videos aloud to foreign languages using artificial intelligence and computer generated voices.Starting Price: $6/hour - 
    8
    
Transkriptor
Transkriptor
Automatically transcribe audio, and turn your audio or video to text. Upload your file and convert your audio to text with Transkriptor. Transkriptor’s powerful artificial intelligence generates online transcriptions within few minutes. Transkriptor is used by many professionals or students. Transkriptor is the best assistant for interview transcription, lecture transcription and video transcription. Transkriptor creates editable TXT, word or SRT files. You can download your transcriptions within seconds or you can use Transkriptor’s online editor for easy and quick editing. Sign up today and be more productive in school, work, and life. Even though Transkriptor is one of the most powerful artificial intelligence solutions, it is extremely easy to use. Transkriptor is an online speech-to-text converter and no installation required. Simply upload your file and start.Starting Price: $9.99 per month - 
    9
    
Trance
Digital Nirvana
Digital Nirvana’s pioneering and advanced speech-to-text engines enable content creators to generate highly accurate audio and video content transcripts. The powerful Trance UI allows users to easily navigate, edit and export caption files in all industry-recognized formats. Built-in AI along with custom preset capabilities ensure caption conformance with style guidelines from various delivery platforms.Trance is designed to use machine learning capabilities to enhance the process of generating transcripts, closed captions, and subtitling for media content. Further, Trance also boasts an industry-first tool, Natural Language Processing capabilities. Our NLP technology enables transcript splitting based on grammar rules and styles for individual streaming platforms. Auto-generate captions to conform with multiple style guidelines and file types - all in the shortest time frame possible. - 
    10
    
spotl
spotl
Whatever the format of your video, your subtitles are optimally placed on the screen, without any intervention required from you. The subtitles generated by spotl are optimized to comply with the constraints of professional subtitling. We also provides you with all the tools you need to work as a team and to verify and validate your content. With its artificial intelligence, SPOTL automatically generates your multilingual subtitles in record time and at a very attractive price. SPOTL's exclusive innovation, post-editing, allows you to have your content corrected by certified professionals. With spotl, your subtitles automatically adapt to the format of your video and are customizable. - 
    11
    
Temi
Temi
Upload any audio or video file. We accept all file types. Review your transcript with timestamps and speakers. Save & export your transcript as MS Word, PDF, SRT, VTT and more. Transcript quality depends on audio quality. Record clear audio to get accurate transcripts. Temi's free transcription editor lets you edit your transcripts online in minutes. Built by our machine learning and speech recognition experts. Quickly clean-up the provided transcript. Adjust the playback speed and skip around easily. Temi knows the timing of every word. Add any timestamps. We mark the change of every speaker and label them. Download your transcript into text (MS Word, PDF) or closed caption files (SRT, VTT).Starting Price: $0.25 per audio minute - 
    12
    
SpokenData
ReplayWell
Let the automatic speech-to-text technology transcribe your data. Or transcribe your data yourself or buy professional transcript. Use our on-line time synchonous editor to surf your data and transcripts. Download transcripts in many formats. Manage your team of transcribers using tags and categories. Help them with transcription by automatic voice-to-text technology. Integrate SpokenData into your application via our REST API. We adapt the voice-to-text on your data domain to maximize the transcript accuracy and lower your labor costs. Enable speech technologies in your applications through integrating SpokenData using our REST API. We are ready to process huge amounts of your data. You get API fitting your needs. Just contact our support team. We customize the voice-to-text on your data and purpose to maximize the transcript accuracy. Suitable for: web/mobile app developers, media monitoring agencies, audio/video archive business. - 
    13
    
Azure Video Indexer
Microsoft
Azure Video Indexer is a video analytics service that uses AI to extract actionable insights from stored videos. Enhance ad insertion, digital asset management, and media libraries by analyzing audio and video content—no machine learning expertise necessary. Enhance your search experiences by using video indexing within the metadata to automatically extract data from your content. Multichannel analysis provides information to perform a more effective search across your media archive and within each file. Search by person, project, visual text, spoken word, entity, topic, and more. Apply the extracted metadata to improve the user experience. Use speech transcription and translation to easily add closed captioning in multiple languages. Fine-tune recommendation algorithms based on objects and people that appear in a video, and automatically create clips from sections featuring a particular person. - 
    14
    
VideoTranslator
VideoTranslator
We look at the number of languages which you can use with your content. Remember, each languages is potentially a new market, and care needs to be taken to properly target your preferred leads. There are two kinds of transcription, listed below. In both cases, speech is involved, hence these are referred to as transcription AI’s. If you’re planning to post your video to social media, it’s important to make sure your video meets social channel specific formatting requirements. Not doing this can affect your users experience, from looking distorted, to unreadable captioning, to simply not playing, the below simple tips and tricks will make your content convert faster!Starting Price: $10 per 1,000 credits - 
    15
    
CaptionHub
Neon Creative Technology
The combination of integrated AI text-to-speech and our own Natural Captions engine gives you perfectly formatted captions, in much the same way as a skilled human subtitler would – but it takes seconds, not days. Our automated transcription delivers text that’s almost perfect. All that’s left for you to do is finesse it from your browser, using smart notifications and validated workflows to collaborate seamlessly with your team and / or agencies when you need to. Perfect subtitles, faster. Machine translation can translate subtitles in 103 languages, in one simple step. Then assign linguists to finesse the translations, and split up videos for shared workloads. Don’t have your own linguists? We can hook you up with our translation partners. No more manual downloading and uploading of videos and subtitle files. Publish your subtitles from CaptionHub with a single click, using our highly secure video platform integrations. - 
    16
    
Gladia
Gladia
Gladia is an advanced audio transcription and intelligence platform delivered via a unified API that supports both asynchronous (pre-recorded) and real-time streaming transcription, enabling developers to convert speech to text in over 100 languages with features like word-level timestamps, language detection, code-switching, speaker diarization, translation, summarization, custom vocabulary, and entity extraction. Its real-time engine achieves latencies under 300 ms while maintaining high accuracy, and it offers “partials” (intermediate transcripts) to improve responsiveness in live settings. The platform’s asynchronous API is powered by a proprietary Whisper-Zero model optimized for enterprise audio, and it lets clients apply add-ons such as enhanced punctuation, name consistency, custom metadata tagging, and export to subtitle formats (SRT, VTT).Starting Price: Free - 
    17
    
Airgram
Airgram Inc.
Airgram is the best meeting productivity tool you’ll ever need in this hybrid work era. Whether it’s the pre-meeting preparations, collaboration on the notes during meetings, or the post-meeting management of the notes, Airgram is here to help teams get the most out of every meeting. Key Features: - Record and transcribe Zoom, Google Meet, or Microsoft Teams meetings with speaker identification in real time. - Collaborate on meeting minutes, and assign action items with due dates. - Share meeting notes to Slack, or export transcripts to Notion, Microsoft Word, and Google Docs to keep everyone posted. - Review meetings with HD video recordings and timestamped notes. Skim for crucial information via AI-based entity extraction. - Create clips from an unstructured text to turn your meetings into key highlights. - Manage shared recordings, transcripts, and meeting notes with team members together in the workspace. Found Airgram helpful? Leave us your feedback here! :)Starting Price: $0 - 
    18
    
Azure AI Speech
Microsoft
Build voice-enabled apps confidently and quickly with the Speech SDK. Transcribe speech to text with high accuracy, produce natural-sounding text-to-speech voices, translate spoken audio, and use speaker recognition during conversations. Create custom models tailored to your app with Speech studio. Get state-of-the-art speech to text, lifelike text to speech, and award-winning speaker recognition. Your data stays yours, your speech input is not logged during processing. Create custom voices, add specific words to your base vocabulary, or build your own models. Run Speech anywhere, in the cloud or at the edge in containers. Quickly and accurately transcribe audio in more than 92 languages and variants. Gain customer insights with call center transcription, improve experiences with voice-enabled assistants, capture key discussions in meetings and more. Use text to speech to create apps and services that speak conversationally, choosing from more than 215 voices, and 60 languages. - 
    19
    
Verbit
Verbit Software
Create Impact with Transcription & Captioning. Our customers are offered the leading interactive solution based on the combination of technology and a human touch. Tailored to Industry Needs. Flexible Transcription & Captioning for Diverse Customers and Industries Court Reporting & Depositions. Real-time, customized transcription. Read backs, text search and in-audio search. Rough draft within one hour. Proofed transcripts within three business days. Learn More. Education & Disability Needs. Accuracy that meets ADA guidelines. Integration with web conferencing and LMS platforms. 24-hour booking and 12-hour cancellation. Interactive transcripts for note taking, search and sharing. Distance Learning & eLearning. 99% accurate transcription and captioning. Integration with LMS, web conferencing and media hosting platforms. Rest API that fits workflows. HIPAA, SOC 2, HECVAT, VPAT, GDPR compliance. Learn More Media Production. 99% accuracy that meets FCC and ADA guidelines - 
    20
    
Transcribe
Wreally
Transcribe saves thousands of hours every month in transcription time for journalists, lawyers, podcasters, students and professional transcriptionists all over the world. Increase your productivity & save mountains of time when converting your interviews, audio notes, lectures, speeches, podcasts and any recorded speech to text. Put on your headphones, load your audio, slow it down and speak out what you hear. It's that simple. Our dictation engine will convert your speech to text on the fly. This is way faster than typing. We support English, Spanish, French, Hindi and almost all other European & Asian languages. - 
    21
    
Ebby.co
Ebby
Automated Transcription & Subtitling Platform for audio and video that saves you time & money. Pay-as-you-go plans starting $6/hr (no monthly subscription). Transcribe in +100 languages and dialects. Leverage our feature rich Online Editor to review, edit and refine your transcripts. Share, collaborate and export transcripts to various formats. Create a free account and try us out now.Starting Price: 10¢ per minute - 
    22
    
Dragon Legal
Nuance Communications
Dragon Legal is a specialized speech recognition software tailored for legal professionals, offering a legal-specific language model trained on over 400 million words from legal documents. This enables attorneys and legal practitioners to dictate contracts, briefs, and legal citations with up to 99% accuracy, three times faster than typing. The software supports the creation of custom voice commands to automate repetitive tasks and allows for the transcription of pre-recorded audio files, enhancing workflow efficiency. Optimized for Windows 11 and compatible with Windows 10, Dragon Legal v16 also provides accessibility features such as "play that back" audio of dictated text and sophisticated macro commands, accommodating legal professionals with physical or cognitive disabilities. Additionally, it offers integration with Dragon Anywhere Mobile, a cloud-based dictation solution for iOS and Android devices, ensuring productivity on the go.Starting Price: $799 one-time payment - 
    23
    
GoVivace
GoVivace
Our automatic speech recognition engine supports several English accents and can be localized to any language. Also, the ASR engine supports standard telephony as well as web and mobile applications. Being capable of actioning voice commands given to electronic devices such as computers, tablets, smartphones or telephones with the aid of a microphone, the GoVivace’s Automatic Speech Recognition Engine finds use in diverse applications. This automatic speech recognition engine compares the spoken input with a number of pre-specified possibilities and convert speech to text. The entire set of pre-specified possibilities constitute the application’s grammar, which powers the interface between the dialogue-speaker and the back-end processing. GoVivace’s patented Automatic Speech Recognition solution needs only very simple grammar for its processing. It can also support very large grammars for complex tasks. - 
    24
    
Dragon Professional
Nuance Communications
Dragon Professional is a speech recognition software that enables professionals to create high-quality documentation more efficiently by converting speech into text with up to 99% accuracy. Optimized for Windows 11 and compatible with Windows 10, it serves individuals and groups across various industries, including financial services, education, and healthcare. The software allows users to dictate documents three times faster than typing, supports the transcription of pre-recorded audio files, and offers customization options such as creating custom words and commands to streamline repetitive tasks. Additionally, Dragon Professional v16 includes access to Dragon Anywhere Mobile, a cloud-based dictation solution for iOS and Android devices, ensuring productivity on the go.Starting Price: $699 one-time payment - 
    25
    
AWS Elemental MediaConvert
Amazon
The service combines advanced video and audio capabilities with a simple web services interface and pay-as-you-go pricing. With AWS Elemental MediaConvert, you can focus on delivering compelling media experiences without having to worry about the complexity of building and operating your own video processing infrastructure. AWS Elemental MediaConvert lets you use a wide range of internet and professional media formats to produce high-quality video outputs that look great on any device. With support for ultra-high definition resolutions, high dynamic range video, graphic overlays, advanced audio features, content protection, and closed captioning, AWS Elemental MediaConvert offers a full set of tools to deliver high-quality viewing experiences. AWS Elemental MediaConvert does not require any set up, management, or maintenance of underlying infrastructure. Process video files and clips to prepare on-demand content for distribution or archiving. - 
    26
    
Zubtitle
Zubtitle
Create awesome videos for social media in minutes. Create great-looking videos with our online video editor. Zubtitle's simple, yet powerful tools will help you edit faster and transform your videos into eye-catching content for social media. Grab your audience's attention with a headline that teases your content with our built-in Text Editor. Our auto-subtitle engine helps you easily add and edit the text and timing of your subtitles. Reach a wider audience with Zubtitle. Our all-inclusive video repurposing tool allows you to optimize your video for any social platform with just a few clicks. Use our quick tools to crop and change your video’s aspect ratio to match any social platform. Highlight the most attention-grabbing portion of your video with our powerful trimming tool. Stand out from other creators by incorporating your unique branding in your videos. Express your creativity and make your content instantly recognizable to build a loyal fan base.Starting Price: $8 per month - 
    27
    
INVOX Medical
VA cali
The most intuitive voice dictation program on the market. Convenient and instant audio-to-text transcription. The program has a clear and simple design, which guarantees a comfortable, fast and precise operation. INVOX Medical has specific dictionaries and is adapted to many medical specialties. INVOX Medical accurately recognizes a wide variety of medical terminology. INVOX Medical is the voice recognition software already trusted by thousands of medical professionals around the world. It's accurate, easy, and incredibly intuitive. In a few minutes you will be dictating your medical reports with complete accuracy. And in addition, it has an unbeatable price. INVOX Medical uses the latest technology in the use of artificial intelligence to help you dictate your medical reports with maximum precision, allowing you to work up to three times faster. The system allows you to add terms to the dictionary, replace words and modify their pronunciation at any time.Starting Price: $35 per month - 
    28
    
Sembly
Sembly
Sembly SaaS solution that enables managers and teams to records, transcribes and generates smart meeting summaries with meeting minutes. Works with Zoom, Google Meet, Microsoft Teams, and others. Sembly is available in English across Web, iOS & Android mobile apps. The smartest AI meeting assistant that helps easily review & share meeting takeaways, meeting records and transcriptions. Turns your meetings into searchable text, highlights key discussion moments, creates notes and summaries. Use Sembly Team to unlock powerful AI analytics to help you and your team achieve more, while attending less! Sembly automatically syncs to your calendar to join and record all your scheduled meetings on all major conferences platforms. This reduces the need to take notes on-call. You can review what was said, search through all your meetings, and share key items with your team members or friends. You can review what was said at a particular meeting or search for it in all of your meetingsStarting Price: $10 per month - 
    29
    
Inkr
Inkr
Inkr is an AI-powered transcription and note-taking platform that converts audio and video into accurate, structured content in seconds, requiring no account to start. It offers real-time “Live Transcription” to capture speech as it happens, ensuring accessibility and instant transcript generation, and “Inkr Note,” which uses AI templates for meetings, lectures, and interviews to auto-generate polished, organized notes or enhance your own text using transcript context. The “Ask Inkr” feature lets you query your transcript with natural-language questions to pinpoint key information without scrolling, while “Edit History” tracks every change and enables version rollback to streamline collaboration. Inkr supports multiple file formats and bulk uploads, delivering searchable, timestamped transcripts alongside customizable templates and smart summaries, all accessible through a clean, intuitive interface that turns spoken words into clear, actionable content.Starting Price: $5.38 per month - 
    30
    
SpeechText.AI
SpeechText.AI
Transcribe audio and video into text. Get accurate transcriptions of podcasts with domain-specific speech recognition. SpeechText.AI is a powerful artificial intelligence software for speech to text conversion and audio transcription. Upload audio or video files. AI transcription software supports various file formats and transcribes from speech to text in any language. Select domain. Select industry domain and audio type from predefined categories to improve the recognition accuracy of domain-specific words. Transcribe. Our speech transcription engine uses state-of-the-art deep neural network models to convert from audio to text with close to human accuracy. Edit & Export. Search, modify and verify audio transcriptions using interactive editing tools. Export your content in different formats. Why SpeechText.AI? Set of amazing features to help you transcribe audio and video in seconds. Speech recognition. Powerful speech-to-text tech.Starting Price: $19 one-time payment - 
    31
    
Closed Caption Creator
Closed Caption Creator
Automatically generate subtitles, closed captioning, and transcripts in over 25 different languages. Closed Caption Creator is used by content creators all over the world to create subtitles, closed captioning, transcripts, and audio descriptions for video. Do you need to create your own subtitles, or closed captioning? Closed Caption Creator is a great choice. We offer an all-in-one solution that allows you to create, edit, and deliver subtitles, closed captioning, audio descriptions, and transcripts in whatever format you may require. Automatically translate your subtitles to over 50 different languages using Closed Caption Creator. Powered by DeepL, and ModernMT. Speed up your subtitle workflow using custom keyboard shortcuts. Control playback, and insert new events without touching the mouse.Starting Price: $20 per month - 
    32
    
Whisper
OpenAI
We’ve trained and are open-sourcing a neural net called Whisper that approaches human-level robustness and accuracy in English speech recognition. Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. We show that the use of such a large and diverse dataset leads to improved robustness to accents, background noise, and technical language. Moreover, it enables transcription in multiple languages, as well as translation from those languages into English. We are open-sourcing models and inference code to serve as a foundation for building useful applications and for further research on robust speech processing. The Whisper architecture is a simple end-to-end approach, implemented as an encoder-decoder Transformer. Input audio is split into 30-second chunks, converted into a log-Mel spectrogram, and then passed into an encoder. - 
    33
    
Deepgram
Deepgram
Deploy accurate speech recognition at scale while continuously improving model performance by labeling data and training from a single console. We deliver state-of-the-art speech recognition and understanding at scale. We do it by providing cutting-edge model training and data-labeling alongside flexible deployment options. Our platform recognizes multiple languages, accents, and words, dynamically tuning to the needs of your business with every training session. The fastest, most accurate, most reliable, most scalable speech transcription, with understanding — rebuilt just for enterprise. We’ve reinvented ASR with 100% deep learning that allows companies to continuously improve accuracy. Stop waiting for the big tech players to improve their software and forcing your developers to manually boost accuracy with keywords in every API call. Start training your speech model and reaping the benefits in weeks, not months or years.Starting Price: $0 - 
    34
    
Trint
Trint
Introducing the easiest way to record, transcribe and share right from your phone! Trint’s mobile app lets you capture the moments that matter, anywhere, anytime. Wired: “Amazing!” Google: “Rocket-fueling innovation!” We understand work doesn’t always happen in an office, so we built the mobile app to give you all the power of Trint’s AI transcription on-the-go. Record live interviews and import files from your phone directly without any clunky equipment. It’s all in the app! Record live conversations. Import audio files into Trint from your other apps. Share transcripts and set editing permissions in-app. Intuitive player to easily follow Trint transcripts. All files saved to your device or to the cloud so never worry about losing a file. Download audio to your device. Drop markers from your Apple Watch while you record. Capture in 28 languages, right from your phone, including English, Spanish, French, Chinese Mandarin, Hindi, etc. - 
    35
    
HappyScribe
HappyScribe
State of the art A.I. working side by side with the best language professionals. Made for transcribers and subtitlers, our interactive editors will ease the way you interact with your transcripts and subtitles. Interactive editors, endless possibilities. Collaborate with all your stakeholders by sharing your transcripts and subtitles in view-only or edit mode, no matter where they are in the world. Export in all formats that you can think of. Our platform prepares you files that are ready for any kind of platform. Upload files of any size and length. Our software supports them all. Automatically translate your transcription and subtitles in the most common languages. Import any public links and synchronize HappyScribe to your current workflow. Create spaces for you to share your files with the rest of your team. Seamlessly integrate with your favorite applications: Zapier, YouTube, and more. All files are protected and remain private. Your subtitles are protected.Starting Price: $9 per month - 
    36
    
VideoToWords.ai
VideoToWords.ai
VideoToWords.ai is an AI‑powered transcription tool that converts audio and video into text with 99.9% accuracy, supporting more than 98 languages and speaker recognition. Users can upload files up to ten hours in length, MP3, WAV, MP4, AVI, MPEG, M4A, and more, directly in the browser, and transcription begins automatically. It provides ultra‑fast, GPU‑accelerated processing, AI‑generated summaries for quick insights, and an intuitive online editor for reviewing and optimizing transcripts. Completed text can be exported in TXT, DOCX, PDF, SRT, or VTT formats for easy sharing, subtitle creation, or further editing. Built on industry‑leading speech and video recognition models, VideoToWords.ai ensures ironclad data security and privacy, handling meeting recordings, lectures, interviews, podcasts, and marketing content seamlessly. With extended file support, customizable export options, and global language coverage.Starting Price: Free - 
    37
    
Recordly
Recordly
Your all-in-one audio/video intelligence platform. Experience the award-winning, world's first unified audio & video intelligence solutions. Effortlessly capture and analyze spoken content in real time. Transform your voice into actionable insights. Convert audio and video recordings into accurate text with ease. Enhance accessibility and documentation. Break language barriers with instant translations. Connect globally with multilingual support. Uncover hidden patterns and insights from your audio and video data. Empower your decisions with detailed analysis. Live events and/or pre-recorded content produce full transcripts, time-coded caption files, intuitive human editors, AI insights, and more. High-quality transcription and translation AI+human workflow to get to 100% quality. Our advanced AI not only transcribes with remarkable accuracy and speed but also understands context and nuances in over 100 languages. It's not just about converting speech to text. - 
    38
    
SpeechPulse
AV BEAM
SpeechPulse uses your computer’s microphone for real-time speech recognition. It can type into your favorite apps, including text editors, web browsers, and office applications. SpeechPulse works fully offline and doesn’t require any internet connectivity. It supports speech recognition in multiple languages, including English, French, Spanish, Italian, German, Japanese, Chinese, and Russian (a total of 100 languages). SpeechPulse supports both auto punctuation and manual punctuation for the English language. It supports auto punctuation for all other languages. SpeechPulse can also generate subtitles for your audio and video files with accurate timestamps. It supports SRT and VTT subtitle formats. You can also customize the width of a subtitle line to include only a limited number of characters. SpeechPulse has a one-time payment. You can pay for the product once and use it forever.Starting Price: $59.95/one-time payment - 
    39
    
Cockatoo
Cockatoo
Convert audio or video files to text transcripts using Cockatoo. Cockatoo is the fastest and most accurate speech-to-text app ever, boasting up to 99% accuracy, surpassing human performance with the power of machine learning. Cockatoo can transcribe 1 hour of audio in just 2-3 minutes, which is 30x faster than doing it manually and quicker than the competition. We support transcription in dozens of languages and dialects from around the world. Cockatoo is your all-in-one file-to-text converter. Upload audio or video in any format and receive a text transcript within seconds. We offer pricing plans tailored to fit any budget, making AI transcription accessible to all. Download transcripts in formats such as srt, docx, pdf, or txt, choosing the one that suits your needs and sharing your transcriptions effortlessly. There's no need to deal with separating audio from video; we handle it all for you. Simply drag and drop your files, and it's that easy.Starting Price: $15 per month - 
    40
    
Sonix
Sonix
Sonix’s in-browser editor allows you to search, play, edit, organize, and share your transcripts from anywhere on any device. Perfect for meetings, lectures, interviews, films... any kind of audio or video, really. Translate your transcripts in minutes with Sonix's advanced automated translation engine. Increase global reach with over 30 languages. Make your videos accessible, searchable, and more engaging. Automated but flexible enough so you can customize and fine-tune to perfection. Share video clips in seconds or publish full transcripts with subtitles using the Sonix media player. Great for internal use or web publishing to drive more traffic to your website. Comprehensive multi-user permissions allow you to grant collaborators access to upload, comment, edit and restrict access to files or folders. Search for words, phrases, and themes across all your transcripts. Stay organized with multi-folder nesting.Starting Price: $5 one-time payment - 
    41
    
Google Recorder
Google
Instantly transform audio into text so that you can search, edit, and share your recordings. It’s fast, it’s easy, and it even works offline. From speech, music, applause, laughter, and more, search all your recordings to find the moments you remember. When you edit your transcript, your audio automatically changes too. Save the parts you need, snip the bits you don’t. Share full searchable recordings on the web. Share short video clips of your audio on social media. 4-hour lecture? No problem. Recorder tags your transcripts with summary keywords so you can quickly navigate to find what you need. Recorder automatically tags speech, music, and sounds around you so you can search for them later. Now you don’t need internet to save important moments. Recorder works offline, so you can record anywhere. Edit your audio by simply editing text. The smartest Recorder yet, bringing the power of search to audio. - 
    42
    
Just Press Record
Just Press Record
Just Press Record is the award-winning mobile audio recorder that brings one-tap recording, transcription and iCloud syncing to all your devices. Turn your voice recordings into text which you can tweak right inside the app and fine-tune your audio by cutting out the parts you don’t need. Life is full of moments we would rather not forget, like your child’s first words, an important meeting or a great idea. Capture and sync these moments effortlessly on Mac, iPad, iPhone and, for ultimate convenience, Apple Watch! A record button everywhere, ready to go when you need it. Unlimited recording time, background recording and pause / resume make it the perfect recorder. Make professional quality recordings up to 96kHz / 24-bit with external microphones connected via the Lightning Port, in M4A, WAV or AIF files. Turn speech into editable, searchable text with support for over 30 languages, independent of your device’s language setting! You can even add punctuation! - 
    43
    
Azure Speech to Text
Microsoft
Quickly and accurately transcribe audio to text in more than 85 languages and variants. Customize models to enhance accuracy for domain-specific terminology. Get more value from spoken audio by enabling search or analytics on transcribed text or facilitating action, all in your preferred programming language. Get accurate audio to text transcriptions with state-of-the-art speech recognition. Add specific words to your base vocabulary or build your own speech-to-text models. Run Speech to Text anywhere, in the cloud or at the edge in containers. Access the same robust technology that powers speech recognition across Microsoft products. Convert audio to text from a range of sources, including microphones, audio files, and blob storage. Use speaker diarisation to determine who said what and when. Get readable transcripts with automatic formatting and punctuation. Tailor your speech models to understand organization- and industry-specific terminology.Starting Price: $1 per audio hour - 
    44
    
Rev.ai
Rev.ai
Rev.ai was built by leading speech recognition experts from millions of hours of accurate human-transcribed content. We began in 2011 with Rev.com, providing human transcription services. We are now the world's largest transcription vendor, with over 35,000 contractors who transcribe millions of minutes of audio each month. In 2017 we launched Temi, an automated speech-to-text transcription and editing service. Temi has already transcribed 20 million minutes of content and was named the best transcription service by Wirecutter. Today our best-in-class speech engine is available to everyone as Rev.ai. We're helping companies get the most out of their audio and video content by making it searchable and accessible. - 
    45
    
EoleCC
Videomenthe
EoleCC is a collaborative web-based subtitling solution that combines automated tools and human review for a fast and professional result. How does it work? 🔼 Upload your video or audio (podcast for example) 💬 Automatic transcription and translation by artificial intelligence in 120 languages. There is a large choice of artificial intelligence tools to translate ! There is even a monitoring to see the details of each step of the workflow. 👥 Collaborative editing & validation, with your team (manager, users and reviewer roles) by yourself or by our translators. 🎞 Subtitle embedding: subtitles are automatically embedded in the video, according to the selected graphic charter. You can create your own subtitle style by customizing it ▶ Share the video and subtitle file (.srt): upload, post on Twitter, YouTube or Dropbox. Discover the EoleCC lite version, a 30 min pack at 19€HT (per month without commitment) for a choice of 5 languages and a verification by you.Starting Price: €19/month/user - 
    46
    
CaptioningStar
CaptioningStar
Open captions are the timed-text description of the spoken audio and background sounds which are displayed on the screen. Unlike closed captioning, the captions cannot be turned off since the captions are burned into the video. We, at CaptioningStar, offer open captions that are FCC, CVAA, and ADA compliant to all genre videos. We enjoy captioning your videos with our highly professional captioners and proficient translators. Captions roll on either at the top or bottom of the screen giving way for the next set of text without disturbing the background content of the video. With exact time codes, captions sync perfectly with each frame. Pop on captions is preferred by people with hearing impairment. Ensure only one to three lines of text appear on the screen for about 3-6 seconds which is then replaced by the next caption.Starting Price: $1 per transcription - 
    47
    
Soundwise.ai
Soundwise.ai
SoundWise.ai is a browser-based transcription tool that lets users convert audio and video files into text for free forever, with no registration required, unlimited usage, and strong privacy safeguards. It supports 90+ languages and formats, including MP3, WAV, MP4, MOV, M4A, FLAC, AAC, MKV, etc. Users can drag-and-drop or upload files (or record voice directly) to get transcripts, with timestamps and speaker detection. There are additional modes, such as converting video into a PDF with a transcript and summary (called “video to PDF”), and “MP3 to text” tools. Accuracy is claimed to reach up to ~99.8% under good conditions. All processing is done in the browser (locally), meaning your audio/video data is not sent off to servers, enhancing user privacy. The interface is minimal, fast, and usable on both desktop and mobile browsers.Starting Price: $10 per month - 
    48
    
VITAC
VITAC
We ensure your content reaches as many people as possible. Whether you’re creating a program, hosting a conference, or planning an event, adding captions and/or audio description provides accessibility and engagement for all. At VITAC, we’re here to help ensure that the process goes smoothly. We’re the largest and most trusted captioning company in the United States, and our investment in our people, process, and technology are unmatched in the industry. Captioning is the art of transcribing the audio portion of a video, program, or event into text and displaying that text on a screen. These captions can appear on a television, movie, computer, or mobile device. Captions largely are used by members of the deaf and hard-of-hearing community, and captioning a program, conference, or event goes a long way towards making it inclusive and accessible to all. - 
    49
    
CaptionSync
Automatic Sync Technologies
Captioning Kaltura MediaSpace content is easy with CaptionSync’s Kaltura MediaSpace integration. Our integration allows you to easily add closed captioning to Kaltura videos and lecture captures. With our integration, you can simplify your captioning workflow and make the content accessible to a broader audience. Closed captioning is not only a legal mandate for public video content, it also gives a broader audience the opportunity to engage with your Kaltura content. Whether you are a Disability Resources manager, an instructor, a student, or a teacher, captions can help you disseminate and promote the content located in Kaltura and ensure the deaf and hard of hearing can be equally active participants in the communication flow. With CaptionSync’s Kaltura MediaSpace integration, you can quickly and easily caption Kaltura online media and make it accessible. - 
    50
    
TurboScribe
TurboScribe
Convert audio and video to accurate text in seconds. Our GPU-powered transcription engine converts audio and video to text in seconds. Upload files in all common formats, including YouTube and more. TurboScribe is powered by Whisper, the most accurate and powerful AI speech-to-text transcription technology in the world. Translate transcripts or subtitles to 134+ languages. Transcribe speech in any language directly to English. Your data is private and only you have access. Files and transcripts are always stored encrypted. TurboScribe supports the vast majority of common audio and video formats, including MP3, M4A, MP4, MOV, AAC, WAV, OGG, and more. While clean and clear audio produces the best results, TurboScribe generally does well with accents, background noise, and lower audio quality.Starting Price: $10 per month