Translate audio from a video automatically: Here's how

Need translations? Try Smartcat for free!

Translate audio - Smartcat blog

You likely work for a company that wants to make video content more accessible to a broader audience. Whether it's internal company training videos, product tutorials for customers, or marketing videos, you're likely considering translating subtitles and adding multilingual dubbing to your videos.

Here's the good news: translating audio from a video is no longer the challenge it once was. It can be done without adding native speakers to your team or hiring translation agencies. Here's how.

Why is video translation important for an international audience?

Audiences are becoming more global than ever. When it comes to reaching a global audience and engaging employees or customers, adding audio content in more languages is a surefire way for companies to increase engagement, subscribers and revenues.

Just look at how heavily Netflix has invested in non-English content. According to their Director of Globalization, Kathy Rokni, they doubled their subscriber base in two years after adding 27 new languages.

Recently, Peloton added dubbed classes to its product in German and French. According to their FAQ, they'll expand their dubbed offering with additional languages over time.

And the benefits are clear:

  1. Subtitles and captions help increase video views by 80%

  2. Translated content gets 12x the engagement rate of non-localized content

  3. Companies can see a 50% boost in productivity from learning delivered in the learner's native language, while it can also drive cost savings by 50-70%

Not just the entertainment industry benefits from subtitling and dubbing. Brands looking to expand into new countries, retain existing customers, or increase employee learning satisfaction with multilingual training videos can also benefit a ton.

How to translate audio from a video

Want to get started with translating audio from a video? Here's a simple five-step process to get started:

1. Transcribe the original audio

The first step to translate an audio file is to carefully transcribe the original audio of your video. This can be done manually or through automatic video transcription tools:

  1. Manual transcription involves human transcribers listening to audo/video recordings and accurately typing the spoken words. Human transcribers cost $1-1.50 per minute.

  2. On the other hand, automated transcription involves using transcription tools to automatically convert audio to text. And we must admit, it's much faster and cheaper (between $0.10-0.20 per minute of audio) than us humans. Plus, with so many big tech companies focused on automatic transcription, it'll continue getting better and cheaper.

A quick word on automatic video transcription

Automatic transcription uses speech recognition to convert spoken words into written text. Even if it isn't yet perfect, computer models now decode speech almost as well as humans (with the right type of audio).

Automatic transcription is darn clever. And we must admit, it's a lot faster than us humans. Is automatic transcription good enough for you? The answer is generally yes, but to make it easy to decide you can sign up free and give it a try.

2. Review and edit your subtitles

Once you have the transcription, you can review the source text to ensure the transcript did it's job correctly.

How accurate does the transcription need to be? It depends on what you're trying to achieve but generally you'll need a full transcript that is near word-perfect.

Note: While most automatic transcription tools have an accuracy of 90%+ (Smartcat's AI OCR transcription is 95% accurate), there may be times when the system does not recognize a proper name. Or, if the speaker has an accent, your tool may not be able to pick up on each inflection, which is why double checking the source text is important.

3. Translate your subtitles

With a polished transcription, you can now translate the text into your target language(s). This can be done using:

  1. Professional linguists for their translation services

  2. AI translation tools

  3. A combination of both

Which of these options should you pick? Again, it depends on your content and goals. If you have high priority content filled with cultural references and humour that is highly context-specific, it's likely AI tools alone will suffice. Media localization is an exercise in technique and creativity and you'll want professionals to ensure your distinct style and voice comes across just right.

There are some particular considerations to take into account when subtitling. You're worrying about the space and time it takes up. That's where different considerations come in:

What is the reading speed?

Reading speed is how long you expect someone to take to read the subtitle. Obviously, it may differ from adults to kids, so you should have a strong idea of what the expected audience wants. A rough estimate is to give 2 seconds per line of text to create an easy reading speed.

What is the timing to audio?

If a subtitle is not timed to within a few frames of the audio, the audience may feel distracted or irritated that somethings lingering on screen for too long or that they've missed something.

Accuracy

You want to stay true to the source and provide the most accurate translation possible. But part of effective translation is figuring out where that balance needs to happen and adapting smartly to ensure a seamless experience.

4. Review your translations

After translation, it's crucial to review the content for accuracy, cultural appropriateness, length, spacing, and timing. Here are some useful tips to keep in mind when reviewing translations:

Length and spacing

Try to stick to one line if possible, with a maximum of two lines on the screen at a time.

Timing

It's also essential to ensure timing is synced correctly. Your subtitle should appear when the speaker starts talking and stop appearing soon after the speaker stops.

Quality Assurance (QA)

Test your subtitles and video before publishing. It's also a good idea to work with local stakeholders or partners to review them. They can check if the line can be read in the given time and double-check the meaning.

5. Create the translated audio

Until now, companies and creators have faced significant cost and time barriers breaking into the global scene. While generative AI voice technologies can help quickly render voices in another language, they don't ensure that the generated audio conveys the original message properly across cultures.

Purely automated systems lack the sophistication to deliver the authenticity and emotional impact that brands need to communicate their vision and intent. In addition, the latest examples of generative AI voice technologies are not ready to solve this problem, as they are typically limited to one person speaking and no audio content other than the voice. The final step is to create new audio in the target language. This can be done through:

  • Voice actors recording the translated script

  • Text-to-speech technology

  • AI dubbing for a natural-sounding result

Streamlining the process with Smartcat Subtitle Editor

Smartcat’s new AI-powered subtitling and multilingual dubbing tool is a first-of-its-kind dubbing tool that automatically transcribes an audio or video source, translates it into the target language, and allows for editing of the translation.

Many Fortune 1000 companies use Smartcat’s subtitle editor to build faster, more streamlined workflows. Using Smartcat, they simply upload their video, choose the language (with over 280 languages supported), transcribe and translate using AI, review and edit, and add dubbing to their multilingual videos.

Here’s a simplified overview of how it works:

  1. Upload your video to the Smartcat platform

  2. Let the AI automatically transcribe the audio

  3. Review the generated transcript for accuracy

  4. Select the language and use AI translation to translate the text

  5. Review and edit the translated transcript as needed

  6. Generate an AI-dubbed version with natural-sounding voices

  7. Download the final multilingual video

  1. Upload your video to the Smartcat platform

2. Let the AI automatically transcribe the audio

3. Review the generated subtitles for accuracy

4. Use AI translation to convert the text to your target language

Make necessary adjustments to the target language text and time codes as you have already done for the source text segments. It’s important to understand that, unlike the source text segments, the translated segments will not always match the precise timecodes.

However, you can still ensure correct positioning with the original text structure and the context of the video with the help of video preview.

5. Review and edit the translated transcript as needed

6. Generate an AI-dubbed version with natural-sounding voices

7. Download the final multilingual video

5 best practices to follow when translating audio

  1. Automate, automate, automate: You might have hundreds or thousands of videos to translate. If you want to translate videos into another language, let alone five, or 15, can be quite the process. This is why you need to focus on automation wherever possible using automatic transcription, dubbing, and tools,

  2. Start with high-quality source material: Aside from the technical aspects like ensuring the original video has clear audio and well-articulated speech, ensure your original content in the source language is translation-friendly. This will make the transcription and translation process much more accurate and increase time to market. Plus, you'll be able to easily repurpose content across all your languages.

  3. Establish a quality pillar: A quality assurance process and review of your translated text is particularly important. Define language-specific style guides to ensure accuracy and consistency and don't tread lightly on implementing them.

  4. Pay attention to reading speed, timing to audio, and : In video content, timing is critical. Ensure the translated audio aligns well with on-screen visuals and text overlays.

  5. Test your translated videos and review performance: The only way to know if your multilingual videos are working as they should is to conduct regular performance analysis. To get feedback, use native speakers or quality assurance testers who have a deep understanding of your target audience.

AI audio translator case study

70% faster subtitle transcription and translation with Smith+Nephew

Smith+Nephew is a multinational medical equipment manufacturer that provides its global employees with e-learning training videos in seven languages. By partnering with Smartcat, they were able to:

The challenge Smartcat solved – Implementing Smartcat's Subtitle Editor and AI dubbing helped the team to:

  • Reduce file preparation time: With the ability to automatically add subtitles from video voice-overs, Smith+Nephew were able to slash their file preparation time by for subtitle transcription 70%

  • Effective collaboration with overall context and coherence: With all work taking place in the same Smartcat workspace, the translation team can collaborate seamlessly, saving in real time.

  • Increased accuracy of subtitle transcription and translation: The team found that they can now prevent mistakes by editing video subtitle source language text before translation. Plus, they can make changes quickly and directly rather than involving their graphic design team.

"When Smartcat released a new product feature to automatically extract subtitles from video voice overs, offering the possibility to quickly edit the source text for a perfect automated translation process and post-editing, our file preparation time was instantly slashed by 70%," says Barbara Fedorowicz, translation department manager at Smith+Nephew.

Today, the audio translation operations at Smith+Nephew is managed in Smartcat's AI video translation tool. It is a central hub for project managers, translators, and everyone involved in the process. It includes automatic video transcriptions, AI-powered translation, a subtitling editor, and natural-sounding with support for 100+ languages.

The results:

  • 70% decrease in workload

  • 15-20 new video training courses created per language every quarter

  • Noticeable cost savings compared to third-party providers

Read the full case study.

Want to know more?

Here's what you can do next:

1. Sign up for Smartcat and engage worldwide audiences to retain and attract customers. By translating your audio, you'll be able to reach international audiences and increase engagement in a big way.

2. Sign up for Smartcat's 14-day free trial today and start translating your audio.

Translate your videos and audio files with AI
💌

Subscribe to our newsletter

Email *