A new AI translation system for headphones together

by SkillAiNest

The local speech is translated by two AI models, the first of which divides the surrounding person wearing a headphone into smaller areas and uses a nerve network and indicates the potential speaker’s search.

The second model then translates the words of speakers in English text from French, German or Spanish using publicly available data sets. The same model removes the unique features and emotional accents of each speaker’s sound, such as pitch and dimensions, and these features apply to the text, which mainly produces the “clone” sound. This means that when the translated version of the speaker’s words comes to the headphone a few seconds later, it looks like it is coming from the direction of the speaker and the sound speaker looks just like his own, not like a robotic sound computer.

Given that separating human sounds is difficult enough to enable the AI ​​system, to be able to include this qualification in a real -time translation system, set the distance between the wearer and the speaker, and the delayed map on a real device is impressive, not a Post -Melon University language language, which is a Post -Technology.

“Translation of speech from real -time speech is incredibly difficult,” he says. “Their results are very good at limited testing settings. But a real product will need most training data, with noise from headset and real -world recording, rather than relying on artificial data.”

The Golcota team is now focusing on reducing the amount of time to kicking the AI’s translation after the speaker says something, which will adjust the more natural sound conversations among people who speak different languages. “We really want to reduce this delay by significantly less than a second, so that you can still get a conversation,” says Golcota.

It has become a huge challenge, because the speed at which the AI ​​system can translate one language into another language depends on the structure of languages. Of these three languages ​​of local speech, this system was trained, the system had accelerated to translate the French into English, then Spanish and then Germans indicate that the Germans, unlike other languages, do not make the function of a phrase and its meaning at the beginning. Researcher, Johannes Gutinber, said.

Reducing the delay can make the translation less correct, they warn: “The more you wait (before translating), the more context you have, and the better the translation will be. This is a balanced process.”

You may also like

Leave a Comment

At Skillainest, we believe the future belongs to those who embrace AI, upgrade their skills, and stay ahead of the curve.

Get latest news

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

@2025 Skillainest.Designed and Developed by Pro