News: Facebook Files Patent for Machine Learning Dialect Identification

Global SEO

As we pointed out in a previous article on Facebook’s artificial intelligence, the social media giant is having a difficult time translating user generated content. People tend to post as they talk on social media, and no machine translation software has been able to adequately understand and translate this, prompting Facebook to look further for a better solution.

On August 28th, Facebook filed a patent for “Machine Learning Dialect Identification” with the U.S. Patent & Trademark Office. This smart language dialect identification system will create classifiers for language dialects. These rules categorize how different words are used and as the machine recognizes them, it creates a dialect-specific language module which will allow it to more accurately translate slang and colloquialisms.

Previously we discussed how Arabic, specifically, presented problems due to the many dialects spoken across the Arab world, and also the poetic nature of the language.

Stepconference.com, a tech and interactive group in the Middle East and North Africa (MENA) region, says the current Facebook translation button for Arabic cannot translate any Arabic dialects other than Modern Standard Arabic (MSA).

In the patent application, it is noted that traditional speech recognition and machine translations systems for Arabic focus on MSA and don’t account for other Arabic dialects, which differ from MSA syntactically, morphologically, lexically and phonologically. The patent author notes that speech recognition and machine translation systems cannot adequately recognize or translate content items to or from non-MSA dialects.

A way to better translate Arabic dialects is to identify the Arabic country the comment or web entry is posted in, linking the post to a specific dialect. Or, an online article or post can be identified as a specific dialect based on user interaction with the content. For example, if an article is rated by users that are known as using an identified Arabic dialect, the module can determine that the online article is in that dialect.

Facebook is also hoping to engage crowdsourcing to augment the training data set. The system will send content items and classification results to users who can respond to confirm whether the classification is correct, or rank it on accuracy.

As companies expand globally, leveraging international social media is vital but will only be useful if the content is accurately localized. Let’s see if Facebook’s new patent gets this right.

Further GPI Resources on Global SEO Topics

Globalization Partners International (GPI) frequently assists customers with multilingual website design, development and deployment, and has developed a suite of globalization tools to help you achieve your multilingual website localization project goals. You can explore them under the Translation tools and Portals section of our website. You may also find some of the following articles and links useful:

For more information or help with your next website translation project, please do not hesitate to contact us via e-mail at info@globalizationpartners.com, or by phone at (866) 272-5874, or by requesting a free web translation quote on your next website translation project.

Quick Quote Calculator

How much will your translations cost and how long will they take? Get started with our Quick Quote Calculator for a real time estimate.

Learn More