Language Support in Social Media Management

For all content sources other than X (Twitter), the Brandwatch suite uses the Compact Language Detector (CLD) system of language detection. The CLD powers the language detection feature in Google Chrome and Translate. It has become an industry standard with support for 83 languages and improved accuracy. In this article, learn about language detection in Social Media Management, what features are available, and which languages are supported.

Note:

Language is detected by the Compact Language Detector (CLD) system for all content sources other than X (Twitter). For posts (tweets), language is detected via X (Twitter)'s own language identification system.


Available language features

Engage and Listen both offer language detection features to help you manage your customer interactions and get the most out of your mentions analysis.

Engage

  • Automatically detects language in both public and private, incoming and sent messages. You will have the option to correct the detected language manually if needed, or click Translate to open Google Translate and translate the message.
    Screenshot_2023-04-19_at_09.29.26.png

    Note:

    Each message will only receive one language classification. If multiple languages are mixed in a single message, the model decides which language is dominant and assigns that classifier. If no language can be detected from the message content (e.g. emojis, mentions, or many unclear or mixed words and characters), the model will assign a classification of “No language detected.”

  • Allows you to create Engage feeds with a Message Language filter, to monitor messages in one or multiple languages. You can also create a feed with “No language detected” to make sure no undetected language messages are missed by your team.
    Screenshot_2023-04-19_at_09.30.08.png
  • Allows you to create automation rules with a Detected Language trigger to automatically label or assign incoming messages based on language. For example, the trigger rule could be the inclusion or exclusion of one or more languages, such as “assign all Dutch messages to team X” or “assign all messages that are not German to team Y.” You can also create automations based on “No language detected,” allowing for a safety net option to catch any messages with an undetected language.
    Screenshot_2023-04-19_at_09.31.02.png

    Note:

    Automation rules will only run for newly ingested content. For example, if a user manually corrects the language of a message, an automation rule will not be triggered based on the newly assigned language classifier of that message. Language detection also runs slower than other automation rules, so language detection rules will always run after any other active automation rules have completed.

Listen

  • Automatically detects language in Listen search mentions.
  • Offers sentiment analysis and topic extraction for select languages (see below).

Supported languages

The Brandwatch suite offers language detection in Engage and Listen for many languages as well as sentiment analysis and topic extraction in Listen for select languages.

The following languages are currently fully supported:

Dutch Italian Spanish Chinese Arabic
English Polish Turkish Hindi Farsi/Persian
French Portuguese Danish Indonesian Hebrew
German Romanian Norwegian Japanese Thai
Greek Russian Swedish Korean Tagalog

 

The following table shows a complete list of all supported languages, the appropriate language code, and the level of support each language currently receives. We also offer a downloadable PDF of all language codes.

Continent Language Code Language Detection Sentiment Analysis Topic Extraction
Europe Albanian sq X X  
  Armenian hy X    
  Belarusian be X    
  Bosnian bs X X  
  Bulgarian bg X

X

 
  Catalan, Valencian ca X    
  Cherokee chr X    
  Croatian hr X X  
  Czech cs X X  
  Dutch nl X X X
  English en X X X
  Estonian  et X    
  French fr X X X
  Georgian ka X    
  German de X X X
  Greek el X X X
  Hawaiian haw X    
  Hungarian hu X X  
  Irish ga X    
  Italian it X X X
  Latvian iv X    
  Lithuanian lt X    
  Luxembourgish, Letzeburgesch lb X    
  Macedonian mk X    
  Maltese mt X    
  Montenegrin sr-ME X    
  Polish  pl X X X
  Portuguese pt X X X
  Romanian  ro X X X
  Russian ru X X X
  Scots sco X    
  Serbian sr X X  
  Slovak sk X X  
  Slovenian sl X X  
  Spanish es X X X
  Turkish tr X X X
  Ukrainian uk X    
  Welsh cy X    
           
Scandinavian Danish da X X X
  Faroese fo X    
  Finnish fi X    
  Icelandic is X    
  Norwegian no X X X
  Norwegian Nynorsk nn X    
  Swedish sv X X X
           
African Afrikaans af X    
  Akan ak X    
  Ga gaa X    
  Ganda lg X    
  Igbo ig X    
  Krio kri X    
  Lozi loz X    
  Luba-Kasai lua X    
  Luo luo X    
  Mauritian mfe X    
  Northern Sotho nso X    
  Seychellois crs X    
  Somali so X    
  Sundanese su    
  Swahili sw    
  Tumbuka tum X    
  Zulu zu X    
           
Asian Azerbaijani az X    
  Bengali bn X  
  Bihari languages bh X    
  Burmese my X    
  Cebuano ceb X    
  Chinese zh X X X
  Chinese Traditional zh-Hant X X  
  Gujarati gu X    
  Hindi hi X X X
  Hmong hmn X    
  Indonesian id X X X
  Japanese ja X X X
  Kapampangan pam X    
  Khasi kha X    
  Korean ko X X X
  Kurdish ku X    
  Limbu lif X    
  Malay ms X X  
  Malayalam ml X    
  Mixed Hindi English   X X  
  Mongolian mn X    
  Nepali ne X    
  Newar new X    
  Rajasthani raj X    
  Sanskrit sa X    
  Sinhala si X    
  Tagalog tl X X X
  Tamil ta X    
  Thai th X X X
  Urdu ur X    
  Vietnamese vi X X  
  Waray war X    
  Zhuang za X    
           
Australasian Fijian fj X    
  Javanese jv X    
  Samoan sm X    
  Tonga to X    
           
Middle Eastern Arabic ar X X X
  Farsi/Persian fa X X X
  Hebrew he X X X
  Syriac syr X    
  Yiddish yi X    
           
Other Abkhazian ab X    
  Afar aa X    
  Amharic am X    
  Assamese as X    
  Aymara ay X    
  Bashkir ba X    
  Basque eu X    
  Bislama bi X    
  Breton br    
  Central Khmer km X    
  Chewa ny X    
  Corsican co X    
  Divehi dv X    
  Dzongkha dz X    
  Esperanto eo X    
  Ewe ee X    
  Gaelic gd X    
  Galician gl X    
  Guarani gn X    
  Haitian ht X    
  Hausa ha X    
  Interlingua ia X    
  Interlingue, Occidental ie X    
  Inuktitut iu X    
  Inupiaq ik X    
  Kalaallisut kl X    
  Kannada kn X    
  Kashmiri ks X    
  Kazakh kk X    
  Kinyarwanda rw X    
  Kyrgyz ky X    
  Lao lo X    
  Latin la X    
  Lingala ln X    
  Malagasy mg X    
  Manx gv X    
  Marathi mr X    
  Nauru na X    
  Occitan oc X    
  Oriya or X    
  Oromo om X    
  Ossetic os X    
  Panjabi pa X    
  Pashto ps X    
  Quechua qu X    
  Romansh rm X    
  Rundi rn X    
  Sango sg X    
  Shona sn X    
  Sindhi sd X    
  South Ndebele nr X    
  Southern Sotho st X    
  Swati ss X    
  Tajik tg X    
  Tatar tt X    
  Telugu te X  
  Tibetan bo X    
  Tigrinya ti X    
  Tsonga ts X    
  Tswana tn X    
  Turkmen tk X    
  Twi tw X    
  Uighur ug X    
  Uzbek uz X    
  Venda ve X    
  Volapük vo X    
  Western Frisian fy X    
  Wolof wo X    
  Xhosa xh X    
  Yoruba yo X    

 

Was this article helpful?
1 out of 1 found this helpful
Brandwatch Academy

The Brandwatch Academy is here

Access on-demand courses on the Brandwatch product suite, plus live events to enrich your knowledge.