For all content sources other than X (Twitter), the Brandwatch suite uses the Compact Language Detector (CLD) system of language detection. The CLD powers the language detection feature in Google Chrome and Translate. It has become an industry standard with support for 83 languages and improved accuracy. In this article, learn about language detection in Social Media Management, what features are available, and which languages are supported.
Note:
Language is detected by the Compact Language Detector (CLD) system for all content sources other than X (Twitter). For posts (tweets), language is detected via X (Twitter)'s own language identification system.
In this article:
Available language features
Engage and Listen both offer language detection features to help you manage your customer interactions and get the most out of your mentions analysis.
Engage
- Automatically detects language in both public and private, incoming and sent messages. You will have the option to correct the detected language manually if needed, or click Translate to open Google Translate and translate the message.
Note:
Each message will only receive one language classification. If multiple languages are mixed in a single message, the model decides which language is dominant and assigns that classifier. If no language can be detected from the message content (e.g. emojis, mentions, or many unclear or mixed words and characters), the model will assign a classification of “No language detected.”
- Allows you to create Engage feeds with a Message Language filter, to monitor messages in one or multiple languages. You can also create a feed with “No language detected” to make sure no undetected language messages are missed by your team.
- Allows you to create automation rules with a Detected Language trigger to automatically label or assign incoming messages based on language. For example, the trigger rule could be the inclusion or exclusion of one or more languages, such as “assign all Dutch messages to team X” or “assign all messages that are not German to team Y.” You can also create automations based on “No language detected,” allowing for a safety net option to catch any messages with an undetected language.
Note:
Automation rules will only run for newly ingested content. For example, if a user manually corrects the language of a message, an automation rule will not be triggered based on the newly assigned language classifier of that message. Language detection also runs slower than other automation rules, so language detection rules will always run after any other active automation rules have completed.
Listen
- Automatically detects language in Listen search mentions.
- Offers sentiment analysis and topic extraction for select languages (see below).
Supported languages
The Brandwatch suite offers language detection in Engage and Listen for many languages as well as sentiment analysis and topic extraction in Listen for select languages.
The following languages are currently fully supported:
Dutch | Italian | Spanish | Chinese | Arabic |
English | Polish | Turkish | Hindi | Farsi/Persian |
French | Portuguese | Danish | Indonesian | Hebrew |
German | Romanian | Norwegian | Japanese | Thai |
Greek | Russian | Swedish | Korean | Tagalog |
The following table shows a complete list of all supported languages, the appropriate language code, and the level of support each language currently receives. We also offer a downloadable PDF of all language codes.
Continent | Language | Code | Language Detection | Sentiment Analysis | Topic Extraction |
Europe | Albanian | sq | X | X | |
Armenian | hy | X | |||
Belarusian | be | X | |||
Bosnian | bs | X | X | ||
Bulgarian | bg | X |
X |
||
Catalan, Valencian | ca | X | |||
Cherokee | chr | X | |||
Croatian | hr | X | X | ||
Czech | cs | X | X | ||
Dutch | nl | X | X | X | |
English | en | X | X | X | |
Estonian | et | X | |||
French | fr | X | X | X | |
Georgian | ka | X | |||
German | de | X | X | X | |
Greek | el | X | X | X | |
Hawaiian | haw | X | |||
Hungarian | hu | X | X | ||
Irish | ga | X | |||
Italian | it | X | X | X | |
Latvian | iv | X | |||
Lithuanian | lt | X | |||
Luxembourgish, Letzeburgesch | lb | X | |||
Macedonian | mk | X | |||
Maltese | mt | X | |||
Montenegrin | sr-ME | X | |||
Polish | pl | X | X | X | |
Portuguese | pt | X | X | X | |
Romanian | ro | X | X | X | |
Russian | ru | X | X | X | |
Scots | sco | X | |||
Serbian | sr | X | X | ||
Slovak | sk | X | X | ||
Slovenian | sl | X | X | ||
Spanish | es | X | X | X | |
Turkish | tr | X | X | X | |
Ukrainian | uk | X | |||
Welsh | cy | X | |||
Scandinavian | Danish | da | X | X | X |
Faroese | fo | X | |||
Finnish | fi | X | |||
Icelandic | is | X | |||
Norwegian | no | X | X | X | |
Norwegian Nynorsk | nn | X | |||
Swedish | sv | X | X | X | |
African | Afrikaans | af | X | ||
Akan | ak | X | |||
Ga | gaa | X | |||
Ganda | lg | X | |||
Igbo | ig | X | |||
Krio | kri | X | |||
Lozi | loz | X | |||
Luba-Kasai | lua | X | |||
Luo | luo | X | |||
Mauritian | mfe | X | |||
Northern Sotho | nso | X | |||
Seychellois | crs | X | |||
Somali | so | X | |||
Sundanese | su | X | |||
Swahili | sw | X | |||
Tumbuka | tum | X | |||
Zulu | zu | X | |||
Asian | Azerbaijani | az | X | ||
Bengali | bn | X | X | ||
Bihari languages | bh | X | |||
Burmese | my | X | |||
Cebuano | ceb | X | |||
Chinese | zh | X | X | X | |
Chinese Traditional | zh-Hant | X | X | ||
Gujarati | gu | X | |||
Hindi | hi | X | X | X | |
Hmong | hmn | X | |||
Indonesian | id | X | X | X | |
Japanese | ja | X | X | X | |
Kapampangan | pam | X | |||
Khasi | kha | X | |||
Korean | ko | X | X | X | |
Kurdish | ku | X | |||
Limbu | lif | X | |||
Malay | ms | X | X | ||
Malayalam | ml | X | |||
Mixed Hindi English | X | X | |||
Mongolian | mn | X | |||
Nepali | ne | X | |||
Newar | new | X | |||
Rajasthani | raj | X | |||
Sanskrit | sa | X | |||
Sinhala | si | X | |||
Tagalog | tl | X | X | X | |
Tamil | ta | X | |||
Thai | th | X | X | X | |
Urdu | ur | X | |||
Vietnamese | vi | X | X | ||
Waray | war | X | |||
Zhuang | za | X | |||
Australasian | Fijian | fj | X | ||
Javanese | jv | X | |||
Samoan | sm | X | |||
Tonga | to | X | |||
Middle Eastern | Arabic | ar | X | X | X |
Farsi/Persian | fa | X | X | X | |
Hebrew | he | X | X | X | |
Syriac | syr | X | |||
Yiddish | yi | X | |||
Other | Abkhazian | ab | X | ||
Afar | aa | X | |||
Amharic | am | X | |||
Assamese | as | X | |||
Aymara | ay | X | |||
Bashkir | ba | X | |||
Basque | eu | X | |||
Bislama | bi | X | |||
Breton | br | X | |||
Central Khmer | km | X | |||
Chewa | ny | X | |||
Corsican | co | X | |||
Divehi | dv | X | |||
Dzongkha | dz | X | |||
Esperanto | eo | X | |||
Ewe | ee | X | |||
Gaelic | gd | X | |||
Galician | gl | X | |||
Guarani | gn | X | |||
Haitian | ht | X | |||
Hausa | ha | X | |||
Interlingua | ia | X | |||
Interlingue, Occidental | ie | X | |||
Inuktitut | iu | X | |||
Inupiaq | ik | X | |||
Kalaallisut | kl | X | |||
Kannada | kn | X | |||
Kashmiri | ks | X | |||
Kazakh | kk | X | |||
Kinyarwanda | rw | X | |||
Kyrgyz | ky | X | |||
Lao | lo | X | |||
Latin | la | X | |||
Lingala | ln | X | |||
Malagasy | mg | X | |||
Manx | gv | X | |||
Marathi | mr | X | |||
Nauru | na | X | |||
Occitan | oc | X | |||
Oriya | or | X | |||
Oromo | om | X | |||
Ossetic | os | X | |||
Panjabi | pa | X | |||
Pashto | ps | X | |||
Quechua | qu | X | |||
Romansh | rm | X | |||
Rundi | rn | X | |||
Sango | sg | X | |||
Shona | sn | X | |||
Sindhi | sd | X | |||
South Ndebele | nr | X | |||
Southern Sotho | st | X | |||
Swati | ss | X | |||
Tajik | tg | X | |||
Tatar | tt | X | |||
Telugu | te | X | X | ||
Tibetan | bo | X | |||
Tigrinya | ti | X | |||
Tsonga | ts | X | |||
Tswana | tn | X | |||
Turkmen | tk | X | |||
Twi | tw | X | |||
Uighur | ug | X | |||
Uzbek | uz | X | |||
Venda | ve | X | |||
Volapük | vo | X | |||
Western Frisian | fy | X | |||
Wolof | wo | X | |||
Xhosa | xh | X | |||
Yoruba | yo | X |