For all content sources other than Twitter, Listen uses the Compact Language Detector (CLD) system of language detection. The CLD powers the language detection feature in Google Chrome and Translate. It has become an industry standard with support for 83 languages and improved accuracy. For Tweets, Listen utilizes Twitter's own language identification.
The support that a language receives will range from a more basic level, to full support including sentiment analysis and topic extraction. Below, you can find a full list of languages which currently benefit from full support:
Dutch | Italian | Spanish | Chinese | Arabic |
English | Polish | Turkish | Hindi | Farsi/Persian |
French | Portuguese | Danish | Indonesian | Hebrew |
German | Romanian | Norwegian | Japanese | Thai |
Greek | Russian | Swedish | Korean | Tagalog |
List of Supported Languages
The following table shows a complete list of all supported languages, the appropriate language code and the level of support each language currently receives.
Continent | Language | Code | Basic Support | Sentiment Analysis | Topic Extraction |
Europe | Albanian | sq | X | X | |
Armenian | hy | X | |||
Belarusian | be | X | |||
Bosnian | bs | X | X | ||
Bulgarian | bg | X |
X |
||
Catalan, Valencian | ca | X | |||
Cherokee | chr | X | |||
Croatian | hr | X | X | ||
Czech | cs | X | X | ||
Dutch | nl | X | X | X | |
English | en | X | X | X | |
Estonian | et | X | |||
French | fr | X | X | X | |
Georgian | ka | X | |||
German | de | X | X | X | |
Greek | el | X | X | X | |
Hawaiian | haw | X | |||
Hungarian | hu | X | X | ||
Irish | ga | X | |||
Italian | it | X | X | X | |
Latvian | iv | X | |||
Lithuanian | lt | X | |||
Luxembourgish, Letzeburgesch | lb | X | |||
Macedonian | mk | X | |||
Maltese | mt | X | |||
Montenegrin | sr-ME | X | |||
Polish | pl | X | X | X | |
Portuguese | pt | X | X | X | |
Romanian | ro | X | X | X | |
Russian | ru | X | X | X | |
Scots | sco | X | |||
Serbian | sr | X | X | ||
Slovak | sk | X | X | ||
Slovenian | sl | X | X | ||
Spanish | es | X | X | X | |
Turkish | tr | X | X | X | |
Ukrainian | uk | X | |||
Welsh | cy | X | |||
Scandinavian | Danish | da | X | X | X |
Faroese | fo | X | |||
Finnish | fi | X | |||
Icelandic | is | X | |||
Norwegian | no | X | X | X | |
Norwegian Nynorsk | nn | X | |||
Swedish | sv | X | X | X | |
African | Afrikaans | af | X | ||
Akan | ak | X | |||
Ga | gaa | X | |||
Ganda | lg | X | |||
Igbo | ig | X | |||
Krio | kri | X | |||
Lozi | loz | X | |||
Luba-Kasai | lua | X | |||
Luo | luo | X | |||
Mauritian | mfe | X | |||
Northern Sotho | nso | X | |||
Seychellois | crs | X | |||
Somali | so | X | |||
Sundanese | su | X | |||
Swahili | sw | X | |||
Tumbuka | tum | X | |||
Zulu | zu | X | |||
Asian | Azerbaijani | az | X | ||
Bengali | bn | X | X | ||
Bihari languages | bh | X | |||
Burmese | my | X | |||
Cebuano | ceb | X | |||
Chinese | zh | X | X | X | |
Chinese Traditional | zh-Hant | X | X | ||
Gujarati | gu | X | |||
Hindi | hi | X | X | X | |
Hmong | hmn | X | |||
Indonesian | id | X | X | X | |
Japanese | ja | X | X | X | |
Kapampangan | pam | X | |||
Khasi | kha | X | |||
Korean | ko | X | X | X | |
Kurdish | ku | X | |||
Limbu | lif | X | |||
Malay | ms | X | X | ||
Malayalam | ml | X | |||
Mixed Hindi English | X | X | |||
Mongolian | mn | X | |||
Nepali | ne | X | |||
Newar | new | X | |||
Rajasthani | raj | X | |||
Sanskrit | sa | X | |||
Sinhala | si | X | |||
Tamil | ta | X | |||
Thai | th | X | X | X | |
Urdu | ur | X | |||
Vietnamese | vi | X | X | ||
Waray | war | X | |||
Zhuang | za | X | |||
Australasian | Fijian | fj | X | ||
Javanese | jv | X | |||
Maori | mi | X | |||
Samoan | sm | X | |||
Tagalog | tl | X | X | X | |
Tonga | to | X | |||
Middle Eastern | Arabic | ar | X | X | X |
Farsi/Persian | fa | X | X | X | |
Hebrew | he | X | X | X | |
Syriac | syr | X | |||
Yiddish | yi | X | |||
Other | Abkhazian | ab | X | ||
Afar | aa | X | |||
Amharic | am | X | |||
Assamese | as | X | |||
Aymara | ay | X | |||
Bashkir | ba | X | |||
Basque | eu | X | |||
Bislama | bi | X | |||
Breton | br | X | |||
Central Khmer | km | X | |||
Chewa | ny | X | |||
Corsican | co | X | |||
Divehi | dv | X | |||
Dzongkha | dz | X | |||
Esperanto | eo | X | |||
Ewe | ee | X | |||
Gaelic | gd | X | |||
Galician | gl | X | |||
Guarani | gn | X | |||
Haitian | ht | X | |||
Hausa | ha | X | |||
Interlingua | ia | X | |||
Interlingue, Occidental | ie | X | |||
Inuktitut | iu | X | |||
Inupiaq | ik | X | |||
Kalaallisut | kl | X | |||
Kannada | kn | X | |||
Kashmiri | ks | X | |||
Kazakh | kk | X | |||
Kinyarwanda | rw | X | |||
Kyrgyz | ky | X | |||
Lao | lo | X | |||
Latin | la | X | |||
Lingala | ln | X | |||
Malagasy | mg | X | |||
Manx | gv | X | |||
Marathi | mr | X | |||
Nauru | na | X | |||
Occitan | oc | X | |||
Oriya | or | X | |||
Oromo | om | X | |||
Ossetic | os | X | |||
Panjabi | pa | X | |||
Pashto | ps | X | |||
Quechua | qu | X | |||
Romansh | rm | X | |||
Rundi | rn | X | |||
Sango | sg | X | |||
Shona | sn | X | |||
Sindhi | sd | X | |||
South Ndebele | nr | X | |||
Southern Sotho | st | X | |||
Swati | ss | X | |||
Tajik | tg | X | |||
Tatar | tt | X | |||
Telugu | te | X | X | ||
Tibetan | bo | X | |||
Tigrinya | ti | X | |||
Tsonga | ts | X | |||
Tswana | tn | X | |||
Turkmen | tk | X | |||
Twi | tw | X | |||
Uighur | ug | X | |||
Uzbek | uz | X | |||
Venda | ve | X | |||
Volapük | vo | X | |||
Western Frisian | fy | X | |||
Wolof | wo | X | |||
Xhosa | xh | X | |||
Yoruba | yo | X |
Check out the link below to download a copy of the language codes.