N-Gram Word Frequency Counter for Various Languages  

  Introduction

This app displays the frequency of n-gram (n = 1, 2, 3, ...) words that appear in a given text according to the user's selection.

Spaces generally serve as word delimiters in most languages. In languages such as Chinese, Japanese, and Thai that have no spaces between words, counting word frequencies can be very challenging. In this app, foreign-language-to-English dictionaries are used to identify words of such languages.
  Choose the main language of your text and click the [Start] button:
English (default) sample  العربية (Arabic) sample  ဗမာစကား (Burmese) sample  中文 (简体 Simplified Chinese) sample 
中文 (繁體 Traditional Chinese ) sample  čeština (Czech) sample  dansk (Danish) sample  Farsi (Persian) sample 
français (French) sample  Deutsch (German) sample  हिन्दी (Hindi) sample  bahasa Indonesia (Indonesian) sample 
italiano (Italian) sample  日本語 (Japanese) sample  한국어 (Korean) sample  polski (Polish) sample 
português (Portuguese) sample  Русский язык (Russian) sample  español (Spanish) sample  Kiswahili (Swahili) sample 
Tagalog (Philippines) sample  ภาษาไทย (Thai) sample  Türkçe (Turkish) sample  tiếng việt (Vietnamese) sample 

  The most distinctive feature in this app
When a user wants to know how many times the word ‘USA’ appears within a certain text, most word frequency tools entirely omit the frequencies of variants like ‘US’, ‘U.S.’, ‘U.S.A.’, ‘America’, ‘United States of America’, etc. It would be useful if the sum of the frequencies of all variants is displayed as well as the frequency of the variants separately.

The most distinctive feature of this app is its ability to automatically develop variants of a representative lexicon (here, ‘USA’, for example). Some other examples of representative lexicons are ‘book’ and ‘study’. The noun ‘book’ may have ‘Book’, ‘book's’, ‘books’, etc. as its variants, and the verb ‘study’ may have ‘studies’, ‘studied’, ‘studying’, etc. as its variants. As for Korean, since particles such as -이, -가, -을, -를, -에게 (-i, -ka, -ul, -lul, -e.key), etc. can be attached to a basic or dictionary word such as 미국 (mi.kuk ‘America’), all such word+particle strings such as 미국이 (mi.kuk.i), 미국을 (mi.kuk.ul), 미구에게 (mi.kuk.e.key), etc. can be regarded as variants of 미국 (mi.kuk). This app displays the frequency of each representative word, those of its variants, as well as together.