Large number of word

На основании Вашего запроса эти примеры могут содержать грубую лексику.

На основании Вашего запроса эти примеры могут содержать разговорную лексику.

большое количество слов

большого количества слов

Large number of words borrowed from Sanskrit or formed on its basis are used in Hindi.

В хинди используется большое количество слов, заимствованных из санскрита или образованных на его основе.

Fortunately, this can be avoided, since there are quite a large number of words that can be identified using search engine optimization.

К счастью этого можно избежать, так как существует достаточно большое количество слов, которые могут быть выявлены с помощью поисковой оптимизации.

But having in your arsenal of a large number of words and a complete lack of knowledge about how to use them together, after all, would force you to take learning English grammar.

Но наличие в вашем арсенале большого количества слов и полное отсутствие знаний о том, как их вместе употребить, все-таки, вынудит вас заняться обучением грамматики английского языка.

Hackers are smart guys and once they came to know that developers are storing hashed passwords, they pre-computed hash of large number of words (from a popular word list or dictionary words).

Хакеры — умные ребята, и как только они узнали, что разработчики хранят хешированные пароли, они предварительно вычислили хеш большого количества слов (из списка популярных слов или словарных слов).

Tamed parrots can memorize and recite a large number of words and sounds, as they have a good memory.

Прирученный попугай может запоминать и произносить большое количество слов и звуков, так как обладают хорошей памятью.

Probably most people have heard the claim that Eskimo languages have an unusually large number of words for «snow».

Умные люди говорят о том, что в эскимосских языках имеется необычайно большое количество слов для обозначения снега.

A popular belief exists that the Inuit have an unusually large number of words for snow.

Широко распространено утверждение о том, что в эскимосских языках имеется необычайно большое количество слов для обозначения снега.

A large number of words transferred from the languages of Indian tribes, which neighbors the settlers.

Большое количество слов переходит из языков индейских племен, с которыми соседствовали поселенцы.

The Chinese language itself has a large number of words with the same pronunciation but completely different meanings.

Сам китайский язык имеет большое количество слов с тем же произношением, но совершенно разные значения.

A large number of words allows you to avoid excessive values of distinct language units for the evaluation of kinship, when their number is small.

Большое количество слов позволяет избежать чрезмерной значимости отдельных языковых единиц для оценки родства, когда их количество мало.

The official language of the country is Spanish, but here it includes a large number of words borrowed from Africans and Indians.

Официальным языком в стране является испанский, однако здесь он включает в себя большое количество слов, заимствованных от африканцев и индейцев.

A popular belief exists that the Inuit have an unusually large number of words for snow.

Умные люди говорят о том, что в эскимосских языках имеется необычайно большое количество слов для обозначения снега.

The situation is even more complicated would be if not for the fact that a large number of words are formed from each other, and because they are much easier to remember.

Ситуация ещё более усложнялась бы, если бы не тот факт, что большое количество слов образуются друг от друга, и потому запомнить их намного легче.

The dictionary is based on the common vocabulary, including colloquial one, and on a large number of words and terms from different spheres (science, technology, etc).

Основу словаря составляет общеупотребительная лексика, включая разговорную, а также большое количество слов и терминов из различных сфер деятельности (наука, техника и т.п).

Children aged 9-10 months understand a large number of words, distinguish the intonation with which they are spoken, react to the word «not.»

Воспитание детей Дети в возрасте 9-10 месяцев понимают большое количество слов, различают интонацию, с которой с ними говорят, реагируют на слово «нельзя».

A large number of words of this group appeared in American English in the 60s during the Vietnam War, and then it was actively used by other governments.

Большое количество слов этой группы появилась в американском варианте английского языка в 60-х годах во время войны с Вьетнамом и потом было активно использовано другими правительствами.

Like Belorussian, Ukrainian language contains a large number of words borrowed from Polish, but it has fewer borrowings from Church Slavonic than Russian.

«Как и белорусский, украинский язык содержит большое количество слов, заимствованных из польского, но у него меньше заимствований из церковнославянского языка, чем у русского языка», — говорится в статье.

In the future, out of his legendary albums, Nebraska and Born in the U. S. A. the Latter is a sufficiently large number of words.

В дальнейшем выходят его легендарные альбомы Nebraska и Born in the U.S.A. Последний заслуживает достаточно большого количества слов.

Nomina trivialia, which later became used as specific epithets in the binomial names of living organisms) is still used (the previously used long names consisting of a large number of words, gave a description of the species, but were not strictly formalized).

nomina trivialia, которые позже стали использоваться в качестве видовых эпитетов в биноминальных названиях живых организмов) используется до сих пор (применявшиеся ранее длинные названия, состоящие из большого количества слов, давали описание видов, но не были строго формализованы).

In Japanese, there is a large number of words non-Japanese origin.

В словарном составе японского языка имеется большое количество слов иностранного (неяпонского) происхождения.

Ничего не найдено для этого значения.

Результатов: 35. Точных совпадений: 35. Затраченное время: 122 мс


Корпоративные решения




Справка и о нас

Индекс слова: 1-300, 301-600, 601-900

Индекс выражения: 1-400, 401-800, 801-1200

Индекс фразы: 1-400, 401-800, 801-1200

This is a list of dictionaries considered authoritative or complete by approximate number of total words, or headwords, included. These figures do not take account of entries with senses for different word classes (such as noun and adjective) and homographs. Although it is possible to count the number of entries in a dictionary, it is not possible to count the number of words in a language.[1][2] In compiling a dictionary, a lexicographer decides whether the evidence of use is sufficient to justify an entry in the dictionary. This decision is not the same as determining whether the word exists.[citation needed]

The green background means a given dictionary is the largest in a given language.

Language Approx. no. of headwords Approx. no. of definitions Dictionary Notes Korean 1,149,538 우리말샘 (Woori Mal Saem, 2017) Online open dictionary including dialects of South and North Korea.[3] Tamil 922,398 Sorkuvai An online open dictionary run by the Tamil Nadu government.[4] Portuguese 818,000 Aulete Digital Online dictionary including expressions.[5] Finnish 800,000 RedFox Pro Online dictionary. The free version has over 300,000 Finnish words and the Pro version has over 800,000 Finnish words. The dictionary has agglomerated other dictionaries, such as technical ones,[6] and the largest set comes from Wordnet.[7] Note that even this dictionary essentially doesn’t include inflections. Kurdish 744,139 Authority of Kurdish Language Dictionary, Kurdish Language Unit Dictionary It contains 744,139 key words from a few Kurdish dialects, but in this census, the Kurdish dialects, terms and buildings in Kurdish were not counted, and in all the dialects Kurdish contains a total of 1.2 million words containing 1.6 million words with all conventions and phrases. Southern Kurdish dialects not examined in (Rojhilat Kurdish Dialects): (Leki, Bayrayi, Fili, Jarossi (Bijari), Kermanshahi, Kulayi, Kerd Ali, Malkshahi, Sanyabi, Kalhori (Kalhuri), Zangana, (Lori), Bashoori Kurdish dialects, Kurdish dialects in Rojava, Bakurian dialects.[8] Swedish 600,000 Svenska Akademiens ordbok, Swedish Academy After having completed letters A through T SAOB included 470,000 words, but 600,000 words when the alphabet was completed in 2017. Svenska Akademiens ordlista, which includes only commonly used words, currently includes ~126,000 words after having added 13,500 and removed 9,000 in its latest edition, SAOL 14, plus an additional 200,000 still encountered words in earlier editions.[9][10] English 711,378 1,402,895 English Wiktionary Contains 711,378 gloss entries and 1,402,895 total definitions.[11][12] Korean 511,282 Standard Korean Language Dictionary[13] Contains 511,282 entries. Italian 500,000 Grande Dizionario Hoepli Italiano[14][15] The number of «sayable and writable» word-forms is estimated at over 2 million[16] Japanese 500,000 Nihon Kokugo Daijiten [17] Lithuanian 500,000 Lietuvių kalbos žodynas (Academic Dictionary of Lithuanian) 22,000 pages in 20 volumes with quotations from all kinds of writing and dialect records between 1547 and 2001. Accessible online at[18] English 470,000 Webster’s Third New International Dictionary and Addenda Section Contains 470,000 entries[19] French 408,078 French Wiktionary Contains more than 408,000 lemma, associated to more than 636,500 definitions and 1,880,500 inflections, distributed other 1,924,200 entries.[20] Serbo-Croatian 400,000 Rječnik hrvatskoga ili srpskoga jezika Published from 1880 to 1976 in 97 fascicles collected into 23 volumes under the auspices of the Yugoslav Academy of Sciences and Arts, estimated at a minimum of 400,000 words by Dragica Malić.[21][22] Includes only words found in the Shtokavian dialect; words from Chakavian and Kajkavian dialects are excluded. Dutch 400,000 Woordenboek der Nederlandsche Taal The 43 volumes of the WNT (including three supplements) consist of 49,255 pages, describing Dutch words from 1500 to 1976.[23] Chinese 378,103 Hanyu Da Cidian The 3rd edition of the digital version contains 18,014 single-character words, 336,706 compound words, 23,383 idioms (chengyu), 504,040 definitions, and 861,956 examples.[24] English 350,000 The American Heritage Dictionary of the English Language, Third Edition In the introduction to the 4th and 5th editions, it is mentioned that more than 10,000 words have been added, thus the total for the 5th edition will be more than 370,000 words.[25][failed verification] Finnish 350,000 Suomen murteiden sanakirja (in progress) Suomen murteiden sanakirja (SMS) will include 350,000 words from different dialects, with well-documented definitions, based on the archives (Suomen murteiden sana-arkisto) of 400,000 words, with over 8 million definitions.[26][27] Persian 343,466 Dehkhoda Dictionary, 1998, ISBN 9789640396025 The original series initially consisted of 3 million records (Persian: فیش (French: fiche) or برگه «barge») (up to 100 meanings/records for each word or proper noun) until Dehkhoda’s death in March 1956, and currently contains 343,466 entries that, according to the latest digital release of the dictionary by Tehran University Press (version 3.0) are based on an ever-growing library of over 2300 volumes in lexicology and various other scientific fields.[28][29][circular reference] Norwegian 330,000 Norsk Ordbok The finished dictionary has about 330,000 headwords, whereas the corpus it’s built upon contains about 500,000 words in total.[30] German 330,000 Deutsches Wörterbuch 330,000 words in use since the mid-fifteenth century.[31] Turkish 316,000 Ötüken Türkçe Sözlük[32] Turkish dictionary (modern and Ottoman Turkish), includes 316,000 entries.[33] Dutch 300,000 Etymologisch woordenboek van het Nederlands [34] Norwegian 300,000 Tanums store rettskrivningsordbok (10. utgave) A dictionary of orthography.[30] Ukrainian 300,000 Український лексикон кінця XVIII — початку XXI ст.: словник-індекс: у 3-х томах Лексичний словник.[35] Gujarati 281,377 Bhagavadgomandal 2.81 lakh words and their meanings in 9 volumes. Also serves as an encyclopedia with almost 8.22 lakh words.[36] English 273,000 600,000 Oxford English Dictionary, Second Edition Oxford Dictionary has 273,000 headwords; 171,476 of them being in current use, 47,156 being obsolete words and around 9,500 derivative words included as subentries. The dictionary contains 157,000 combinations and derivatives in bold type, and 169,000 phrases and combinations in bold italic type, making a total of over 600,000 word-forms.[37][38]
There is one count that puts the English vocabulary at about 1 million words — but that count presumably includes words such as Latin species names, prefixed and suffixed words, scientific terminology, jargon, foreign words of extremely limited English use and technical acronyms.[39][40][41] Urdu 264,000 Urdu Lughat[42] [43] Ukrainian 253,000 Великий орфографічний словник сучасної української лексики A dictionary of orthography. Contains 253,000 entries (253,000 words).[44][45] Czech 250,000 Příruční slovník jazyka českého [cs] Nine volumes of this dictionary were printed in years 1935–1957. They contain about 250,000 words, their meanings and example usage from literature. The dictionary is available online.[46][47] Serbo-Croatian 241,000 Dictionary of Serbo-Croatian Literary and Vernacular Language This dictionary is incomplete. So far, 20 volumes of the planned 40 have been published. These 20 volumes contain 241.000 headwords. When complete, this Dictionary will have around 500.000 headwords.[48] Portuguese 228,000 382,000 Houaiss Dictionary of the Portuguese Language 228,000 entries and 382,000 meanings.[49] Belarusian 223,000 Вялікі слоўнік беларускай мовы: арфаграфія, акцэнтуацыя, парадыгматыка [50] Russian 220,000 250,000 Толковый словарь живого великорусского языка The 3rd edition by Baudouin de Courtenay contains about 250,000 entries (220,000 words and 30,000 proverbs)[51][52] Finnish 201,000 Nykysuomen sanakirja, 1961 Nykysuomen sanakirja can be translated to The Dictionary of Modern Finnish or The Dictionary of Contemporary Finnish, but the language can be quite dated; the dictionary only reflects the language as it was no later than 1961. Even though it has been published again, it has not been updated. The dictionary contains over 201,000 headwords in six volumes.[53] For modern language, The New Dictionary of Modern Finnish is more relevant. German 200,000 Großes Wörterbuch der deutschen Sprache Dictionary by the Berlin-Brandenburg Academy of Sciences and Humanities of over 200,000 contemporary words.[54] Norwegian (bokmål) 200,000 Det Norske Akademis ordbok The Norwegian Academy Dictionary contains more than 200,000 entries and more than 300,000 literary quotes. Furthermore, it contains fixed expressions and pronunciation.The dictionary is free and edited daily. Danish 200,000 Ordbog over det danske Sprog Dictionary maintained by the Society for Danish Language and Literature [da]. Covers Danish language use 1700–1950.[55] The society also maintains a sister dictionary, Den Danske Ordbog [da] covering language use since 1950. Slovak 200,000 Slovník slovenského jazyka z r. 1959 – 1968, Slovník súčasného slovenského jazyka A – G, H – L, M – N z r. 2006, 2011, 2015 Here is the information about the number of words in Slovak written by Jazykovedný ústav Ľ. Štúra SAV. Tibetan 195,919 Rangjung Yeshe Dharma Dictionary Considering the large number of Buddhist terminology, colloquial expressions and modern literary Tibetan neologisms not included in this dictionary, the actual total number is probably about twice the number of terms included on this website (195,919), perhaps 375–400,000 Tibetan words in total.


Hindi 183,175 Hindi Wiktionary A free dictionary that gives everyone the right to edit.[57] Romanian 180,000 dexonline Online dictionary. Project of digitisation of 67 general, specialty and archaic dictionaries. Launched in 2001. As of 2013, it contained over 180,000 unique words and 576,000 definitions. Kazakh 166,000 15 томдық «Қазақ тілінің түсіндірме сөздігі» Explanatory dictionary of the Kazakh language[58] English 155,327 207,016 WordNet, 3.1 As of November 2012 WordNet’s latest Online-version is 3.1. The database contains 155,327 words organized in 175,979 synsets for a total of 207,016 word-sense pairs.[59] Belarusian 150,000 Слоўнік беларускай мовы [60] Icelandic 150,000 Orðabók Blöndals[61] The dictionary contains 150,000 headwords in 17 volumes.[62] Russian 150,000 Большой академический словарь русского языка Great Academy Dictionary of Russian language[63] Swiss German 150,000 Schweizerisches Idiotikon[64] The dictionary contains 150,000 words from the late Middle Ages to today.[65] German 148,000 Duden – Die deutsche Rechtschreibung The most influential dictionary in Germany, a dictionary of orthography.[66] Polish 140,000 Wielki słownik ortograficzny PWN Big orthography dictionary PWN contains new words, proper nouns and latest spelling changes. German 141.154 German Wiktionary Contains 141.154 german gloss entries[67][68] French 135,000 Trésor de la Langue Française informatisé ATILF[69] (Analyse et Traitement Informatique de la Langue Française – Computer Processing and Analysis of the French Language)
135,000 (Larousse Dictionnaire de français, published by Editions Larousse)[70][71] Ukrainian 134,058 Словник української мови (The Dictionary of the Ukrainian language) The dictionary was finished in late 1970s — early 1980s[72][73] Dutch 134,000 Woordenlijst Nederlandse Taal (Het Groene Boekje) [74] Russian 130,000 Большой толковый словарь русского языка Great Dictionary of Russian language[75] Swedish 130,000 Rikstermbanken[76] Sweden’s national term bank.[77] Indonesian 127,036 Kamus Besar Bahasa Indonesia, 5th edition, 2016 Swedish 126,000 Svenska Akademiens ordlista (SAOL)[78] Normative Swedish language spelling dictionary, includes around 120,000 headwords.[79] Eastern Armenian 125,000 Ժամանակակից հայոց լեզվի բացատրական բառարան Žamanakakic’ hayoc’ lezvi bac’atrakan baṙaran[80] Tamil 124,405 University of Madras Tamil Lexicon The dictionary includes 124,405 separate entries.[81] Malaysian 120,000 Kamus Dewan Perdana, 1st Edition, 2020 Arabic[notes 1] 120,000 Tāj al-ʿArūs min Jawāhir al-Qāmūs The dictionary includes 120,000 entries filling 40 volumes, whereby one entry comprises dozens of words.[84] Frisian 120,000 Het Wurdboek fan de Fryske taal[85] Dictionary of New Frisian (Nieuwfries) from 1800 to 1975[86] Finland Swedish 120,000 Ordbok över Finlands svenska folkmål[87] (in progress) The dictionary includes around 120,000 headwords.[88] Bulgarian 119,200 Dictionary of the Bulgarian Language (monolingual academic explanatory dictionary), (Многотомен) Речник на българския език in Bulgarian, in 15+ volumes This dictionary covers vocabulary from the last 150 years of Bulgarian and is compiled and edited by linguistics (primarily native lexicographers and lexicologists) from The Institute for the Bulgarian Language (part of the Bulgarian Academy of Sciences). It includes basic, commonly used, literary, colloquial, dialectical, archaic and obsolete Bulgarian words, as well as some specialized terminology. The latest volume (15th) published in 2015 ends with headwords beginning with the (Bulgarian Cyrillic) letter Р.[89] French 116,000 Le Dictionnaire universel francophone (DUF)[90] Dictionary published by Hachette[91] Turkish 114,767 Güncel Türkçe Sözlük Online dictionary of the Turkish Language Association[92] Belarusian 112,462 Skarnik As of August 2019. Belarusian-Russian online dictionary contains 112,462 words.[93] Slovene 110,180 Slovar slovenskega knjižnega jezika, Second edition, 2014 The official dictionary of modern Slovene is Slovar slovenskega knjižnega jezika (SSKJ; Standard Slovene Dictionary). It was published in five volumes by Državna Založba Slovenije between 1970 and 1991 and contains more than 100,000 entries and subentries with accentuation, part-of-speech labels, common collocations, and various qualifiers. In the 1990s, an electronic version of the dictionary was published and it is available online.[94] Finnish 102,174 Kielitoimiston sanakirja, 2018 Online dictionary. Institute for the Languages of Finland (governmental institute) has selected the core vocabulary, and many headwords are not included.[95] Afrikaans 100,000 Handwoordeboek van die Afrikaanse Taal (HAT), 2015 New 6th edition contains 3228 new keywords and 5365 meaning.[96] French 100,000 Le Grand Robert, 2019 Contains 100,000 words and 350,000 definitions.[97] German 100,000 Österreichisches Wörterbuch, 2018 Official dictionary of the German language in the Republic of Austria.[98] Polish 100,000 Słownik języka polskiego PWN Polish dictionary of PWN contains about 100,000 articles and 145,000 definitions.[99] Russian 100,000 Орфографический словарь русского языка[100] Normative Russian dictionary,[101] the dictionary includes around 100,000 words.[102] Turkish 100,000 Misalli Büyük Türkçe Sözlük[103] Historical Turkish dictionary (modern and Ottoman Turkish), includes 100,000 entries,[104] 35,000 idioms based on 1000 literary works by 400 writers[103] Spanish 93,000 Diccionario de la lengua española de la Real Academia Española, 23rd edition, 2014 [105] Soranî 92,000 فەرهەنگی زانستگای کوردستان Contains 92,000 keywords from Soranî dialect.[106] Spanish 90,000 Diccionario de uso del español [es], 2007 Contains 90,000 keywords and 190,000 meaning. Dutch 90,000 Van Dale, 14th edition, 2005 [107] Catalan 88,500 172,000 Gran Diccionari de la llengua catalana (Great Dictionary of the Catalan language, includes the definitions in the Diccionari de la llengua catalana) Contains 88,500 headwords and 172,000 definitions.[108] Chinese 85,568 Zhonghua Zihai The largest character dictionary covering all varieties of Chinese, a compilation of Chinese characters in use over three millennia of written history.[109][110][111] Arabic 83,015 المعجم المعاصر2019 The first version (2019) contains approximately 83,015 entries.[112][113] Arabic 80,000 Lisan Al-Arab The dictionary includes around 80,000 entries.[84] French 80,000 Dictionnaire de la langue française[114] Four-volume dictionary[115] of the French language by Émile Littré[116] Uzbek 80,000 Oʻzbek tilining izohli lugʻati (Annotated Dictionary of the Uzbek Language) The largest Uzbek language dictionary, made of five volumes and including around 80,000 entries.[117] Middle Dutch 75,000 Het Middelnederlandsch Woordenboek [nl][118] Historical dictionary of Middle Dutch: 1250-1550[119] Romanian 67,000 Dicționarul explicativ al limbii române (Published by the Romanian Academy) Tamazight 65,716 Amawal Ameqran, Abdelhafed Idres. 2017 Swedish 65,000 Svensk ordbok utgiven av Svenska Akademien (Svensk ordbok, SO)[120] The dictionary includes around 65,000 headwords.[120] Swedish 64,000 Ordbok öfver svenska språket (Dalin Ordbok)[121] The dictionary includes around 64,000 headwords.[122] Arabic 60,000 Al-Qamus al-Muhit wa al-Qabus al-Wasit[notes 2] The dictionary includes around 60,000 entries.[123] Dutch 60,000 Groot Woordenboek Afrikaans en Nederlands[124] Dutch-Afrikaans dictionary[125] Turkish 60,000 Osmanlıca-Türkçe Ansiklopedik Lûgat[126] Ottoman Turkish dictionary, includes 60,000 entries.[127] Galician 59,999 Dicionario da Real Academia Galega (Dictionary of the Royal Galician Academy) [128] Western Armenian 56,000 Հայոց լեզուի նոր բառարան Hayoc’ lezowi nor baṙaran[129] Tatar 56,000 Татарско-русский словарь Ш.Н. Асылгараева, Ф.А. Ганиева, М.З. Закиева, К.М. Миннуллина, Д.Б. Рамазанова Tatar-Russian dictionary of Sh.N. Asylgaraev, F.A. Ganiev, M.Z. Zakiyev, K.M. Minnullin, D.B. Ramazanova[130] French 55,000 Dictionnaire de l’Académie française (DAF) Normative French dictionary,[131] once complete, it will contain 55,000 words[132] Dutch 52,000 Woordenlijst Nederlandse Taal[133] Normative Dutch dictionary,[134] the dictionary includes around 52,000 entries and around 134.000 derivative words.[135] Dutch 50,000 Het Groene woordenboek — Handwoordenboek Nederlands [136] Turkmen 50,000 Türkmen diliniň düşündirişli sözlügi Turkmen Explanatory Dictionary[137] Azerbaijani 44,750 Azərbaycan dilinin izahlı lüğəti Azerbaijani Explanatory Dictionary[138] Syriac 43,030 Sureth dictionary Published by the Association Assyrophile de France, it features Assyrian Neo-Aramaic, Turoyo and Chaldean Neo-Aramaic words of all dialects.[139] Icelandic 43,000 560,000 Orðabók Háskólans 43,000 basic words and 519,000 compound words of which more than half are attested only once or don’t get into print (“instant combinations”)[140] Thai 40,840 พจนานุกรม ฉบับราชบัณฑิตยสถาน พ.ศ. ๒๕๕๔ Bashkir 40,000 Башкирско-русский словарь Ураксин З.Г. Bashkir-Russian dictionary Uraksin Z. G.[141] Chuvash 40,000 Чувашско-русский словарь Скворцова М. И. Chuvash-Russian dictionary Skvortsova M. I.[142] Dargwa 40,000 Даргинско-русский словарь Юсупова Х. А. Dargwa-Russian dictionary of Yusupov H. A[143] Riksmål 40,000 Riksmålsordlisten[144] Normative dictionary of the non-official Norwegian written language called Riksmål.[145] Arabic 40,000 Taj al-Lugha wa Sihah al-Arabiyya[notes 3] The dictionary includes around 40,000 entries.[123] Classical Latin 39,589 Oxford Latin Dictionary Includes 39,589 Classical Latin entries, including borrowings from Greek, Gaulish, other Italic dialects, Sanskrit, and others. There are about: 10,000 adjectives, 2,123 adverbs, 46 conjunctions, 77 interjections, 17,450 nouns, 26 particles, 39 prepositions, 17 pronouns, and 5,986 verbs. The remaining entries are references to other entries (such as alternate spellings or archaic versions), prefixes, suffixes, and terms left untranslated by the original editors.[146] Avar 36,000 Аварско-русский словарь Гимбатова. М. М.[147] Avar-Russian dictionary of M. M. Gimbatov[147] Venetian 36,000 Dizionario della lingua veneta[148] Dictionary of Venetian Language[149] of Gianfranco Cavallin[150] Turkish 32,021 Nişanyan Sözlük[151] Turkish etymological dictionary, includes 32,021 entries.[151] Lezgi 28,000 Лезгинско-русский словарь: Б. Б Талибов, М. М. Гаджиев[152] Lezgi-Russian dictionary: B. B Talibov, M. M. Gadzhiev[152] Middle Dutch 25,000 Het Vroegmiddelnederlands Woordenboek[153] Historical dictionary of Middle Dutch: 1200-1300[154] Old Swedish 22,894 Ordbok öfver svenska medeltidsspråket[155] The dictionary includes around 22,894 headwords in 3 volumes[156] and with supplement in 2 volumes (21,495 headwords) the dictionary includes 44,389 headwords.[157] Chechen 20,000 Чеченско-русский словарь. Алироев, И.А.; Хамидова, З.Х.; Алексеев, М.Е.[158] Chechen-Russian dictionary. I.A Aliroev., Z.Kh. Khamidova., M.E. Alekseev.,[158] Kabardian 20,000 Кабардинско-русский словарь. М. Л. Апажев, Н. А. Багов[159] Kabardian-Russian dictionary. M. L. Apazhev, N. A. Bagov[159] Quechua 20,000 Diccionario Quechua-Español Lira Jorge Quechua-Spanish dictionary Lira Jorge[160] Swedish 20,000 Svensk etymologisk ordbok[161] The dictionary includes around 20,000 headwords.[162] Esperanto 16,780 Plena Ilustrita Vortaro de Esperanto (Complete Illustrated Dictionary of Esperanto) 46,890 lexical units[163] Ingush 11,142 Ингушско-русский словарь. М. С. Мургустов.[164] Ingush-Russian dictionary by M.S. Murgustov.[164] Nahuatl 10,500 Tlahtolxitlauhcayotl: Chicontepec, Veracruz Huasteca Nahuatl monolingual dictionary with 10 500 entries of which 360 are loanwords, co-authored by John Joseph Sullivan[165] Russian 10,000 Словарь ударений русского языка[166] Normative dictionary of stresses[167] Old Dutch 9,000 Het Oudnederlands Woordenboek[168] [154] Flemish-only words 1,000 Het Gele Boekje[169] [170] Toki Pona 120 Toki Pona: The Language of Good[171] A later dictionary added 17 words, bring the total number of official words to 137.[172] [173]

large number of — перевод на русский

But the systems of mass production that had been developed in America were only profitable if they made large numbers of the same objects.

Но системы массового производства, созданые в Америке, были прибыльными только если они производили большое количество одинаковых товаров.

Throughout Germany, a large number of firms are making seemingly innocent component parts, which are then dispatched to great central factories where they’re assembled very rapidly, into fighter and bomber aircraft.

По всей Германии большое количество заводов изготавливают,.. …казалось бы, обычные запасные части,.. …которые потом переправляют на центральные заводы,..

Killing large numbers of people simply ’cause they don’t look like you, they don’t talk like you and they don’t have the same kind of hats you do!

«бить большое количество людей просто потому, что они не выгл€д€т как вы, не говор€т как вы, и у них нет таких же шл€п как у вас.

That’s the question! How are gonna get large numbers of people to commit suicide at a time and place of our choosing?

ак сделать чтобы большое количество людей совершили самоубийство в нужное врем€ в нужном нам месте?

Punish them severely enough that they can essentially intimidate a large number of other people.

Наказать их достаточно строго, чтобы это существенно запугало большое количество остальных.

Показать ещё примеры для «большое количество»…

Or you could use two flat mirrors and a candle flame you would see a large number of images each the reflection of another image.

Если поставить два плоских зеркала и пламя свечи между ними, то вы увидите огромное количество отражений, каждое из которых в свою очередь является отражением другого.

You, um… you remember we told you earlier about the large number of dogs that watch us here on A Bit of Fry and Laurie?

Вы… мы уже ранее говорили Вам, что нашу программу, смотрит огромное количество домашних питомцев.

A large number of Greek ships were destroyed… in the bombing of the nearby Port of Piraeus.

Огромное количество греческих кораблей было разрушено недалеко от порта Пирес.

I have been looking through a large number of reference books, And I can’t find a single mention of it.

Я пересмотрел огромное количество справочников, и не мог найти хоть малейшее упоминание об этом.

An enormous amount of energy will be drained from wherever you arrive and a large number of people will be killed.

Гигантское количество энергии будет поглощено из всего, что будет находится в точке вашего прибытия. И огромное количество людей погибнет.

Показать ещё примеры для «огромное количество»…

There are a large number of Alices, of Trudys,

Здесь много Алис, Труди,

Get rich, screw a large number of women.

— Я разбогател и получу много женщин.

Does that mean that L has begun to suspect the police? He must need quite a large number of people to investigate them.

что L начал подозревать полицию? их должно быть много.

Large numbers of bodies are still not cleared from the original outbreak of infection.

После вспышки эпидемии осталось много тел, их предстоит убрать.

Man, we got a large number of clovers on our lawn.

Мужик, у нас много клевера на лужайке.

Показать ещё примеры для «много»…

What we are looking at now is a mission where a large number of the Watutsi people took shelter.

Сейчас перед нами миссионерская резиденция где большое число людей батутси нашли убежище.

Parish shipped a large number of electronic components from China to New York.

Пэришу доставили морем большое число деталей электроники из Китая в Нью-Йорк.

Though a large number of men have been dispatched north.

Однако большое число людей отправлено на север.

The largest number of malnourished people ever in the history of the Earth.

Это самое большое число за всю историю Земли.

From a very large number of computer runs making various assumptions, adopting various maxima and minima, there is in fact a general forecast of a breakdown of world society in the first decades of the next century.

С помощью большого числа компьютеров, способных делать различные предположения, принимая разнообразные минимумы и максимумы, нами был фактически спрогнозирован распад мирового сообщества в первой половине следующего века.

Показать ещё примеры для «большое число»…

We do have a large number of ships passing through.

Через станцию проходит множество кораблей.

An examination of the body showed a large number of bites and wounds.

Осмотр тела показал множество ранений.

They report large numbers of Odin’s men patrolling just beyond the border.

Да, сир. У границы замечено множество патрулей Одина.

She’s already undergone a large number of medical procedures.

Она и так пережила множество медицинских процедур.

All I’ll say is, he gets a large number of letters, written in a particular female hand.

Вот что я скажу, он получает множество писем, написанных особенным женским почерком.

Отправить комментарий

The Linguistic Society of America mentions 6,909 distinct languages in the world. Have you ever wondered which language has the most words in it out of all these languages?

This is quite a difficult question to answer, even for some linguistics. You start learning any new language; you may have this question in mind how many words this language has?

Even though it seems a pretty simple question, it has various complexities. By the end of this article, you will be able to find the answer to your question; “which language has the most words?”

It is still controversial how many words a language has. This is because there are some words that dictionaries do not include.

However, there are certain languages that many linguistics consider as the richest languages. Here are a few of them:

1. English – Does It Have The Most Words?

According to some linguists, English is one of the largest languages in terms of word count. Only a limited amount of dictionary entries support this assertion.

The Oxford English Dictionary contains almost 200,000 words. It includes 47,156 obsolete words and 171,476 active words.

The Oxford English Dictionary contains almost 200,000 words including 47,156 obsolete words and 171,476 active words.

Many countries in the world speak English as a second language. English is a language that has also taken hundreds of words from other languages.

It has incorporated these words into its own. It enables native English speakers to choose a broader range of vocabulary.

This is mainly because of the Viking invasions in England. Then, the Normans, colonization, and exploration waves added some to the English lexicon.

Due to this reason, the English dictionary got several new foreign terms. Every day, new words are being added to the English language. It has become the global lingua franca.

2. Korean – Is It The Richest Language?

Korean is at the very top of the list. There are 1,100,373 terms in the largest dictionary of the Korean language. This information comes from a government-approved official source.

This is the only most extensive collection of words in any language’s dictionary. It has around double the number of words in English, having 505,000 in its most extensive dictionary.

Even that appears to be a huge sum. But, this number includes both the South Korean and North Korean vocabulary.

And I assume it covers all the dialects in both locations. It clarifies some of the complications.

3. Arabic – Does It Have The Most Words:

Its strong impact on other languages is undeniable. But is Arabic the richest language in terms of words? Arabic might be the richest language due to its complexity based on its intricacy.

Arabic words use about three, four, or five letter roots to make a word. These roots create a wide range of meanings.

You might end up with a large number of words if you counted each variant as a different word. However, native speakers are not using all these words.

There are hundreds of terms used for camel in Arabic. It also has different love words, including unique words for every stage of love. That demonstrates the Arabic language’s incredible versatility and depth.

Arabic words use about three, four, or five letter roots to make a word.

Many individuals might don’t know that a large number of words originate from Arabic.

Spanish, Latin, Italian, German, Greek, and other languages frequently use these words. Most people don’t know this fact before they make their way into the English Language.

4. German – Extensive Vocabulary:

If we count compound words, German contains over 300,000 words. So, German language speakers can theoretically make as many terms as they want.

German has some words that are some of the longest words in existence. German uses compound words and makes them super long.

For instance, the German word “Unabhängigkeitserklärung” means “declaration of independence.” Do you think it is a single word?

Given the compounding possibilities, German would soon surpass English.

This can happen with the continuous addition of different legitimate German “words.” Germans would accept these words without hesitation.

5. Japanese – With Massive Word Count:

Another popular competitor is Japanese, which contains a massive 500,00 words. Half of them, however, are written in Kanji (Chinese characters).

You may write a daily newspaper with roughly 1200 Kanji words on a daily basis.

This can give you an idea that how massive word count Japanese has. Even though it relates to Chinese, but it still is one the languages with most words.

6. Finnish – Has The Most Words:

Finnish is an agglutinative language. It has the ability to produce an unlimited number of characters. You will not, however, discover such an extensive vocabulary.

Agglutinative languages can make new words by the combination of prefixes and suffixes. Finnish is one of these languages.

Are Dictionaries A Reliable Source To Estimate Which Language Has The Most Words?

Based on the facts, the answer would be no. Have you ever heard any language expert saying that English contains the most words?

They state this claim but verifying it is nearly impossible. Moreover, dictionaries do not contain all of the words of a language. Most of the dictionaries exclude slang words.

Dictionaries are not a correct approach to estimate the language having the most words.

Many Linguistics refer that the English language has the most words. But, what if we search for Korean? Dictionary says that it’s 1,100,373 words.

This word count is about twice the number of words in English. So, the claim goes wrong here.

The Duden contains almost 145,000 German words. The Diccionario de la Real Academia Española documents almost 88,000 words.

In a nutshell, dictionaries are not a correct approach. We can’t estimate the language having the most words based on dictionary entries. But there is an exception in some European languages.

Facts: Dictionaries Cannot Estimate Which Language Has The Most Words

Several examples show that dictionaries are not the ultimate approach. It is impossible to estimate the language having the most words in this way.

Steven Frank is the author of The Pen Commandments. He says that English comprises 500,000 words.

While the Oxford English Dictionary estimates a different figure. It states that the English language contains about 200,000 terms. It includes 47,156 obsolete words and 171,476 active words.

Other examples are:

According to Steven Frank, the German language has around 135,000 people. At the same time, the word count in authoritative dictionaries is 330,000.

Steven Frank claims that the French language has less than 100,000 words. The dictionary Littré, on the other hand, contains 132,000 active terms.

At the same time, according to the dictionary Larousse, French has 59,000 words.

These dictionaries have limitations to the terms of a single country. This is despite the people speaking the French language well beyond France’s boundaries.

Bottom Line:

By Now, you must have an idea that which language has the most words. Many linguistics say that the English language has the most words of any language on the planet. However, proving this is not easy.

It’s difficult to estimate the total number of words in non-alphabetic languages. For example, Chinese.

One approach for determining which language has the most words is the dictionaries. But it mainly works for all of Europe’s major languages.

Furthermore, dictionaries would not be able to include all the terms in a language. Most of the time, dictionaries do not have compounded words in many ways.

So, in the end, it really doesn’t matter. You’ll be dealing with a large vocabulary.

  • Author
  • Recent Posts

Sylvia Simpson

I’ve always loved learning and teaching languages. I started my career as a teacher in Madrid, Spain, where I taught business professionals. I then moved to Brussels, Belgium, where I worked with international affairs students and interns who were working with the European Union.

Sylvia Simpson

The longest word in any given language depends on the word formation rules of each specific language, and on the types of words allowed for consideration.

Agglutinative languages allow for the creation of long words via compounding. Words consisting of hundreds, or even thousands of characters have been coined. Even non-agglutinative languages may allow word formation of theoretically limitless length in certain contexts. An example common to many languages is the term for a very remote ancestor, «great-great-…..-grandfather», where the prefix «great-» may be repeated any number of times. The examples of «longest words» within the «Agglutinative languages» section may be nowhere near close to the longest possible word in said language, but is instead a popular example of a text-heavy word.

Systematic names of chemical compounds can run to hundreds of thousands of characters in length. The rules of creation of such names are commonly defined by international bodies, therefore they formally belong to many languages. The longest recognized systematic name is for the protein titin, at 189,819 letters.[1] While lexicographers regard generic names of chemical compounds as verbal formulae rather than words,[2] for its sheer length the systematic name for titin is often included in longest-word lists.

Longest word candidates may be judged by their acceptance in major dictionaries such as the Oxford English Dictionary or in record-keeping publications like Guinness World Records, and by the frequency of their use in ordinary language.

Agglutinative languagesEdit


The longest Basque toponym is Azpilicuetagaraicosaroyarenberecolarrea (40) which means «The lower field of the sheepfold (located in) the hight of Azpilicueta».[3]


Since Esperanto allows word compounding, there are no limits on how long a word can theoretically become. An example is the 39-letter oranĝ-kanton-pafil-limig-aktivul-malamanto, meaning «Orange County gun control activist hater». Such clusters are not considered good style (the 8-word alternative oranĝkantona malamanto de aktivuloj por limigo de pafiloj is more standard), but they are permissible under the rules of Esperanto grammar.[4] Hyphens are optional in Esperanto compounds,[5] so oranĝkantonpafillimigaktivulmalamanto is also technically a valid spelling.

The longest Esperanto roots officially recognized by the Akademio de Esperanto are 13 letters long, shown here with the added substantive «-o» ending:

  • administracio (administration),
  • aŭtobiografio (autobiography),
  • diskriminacio (discrimination),
  • konservatorio (conservatory),
  • paleontologio (palaeontology),
  • paralelogramo (parallelogram), and
  • trigonometrio (trigonometry).[6]

The longest word found in the dictionary Plena Ilustrita Vortaro as of its 2020 edition is the 24-letter proper noun Meklenburgio-Antaŭpomerio (the German state Mecklenburg-Vorpommern), followed by the 21-letter word proviantadministracio (rations administration).

As of March 2022 the longest word found in the Tekstaro de Esperanto text corpus is the 66-letter word unue-volapukista-poste-esperantista-poste-idista-poste-denove-esperantista, meaning «first-volapukist-then-esperantist-then-idist-then-again-esperantist», which was used in a review published in Monato in 1997 to describe František Lorenz.[7] However, this word does not follow normal Esperanto word formation rules. Other long words found in Tekstaro de Esperanto that do follow regular word formation include:

  • sescent-kvindek-mil-kvadratkilometra (consisting of 650 000 square kilometers), 33 letters, used in an Esperanto version of an 2011 article by Marc Lavergne in Le Monde diplomatique,
  • tragedio-komedio-historio-pastoraloj (tragical-comical-historical-pastorals), 33 letters, used in L. L. Zamenhof’s translation of Hamlet,
  • Nord-Atlantik-Traktad-Organizo (North Atlantic Treaty Organization), 27 letters, more commonly translated with two words: Nord-Atlantika Traktat-Organiz(aĵ)o.


  • Sünnipäevanädalalõpupeopärastlõunaväsimatus meaning «untiredness of a birthday week graduation party» which is 46 letters.[citation needed]
  • 31 lettered word of uusaastaöövastuvõtuhommikuidüll meaning «morning idyll after the new year».[8]
  • There is also the 25 letter long word of põllumajandusministeerium which is «Ministry of Agriculture».[citation needed]
  • The word kuulilennuteetunneliluuk meaning «the hatch a bullet flies out of when exiting a tunnel» is 24 letters long and a palindrome. It could be one of the longest palindromes.[citation needed]


Examples of long words that have been in everyday use in the Finnish language are kolmivaihekilowattituntimittari which means «three-phase kilowatt hour meter» (31 letters), liikekannallepanotarkastuskierros («mobilization inspection round», 33 letters),[9] peruspalveluliikelaitoskuntayhtymä («a public utility of a municipal federation for provision of basic services», 34 letters),[10] and lentokonesuihkuturbiinimoottoriapumekaanikkoaliupseerioppilas «airplane jet turbine engine auxiliary mechanic non-commissioned officer student» (61 letters), an actual military term, although one which has been deprecated. The longest military term in current use is vastatykistömaalinosoitustutkakalustojärjestelmäinsinöörierikoisupseeri «counter-artillery targeting radar systems engineer specialist officer» with 71 characters, with 2 more if grammatically incorrect extra hyphens added for readability are counted.[citation needed] If conjugated forms are allowed, even longer real words can be made. Allowing derivatives and clitics allows the already lengthy word to grow even longer, although the usability of the word starts to degrade. Because Finnish uses free forming of composite words, new words can even be formed during a conversation. One can add nouns after each other without breaking grammar rules.

If one allows artificial constructs as well as using clitics and conjugated forms, one can create even longer words: such as kumarreksituteskenteleentuvaisehkollaismaisekkuudellisennesk-
(102 letters), which was created by Artturi Kannisto.[11]

The longest non-compound (a single stem with prefixes and suffixes) Finnish word recognised by the Guinness Book of Records is epäjärjestelmällistyttämättömyydellänsäkäänköhänkään (see also Agglutination#Extremes), based on the stem järki (reason, sanity), and it means: «I wonder if – even with his/her quality of not having been made unsystematized».

Äteritsiputeritsipuolilautatsijänkä and a defunct bar named after it, Äteritsiputeritsipuolilautatsi-baari, are the longest place names in use.


Eltöredezettségmentesítőtleníttethetetlenségtelenítőtlenkedhetnétek, with 67 letters is the longest word in the Hungarian language and approximately means «you could defragmentation defragmenting impenetrability defragmentation». It is already morphed, since Hungarian is an agglutinative language.

The Hungarian language has many causes for writing words together, but there are a few rules for avoiding undisciplined length, resulting in unreadability.

Words with less than six syllables can be written in one. Agglutinated words have to be separated by one dash, if they are more than six syllables altogether. If there are more than two words that are already written with a dash and we want to add some more, we have to use a new dash to add it (like C-vitamin-adagolás, meaning «Vitamin C rationing»). If there would be two long words to be written, they are advised to be used separately (possible: békeszerződéstervezet-kidolgozás meaning «peace agreement plan elaboration», but advised rather a békeszerződés tervezetének kidolgozása meaning «the elaboration of the plan of the peace agreement»).

The longest dictionary form word is the word megszentségtelenített, with 21 characters (although it ultimately derives from the word szent meaning: «saint» or «sacred»), and it means «desecrated» or «profaned».[13]


There is some disagreement about what is the longest word in the Korean language, which arises from misunderstanding of the Korean language.

The longest word appearing in the Standard Korean Dictionary published by the National Institute of the Korean Language is 청자 양인각 연당초상감 모란 문은구 대접 (靑瓷陽印刻蓮唐草象嵌牡丹文銀釦대접); Revised Romanization: cheongjayang-in-gakyeondangchosang-gammoranmuneun-gudaejeop, which is a kind of ceramic bowl from the Goryeo dynasty; that word is 17 syllable blocks long, and contains a total of 46 hangul letters.[14][15] However, to call this a word would be incorrect. It simply consists of many words which act as adjectives for the one word 대접.

The word 니코틴아마이드 아데닌 다이뉴클레오타이드 (nikotin-amaideu adenin dainyukeulle-otaideu), a phonetic transcription of «nicotinamide adenine dinucleotide», has a larger number of syllable blocks (19) but a smaller number of letters (41), but does not qualify as a single word due to the spaces.

In proper nouns, many Korean monarchs have overly long posthumous names built from many different Sino-Korean nouns describing their positive characteristics, for example Sunjo of Joseon, whose full posthumous name is the 77-syllable-block 순조 선각 연덕현도 경인순희 체성응명흠광석경계천배극융원돈휴의행소윤희화준렬대중지정 홍훈철 모건시태형창 운홍기고명박후강건수정계통수력 공유범문안무정영경 성효대왕 (sunjoseongag-yeondeoghyeondogyeong-insunhuicheseong-eungmyeongheumgwangseoggyeong-gyecheonbaegeug-yung-wondonhyuuihaengsoyunhuihwa-junlyeoldaejungjijeonghonghuncheolmogeonsitaehy-eongchang-unhong-gigomyeongbaghugang-geonsujeong-gyetongsulyeoggong-yubeommun-anmujeong-yeong-gyeongseonghyodaewang).[citation needed] This is simply writing the phrase in Hanja (Hanzi) 純祖先覺淵德顯道景仁純禧體聖凝命欽光錫慶繼天配極隆元敦休懿行昭倫熙化峻烈大中至正洪勳哲謨乾始泰亨昌運弘基高明博厚剛健粹精啓統垂曆建功裕範文安武靖英敬成孝肅皇帝, being transliterate in Hangul. It is not a single word and does not qualify as a lexical entry.


A popular example of the longest suffixed word in Mongolian is «Цахилгаанжуулалтыхантайгаа» (tsakhilgaanjuulaltykhantaigaa) which is 26 letters long. Here is a table showing, with translations, which suffixes are added.[citation needed]

Word Translation
Цахилгаан electricity (power)
Цахилгаанжуул electrify
Цахилгаанжуулалт electrification
Цахилгаанжуулалтын electrifications
Цахилгаанжуулалтыхан electricians
Цахилгаанжуулалтыхантай with electricians
Цахилгаанжуулалтыхантайгаа do (action) with electricians


The longest word in the Ojibwe language is miinibaashkiminasiganibiitoosijiganibadagwiingweshiganibakwezhigan (66 letters), meaning «blueberry pie». This literally translates to «blueberry cooked to jellied preserve that lies in layers in which the face is covered in bread».[16]


Tagalog can make long words by adding on affixes, suffixes, and other root words with a connector.

The longest published word in the language is pinakanakakapagngitngitngitngitang-pagsisinungasinungalingan, with 59 letters. This compound word means «to keep making up a lie that causes the most extreme anger while pretending you are not.»[17]


Turkish, as an agglutinative language, carries the potential for words of arbitrary length.

Muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmişsinizcesine, at 70 letters, has been cited as the longest Turkish word. It was used in a contrived story designed to use this word.[18][19] The word means «As if you would be from those we can not easily/quickly make a maker of unsuccessful ones» and its usage was illustrated as follows:

Kötü amaçların güdüldüğü bir öğretmen okulundayız. Yetiştirilen öğretmenlere öğrencileri nasıl muvaffakiyetsizleştirecekleri öğretiliyor. Yani öğretmenler birer muvaffakiyetsizleştirici olarak yetiştiriliyorlar. Fakat öğretmenlerden biri muvaffakiyetsizleştirici olmayı, yani muvaffakiyetsizleştiricileştirilmeyi reddediyor, bu konuda ileri geri konuşuyor. Bütün öğretmenleri kolayca muvaffakiyetsizleştiricileştiriverebileceğini sanan okul müdürü bu duruma sinirleniyor, ve söz konusu öğretmeni makamına çağırıp ona diyor ki: Muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmişsinizcesine laflar ediyormuşsunuz ha?
We are in a teachers’ training school that has evil purposes. The teachers who are being educated in that school are being taught how to make unsuccessful ones from students. So, one by one, teachers are being educated as makers of unsuccessful ones. However, one of those teachers refuses to be maker of unsuccessful ones, in other words, to be made a maker of unsuccessful ones; he talks about and criticizes the school’s stand on the issue. The headmaster who thinks every teacher can be made easily/quickly into a maker of unsuccessful ones gets angry. He invites the teacher to his room and says «You are talking as if you were one of those we can not easily/quickly turn into a maker of unsuccessful ones, huh?»

Other well-known very long Turkish words are:[20]

  • Çekoslovakyalılaştıramadıklarımızdanmışsınızcasına means «As if you are one of those people whom we could not turn into a Czechoslovakian».
  • Afyonkarahisarlılaştırabildiklerimizdenmişsinizcesine means «As if you are one of the people that we made resemble from Afyonkarahisar». (Afyonkarahisar is a city in Turkey.)

Word formationEdit

Turkish English
Muvaffak Successful
Muvaffakiyet Success
Muvaffakiyetsiz Unsuccessful (without success’)
Muvaffakiyetsizleş(-mek) (To) become unsuccessful
Muvaffakiyetsizleştir(-mek) (To) make one unsuccessful
Muvaffakiyetsizleştirici Maker of unsuccessful ones
Muvaffakiyetsizleştiricileş(-mek) (To) become a maker of unsuccessful ones
Muvaffakiyetsizleştiricileştir(-mek) (To) make one a maker of unsuccessful ones
Muvaffakiyetsizleştiricileştiriver(-) (To) easily/quickly make one a maker of unsuccessful ones
Muvaffakiyetsizleştiricileştiriverebil(-mek) (To) be able to make one easily/quickly a maker of unsuccessful ones
Muvaffakiyetsizleştiricileştiriveremeyebil(-mek) To be able to not make one easily/quickly a maker of unsuccessful ones
Muvaffakiyetsizleştiricileştiriveremeyebilecek One who is not able to make one easily/quickly a maker of unsuccessful ones
Muvaffakiyetsizleştiricileştiriveremeyebilecekler Those who are not able to make one easily/quickly a maker of unsuccessful ones
Muvaffakiyetsizleştiricileştiriveremeyebileceklerimiz Those whom we cannot make easily/quickly a maker of unsuccessful ones
Muvaffakiyetsizleştiricileştiriveremeyebileceklerimizden From those we can not easily/quickly make a maker of unsuccessful ones
Muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmiş (Would be) from those we can not easily/quickly make a maker of unsuccessful ones
Muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmişsiniz You would be from those we can not easily/quickly make a maker of unsuccessful ones
Muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmişsinizcesine As if you would be from those we can not easily/quickly make a maker of unsuccessful ones

Non-agglutinative languagesEdit


Afrikaans, as it is a daughter language of the Dutch language, is capable of forming compounds of potentially limitless length in the same way as in the Dutch language. According to the Total Book of South African Records, the longest word in the language is[21]Tweedehandse­motor­verkoops­manne­vakbond­stakings­vergadering­sameroepers­toespraak­skrywers­pers­verklaring­uitreikings­media­konferensie­aankondiging (136 letters), which means «issuable media conference’s announcement at a press release regarding the convener’s speech at a secondhand car dealership union’s strike meeting». This word, however, is contrived to be long and does not occur in everyday speech or writing.


Currently, the longest word in Arabic is the 15-letter-long word أَفَإِستَسقَينَاكُمُوها.[22] Which means «Did we ask you to let us drink it?» However, according to some online sources the 16-letter-long word أَفَإِستَسقَينَاكُمُوهما is the longest word in Arabic meaning «Did we ask you to let us drink both of them?». Regardless, official sources supporting such a stance cannot be found.


The Bulgarian online etymological dictionary claims that longest word in Bulgarian to be the 39-letter-long непротивоконституционствувателствувайте (neprotivokonstitutsionstvuvatelstvuvayte), introduced in the Constitution of Bulgaria of 1947 (Dimitrov Constitution).[23] The word means «do not perform actions against the constitution» (addressed to more than one person).


The longest word in Catalan is considered to be Anticonstitucionalment, an adverb meaning «[done in a way that is] against the constitution», however, the scientific word Psiconeuroimmunoendocrinologia, related to endocrinology, has been proposed by the University of Barcelona to be the true longest word.[24]


The longest known word in Croatian is prijestolonasljednikovičičinima,[25] meaning «to those who belong to the throne successor’s little wife.» The 31-letter word is the dative case of prijestolonasljednikovičica «the throne successor’s little wife» which is the diminutive of prijestolonasljednikovica «the throne successor’s wife.»


Traditionally, the word nejneobhospodařovávatelnější («of the least cultivable», 28 letters) is considered as the longest Czech word, but there are some longer artificial words. Most of them are compound adjectives in dative, instrumental or other grammatical case and derived from the iterative or frequentative verbal form or the ability adjective form (like «-able»).

  • Nejnezdevětadevadesáteroroznásobovávatelnějšími (47; Instrumental case of the ones least multipliable by a group of ninety-nine on a regular basis)
  • Nejnezdevětadevadesáteroroznásobovávatelnější (Those who are the least multiplable by a group ninety-nine on a regular basis)
  • Nejzdevětadevadesáteroroznásobovávatelnější (Those who are the most multiplable by a group ninety-nine on a regular basis)
  • Zdevětadevadesáteroroznásobovávatelnější (Those who are more multiplable by a group ninety-nine on a regular basis)
  • Zdevětadevadesáteroroznásobovávatelní (Those who are multiplable by a group of ninety-nine on a regular basis)
  • Zdevětadevadesáteroroznásobovávat (Alternative of «multiply out by a group of ninety-nine on a regular basis»)
  • Zdevětadevadesáteroroznásobovat (Multiply out by a group of ninety-nine on a regular basis — continuous grammatical aspect)
  • Zdevětadevadesáteroznásobovat (Multiply by ninety-nine on a regular basis – continuous grammatical aspect)
  • Zdevětadevadesáteroznásobit (Multiply by a group of ninety-nine once)
  • Zdevětadevadesáteronásobit (Multiply by a group of ninety-nine)
  • Devětadevadesátero (A group of ninety-nine)
  • Devětadevadesát (Inverse of devadesát devět = ninety-nine)


Danish, like many Germanic languages, is capable of compounding words to create ad hoc compounds of potentially limitless length. Nevertheless, the constructed word special­læge­praksis­planlægnings­stabiliserings­periode – which means «a period of stabilising the planning of a specialist doctor’s practice» – was cited in 1993 by the Danish version of the Guinness Book of World Records as the longest word in the Danish language at 51 letters long. It is however not possible (using Google) to find a text, which actually uses this word, except for in the context of discussing the longest Danish word.


Dutch, like many Germanic languages, is capable of forming compounds of potentially limitless length. The 53-letter word Kinder­carnavals­optocht­voorbereidings­werkzaamheden­plan, meaning «preparation activities plan for a children’s carnival procession», was cited by the 1996 Guinness Book of World Records as the longest Dutch word.[26]

The longest word in the authoritative Van Dale Dutch dictionary (2009 edition) in plural form is meervoudige­persoonlijkheids­stoornissen;[27] 38 letters long, meaning «multiple personality disorders». The entry in the dictionary however is in the singular, counting 35 letters.

The free OpenTaal dictionary,[28] which has been certified by the Dutch Language Union (the official Dutch language institute) and is included in many open-source applications, contains the following longest words, which are 40 letters long:

  • vervoerders­aansprakelijkheids­verzekering, «carriers’ liability insurance»;
  • bestuurders­aansprakelijkheids­verzekering, «drivers’ liability insurance»;
  • overeenstemmings­beoordelings­procedures, «conformity assessment procedures» (38 letters)

The word often said to be the longest in Dutch – probably because of its funny meaning and alliteration – which has also appeared in print, is Hottentotten­soldaten­tenten­tentoonstellings­bouwterrein («construction ground for the Hottentot soldiers’ tents exhibition»); counting 53 letters.


The 45-letter word pneumono­ultra­microscopic­silico­volcano­coni­osis is the longest English word that appears in a major dictionary.[29][30] Originally coined to become a candidate for the longest word in English, the term eventually developed some independent use in medicine.[31] It is referred to as «P45» by researchers.[32]

The 30-letter word pseudopseudohypoparathyroidism refers to an inherited disorder,[33] named for its similarity to pseudohypoparathyroidism in presentation, which is in turn named for its similarity to hypoparathyroidism. This is the longest word that was not contrived with the sole intention of becoming the longest word.[34]

Flocci­nauci­nihili­pili­fication, at 29 letters and meaning the act of estimating something as being worth so little as to be practically valueless, or the habit of doing so, is the longest non-technical, coined word in Oxford Dictionaries of the English language.[29]

Anti­dis­establishment­arian­ism, at 28 letters, is the longest non-coined, non-systematic English word in Oxford Dictionaries.[29] It refers to a 19th-century political movement that opposed the disestablishment of the Church of England as the state church of England.



In German, whole numbers (smaller than 1 million) can be expressed as single words, which makes sieben­hundert­sieben­und­siebzig­tausend­sieben­hundert­sieben­und­siebzig (777,777) a 65 letter word. In combination with -malig or, as an inflected noun, (des …) -maligen, all numbers can be written as one word. A 79 letter word, Donau­dampf­schiffahrts­elektrizitäten­haupt­betriebs­werk­bau­unter­beamten­gesellschaft, was named the longest published word in the German language by the 1972 Guinness Book of World Records, but longer words are possible. The word was the name of a prewar Viennese club for subordinate officials of the headquarters of the electrical division of the company named the Donau­dampf­schiffahrts­gesellschaft, «Danube steam boat operation company».

The longest word that is not created artificially as a longest-word record seems to be Rindfleisch­etikettierungs­überwachungs­aufgaben­übertragungs­gesetz at 63 letters. The word means «law delegating beef label monitoring» but as of 2013, it was removed from the books because European Union regulations have changed and that particular law became obsolete, leading to news reports that Germany «had lost its longest word».[35]

In December 2016 the 51-letter word Bundes­präsidenten­stichwahl­wiederholungs­verschiebung («deferral of the second iteration of the federal presidential run-off election») was elected the Austrian Word of the Year 2016.[36] The jury called it a «descriptive word» which «in terms of its content as well as its length, is a symbol and an ironic form of commentary for the political events of this year, characterized by the very long campaign for the presidential election, the challenges of the voting process, and its reiteration.»[36][37]


In his comedy Assemblywomen (c. 392 BC), Aristophanes coined the 182-letter word λοπαδο­τεμαχο­σελαχο­γαλεο­κρανιο­λειψανο­δριμ­υπο­τριμματο­σιλφιο­καραβο­μελιτο­κατακεχυ­μενο­κιχλ­επι­κοσσυφο­φαττο­περιστερ­αλεκτρυον­οπτο­κεφαλλιο­κιγκλο­πελειο­λαγῳο­σιραιο­βαφη­τραγανο­πτερύγων (Lopado­temacho­selacho­galeo­kranio­leipsano­drim­hypo­trimmato­silphio­karabo­melito­katakechy­meno­kichl­epi­kossypho­phatto­perister­alektryon­opte­kephallio­kigklo­peleio­lagoio­siraio­baphe­tragano­pterygon), a fictional food dish consisting of a combination of fish and other meat. The word is cited as the longest ancient Greek word ever written.[38]
A modern Greek word of 22 letters is ηλεκτροεγκεφαλογράφημα (ilektroenkefalográfima) (gen. ηλεκτροεγκεφαλογραφήματος (ilektroenkefalografímatos), 25 letters) meaning «electroencephalogram».


The longest Hebrew word is the 19-letter-long (including vowels) וכשלאנציקלופדיותינו (u’chshelentsiklopediotenu),[39] which means «And when to our encyclopedias…» The Hebrew word אנציקלופדיה (encyclopedia) is of a European origin.

The longest word in Hebrew that doesn’t originate from another language is וכשלהתמרמרויותינו, (u’chshelehitmarmeruyotenu) which crudely means «And when, to our resentments/ grievances»

The 11-letter-long (including vowels) וְהָאֲחַשְׁדַּרְפְּנִים (veha’aḥashdarpením) is the longest word to appear in the Hebrew Bible. — Its meaning is «And the satraps». It also does not originate from Hebrew.[citation needed]

Other very long Hebrew words include:

  • וכשבהשתעשעויותיהם (u’chshebehishta’ashuateyhem) meaning: «And when they were having fun» or «And while in their playfulness».


Hindi has a finite list of compound words which are based on established grammatical rules of the language. The word commonly cited as the longest in Hindi is लौहपथगामिनीसूचकदर्शकहरितताम्रलौहपट्टिका (lauhpathagāminīsūchakdarshkaharitatāmralauhpaṭṭikā), which consists of 24 consonants and 10 vowel diacritics, making up a total of 34 characters. The word literally means «a green railway warning signboard made of copper-iron». Its plural would be लौहपथगामिनीसूचकदर्शकहरितताम्रलौहपट्टिकाएँ (lauhpathagāminīsūchakdarshkaharitatāmralauhpaṭṭikāẽ), which has an additional vowel and a diacritic. It is a neologism and not in common use.[40]

A much smaller word borrowed from Sanskrit which is in common use and is also often cited as the longest word is किंकर्तव्यविमूढ़ (kinkartavyavimūṛh). It consists of 8 consonants and 5 vowel diacritics, making up a total of 13 characters. The word literally means «confused about what to do», meaning to be bewildered or flabbergasted.


Icelandic has the ability to form compounds of arbitrary length by stringing together genitives (eignarfallssamsetning), so no single words of maximal length exist in the language. However, vaðlaheiðarvegavinnuverkfærageymsluskúr and vaðlaheiðarvegavinnuverkfærageymsluskúraútidyralyklakippuhringur are sometimes cited as particularly long words;[41] the latter has 64 letters and means «a keychain ring for the outdoor key of road workers shed in a moor called Vaðlaheiði».

Analysis of a corpus of contemporary Icelandic texts by Uwe Quasthoff, Sabine Fiedler and Erla Hallsteinsdóttir identified Alþjóðaflutningaverkamannasambandsins («of the International Transport Workers’ Federation»; 37 letters) and Norðvestur-Atlantshafsfiskveiðistofnunarinnar («of the Northwest Atlantic Fisheries’ Organization»; 45 letters) as the longest unhyphenated and hyphenated words.[42]

The longest word occurring at least twice in the University of Leipzig isl-is_web_2015 corpus is Auðmannastjórnvaldaembættisstjórnmálaverkalýðsverðlausraverðbréfaábyrgðarlausrakvóta-ræningjaaftaníossaspilling (110 letters).[43]


Indonesian is a part of Austronesian language. According from Kamus Besar Bahasa Indonesia. The longest word of this language is mempertanggungjawabkan, which is 22 letter meaning «take responsibility» in english and heksakosioiheksekontaheksafobia, 30 letter meaning «hexacosioihexecontahexaphobia» in english.[44]


The longest non-compound word in Irish is grianghrafadóireacht, a 20-letter-long word meaning «photography».[45]


The longest word in Italian is traditionally precipitevolissimevolmente, which is a 26-letter-long adverb.[46] It is formed by subsequent addition of postfixes to the original root:

  1. precipitevole: «hasty»;
  2. precipitevolissimo: «very hasty»;
  3. precipitevolissimevole: «[of someone/something] that acts very hastily», (not grammatically correct[citation needed]);
  4. precipitevolissimevolmente: «in a way like someone/something that acts very hastily» (not grammatically correct, but nowadays part of the language).

The word is never used in every-day language, but in jokes. Nevertheless, it is an official part of Italian language; it was coined in 1677 by poet Francesco Moneti:

perché alla terra alfin torna repente / precipitevolissimevolmente

— Francesco Moneti, Cortona Convertita, canto III, LXV

The word technically violates Italian grammar rules, the correct form being precipitevolissimamente, which is three letters and one syllable shorter. The poet coined the new word to have 11 syllables in the second verse.

Other words can be created with a similar (and grammatically correct) mechanism starting from a longer root, winding up with a longer word. Some examples are:

  • sovramagnificentissimamente (cited by Dante Alighieri in De vulgari eloquentia), 27 letters, «in a way that is more than magnificent by far» (archaic);[47]
  • incontrovertibilissimamente, 27 letters, «in a way that is very difficult to falsify»;
  • particolareggiatissimamente, 27 letters, «in an extremely detailed way»;
  • anticostituzionalissimamente, 28 letters, «in a way that strongly violates the constitution».

The longest accepted neologism is psiconeuroendocrinoimmunologia (30 letters).[citation needed].

Other long words are:

  • nonilfenossipolietilenossietonolo (33 letters — chemical)
  • pentagonododecaedrotetraedrico (30 letters — 3D geometric figure)
  • esofagodermatodigiunoplastica (29 letters — surgery)
  • elettroencefalograficamente (27 letters — medical adverb: electroencephalographically)
  • diclorodifeniltricloroetano (27 letters — chemical: DDT)


Láadan is not agglutinating as there is no mechanism to combine arbitrary words into one without intermediating grammatical mechanisms (such as the relativizer § In other languages); however, there are a number of affixes that further elucidate the contextual meaning of a word. These are ignored when determining the longest words in the language. The primary reference for vocabulary is the 3rd edition of the official dictionary and grammar.

  • oshetham éelenethilethu, 22 letters not counting the space, or 17 phonemes (since for example ée is a toneme of e, and th is a separate sound from *t or *h separately—the asterisks indicate that neither sound exists in Láadan) — a set phrase for a wreath of grapevine, a common symbol of the language[48]
  • shineshidethóo, 14 letters or 10 phonemes — an invited guest[49]


The longest attested word in Classical Latin is subductisupercilicarptor, which was coined by the obscure poet Laevius in the 1st century. In Medieval Latin, the longest known word is honorificabilitudinitas, which was first attested in a treatise written by the 8th century Grammarian Peter of Pisa. One can further increase the length of the words by adding the Dative plural case to them, which would result in the words subductisupercilicarptoribus and honorificabilitudinitatibus respectively.[citation needed]


The longest Lithuanian word is 40 letters long:

  • nebeprisikiškiakopūstlapiaujančiuosiuose — «in those, of masculine gender, who aren’t gathering enough wood sorrel’s leaves by themselves anymore.» — the plural locative case of past iterative active participle of verb kiškiakopūstlapiauti meaning «to pick wood-sorrels’ leaves» (leaves of edible forest plant with sour taste, word by word translation «rabbit cabbage»). The word is attributed to software developer / writer Andrius Stašauskas.[50][unreliable source?][51][unreliable source?]


The Māori-language 85-letter place name Taumata­whakatangihanga­koauau­o­tamatea­turi­pukaka­piki­maunga­horo­nuku­pokai­whenua­ki­tana­tahu is the longest place name in English-speaking countries and second longest in the world, according to Wises New Zealand Guide and The New Zealand Herald.[52]


Very long Polish words can be created as adjectives from numerals and nouns. For example, Dziewięćsetdziewięćdziesięciodziewięcionarodowościowego, 54 letters, is the genitive singular form of an adjective meaning roughly «of nine-hundred and ninety-nine nationalities». Similar words are rather artificial compounds, constructed within allowed grammar rules, but are seldom used in spoken language, although they are not nonsense words.[citation needed] It is possible to make even longer words in this way, for example:

Dziewięćsetdziewięćdziesiątdziewięćmiliardówdziewięćsetdziewięćdziesiątdziewięćmilionów-dziewięćsetdziewięćdziesiątdziewięćtysięcydziewięćsetdziewięćdziesięciodziewięcioletniego (176 letters, meaning «of 999,999,999,999 years old»).

One of the longest common words is 31-letter dziewięćdziesięciokilkuletniemu – the dative singular form of «ninety-and-some years old one». Another known long word is konstantynopolitańczykowianeczka[citation needed] (32 letters), «a daughter of a man who lives in Constantinople» and pięćdziesięciogroszówka (23 letters), «a 50 groszy coin».[53]


The longest Romanian word is pneumonoultramicroscopicsilicovolcaniconioză, with 44 letters,[54] but the longest one admitted by the Dicționarul explicativ al limbii române («Explanatory Dictionary of the Romanian Language», DEX) is electroglotospectrografie, with 25 letters.[55][56]


Most likely one of the longest Russian words is a chemical term, тетра­гидро­пиранил­цикло­пентил­тетра­гидро­пиридо­пириди­новая (tetra­gidro­piranil­ciklo­pentil­tetra­gidro­pirido­piridi­novaya), which contains 55 letters. It was used in Russian patent RU2285004C2 (granted and published in 2006). This word is an adjective that can describe e.g. a chemical formula. As a noun, it is without the last 4 letters.

Another one is превысоко­много­рассмотрительствующий (prevysoko­mnogo­rassmotritel’stvuyushchiy), which contains 35 letters. It is an adjective in the bureaucratic language of the 19th century «meaning a very polite form of addressing clerks, something like Your Excellency, Your Highness, Your Majesty all together» (Guinness World Records 2003[citation needed]). Its dative singular form, превысоко­много­рассмотрительствующему (prevysoko­mnogo­rassmotritel’stvuyushchemu, with 36 letters) can be an example of excessively official vocabulary of the 19th century.

Numeral compounds can be long as well, such as Тысячево­сьмисот­восьми­десяти­девяти­микро­метровый (Tysyachevo­s’misot­vos’mi­desyati­devyati­mikro­metrovyy), which is an adjective containing 46 letters, meaning «1889-micrometers long».[57]


Sanskrit allows word compounding of arbitrary length. Nouns and verbs can be expressed in a sentence.[citation needed]

The longest sentence ever used in Sanskrit literature is (in Devanagari):


In IAST transliteration:


from the Varadāmbikā Pariṇaya Campū by Tirumalāmbā,[58] composed of 195 Sanskrit letters (428 letters in the roman transliteration, dashes excluded), thus making it the longest word ever to appear in worldwide literature.[59][60]

Each hyphen separates every individual word this word is composed of.

The approximate meaning of this word is:

«In it, the distress, caused by thirst, to travellers, was alleviated by clusters of rays of the bright eyes of the girls; the rays that were shaming the currents of light, sweet and cold water charged with the strong fragrance of cardamom, clove, saffron, camphor and musk and flowing out of the pitchers (held in) the lotus-like hands of maidens (seated in) the beautiful water-sheds, made of the thick roots of vetiver mixed with marjoram, (and built near) the foot, covered with heaps of couch-like soft sand, of the clusters of newly sprouting mango trees, which constantly darkened the intermediate space of the quarters, and which looked all the more charming on account of the trickling drops of the floral juice, which thus caused the delusion of a row of thick rainy clouds, densely filled with abundant nectar.»


Traditionally, the word najneobhospodarovávateľnejšieho («of the least cultivable», 31 letters) is considered as the longest Slovak word, but there are some longer artificial words. Most of them are compound adjectives in dative, instrumental or other grammatical case and derived from the iterative or frequentative verbal form or the ability adjective form (like -able).[61][62]

Artificial words, lexically valid but never used in language:

  • znajneprekryštalizovávateľnejšievajúcimi, 40 letters, «through the least crystallised ones»
  • znajnepreinternacionalizovateľnejšievať, 39 letters
  • najnezrevolucionalizovateľnejšiemu, 34 letters [63]
  • najnerozkrasokorčuľovateľnejšieho, 33 letters

Artificial words using Slovak towns or places, lexically valid but never used in language:

  • znajneprehornádskodružstevnianskovávateľnejšievajúcimi, 54 letters
  • znajneprechminianskojakubovianskovávateľnejšievajúcimi, 54 letters


  • deväťstodeväťdesiatdeväťtisícštyristodeväťdesiatdeväť, 53 letters, «999499» [64]
  • sedemstodeväťdesiatsedemtisícsedemstodeväťdesiatsedem, 53 letters, «797797» [65]


The longest word in Spanish is esternocleidomastoideitis (inflammation of the sternocleidomastoid muscle, 30 letters).[66] Runners-up are anticonstitucionalmente ([proceeding in a manner that is] contrary to the constitution) and electroencefalografistas (specialists that do electrical scans on brains (electroencephalographists)), both 23 letters.

The word anticonstitucionalmente is usually considered the longest word in general use. This word can be made even longer by the addition of the absolute superlative suffix, rendering anticonstitucionalísimamente (i.e.: «very strongly against the constitution»). Some dictionaries (but not the RAE dictionary[67]) removed its root word (anticonstitucional) in 2005, causing comments about it not «being a valid word anymore» and suggesting the use of inconstitucional as a replacement.[citation needed]


Realisationsvinstbeskattning (28 letters) is the longest word in Svenska Akademiens Ordlista. It means «capital gains taxation», and is usually shortened to Reavinstskatt (same meaning).
However, Swedish grammar makes it possible to create arbitrarily long words. One such word is Spårvagnsaktiebolagsskensmutsskjutarefackföreningspersonalbeklädnadsmagasinsförråd-sförvaltarens (94 letters) which means: «[belonging to] The manager of the depot for the supply of uniforms to the personnel of the track cleaners’ union of the tramway company».[68]

Toki PonaEdit

kijetesantakalu in the Toki Pona writing system sitelen pona

The longest word in Toki Pona is kijetesantakalu (15 letters), which was proposed in 2009 as an April Fools’ joke by the language’s creator Sonja Lang as a word for any animal of the Procyonidae family, which includes raccoons and related species.[69] The word has since entered into common use, and it has become common to define kijetesantakalu more broadly as any animal from the Musteloidea superfamily.[70] In 2019 James Flear designed a glyph for kijetesantakalu in Toki Pona’s sitelen pona writing system, which has become a popular icon within the Toki Pona community.[71]

As a minimalistic isolating constructed language, most words in Toki Pona are much shorter, the median being 4 letters. The longest words featured in the 2014 book Toki Pona: The Language of Good, Lang’s first official Toki Pona publication, are the 7-letter words kepeken («to use, by means of») and sitelen («symbol, picture»). The list of proposed country names in the same book also mentions ma Papuwanijukini («Papua New Guinea»), which includes a 14-letter proper adjective.[72]


Vietnamese is an isolating language, which naturally limits the length of a morpheme. The longest, at seven letters, is nghiêng, which means «inclined» or «to lean».[73] This is the longest word that can be written without a space. However, not all words in Vietnamese are single morphemes. Indeed, nghiêng can be reduplicated as nghiêng nghiêng.

The written language abounds with compound words in which each constituent word is delimited by spaces, just like any freestanding word. Moreover, the grammar lacks inflection to mark parts of speech, and prepositions are often optional. Therefore, the boundary between a word and a phrase is poorly defined.[74] Examples of this ambiguity include:

  • Chủ nghĩa phân biệt chủng tộc («racism»), which is composed of the words chủ nghĩa («ideology»), phân biệt («discriminate»), and chủng tộc («race»)
  • Cơm gà xào sả ớt, which literally describes a dish of grilled chicken sauteed with lemongrass and peppers on rice
  • Ông bà anh chị em, a polite pronoun composed of five kinship terms

Unlike locally coined compound words, compound words in Sino-Vietnamese vocabulary are less ambiguous, because of the use of premodifiers (as in English) as opposed to the native postmodifiers. Long Sino-Vietnamese words include bách khoa toàn thư («encyclopedia») and thủy động lực học («hydrodynamics»).

Loanwords and pronunciation respellings from other languages can also result in long words. For example, «consortium» is côngxoocxiom (12 letters), and «Indonesia» may be left as-is or spelled In-đô-nê-xi-a (13 counting hyphens).[75] The Encyclopedic Dictionary of Vietnam systematically respells foreign names, introducing long names into an official Vietnamese lexicon:

  • Kômixacjepxkaia («Komissarzhevskaya», 15 letters)[76]
  • Rôjơđextơvenxki («Rozhdestvensky», 15 letters)[77]
  • Mêtơrôpôliten Ôpêra («Metropolitan Opera», 18 letters)[78]

Long initialisms in Vietnamese include:

  • CHXHCNVN (Cộng hòa Xã hội chủ nghĩa Việt Nam, «Socialist Republic of Vietnam», 8 characters)
  • MTDTGPMNVN (Mặt trận Dân tộc Giải phóng miền Nam Việt Nam, «Viet Cong», 10 characters)

In modern Vietnamese, compound words can be identified fairly easily within title cased text: a morpheme that begins with a capital letter followed by one or more morphemes that begin with a lowercase letter. For example, xã hội chủ nghĩa («socialism») is capitalized as one component within Cộng hòa Xã hội chủ nghĩa Việt Nam.


Llanfair­pwllgwyngyll­gogery­chwyrn­drobwll­llan­tysilio­gogo­goch, a railway station on the island of Anglesey in Wales, is the longest place name in the Welsh language. At 51 letters in the Welsh alphabet (the digraphs ll and ch are each collated as single letters) the name can be translated as «St Mary’s church in the hollow of the white hazel near to the rapid whirlpool and the church of St Tysilio of the red cave». However, it was artificially contrived in the 1860s as a publicity stunt, to give the station the longest name of any railway station in the United Kingdom.

Long words are comparatively rare in Welsh. Candidates for long words other than proper nouns include the following (the digraph dd is also treated as a single letter, as is ng in many instances including in the last word below):

  • gwrthddatgysylltiadaeth (antidisestablishmentarianism)
  • microgyfrifiaduron (microcomputers)
  • gwrthgyfansoddiaethwyr (anticonstitutionalists)
  • lled-ddargludyddion (semiconductors)
  • tra-arglwyddiaethasant (they tyrannised)
  • cyfrwngddarostynedigaeth (intercession)[79] (-au can be added to form the plural, and the word can be further lengthened slightly by initial mutation: fy nghyfrwngddarostynedigaethau, «my intercessions»)

See alsoEdit

  • Morphology (linguistics)
  • Longest English sentence
  • Coxeter group — mathematical concept whose entities are sometimes called words


  1. ^ McCulloch S. «Longest word in English». Sarah Archived from the original on 14 January 2010. Retrieved 12 October 2016.
  2. ^ Oxford Word and Language Service team. «Ask the experts — What is the longest English word?». / Oxford University Press. Archived from the original on 13 September 2008. Retrieved 13 January 2008.
  3. ^ (in Basque) Iñaki Arranz, Hitza azti, Alberdania, 2006, 283 pages. (Zein da euskal hitzik luzeena?)
  4. ^ Jordan, David K. (1 July 1999). «Chapter 4 (Part 1): Nouns». Being colloquial in Esperanto: a reference guide. Esperanto League for North Amer. ISBN 9780939785049. The last, «silly» line is the same as the «wrong» one, but it is technically possible because it is a single noun.
  5. ^ Wennergren, Bertilo. «PMEG – Precizigaj antaŭelementoj – Kombinoj el kombinoj». Plena Manlibro de Esperanta Gramatiko. Retrieved 7 March 2022.
  6. ^ «Akademia Vortaro«. Akademio de Esperanto. Archived from the original on 24 July 2011. Retrieved 30 November 2009.
  7. ^ Gonçalo Neves (1997). «Bontone pri la bretona». Monato. Retrieved 7 March 2022.
  8. ^ «Estonian / Lingvopedia ::». Retrieved 20 April 2020.
  9. ^ Appears on page 97 in Laaksonen, Lasse: Viina, hermot ja rangaistukset — sotilasjohdon henkilökohtaiset ongelmat 1918-1945. Docendo, Helsinki 2017.
  10. ^ «Suupohjan peruspalveluliikelaitoskuntayhtymä – LLKY».
  11. ^ Karilas, Yrjö: Antero Vipunen, arvoitusten ja ongelmien, leikkien ja pelien sekä eri harrastelualojen pikkujättiläinen, p. 226, 20th edition. WSOY 2003. ISBN 9510121770
  12. ^ 139. point [] in the Hungarian Academy of Sciences: Rules of Hungarian Orthography
  13. ^ See at the end of the entry megszentségtelenít in a monolingual dictionary of Hungarian
  14. ^ «청자양인각연당초상감모란문은구대접». Naver Dictionary. Retrieved 6 August 2015.
  15. ^ «독일에서 가장 긴 단어 사라진다» [Longest word in Germany disappears]. JoongAng Ilbo. 4 June 2013. Retrieved 6 August 2015.[permanent dead link]
  16. ^ «Grammar Pro» Archived 13 December 2017 at the Wayback Machine, a page of the collaborative Anishinaabe language revitalization effort
  17. ^ «PUTANGINA». TAGALOG LANG. 30 December 2015. Retrieved 25 April 2018.
  18. ^ «Yeni Mesaj Internet Sitesi». Archived from the original on 18 July 2011.
  19. ^ «Papatyam Forum». Archived from the original on 27 July 2011.
  20. ^ «Çekoslavakyalılaştıramadıklarımızdan mısınız? TDK’ye Göre Doğru Yazılışı — Çekoslavakyalılaştıramadıklarımızdan mısınız? Doğru Yazımı Nasıldır?». 23 December 2016.
  21. ^ Rosenthal, Eric (1982). Total Book of South African records. Delta Books. p. 61. ISBN 0-908387-19-9.
  22. ^ الكاش, علي (9 January 2021). الصوفية والصفوية، خصائص وأهداف مشتركة (First ed.). البُرهان. p. 195.
  23. ^ «непротивоконституционствувателствувайте». Retrieved 28 October 2013.
  24. ^ «Psiconeuroimmunoendocrinologia: la paraula més llarga de la UB? – Vocabulària». (in Catalan). Retrieved 18 November 2017.
  25. ^ Jeste li znali da najdulja hrvatska riječ ima 31 slovo?, Dalmacija News, 22 February 2014.
  26. ^ «A Collection of Word Oddities and Trivia». Archived from the original on 27 April 2009. Retrieved 7 March 2009.
  27. ^ «Wat is het langste woord in het Nederlands».
  28. ^ «Welkom bij OpenTaal».
  29. ^ a b c «What is the longest English word?» (
  30. ^ «pneumono­ultra­microscopic­silico­volcano­coni­osis definition». Retrieved 7 March 2009.
  31. ^ «PNEUMONO­ULTRA­MICROSCOPIC­SILICO­VOLCANO­CONI­OSIS». Archived from the original on 8 June 2009. Retrieved 7 March 2009.
  32. ^ «BBC – h2g2 – Pneumono­ultra­microscopic­silico­volcano­coni­osis – The Longest Word». BBC. Retrieved 7 March 2009.
  33. ^ «Pseudopseudohypoparathyroidism | Genetic and Rare Diseases Information Center (GARD) – an NCATS Program». Retrieved 31 January 2017.
  34. ^ «What is the longest English word?». AskOxford. Archived from the original on 22 October 2008. Retrieved 22 August 2010.
  35. ^ «Law change spells end for Germany’s longest word». Associated Press. 4 June 2013.
  36. ^ a b Austria chooses its Word of the Year, The Local, 9 Dec. 2016.
  37. ^ Presseerklärung der Jury zur Wahl des Österreichischen Worts des Jahres, Forschungsstelle Österreichisches Deutsch, 9 Dec. 2016
  38. ^ De Luca, Kenneth M. (2005). Aristophanes’ male and female revolutions : a reading of Aristophanes’ Knights and Assemblywomen. Lanham, MD: Lexington Books. p. 124. ISBN 978-0-7391-0833-8.
  39. ^ «Longest word in hebrew | Hebrew language | Preply». Retrieved 27 May 2020.
  40. ^ «हिंदी भाषा का अब तक निर्मित किया गया सबसे बड़ा शब्द है?». Upto Cricket (in Hindi). Retrieved 28 February 2021.
  41. ^ Helgason, Haukur Már. «Hvernig hljóðar lengsta orð í heimi á íslensku?». Vísindavefurinn. University of Iceland. Retrieved 28 December 2013.
  42. ^ Quasthoff, Uwe; Fiedler, Sabine; Hallsteinsdóttir, Erla, eds. (14 May 2012). Frequency Dictionary Icelandic / Íslensk tíðniorðabók. Leipziger Universitätsverlag. ISBN 978-3-86583-656-4. OCLC 808247819.
  43. ^[dead link]
  44. ^ «3 kata terpanjang dalam KBBI».
  45. ^ «Foclóir Gaeilge–Béarla (Ó Dónaill): grianghrafadóireacht». Retrieved 13 January 2022.
  46. ^ Crusca, Accademia Della (1829). «Dizionario della lingua italiana …»
  47. ^ «Dante: De Vulgari Eloquentia II». Retrieved 22 July 2016.
  48. ^ «Láadan-to-English».
  49. ^ «Láadan to English – Sh». 25 October 2015.
  50. ^ «A Collection of Word Oddities and Trivia».
  51. ^ «Loooooooong words». Archived from the original on 9 May 2016. Retrieved 20 April 2016.
  52. ^ NZPA (11 August 2003). «Nasa turns to Kiwi when it needs expert space advice». New Zealand Herald. Retrieved 28 March 2011. Three years ago, Mr Coleman, a website designer, posted a message on an internet bulletin board about Taumata­whakatangihanga­koauau­o­tamatea­turi­pukaka­piki­maunga­horo­nuku­pokai­whenua­ki­tana­tahu in southern Hawkes Bay. It is the second-longest place name in the world, according to Wises New Zealand Guide.
  53. ^ «pięćdziesięciogroszówka — Słownik SJP».
  54. ^ Bălhuc, Paul (15 January 2017). «Câte litere are cel mai lung cuvânt din limba română și care este singurul termen ce conține toate vocalele». Adevărul (in Romanian).
  55. ^ «Electroglotospectrografie». Dicționarul explicativ al limbii române (in Romanian). Retrieved 10 February 2021.
  56. ^ «Curiozități lingvistice: cele mai lungi cuvinte din limba română». Dicț (in Romanian). Retrieved 10 February 2021.
  57. ^ «Слитное и раздельное написание имён числительных — Агентство переводов Lingvotech».
  58. ^ «Ἡλληνιστεύκοντος». 13 March 2010.
  59. ^ McFarlan, Donald; McWhirter, Norris (1991). Guinness Book of World Records, 1991. ISBN 9780553289541.
  60. ^ «Guinness World Records – Longest word». Retrieved 23 September 2017.
  61. ^ «Aké je najdlhšie slovo v slovenčine?».
  62. ^[bare URL PDF]
  63. ^ «Viete, ktoré slovo slovenského jazyka je najdlhšie?». 23 November 2021.
  64. ^ «Promo/NocVyskumnikov2011/Kviz».
  65. ^ «Najdlhšie slová v slovenčine, aké dokážeme povedať a vysloviť». 8 December 2016.
  66. ^ Roldán Calzado, Juan Luis (2 October 2008). «La palabra más larga». Me la juego a letras (in Spanish). Lulu Press. p. 34. ISBN 978-1-4092-2893-6. Retrieved 15 March 2017 – via Google Books.
  67. ^ «Anticonstitucional | Diccionario de la lengua española».
  68. ^ The Guinness Book of Records 1985. Guinness Books. 1985. p. 89. ISBN 0-85112-419-4.
  69. ^ Sonja Lang. «New official word / Nova oficiala vorto». Retrieved 7 March 2022.
  70. ^ Sonja Lang (2021). Toki Pona Dictionary. ISBN 978-0-9782923-6-2.
  71. ^ «toki pona | toki! After a lot of demand for a sitelen pona glyph for the extinct words «**apeja**» and «**kijetesantakalu**» *(believe it or not), *I’ve d…» 13 July 2019. Retrieved 24 October 2022.
  72. ^ Sonja Lang (2014). Toki Pona: The Language of Good. ISBN 978-0-9782923-0-0.
  73. ^ Phan Ngọc Linh; Phạm Thịnh. ««Lộ» sai sót mới tại CK Đường lên đỉnh Olympia 2012?». Dân Trí. Retrieved 18 October 2013.
  74. ^ Barnes, Leslie (2014). Vietnam and the Colonial Condition of French Literature. University of Nebraska Press. p. 125. ISBN 978-0-8032-66759 – via Google Books. The formal characteristics of Vietnamese compounds are not completely clear, however, and because no obvious graphic boundaries exist to demarcate one word from another, the distinction between word and phrase is often very difficult to discern.
  75. ^ «Thông tin cơ bản về các nước, khu vực và quan hệ với Việt Nam» [Basic information on countries, regions, and relations with Vietnam] (in Vietnamese). Vietnam Ministry of Foreign Affairs.
  76. ^ «Kômixacjepxkaia V. F.». Encyclopedic Dictionary of Vietnam (in Vietnamese). 2005.
  77. ^ «Rôjơđextơvenxki G. N.». Encyclopedic Dictionary of Vietnam (in Vietnamese). 2005.
  78. ^ «Mêtơrôpôliten Ôpêra». Encyclopedic Dictionary of Vietnam (in Vietnamese). 2005.
  79. ^ «LISTSERV 15.5 – WELSH-L Archives».
  • 1
    large number

    Персональный Сократ > large number

  • 2
    large number

    English-Russian base dictionary > large number

  • 3
    large number

    English-Russian big medical dictionary > large number

  • 4
    large number

    Большой англо-русский и русско-английский словарь > large number

  • 5
    large number

    Универсальный англо-русский словарь > large number

  • 6
    large number of radial points in the grid

    Универсальный англо-русский словарь > large number of radial points in the grid

  • 7
    a large number of

    English-Russian base dictionary > a large number of

  • 8
    a large number

    English-Russian grammar dictionary > a large number

  • 9
    a large number of

    Персональный Сократ > a large number of

  • 10
    a large number of

    Большой англо-русский и русско-английский словарь > a large number of

  • 11
    a large number

    Универсальный англо-русский словарь > a large number

  • 12
    a large number of

    Англо-русский универсальный дополнительный практический переводческий словарь И. Мостицкого > a large number of

  • 13
    a large number, a large amount, a great deal

    English-Russian grammar dictionary > a large number, a large amount, a great deal

  • 14
    a large number of people

    Универсальный англо-русский словарь > a large number of people

  • 15
    for a large number of customers

    Универсальный англо-русский словарь > for a large number of customers

  • 16
    he attracted a large number of followers

    Универсальный англо-русский словарь > he attracted a large number of followers

  • 17
    if we could have at our disposal a large number of precise observations we

    Универсальный англо-русский словарь > if we could have at our disposal a large number of precise observations we

  • 18
    legion (A very large number, multitude)

    Универсальный англо-русский словарь > legion (A very large number, multitude)

  • 19
    span a large number of tracks

    Универсальный англо-русский словарь > span a large number of tracks

  • 20
    this work has given impetus to a large number of investigations

    Универсальный англо-русский словарь > this work has given impetus to a large number of investigations


  • Следующая →
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

См. также в других словарях:

  • large number — index plurality, quantity Burton s Legal Thesaurus. William C. Burton. 2006 …   Law dictionary

  • Large Number Hypothesis — Die Large Number Hypothesis (engl., LNH, deutsch Hypothese der großen Zahlen) ist eine Vermutung in der theoretischen Physik. Sie wurde 1937 von Paul Dirac erhoben und beschäftigt sich mit der seltsamen Häufung von absoluten Verhältnissen in der… …   Deutsch Wikipedia

  • large number — noun a large indefinite number (Freq. 18) a battalion of ants a multitude of TV antennas a plurality of religions • Syn: ↑battalion, ↑multitude, ↑plurality, ↑pack …   Useful english dictionary

  • Large format lens — Large format lenses are photographic optics that provide an image circle large enough to cover large format film or plates. Large format lenses are typically used in large format cameras and view cameras.Photographic optics generally project a… …   Wikipedia

  • large — [ lardʒ ] adjective *** bigger than usual in size: The house had an exceptionally large yard. Large crowds gather each year in St. Peter s Square to see the Pope. A large man with a long ginger beard stood in the doorway. a. used in clothing… …   Usage of the words and phrases in modern English

  • large-scale — large ,scale adjective only before noun * 1. ) involving a large number of people or things, or happening over a large area: The gang is believed to be involved in large scale international drug trafficking. We need to protect the village from… …   Usage of the words and phrases in modern English

  • large attendance — large number of persons present …   English contemporary dictionary

  • Large numbers — This article is about large numbers in the sense of numbers that are significantly larger than those ordinarily used in everyday life, for instance in simple counting or in monetary transactions. The term typically refers to large positive… …   Wikipedia

  • Large deviations theory — In Probability Theory, the Large Deviations Theory concerns the asymptotic behaviour of remote tails of sequences of probability distributions. Some basic ideas of the theory can be tracked back to Laplace and Cramér, although a clear unified… …   Wikipedia

  • number — number1 W1S1 [ˈnʌmbə US bər] n ▬▬▬▬▬▬▬ 1¦(number)¦ 2¦(phone)¦ 3¦(in a set/list)¦ 4¦(for recognizing somebody/something)¦ 5¦(amount)¦ 6 numbers 7¦(music)¦ 8¦(magazine)¦ 9 have somebody s number 10 black/elegant etc …   Dictionary of contemporary English

  • large — [[t]lɑ͟ː(r)ʤ[/t]] ♦ larger, largest 1) ADJ GRADED A large thing or person is greater in size than usual or average. The Pike lives mainly in large rivers and lakes… In the largest room about a dozen children and seven adults are sitting on the… …   English dictionary

which language is richest in words

Have you heard language experts say that English has more words than other languages? The claim is made but it’s practically impossible to verify.

Steven Frank, the author of The Pen Commandments claims that English has 500,000 words with German having about 135,000 and French having fewer than 100,000.

But wait…

A blog post for The Economist agrees that English is rich in vocabulary, but comparisons with other languages can’t be made for several reasons.

The simplest problem in comparing the size of different languages is inflection.

Do we count “run”, “runs” and “ran” as three separate words? Another problem is multiple meanings. Do we count “run” the verb and “run” the noun as one word or two? What about “run” as in the long run of a play on Broadway? According to a recent NPR article, “run” has at least 645 different meanings!

When counting a language’s words do we count compounds? Is “every day” one word or two? Are the names of new chemical compounds words? Answering the question, “What is the richest language?” becomes more and more complicated.

Estoy, Estás, Está—One Word or Three?

Some languages inflect much more than English. The Spanish verb “estar” has dozens of forms—estoy, estás, está, “I am,” “you are,” “he is” and so on.

Does that make Spanish richer in word count?

Some languages inflect much less (Chinese is famously ending-free). So, whether we count inflected forms will have a huge influence on final counts.

Moreover, many languages habitually build long words from short ones.

German is obvious; it is a trifle to coin a new compound word for a new situation. For example, is the German Unabhängigkeitserklärung—declaration of independence—one word?

Given the possibilities for compounds, German would quickly outstrip English, with the constant addition of new legitimate German “words”, which Germans would accept without blinking.

Glasses looking into an open book

A Sentence that Translates as One Word

The Turkish language is similar in this way.

Turkish not only crams words together but does so in ways that make whole, meaningful sentences.

“Were you one of those people whom we could not make into a Czechoslovak?” translates as one word in Turkish.

You write it without spaces, pronounce it in one breath in speaking, it can’t be interrupted with digressions, and so forth.

Counting the Words in the Dictionary

Another way of measuring the vocabulary in a language and comparing counts is by counting the number of words listed in a standard authoritative dictionary in that language.

From a list on Wikipedia, here’s one such comparison. This is a list of dictionaries considered authoritative or complete by approximate number of total words or headwords, included.

These figures do not include entries with senses for different word classes (such as noun and adjective) and homographs.

Wikipedia says it’s possible to count the number of entries in a dictionary, but it’s not possible to count the number of words in a language:

Language Words in the Dictionary
Korean 1,100,373
Japanese 500,000
Italian 260,000
English 171,476
Russian 150,000
Spanish 93,000
Chinese 85,568

Which language has the most words? Maybe it’s English.

The Oxford Dictionary says it’s quite probable that English has more words than most comparable world languages. The reason is historical.

English was originally a Germanic language, related to Dutch and German. English shares much of its grammar and basic vocabulary with those languages.

After the Norman Conquest in 1066 English was hugely influenced by Norman French, which became the language of the ruling class for a considerable period, and by Latin, which was the language of scholarship and of the Church.

Very large numbers of French and Latin words entered the language. This melding of languages means English has a much larger vocabulary than either the Germanic languages or the members of the Romance language family according to Oxford.

English builds its vocabulary through a willingness to accept foreign words. And because English became an international language, it has absorbed vocabulary from a large number of other sources.

So, which language is richest in words?

Let us ask a different, and we think more important question:

Does it really matter?

Whatever languages you translate or interpret in—Chinese, Japanese, Russian, sign language, or others—you are bound to have a rich body of words to work with.

But if you want to dig deeper into the subject, check out Part 2 on the Arabic language.

About Interpreters and Translators, Inc.

iTi’s dedicated and experienced team offers a wide range of multilingual solutions for domestic and global corporations in a variety of industries. Do you require translation services to enhance your global marketing and sales initiatives or interpreter services to communicate across languages? We specialize in custom language solutions and work with over 250 languages so regardless of the barrier you face, we will work together in synergy to bridge the gap to ensure success.

Stay in Touch



Oxford Dictionaries


I have around 5000 files and I need to find words in each of them from a list of 10000 words. My current code uses a (very) long regex to do it, but it’s very slow.

wordlist = [...list of around 10000 english words...]
filelist = [...list of around 5000 filenames...]
wordlistre = re.compile('|'.join(wordlist), re.IGNORECASE)
discovered = []

for x in filelist:
    with open(x, 'r') as f:
        found = wordlistre.findall(
    if found:
        discovered = [x, found]

This checks files at a rate of around 5 files per second, which is a lot faster than doing it manually, however it’s still very slow. Is there a better way to do this?

asked Apr 15, 2015 at 6:02

Daffy's user avatar


If you have access to grep on a command line, you can try the following:

grep -i -f wordlist.txt -r DIRECTORY_OF_FILES

You’ll need to create a file wordlist.txt of all the words (one word per line).

Any lines in any of your files that match any of your words will be printed to STDOUT in the following format:

<path/to/file>:<matching line>

answered Apr 15, 2015 at 6:19

Sam Choukri's user avatar

Sam ChoukriSam Choukri

1,86411 silver badges17 bronze badges


Without more info on your data, a couple of thoughts are to use dictionaries instead of lists, and to reduce the data needed for searching/sorting. Also consider using re.split if your delimiters are not as clean as below:

wordlist = 'this|is|it|what|is|it'.split('|')
d_wordlist = {}

for word in wordlist:
    first_letter = word[0]

filelist = [...list of around 5000 filenames...]
discovered = {}

for x in filelist:
    with open(x, 'r') as f:
        for word in
            first_letter = word[0]
            if word in d_wordlist[first_letter]:

return discovered

answered Apr 15, 2015 at 8:18

Patrick Stewart's user avatar


The Aho-Corasick algorithm was devised for precisely this usage, and implemented as fgrep in Unix. With POSIX, the command grep -F is defined to perform this function.

It differs from regular grep in that it only uses fixed strings (not regular expressions) and is optimized for searching for a large number of strings.

To run it on a large number of files, specify the precise files on the command line, or pass them through xargs:

xargs -a filelist.txt grep -F -f wordlist.txt

The function of xargs is to fill up the command line with as many files as possible, and run grep as many times as necessary;

grep -F -f wordlist.txt (files 1 through 2,500 maybe)
grep -F -f wordlist.txt (files 2,501 through 5,000)

The precise number of files per invocation depends on the length of the individual file names, and the size of the ARG_MAX constant on your system.

answered Apr 15, 2015 at 8:34

tripleee's user avatar


172k32 gold badges264 silver badges311 bronze badges


Понравилась статья? Поделить с друзьями:
  • Large number in excel
  • Latin word for english language
  • Large first letter in word
  • Latin word for ended
  • Latin word for elements