Richard’s answer will work well in many cases, but it can take exponential time: this will happen if there are many segments of the string W, each of which can be decomposed in multiple different ways. For example, suppose W is abcabcabcd
, and the other words are ab
, c
, a
and bc
. Then the first 3 letters of W can be decomposed either as ab|c
or as a|bc
… and so can the next 3 letters, and the next 3, for 2^3 = 8 possible decompositions of the first 9 letters overall:
a|bc|a|bc|a|bc
a|bc|a|bc|ab|c
a|bc|ab|c|a|bc
a|bc|ab|c|ab|c
ab|c|a|bc|a|bc
ab|c|a|bc|ab|c
ab|c|ab|c|a|bc
ab|c|ab|c|ab|c
All of these partial decompositions necessarily fail in the end, since there is no word in the input that contains W’s final letter d
— but his algorithm will explore them all before discovering this. In general, a word consisting of n copies of abc
followed by a single d
will take O(n*2^n) time.
We can improve this to O(n^2) worst-case time (at the cost of O(n) space) by recording extra information about the decomposability of suffixes of W as we go along — that is, suffixes of W that we have already discovered we can or cannot match to word sequences. This type of algorithm is called dynamic programming.
The condition we need for some word W to be decomposable is exactly that W begins with some word X from the set of other words, and the suffix of W beginning at position |X|+1 is decomposable. (I’m using 1-based indices here, and I’ll denote a substring of a string S beginning at position i and ending at position j by S[i..j].)
Whenever we discover that the suffix of the current word W beginning at some position i is or is not decomposable, we can record this fact and make use of it later to save time. For example, after testing the first 4 decompositions in the 8 listed earlier, we know that the suffix of W beginning at position 4 (i.e., abcabcd
) is not decomposable. Then when we try the 5th decomposition, i.e., the first one starting with ab
, we first ask the question: Is the rest of W, i.e. the suffix of W beginning at position 3, decomposable? We don’t know yet, so we try adding c
to get ab|c
, and then we ask: Is the rest of W, i.e. the suffix of W beginning at position 4, decomposable? And we find that it has already been found not to be — so we can immediately conclude that no decomposition of W beginning with ab|c
is possible either, instead of having to grind through all 4 possibilities.
Assuming for the moment that the current word W is fixed, what we want to build is a function f(i) that determines whether the suffix of W beginning at position i is decomposable. Pseudo-code for this could look like:
- Build a trie the same way as Richard's solution does.
- Initialise the array KnownDecomposable[] to |W| DUNNO values.
f(i):
- If i == |W|+1 then return 1. (The empty suffix means we're finished.)
- If KnownDecomposable[i] is TRUE or FALSE, then immediately return it.
- MAIN BODY BEGINS HERE
- Walk through Richard's trie from the root, following characters in the
suffix W[i..|W|]. Whenever we find a trie node at some depth j that
marks the end of a word in the set:
- Call f(i+j) to determine whether the rest of W can be decomposed.
- If it can (i.e. if f(i+j) == 1):
- Set KnownDecomposable[i] = TRUE.
- Return TRUE.
- If we make it to this point, then we have considered all other
words that form a prefix of W[i..|W|], and found that none of
them yield a suffix that can be decomposed.
- Set KnownDecomposable[i] = FALSE.
- Return FALSE.
Calling f(1) then tells us whether W is decomposable.
By the time a call to f(i) returns, KnownDecomposable[i] has been set to a non-DUNNO value (TRUE or FALSE). The main body of the function is only run if KnownDecomposable[i] is DUNNO. Together these facts imply that the main body of the function will only run as many times as there are distinct values i that the function can be called with. There are at most |W|+1 such values, which is O(n), and outside of recursive calls, a call to f(i) takes at most O(n) time to walk through Richard’s trie, so overall the time complexity is bounded by O(n^2).
Problem Description:
I am working on a problem, which is to write a program to find the longest word made of other words in a list of words.
EXAMPLE
Input: test, tester, testertest, testing, testingtester
Output: testingtester
I searched and find the following solution, my question is I am confused in step 2, why we should break each word in all possible ways? Why not use each word directly as a whole? If anyone could give some insights, it will be great.
The solution below does the following:
- Sort the array by size, putting the longest word at the front
- For each word, split it in all possible ways. That is, for “test”, split it into {“t”, “est”}, {“te”, “st”} and {“tes”, “t”}.
- Then, for each pairing, check if the first half and the second both exist elsewhere in the array.
- “Short circuit” by returning the first string we find that fits condition #3.
Solution – 1
Answering your question indirectly, I believe the following is an efficient way to solve this problem using tries.
Build a trie from all of the words in your string.
Sort the words so that the longest word comes first.
Now, for each word W, start at the top of the trie and begin following the word down the tree one letter at a time using letters from the word you are testing.
Each time a word ends, recursively re-enter the trie from the top making a note that you have “branched”. If you run out of letters at the end of the word and have branched, you’ve found a compound word and, because the words were sorted, this is the longest compound word.
If the letters stop matching at any point, or you run out and are not at the end of the word, just back track to wherever it was that you branched and keep plugging along.
I’m afraid I don’t know Java that well, so I’m unable to provide you sample code in that language. I have, however, written out a solution in Python (using a trie implementation from this answer). Hopefully it is clear to you:
#!/usr/bin/env python3
#End of word symbol
_end = '_end_'
#Make a trie out of nested HashMap, UnorderedMap, dict structures
def MakeTrie(words):
root = dict()
for word in words:
current_dict = root
for letter in word:
current_dict = current_dict.setdefault(letter, {})
current_dict[_end] = _end
return root
def LongestCompoundWord(original_trie, trie, word, level=0):
first_letter = word[0]
if not first_letter in trie:
return False
if len(word)==1 and _end in trie[first_letter]:
return level>0
if _end in trie[first_letter] and LongestCompoundWord(original_trie, original_trie, word[1:], level+1):
return True
return LongestCompoundWord(original_trie, trie[first_letter], word[1:], level)
#Words that were in your question
words = ['test','testing','tester','teste', 'testingtester', 'testingtestm', 'testtest','testingtest']
trie = MakeTrie(words)
#Sort words in order of decreasing length
words = sorted(words, key=lambda x: len(x), reverse=True)
for word in words:
if LongestCompoundWord(trie,trie,word):
print("Longest compound word was '{0:}'".format(word))
break
With the above in mind, the answer to your original question becomes clearer: we do not know ahead of time which combination of prefix words will take us successfully through the tree. Therefore, we need to be prepared to check all possible combinations of prefix words.
Since the algorithm you found does not have an efficient way of knowing what subsets of a word are prefixes, it splits the word at all possible points in word to ensure that all prefixes are generated.
Solution – 2
I guess you are just making a confusion about which words are split.
After sorting, you consider the words one after the other, by decreasing length. Let us call a “candidate” a word you are trying to decompose.
If the candidate is made of other words, it certainly starts with a word, so you will compare all prefixes of the candidate to all possible words.
During the comparison step, you compare a candidate prefix to the whole words, not to split words.
By the way, the given solution will not work for triwords and longer. The fix is as follows:
- try every prefix of the candidate and compare it to all words
- in case of a match, repeat the search with the suffix.
Example:
testingtester
gives the prefixes
t
, te
, tes
, test
, testi
, testin
, testing
, testingt
, testingte
, testingtes
and testingteste
Among these, test
and testing
are words. Then you need to try the corresponding suffixes ingtester
and tester
.
ingtester
gives
i
, in
, ing
, ingt
, ingte
, ingtes
, ingtest
and ingteste
, none of which are words.
tester
is a word and you are done.
IsComposite(InitialCandidate, Candidate):
For all Prefixes of Candidate:
if Prefix is in Words:
Suffix= Candidate - Prefix
if Suffix == "":
return Candidate != InitialCandidate
else:
return IsComposite(InitialCandidate, Suffix)
For all Candidate words by decreasing size:
if IsComposite(Candidate, Candidate):
print Candidate
break
Solution – 3
I would probably use recursion here. Start with the longest word and find words it starts with. For any such word remove it from the original word and continue with the remaining part in the same manner.
Pseudo code:
function iscomposed(orininalword, wordpart)
for word in allwords
if word <> orininalword
if wordpart = word
return yes
elseif wordpart starts with word
if iscomposed(orininalword, wordpart - word)
return yes
endif
endif
endif
next
return no
end
main
sort allwords by length descending
for word in allwords
if iscomposed(word, word) return word
next
end
Example:
words: abcdef abcde abc cde ab
Passes:
1. abcdef starts with abcde. rest = f. 2. no word f starts with found. 1. abcdef starts with abc. rest = def. 2. no word def starts with found. 1. abcdef starts with ab. rest = cdef. 2. cdef starts with cde. rest = f. 3. no word f starts with found. 1. abcde starts with abc. rest = cde. 2. cde itself found. abcde is a composed word
Solution – 4
Richard’s answer will work well in many cases, but it can take exponential time: this will happen if there are many segments of the string W, each of which can be decomposed in multiple different ways. For example, suppose W is abcabcabcd
, and the other words are ab
, c
, a
and bc
. Then the first 3 letters of W can be decomposed either as ab|c
or as a|bc
… and so can the next 3 letters, and the next 3, for 2^3 = 8 possible decompositions of the first 9 letters overall:
a|bc|a|bc|a|bc
a|bc|a|bc|ab|c
a|bc|ab|c|a|bc
a|bc|ab|c|ab|c
ab|c|a|bc|a|bc
ab|c|a|bc|ab|c
ab|c|ab|c|a|bc
ab|c|ab|c|ab|c
All of these partial decompositions necessarily fail in the end, since there is no word in the input that contains W’s final letter d
— but his algorithm will explore them all before discovering this. In general, a word consisting of n copies of abc
followed by a single d
will take O(n*2^n) time.
We can improve this to O(n^2) worst-case time (at the cost of O(n) space) by recording extra information about the decomposability of suffixes of W as we go along — that is, suffixes of W that we have already discovered we can or cannot match to word sequences. This type of algorithm is called dynamic programming.
The condition we need for some word W to be decomposable is exactly that W begins with some word X from the set of other words, and the suffix of W beginning at position |X|+1 is decomposable. (I’m using 1-based indices here, and I’ll denote a substring of a string S beginning at position i and ending at position j by S[i..j].)
Whenever we discover that the suffix of the current word W beginning at some position i is or is not decomposable, we can record this fact and make use of it later to save time. For example, after testing the first 4 decompositions in the 8 listed earlier, we know that the suffix of W beginning at position 4 (i.e., abcabcd
) is not decomposable. Then when we try the 5th decomposition, i.e., the first one starting with ab
, we first ask the question: Is the rest of W, i.e. the suffix of W beginning at position 3, decomposable? We don’t know yet, so we try adding c
to get ab|c
, and then we ask: Is the rest of W, i.e. the suffix of W beginning at position 4, decomposable? And we find that it has already been found not to be — so we can immediately conclude that no decomposition of W beginning with ab|c
is possible either, instead of having to grind through all 4 possibilities.
Assuming for the moment that the current word W is fixed, what we want to build is a function f(i) that determines whether the suffix of W beginning at position i is decomposable. Pseudo-code for this could look like:
- Build a trie the same way as Richard's solution does.
- Initialise the array KnownDecomposable[] to |W| DUNNO values.
f(i):
- If i == |W|+1 then return 1. (The empty suffix means we're finished.)
- If KnownDecomposable[i] is TRUE or FALSE, then immediately return it.
- MAIN BODY BEGINS HERE
- Walk through Richard's trie from the root, following characters in the
suffix W[i..|W|]. Whenever we find a trie node at some depth j that
marks the end of a word in the set:
- Call f(i+j) to determine whether the rest of W can be decomposed.
- If it can (i.e. if f(i+j) == 1):
- Set KnownDecomposable[i] = TRUE.
- Return TRUE.
- If we make it to this point, then we have considered all other
words that form a prefix of W[i..|W|], and found that none of
them yield a suffix that can be decomposed.
- Set KnownDecomposable[i] = FALSE.
- Return FALSE.
Calling f(1) then tells us whether W is decomposable.
By the time a call to f(i) returns, KnownDecomposable[i] has been set to a non-DUNNO value (TRUE or FALSE). The main body of the function is only run if KnownDecomposable[i] is DUNNO. Together these facts imply that the main body of the function will only run as many times as there are distinct values i that the function can be called with. There are at most |W|+1 such values, which is O(n), and outside of recursive calls, a call to f(i) takes at most O(n) time to walk through Richard’s trie, so overall the time complexity is bounded by O(n^2).
Solution – 5
To find longest world using recursion
class FindLongestWord {
public static void main(String[] args) {
List<String> input = new ArrayList<>(
Arrays.asList("cat", "banana", "rat", "dog", "nana", "walk", "walker", "dogcatwalker"));
List<String> sortedList = input.stream().sorted(Comparator.comparing(String::length).reversed())
.collect(Collectors.toList());
boolean isWordFound = false;
for (String word : sortedList) {
input.remove(word);
if (findPrefix(input, word)) {
System.out.println("Longest word is : " + word);
isWordFound = true;
break;
}
}
if (!isWordFound)
System.out.println("Longest word not found");
}
public static boolean findPrefix(List<String> input, String word) {
boolean output = false;
if (word.isEmpty())
return true;
else {
for (int i = 0; i < input.size(); i++) {
if (word.startsWith(input.get(i))) {
output = findPrefix(input, word.replace(input.get(i), ""));
if (output)
return true;
}
}
}
return output;
}
}
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
1
branch
0
tags
Code
-
Use Git or checkout with SVN using the web URL.
-
Open with GitHub Desktop
-
Download ZIP
Latest commit
Files
Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
Longest-Word-Made-of-Other-Words
Longest Word Made of Other Words using a trie
Problem Statement
a program that reads a file containing a sorted list of words (one word per line, no
spaces, all lower case), then identifies the
- 1st longest word in the file that can be constructed by concatenating copies of shorter words also found in the file.
- The program should then go on to report the 2nd longest word found
- Total count of how many of the words in the list can be constructed
of other words in the list.
import the folder into eclipse and run the code
A few points I would like to mention regarding the approach I took to solve this :
- I have used a Trie data structure to store the words after reading them from the file.
- I also maintain a HashMap as a part of the Trie (populated during construction of the Trie) which stores the word parsed as Key and the list of prefixes that word has in the Trie as value.
- A Queue is maintained to store the word parsed and its equivalent suffix(created by removing the prefix from the word which the hashmap maintains) If there are multiple prefixes in HashMap for same word, 2 corresponding suffixes are stored for that word in the Queue.
- Now we pop elements out of the Queue and keep checking if the suffix is present in the Trie. If it is present then we consider it as a compound word. If not then we again check if same word has a prefix and if there is a prefix we insert the suffix and the word in the Queue.
The first thing that strikes me is that you’ve put everything within the Program
class, which forces your methods to be static
, and any program where everything is static
is a program I want to rewrite
Let’s start with FindLongestWords(string[])
:
if (listOfWords == null) throw new ArgumentException("listOfWords");
This guard clause is throwing the wrong exception, it should be an ArgumentNullException
. Now the goal of a guard clause is to fail fast. Interestingly if you leave it out, the next line would throw an ArgumentNullException
all by itself, merely by passing it to the OrderByDescending
extension method:
var sortedWords = listOfWords.OrderByDescending(word => word.Length).ToList();
I see you’re using var
— I like that. So sortedWords
is a List<string>
. I think the method could happily take any IEnumerable<string>
instead of an array.
Now the next thing I see is a lie — dict
would be an almost-acceptable name for any IDictionary
, but it’s a HashSet<T>
…
var dict = new HashSet<String>(sortedWords);
And here we are. You’re using System.Linq
, so this loop could very well be rewritten with a much shorter Linq-expression:
foreach (var word in sortedWords)
{
if (isMadeOfWords(word, dict))
{
return word;
}
}
return null;
Turns into this one-liner (notice dict
is gone!):
return sortedWords.FirstOrDefault(word => isMadeOfWords(word, sortedWords));
isMadeOfWords
should be named IsMadeOfWords
, and can be happy with some ICollection
:
private static bool IsMadeOfWords(string word, ICollection<string> dict)
{
if (String.IsNullOrEmpty(word)) return false;
if (word.Length == 1)
{
return dict.Contains(word);
}
foreach (var pair in generatePairs(word).Where(pair => dict.Contains(pair.Item1)))
{
return dict.Contains(pair.Item2) || IsMadeOfWords(pair.Item2, dict);
}
return false;
}
I find that’s more readable than the more concise form that ReSharper suggests:
if (String.IsNullOrEmpty(word)) return false;
return word.Length == 1
? dict.Contains(word)
: generatePairs(word).Where(pair => dict.Contains(pair.Item1))
.Select(pair => dict.Contains(pair.Item2) || IsMadeOfWords(pair.Item2, dict))
.FirstOrDefault();
And then generatePairs
should be renamed to GeneratePairs
and can return IEnumerable<Tuple<string, string>>
instead of List<Tuple<string, string>>
. The only thing that itches here is the absence of var
in for(int i = 1...
— if you’re going to use var
, might as well go all the way!
I’d put that logic in its own class, and then… looks good!
The longest word in any given language depends on the word formation rules of each specific language, and on the types of words allowed for consideration.
Agglutinative languages allow for the creation of long words via compounding. Words consisting of hundreds, or even thousands of characters have been coined. Even non-agglutinative languages may allow word formation of theoretically limitless length in certain contexts. An example common to many languages is the term for a very remote ancestor, «great-great-…..-grandfather», where the prefix «great-» may be repeated any number of times. The examples of «longest words» within the «Agglutinative languages» section may be nowhere near close to the longest possible word in said language, but is instead a popular example of a text-heavy word.
Systematic names of chemical compounds can run to hundreds of thousands of characters in length. The rules of creation of such names are commonly defined by international bodies, therefore they formally belong to many languages. The longest recognized systematic name is for the protein titin, at 189,819 letters.[1] While lexicographers regard generic names of chemical compounds as verbal formulae rather than words,[2] for its sheer length the systematic name for titin is often included in longest-word lists.
Longest word candidates may be judged by their acceptance in major dictionaries such as the Oxford English Dictionary or in record-keeping publications like Guinness World Records, and by the frequency of their use in ordinary language.
Agglutinative languagesEdit
BasqueEdit
The longest Basque toponym is Azpilicuetagaraicosaroyarenberecolarrea (40) which means «The lower field of the sheepfold (located in) the hight of Azpilicueta».[3]
EsperantoEdit
Since Esperanto allows word compounding, there are no limits on how long a word can theoretically become. An example is the 39-letter oranĝ-kanton-pafil-limig-aktivul-malamanto, meaning «Orange County gun control activist hater». Such clusters are not considered good style (the 8-word alternative oranĝkantona malamanto de aktivuloj por limigo de pafiloj is more standard), but they are permissible under the rules of Esperanto grammar.[4] Hyphens are optional in Esperanto compounds,[5] so oranĝkantonpafillimigaktivulmalamanto is also technically a valid spelling.
The longest Esperanto roots officially recognized by the Akademio de Esperanto are 13 letters long, shown here with the added substantive «-o» ending:
- administracio (administration),
- aŭtobiografio (autobiography),
- diskriminacio (discrimination),
- konservatorio (conservatory),
- paleontologio (palaeontology),
- paralelogramo (parallelogram), and
- trigonometrio (trigonometry).[6]
The longest word found in the dictionary Plena Ilustrita Vortaro as of its 2020 edition is the 24-letter proper noun Meklenburgio-Antaŭpomerio (the German state Mecklenburg-Vorpommern), followed by the 21-letter word proviantadministracio (rations administration).
As of March 2022 the longest word found in the Tekstaro de Esperanto text corpus is the 66-letter word unue-volapukista-poste-esperantista-poste-idista-poste-denove-esperantista, meaning «first-volapukist-then-esperantist-then-idist-then-again-esperantist», which was used in a review published in Monato in 1997 to describe František Lorenz.[7] However, this word does not follow normal Esperanto word formation rules. Other long words found in Tekstaro de Esperanto that do follow regular word formation include:
- sescent-kvindek-mil-kvadratkilometra (consisting of 650 000 square kilometers), 33 letters, used in an Esperanto version of an 2011 article by Marc Lavergne in Le Monde diplomatique,
- tragedio-komedio-historio-pastoraloj (tragical-comical-historical-pastorals), 33 letters, used in L. L. Zamenhof’s translation of Hamlet,
- Nord-Atlantik-Traktad-Organizo (North Atlantic Treaty Organization), 27 letters, more commonly translated with two words: Nord-Atlantika Traktat-Organiz(aĵ)o.
EstonianEdit
- Sünnipäevanädalalõpupeopärastlõunaväsimatus meaning «untiredness of a birthday week graduation party» which is 46 letters.[citation needed]
- 31 lettered word of uusaastaöövastuvõtuhommikuidüll meaning «morning idyll after the new year».[8]
- There is also the 25 letter long word of põllumajandusministeerium which is «Ministry of Agriculture».[citation needed]
- The word kuulilennuteetunneliluuk meaning «the hatch a bullet flies out of when exiting a tunnel» is 24 letters long and a palindrome. It could be one of the longest palindromes.[citation needed]
FinnishEdit
Examples of long words that have been in everyday use in the Finnish language are kolmivaihekilowattituntimittari which means «three-phase kilowatt hour meter» (31 letters), liikekannallepanotarkastuskierros («mobilization inspection round», 33 letters),[9] peruspalveluliikelaitoskuntayhtymä («a public utility of a municipal federation for provision of basic services», 34 letters),[10] and lentokonesuihkuturbiinimoottoriapumekaanikkoaliupseerioppilas «airplane jet turbine engine auxiliary mechanic non-commissioned officer student» (61 letters), an actual military term, although one which has been deprecated. The longest military term in current use is vastatykistömaalinosoitustutkakalustojärjestelmäinsinöörierikoisupseeri «counter-artillery targeting radar systems engineer specialist officer» with 71 characters, with 2 more if grammatically incorrect extra hyphens added for readability are counted.[citation needed] If conjugated forms are allowed, even longer real words can be made. Allowing derivatives and clitics allows the already lengthy word to grow even longer, although the usability of the word starts to degrade. Because Finnish uses free forming of composite words, new words can even be formed during a conversation. One can add nouns after each other without breaking grammar rules.
If one allows artificial constructs as well as using clitics and conjugated forms, one can create even longer words: such as kumarreksituteskenteleentuvaisehkollaismaisekkuudellisennesk-
enteluttelemattomammuuksissansakaankopahan (102 letters), which was created by Artturi Kannisto.[11]
The longest non-compound (a single stem with prefixes and suffixes) Finnish word recognised by the Guinness Book of Records is epäjärjestelmällistyttämättömyydellänsäkäänköhänkään (see also Agglutination#Extremes), based on the stem järki (reason, sanity), and it means: «I wonder if – even with his/her quality of not having been made unsystematized».
Äteritsiputeritsipuolilautatsijänkä and a defunct bar named after it, Äteritsiputeritsipuolilautatsi-baari, are the longest place names in use.
HungarianEdit
Eltöredezettségmentesítőtleníttethetetlenségtelenítőtlenkedhetnétek, with 67 letters is the longest word in the Hungarian language and approximately means «you could defragmentation defragmenting impenetrability defragmentation». It is already morphed, since Hungarian is an agglutinative language.
The Hungarian language has many causes for writing words together, but there are a few rules for avoiding undisciplined length, resulting in unreadability.
Words with less than six syllables can be written in one. Agglutinated words have to be separated by one dash, if they are more than six syllables altogether. If there are more than two words that are already written with a dash and we want to add some more, we have to use a new dash to add it (like C-vitamin-adagolás, meaning «Vitamin C rationing»). If there would be two long words to be written, they are advised to be used separately (possible: békeszerződéstervezet-kidolgozás meaning «peace agreement plan elaboration», but advised rather a békeszerződés tervezetének kidolgozása meaning «the elaboration of the plan of the peace agreement»).
[12]
The longest dictionary form word is the word megszentségtelenített, with 21 characters (although it ultimately derives from the word szent meaning: «saint» or «sacred»), and it means «desecrated» or «profaned».[13]
KoreanEdit
There is some disagreement about what is the longest word in the Korean language, which arises from misunderstanding of the Korean language.
The longest word appearing in the Standard Korean Dictionary published by the National Institute of the Korean Language is 청자 양인각 연당초상감 모란 문은구 대접 (靑瓷陽印刻蓮唐草象嵌牡丹文銀釦대접); Revised Romanization: cheongjayang-in-gakyeondangchosang-gammoranmuneun-gudaejeop, which is a kind of ceramic bowl from the Goryeo dynasty; that word is 17 syllable blocks long, and contains a total of 46 hangul letters.[14][15] However, to call this a word would be incorrect. It simply consists of many words which act as adjectives for the one word 대접.
The word 니코틴아마이드 아데닌 다이뉴클레오타이드 (nikotin-amaideu adenin dainyukeulle-otaideu), a phonetic transcription of «nicotinamide adenine dinucleotide», has a larger number of syllable blocks (19) but a smaller number of letters (41), but does not qualify as a single word due to the spaces.
In proper nouns, many Korean monarchs have overly long posthumous names built from many different Sino-Korean nouns describing their positive characteristics, for example Sunjo of Joseon, whose full posthumous name is the 77-syllable-block 순조 선각 연덕현도 경인순희 체성응명흠광석경계천배극융원돈휴의행소윤희화준렬대중지정 홍훈철 모건시태형창 운홍기고명박후강건수정계통수력 공유범문안무정영경 성효대왕 (sunjoseongag-yeondeoghyeondogyeong-insunhuicheseong-eungmyeongheumgwangseoggyeong-gyecheonbaegeug-yung-wondonhyuuihaengsoyunhuihwa-junlyeoldaejungjijeonghonghuncheolmogeonsitaehy-eongchang-unhong-gigomyeongbaghugang-geonsujeong-gyetongsulyeoggong-yubeommun-anmujeong-yeong-gyeongseonghyodaewang).[citation needed] This is simply writing the phrase in Hanja (Hanzi) 純祖先覺淵德顯道景仁純禧體聖凝命欽光錫慶繼天配極隆元敦休懿行昭倫熙化峻烈大中至正洪勳哲謨乾始泰亨昌運弘基高明博厚剛健粹精啓統垂曆建功裕範文安武靖英敬成孝肅皇帝, being transliterate in Hangul. It is not a single word and does not qualify as a lexical entry.
MongolianEdit
A popular example of the longest suffixed word in Mongolian is «Цахилгаанжуулалтыхантайгаа» (tsakhilgaanjuulaltykhantaigaa) which is 26 letters long. Here is a table showing, with translations, which suffixes are added.[citation needed]
Word | Translation |
---|---|
Цахилгаан | electricity (power) |
Цахилгаанжуул | electrify |
Цахилгаанжуулалт | electrification |
Цахилгаанжуулалтын | electrifications |
Цахилгаанжуулалтыхан | electricians |
Цахилгаанжуулалтыхантай | with electricians |
Цахилгаанжуулалтыхантайгаа | do (action) with electricians |
OjibweEdit
The longest word in the Ojibwe language is miinibaashkiminasiganibiitoosijiganibadagwiingweshiganibakwezhigan (66 letters), meaning «blueberry pie». This literally translates to «blueberry cooked to jellied preserve that lies in layers in which the face is covered in bread».[16]
TagalogEdit
Tagalog can make long words by adding on affixes, suffixes, and other root words with a connector.
The longest published word in the language is pinakanakakapagngitngitngitngitang-pagsisinungasinungalingan, with 59 letters. This compound word means «to keep making up a lie that causes the most extreme anger while pretending you are not.»[17]
TurkishEdit
Turkish, as an agglutinative language, carries the potential for words of arbitrary length.
Muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmişsinizcesine, at 70 letters, has been cited as the longest Turkish word. It was used in a contrived story designed to use this word.[18][19] The word means «As if you would be from those we can not easily/quickly make a maker of unsuccessful ones» and its usage was illustrated as follows:
Kötü amaçların güdüldüğü bir öğretmen okulundayız. Yetiştirilen öğretmenlere öğrencileri nasıl muvaffakiyetsizleştirecekleri öğretiliyor. Yani öğretmenler birer muvaffakiyetsizleştirici olarak yetiştiriliyorlar. Fakat öğretmenlerden biri muvaffakiyetsizleştirici olmayı, yani muvaffakiyetsizleştiricileştirilmeyi reddediyor, bu konuda ileri geri konuşuyor. Bütün öğretmenleri kolayca muvaffakiyetsizleştiricileştiriverebileceğini sanan okul müdürü bu duruma sinirleniyor, ve söz konusu öğretmeni makamına çağırıp ona diyor ki: Muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmişsinizcesine laflar ediyormuşsunuz ha?
We are in a teachers’ training school that has evil purposes. The teachers who are being educated in that school are being taught how to make unsuccessful ones from students. So, one by one, teachers are being educated as makers of unsuccessful ones. However, one of those teachers refuses to be maker of unsuccessful ones, in other words, to be made a maker of unsuccessful ones; he talks about and criticizes the school’s stand on the issue. The headmaster who thinks every teacher can be made easily/quickly into a maker of unsuccessful ones gets angry. He invites the teacher to his room and says «You are talking as if you were one of those we can not easily/quickly turn into a maker of unsuccessful ones, huh?»
Other well-known very long Turkish words are:[20]
- Çekoslovakyalılaştıramadıklarımızdanmışsınızcasına means «As if you are one of those people whom we could not turn into a Czechoslovakian».
- Afyonkarahisarlılaştırabildiklerimizdenmişsinizcesine means «As if you are one of the people that we made resemble from Afyonkarahisar». (Afyonkarahisar is a city in Turkey.)
Word formationEdit
Turkish | English |
---|---|
Muvaffak | Successful |
Muvaffakiyet | Success |
Muvaffakiyetsiz | Unsuccessful (‘without success’) |
Muvaffakiyetsizleş(-mek) | (To) become unsuccessful |
Muvaffakiyetsizleştir(-mek) | (To) make one unsuccessful |
Muvaffakiyetsizleştirici | Maker of unsuccessful ones |
Muvaffakiyetsizleştiricileş(-mek) | (To) become a maker of unsuccessful ones |
Muvaffakiyetsizleştiricileştir(-mek) | (To) make one a maker of unsuccessful ones |
Muvaffakiyetsizleştiricileştiriver(-) | (To) easily/quickly make one a maker of unsuccessful ones |
Muvaffakiyetsizleştiricileştiriverebil(-mek) | (To) be able to make one easily/quickly a maker of unsuccessful ones |
Muvaffakiyetsizleştiricileştiriveremeyebil(-mek) | To be able to not make one easily/quickly a maker of unsuccessful ones |
Muvaffakiyetsizleştiricileştiriveremeyebilecek | One who is not able to make one easily/quickly a maker of unsuccessful ones |
Muvaffakiyetsizleştiricileştiriveremeyebilecekler | Those who are not able to make one easily/quickly a maker of unsuccessful ones |
Muvaffakiyetsizleştiricileştiriveremeyebileceklerimiz | Those whom we cannot make easily/quickly a maker of unsuccessful ones |
Muvaffakiyetsizleştiricileştiriveremeyebileceklerimizden | From those we can not easily/quickly make a maker of unsuccessful ones |
Muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmiş | (Would be) from those we can not easily/quickly make a maker of unsuccessful ones |
Muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmişsiniz | You would be from those we can not easily/quickly make a maker of unsuccessful ones |
Muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmişsinizcesine | As if you would be from those we can not easily/quickly make a maker of unsuccessful ones |
Non-agglutinative languagesEdit
AfrikaansEdit
Afrikaans, as it is a daughter language of the Dutch language, is capable of forming compounds of potentially limitless length in the same way as in the Dutch language. According to the Total Book of South African Records, the longest word in the language is[21]Tweedehandsemotorverkoopsmannevakbondstakingsvergaderingsameroeperstoespraakskrywerspersverklaringuitreikingsmediakonferensieaankondiging (136 letters), which means «issuable media conference’s announcement at a press release regarding the convener’s speech at a secondhand car dealership union’s strike meeting». This word, however, is contrived to be long and does not occur in everyday speech or writing.
ArabicEdit
Currently, the longest word in Arabic is the 15-letter-long word أَفَإِستَسقَينَاكُمُوها.[22] Which means «Did we ask you to let us drink it?» However, according to some online sources the 16-letter-long word أَفَإِستَسقَينَاكُمُوهما is the longest word in Arabic meaning «Did we ask you to let us drink both of them?». Regardless, official sources supporting such a stance cannot be found.
BulgarianEdit
The Bulgarian online etymological dictionary claims that longest word in Bulgarian to be the 39-letter-long непротивоконституционствувателствувайте (neprotivokonstitutsionstvuvatelstvuvayte), introduced in the Constitution of Bulgaria of 1947 (Dimitrov Constitution).[23] The word means «do not perform actions against the constitution» (addressed to more than one person).
CatalanEdit
The longest word in Catalan is considered to be Anticonstitucionalment, an adverb meaning «[done in a way that is] against the constitution», however, the scientific word Psiconeuroimmunoendocrinologia, related to endocrinology, has been proposed by the University of Barcelona to be the true longest word.[24]
CroatianEdit
The longest known word in Croatian is prijestolonasljednikovičičinima,[25] meaning «to those who belong to the throne successor’s little wife.» The 31-letter word is the dative case of prijestolonasljednikovičica «the throne successor’s little wife» which is the diminutive of prijestolonasljednikovica «the throne successor’s wife.»
CzechEdit
Traditionally, the word nejneobhospodařovávatelnější («of the least cultivable», 28 letters) is considered as the longest Czech word, but there are some longer artificial words. Most of them are compound adjectives in dative, instrumental or other grammatical case and derived from the iterative or frequentative verbal form or the ability adjective form (like «-able»).
- Nejnezdevětadevadesáteroroznásobovávatelnějšími (47; Instrumental case of the ones least multipliable by a group of ninety-nine on a regular basis)
- Nejnezdevětadevadesáteroroznásobovávatelnější (Those who are the least multiplable by a group ninety-nine on a regular basis)
- Nejzdevětadevadesáteroroznásobovávatelnější (Those who are the most multiplable by a group ninety-nine on a regular basis)
- Zdevětadevadesáteroroznásobovávatelnější (Those who are more multiplable by a group ninety-nine on a regular basis)
- Zdevětadevadesáteroroznásobovávatelní (Those who are multiplable by a group of ninety-nine on a regular basis)
- Zdevětadevadesáteroroznásobovávat (Alternative of «multiply out by a group of ninety-nine on a regular basis»)
- Zdevětadevadesáteroroznásobovat (Multiply out by a group of ninety-nine on a regular basis — continuous grammatical aspect)
- Zdevětadevadesáteroznásobovat (Multiply by ninety-nine on a regular basis – continuous grammatical aspect)
- Zdevětadevadesáteroznásobit (Multiply by a group of ninety-nine once)
- Zdevětadevadesáteronásobit (Multiply by a group of ninety-nine)
- Devětadevadesátero (A group of ninety-nine)
- Devětadevadesát (Inverse of devadesát devět = ninety-nine)
DanishEdit
Danish, like many Germanic languages, is capable of compounding words to create ad hoc compounds of potentially limitless length. Nevertheless, the constructed word speciallægepraksisplanlægningsstabiliseringsperiode – which means «a period of stabilising the planning of a specialist doctor’s practice» – was cited in 1993 by the Danish version of the Guinness Book of World Records as the longest word in the Danish language at 51 letters long. It is however not possible (using Google) to find a text, which actually uses this word, except for in the context of discussing the longest Danish word.
DutchEdit
Dutch, like many Germanic languages, is capable of forming compounds of potentially limitless length. The 53-letter word Kindercarnavalsoptochtvoorbereidingswerkzaamhedenplan, meaning «preparation activities plan for a children’s carnival procession», was cited by the 1996 Guinness Book of World Records as the longest Dutch word.[26]
The longest word in the authoritative Van Dale Dutch dictionary (2009 edition) in plural form is meervoudigepersoonlijkheidsstoornissen;[27] 38 letters long, meaning «multiple personality disorders». The entry in the dictionary however is in the singular, counting 35 letters.
The free OpenTaal dictionary,[28] which has been certified by the Dutch Language Union (the official Dutch language institute) and is included in many open-source applications, contains the following longest words, which are 40 letters long:
- vervoerdersaansprakelijkheidsverzekering, «carriers’ liability insurance»;
- bestuurdersaansprakelijkheidsverzekering, «drivers’ liability insurance»;
- overeenstemmingsbeoordelingsprocedures, «conformity assessment procedures» (38 letters)
The word often said to be the longest in Dutch – probably because of its funny meaning and alliteration – which has also appeared in print, is Hottentottensoldatententententoonstellingsbouwterrein («construction ground for the Hottentot soldiers’ tents exhibition»); counting 53 letters.
EnglishEdit
The 45-letter word pneumonoultramicroscopicsilicovolcanoconiosis is the longest English word that appears in a major dictionary.[29][30] Originally coined to become a candidate for the longest word in English, the term eventually developed some independent use in medicine.[31] It is referred to as «P45» by researchers.[32]
The 30-letter word pseudopseudohypoparathyroidism refers to an inherited disorder,[33] named for its similarity to pseudohypoparathyroidism in presentation, which is in turn named for its similarity to hypoparathyroidism. This is the longest word that was not contrived with the sole intention of becoming the longest word.[34]
Floccinaucinihilipilification, at 29 letters and meaning the act of estimating something as being worth so little as to be practically valueless, or the habit of doing so, is the longest non-technical, coined word in Oxford Dictionaries of the English language.[29]
Antidisestablishmentarianism, at 28 letters, is the longest non-coined, non-systematic English word in Oxford Dictionaries.[29] It refers to a 19th-century political movement that opposed the disestablishment of the Church of England as the state church of England.
FrenchEdit
GermanEdit
In German, whole numbers (smaller than 1 million) can be expressed as single words, which makes siebenhundertsiebenundsiebzigtausendsiebenhundertsiebenundsiebzig (777,777) a 65 letter word. In combination with -malig or, as an inflected noun, (des …) -maligen, all numbers can be written as one word. A 79 letter word, Donaudampfschiffahrtselektrizitätenhauptbetriebswerkbauunterbeamtengesellschaft, was named the longest published word in the German language by the 1972 Guinness Book of World Records, but longer words are possible. The word was the name of a prewar Viennese club for subordinate officials of the headquarters of the electrical division of the company named the Donaudampfschiffahrtsgesellschaft, «Danube steam boat operation company».
The longest word that is not created artificially as a longest-word record seems to be Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz at 63 letters. The word means «law delegating beef label monitoring» but as of 2013, it was removed from the books because European Union regulations have changed and that particular law became obsolete, leading to news reports that Germany «had lost its longest word».[35]
In December 2016 the 51-letter word Bundespräsidentenstichwahlwiederholungsverschiebung («deferral of the second iteration of the federal presidential run-off election») was elected the Austrian Word of the Year 2016.[36] The jury called it a «descriptive word» which «in terms of its content as well as its length, is a symbol and an ironic form of commentary for the political events of this year, characterized by the very long campaign for the presidential election, the challenges of the voting process, and its reiteration.»[36][37]
GreekEdit
In his comedy Assemblywomen (c. 392 BC), Aristophanes coined the 182-letter word λοπαδοτεμαχοσελαχογαλεοκρανιολειψανοδριμυποτριμματοσιλφιοκαραβομελιτοκατακεχυμενοκιχλεπικοσσυφοφαττοπεριστεραλεκτρυονοπτοκεφαλλιοκιγκλοπελειολαγῳοσιραιοβαφητραγανοπτερύγων (Lopadotemachoselachogaleokranioleipsanodrimhypotrimmatosilphiokarabomelitokatakechymenokichlepikossyphophattoperisteralektryonoptekephalliokigklopeleiolagoiosiraiobaphetraganopterygon), a fictional food dish consisting of a combination of fish and other meat. The word is cited as the longest ancient Greek word ever written.[38]
A modern Greek word of 22 letters is ηλεκτροεγκεφαλογράφημα (ilektroenkefalográfima) (gen. ηλεκτροεγκεφαλογραφήματος (ilektroenkefalografímatos), 25 letters) meaning «electroencephalogram».
HebrewEdit
The longest Hebrew word is the 19-letter-long (including vowels) וכשלאנציקלופדיותינו (u’chshelentsiklopediotenu),[39] which means «And when to our encyclopedias…» The Hebrew word אנציקלופדיה (encyclopedia) is of a European origin.
The longest word in Hebrew that doesn’t originate from another language is וכשלהתמרמרויותינו, (u’chshelehitmarmeruyotenu) which crudely means «And when, to our resentments/ grievances»
The 11-letter-long (including vowels) וְהָאֲחַשְׁדַּרְפְּנִים (veha’aḥashdarpením) is the longest word to appear in the Hebrew Bible. — Its meaning is «And the satraps». It also does not originate from Hebrew.[citation needed]
Other very long Hebrew words include:
- וכשבהשתעשעויותיהם (u’chshebehishta’ashuateyhem) meaning: «And when they were having fun» or «And while in their playfulness».
HindiEdit
Hindi has a finite list of compound words which are based on established grammatical rules of the language. The word commonly cited as the longest in Hindi is लौहपथगामिनीसूचकदर्शकहरितताम्रलौहपट्टिका (lauhpathagāminīsūchakdarshkaharitatāmralauhpaṭṭikā), which consists of 24 consonants and 10 vowel diacritics, making up a total of 34 characters. The word literally means «a green railway warning signboard made of copper-iron». Its plural would be लौहपथगामिनीसूचकदर्शकहरितताम्रलौहपट्टिकाएँ (lauhpathagāminīsūchakdarshkaharitatāmralauhpaṭṭikāẽ), which has an additional vowel and a diacritic. It is a neologism and not in common use.[40]
A much smaller word borrowed from Sanskrit which is in common use and is also often cited as the longest word is किंकर्तव्यविमूढ़ (kinkartavyavimūṛh). It consists of 8 consonants and 5 vowel diacritics, making up a total of 13 characters. The word literally means «confused about what to do», meaning to be bewildered or flabbergasted.
IcelandicEdit
Icelandic has the ability to form compounds of arbitrary length by stringing together genitives (eignarfallssamsetning), so no single words of maximal length exist in the language. However, vaðlaheiðarvegavinnuverkfærageymsluskúr and vaðlaheiðarvegavinnuverkfærageymsluskúraútidyralyklakippuhringur are sometimes cited as particularly long words;[41] the latter has 64 letters and means «a keychain ring for the outdoor key of road workers shed in a moor called Vaðlaheiði».
Analysis of a corpus of contemporary Icelandic texts by Uwe Quasthoff, Sabine Fiedler and Erla Hallsteinsdóttir identified Alþjóðaflutningaverkamannasambandsins («of the International Transport Workers’ Federation»; 37 letters) and Norðvestur-Atlantshafsfiskveiðistofnunarinnar («of the Northwest Atlantic Fisheries’ Organization»; 45 letters) as the longest unhyphenated and hyphenated words.[42]
The longest word occurring at least twice in the University of Leipzig isl-is_web_2015 corpus is Auðmannastjórnvaldaembættisstjórnmálaverkalýðsverðlausraverðbréfaábyrgðarlausrakvóta-ræningjaaftaníossaspilling (110 letters).[43]
IndonesianEdit
Indonesian is a part of Austronesian language. According from Kamus Besar Bahasa Indonesia. The longest word of this language is mempertanggungjawabkan, which is 22 letter meaning «take responsibility» in english and heksakosioiheksekontaheksafobia, 30 letter meaning «hexacosioihexecontahexaphobia» in english.[44]
IrishEdit
The longest non-compound word in Irish is grianghrafadóireacht, a 20-letter-long word meaning «photography».[45]
ItalianEdit
The longest word in Italian is traditionally precipitevolissimevolmente, which is a 26-letter-long adverb.[46] It is formed by subsequent addition of postfixes to the original root:
- precipitevole: «hasty»;
- precipitevolissimo: «very hasty»;
- precipitevolissimevole: «[of someone/something] that acts very hastily», (not grammatically correct[citation needed]);
- precipitevolissimevolmente: «in a way like someone/something that acts very hastily» (not grammatically correct, but nowadays part of the language).
The word is never used in every-day language, but in jokes. Nevertheless, it is an official part of Italian language; it was coined in 1677 by poet Francesco Moneti:
perché alla terra alfin torna repente / precipitevolissimevolmente
— Francesco Moneti, Cortona Convertita, canto III, LXV
The word technically violates Italian grammar rules, the correct form being precipitevolissimamente, which is three letters and one syllable shorter. The poet coined the new word to have 11 syllables in the second verse.
Other words can be created with a similar (and grammatically correct) mechanism starting from a longer root, winding up with a longer word. Some examples are:
- sovramagnificentissimamente (cited by Dante Alighieri in De vulgari eloquentia), 27 letters, «in a way that is more than magnificent by far» (archaic);[47]
- incontrovertibilissimamente, 27 letters, «in a way that is very difficult to falsify»;
- particolareggiatissimamente, 27 letters, «in an extremely detailed way»;
- anticostituzionalissimamente, 28 letters, «in a way that strongly violates the constitution».
The longest accepted neologism is psiconeuroendocrinoimmunologia (30 letters).[citation needed].
Other long words are:
- nonilfenossipolietilenossietonolo (33 letters — chemical)
- pentagonododecaedrotetraedrico (30 letters — 3D geometric figure)
- esofagodermatodigiunoplastica (29 letters — surgery)
- elettroencefalograficamente (27 letters — medical adverb: electroencephalographically)
- diclorodifeniltricloroetano (27 letters — chemical: DDT)
LáadanEdit
Láadan is not agglutinating as there is no mechanism to combine arbitrary words into one without intermediating grammatical mechanisms (such as the relativizer § In other languages); however, there are a number of affixes that further elucidate the contextual meaning of a word. These are ignored when determining the longest words in the language. The primary reference for vocabulary is the 3rd edition of the official dictionary and grammar.
- oshetham éelenethilethu, 22 letters not counting the space, or 17 phonemes (since for example ée is a toneme of e, and th is a separate sound from *t or *h separately—the asterisks indicate that neither sound exists in Láadan) — a set phrase for a wreath of grapevine, a common symbol of the language[48]
- shineshidethóo, 14 letters or 10 phonemes — an invited guest[49]
LatinEdit
The longest attested word in Classical Latin is subductisupercilicarptor, which was coined by the obscure poet Laevius in the 1st century. In Medieval Latin, the longest known word is honorificabilitudinitas, which was first attested in a treatise written by the 8th century Grammarian Peter of Pisa. One can further increase the length of the words by adding the Dative plural case to them, which would result in the words subductisupercilicarptoribus and honorificabilitudinitatibus respectively.[citation needed]
LithuanianEdit
The longest Lithuanian word is 40 letters long:
- nebeprisikiškiakopūstlapiaujančiuosiuose — «in those, of masculine gender, who aren’t gathering enough wood sorrel’s leaves by themselves anymore.» — the plural locative case of past iterative active participle of verb kiškiakopūstlapiauti meaning «to pick wood-sorrels’ leaves» (leaves of edible forest plant with sour taste, word by word translation «rabbit cabbage»). The word is attributed to software developer / writer Andrius Stašauskas.[50][unreliable source?][51][unreliable source?]
MāoriEdit
The Māori-language 85-letter place name Taumatawhakatangihangakoauauotamateaturipukakapikimaungahoronukupokaiwhenuakitanatahu is the longest place name in English-speaking countries and second longest in the world, according to Wises New Zealand Guide and The New Zealand Herald.[52]
PolishEdit
Very long Polish words can be created as adjectives from numerals and nouns. For example, Dziewięćsetdziewięćdziesięciodziewięcionarodowościowego, 54 letters, is the genitive singular form of an adjective meaning roughly «of nine-hundred and ninety-nine nationalities». Similar words are rather artificial compounds, constructed within allowed grammar rules, but are seldom used in spoken language, although they are not nonsense words.[citation needed] It is possible to make even longer words in this way, for example:
Dziewięćsetdziewięćdziesiątdziewięćmiliardówdziewięćsetdziewięćdziesiątdziewięćmilionów-dziewięćsetdziewięćdziesiątdziewięćtysięcydziewięćsetdziewięćdziesięciodziewięcioletniego (176 letters, meaning «of 999,999,999,999 years old»).
One of the longest common words is 31-letter dziewięćdziesięciokilkuletniemu – the dative singular form of «ninety-and-some years old one». Another known long word is konstantynopolitańczykowianeczka[citation needed] (32 letters), «a daughter of a man who lives in Constantinople» and pięćdziesięciogroszówka (23 letters), «a 50 groszy coin».[53]
RomanianEdit
The longest Romanian word is pneumonoultramicroscopicsilicovolcaniconioză, with 44 letters,[54] but the longest one admitted by the Dicționarul explicativ al limbii române («Explanatory Dictionary of the Romanian Language», DEX) is electroglotospectrografie, with 25 letters.[55][56]
RussianEdit
Most likely one of the longest Russian words is a chemical term, тетрагидропиранилциклопентилтетрагидропиридопиридиновая (tetragidropiranilciklopentiltetragidropiridopiridinovaya), which contains 55 letters. It was used in Russian patent RU2285004C2 (granted and published in 2006). This word is an adjective that can describe e.g. a chemical formula. As a noun, it is without the last 4 letters.
Another one is превысокомногорассмотрительствующий (prevysokomnogorassmotritel’stvuyushchiy), which contains 35 letters. It is an adjective in the bureaucratic language of the 19th century «meaning a very polite form of addressing clerks, something like Your Excellency, Your Highness, Your Majesty all together» (Guinness World Records 2003[citation needed]). Its dative singular form, превысокомногорассмотрительствующему (prevysokomnogorassmotritel’stvuyushchemu, with 36 letters) can be an example of excessively official vocabulary of the 19th century.
Numeral compounds can be long as well, such as Тысячевосьмисотвосьмидесятидевятимикрометровый (Tysyachevos’misotvos’midesyatidevyatimikrometrovyy), which is an adjective containing 46 letters, meaning «1889-micrometers long».[57]
SanskritEdit
Sanskrit allows word compounding of arbitrary length. Nouns and verbs can be expressed in a sentence.[citation needed]
The longest sentence ever used in Sanskrit literature is (in Devanagari):
- निरन्तरान्धकारितदिगन्तरकन्दलदमन्दसुधारसबिन्दुसान्द्रतरघनाघनवृन्द-सन्देहकरस्यन्दमानमकरन्दबिन्दुबन्धुरतरमाकन्दतरुकुलतल्पकल्पमृ-दुलसिकताजालजटिलमूलतलमरुवकमिलदलघुलघुलयकलितरमणीय-पानीयशालिकाबालिकाकरारविन्दगलन्तिकागलदेलालवङ्गपाटलघनसा-रकस्तूरिकातिसौरभमेदुरलघुतरमधुरशीतलतरसलिलधारानिराकरिष्णुत-दीयविमलविलोचनमयूखरेखापसारितपिपासायासपथिकलोकान्
In IAST transliteration:
- nirantarāndhakārita-digantara-kandaladamanda-sudhārasa-bindu-sāndratara-ghanāghana-vṛnda-sandehakara-syandamāna-makaranda-bindu-bandhuratara-mākanda-taru-kula-talpa-kalpa-mṛdula-sikatā-jāla-jaṭila-mūla-tala-maruvaka-miladalaghu-laghu-laya-kalita-ramaṇīya-pānīya-śālikā-bālikā-karāra-vinda-galantikā-galadelā-lavaṅga-pāṭala-ghanasāra-kastūrikātisaurabha-medura-laghutara-madhura-śītalatara-saliladhārā-nirākariṣṇu-tadīya-vimala-vilocana-mayūkha-rekhāpasārita-pipāsāyāsa-pathika-lokān
from the Varadāmbikā Pariṇaya Campū by Tirumalāmbā,[58] composed of 195 Sanskrit letters (428 letters in the roman transliteration, dashes excluded), thus making it the longest word ever to appear in worldwide literature.[59][60]
Each hyphen separates every individual word this word is composed of.
The approximate meaning of this word is:
- «In it, the distress, caused by thirst, to travellers, was alleviated by clusters of rays of the bright eyes of the girls; the rays that were shaming the currents of light, sweet and cold water charged with the strong fragrance of cardamom, clove, saffron, camphor and musk and flowing out of the pitchers (held in) the lotus-like hands of maidens (seated in) the beautiful water-sheds, made of the thick roots of vetiver mixed with marjoram, (and built near) the foot, covered with heaps of couch-like soft sand, of the clusters of newly sprouting mango trees, which constantly darkened the intermediate space of the quarters, and which looked all the more charming on account of the trickling drops of the floral juice, which thus caused the delusion of a row of thick rainy clouds, densely filled with abundant nectar.»
SlovakEdit
Traditionally, the word najneobhospodarovávateľnejšieho («of the least cultivable», 31 letters) is considered as the longest Slovak word, but there are some longer artificial words. Most of them are compound adjectives in dative, instrumental or other grammatical case and derived from the iterative or frequentative verbal form or the ability adjective form (like -able).[61][62]
Artificial words, lexically valid but never used in language:
- znajneprekryštalizovávateľnejšievajúcimi, 40 letters, «through the least crystallised ones»
- znajnepreinternacionalizovateľnejšievať, 39 letters
- najnezrevolucionalizovateľnejšiemu, 34 letters [63]
- najnerozkrasokorčuľovateľnejšieho, 33 letters
Artificial words using Slovak towns or places, lexically valid but never used in language:
- znajneprehornádskodružstevnianskovávateľnejšievajúcimi, 54 letters
- znajneprechminianskojakubovianskovávateľnejšievajúcimi, 54 letters
Numerals:
- deväťstodeväťdesiatdeväťtisícštyristodeväťdesiatdeväť, 53 letters, «999499» [64]
- sedemstodeväťdesiatsedemtisícsedemstodeväťdesiatsedem, 53 letters, «797797» [65]
SpanishEdit
The longest word in Spanish is esternocleidomastoideitis (inflammation of the sternocleidomastoid muscle, 30 letters).[66] Runners-up are anticonstitucionalmente ([proceeding in a manner that is] contrary to the constitution) and electroencefalografistas (specialists that do electrical scans on brains (electroencephalographists)), both 23 letters.
The word anticonstitucionalmente is usually considered the longest word in general use. This word can be made even longer by the addition of the absolute superlative suffix, rendering anticonstitucionalísimamente (i.e.: «very strongly against the constitution»). Some dictionaries (but not the RAE dictionary[67]) removed its root word (anticonstitucional) in 2005, causing comments about it not «being a valid word anymore» and suggesting the use of inconstitucional as a replacement.[citation needed]
SwedishEdit
Realisationsvinstbeskattning (28 letters) is the longest word in Svenska Akademiens Ordlista. It means «capital gains taxation», and is usually shortened to Reavinstskatt (same meaning).
However, Swedish grammar makes it possible to create arbitrarily long words. One such word is Spårvagnsaktiebolagsskensmutsskjutarefackföreningspersonalbeklädnadsmagasinsförråd-sförvaltarens (94 letters) which means: «[belonging to] The manager of the depot for the supply of uniforms to the personnel of the track cleaners’ union of the tramway company».[68]
Toki PonaEdit
kijetesantakalu in the Toki Pona writing system sitelen pona
The longest word in Toki Pona is kijetesantakalu (15 letters), which was proposed in 2009 as an April Fools’ joke by the language’s creator Sonja Lang as a word for any animal of the Procyonidae family, which includes raccoons and related species.[69] The word has since entered into common use, and it has become common to define kijetesantakalu more broadly as any animal from the Musteloidea superfamily.[70] In 2019 James Flear designed a glyph for kijetesantakalu in Toki Pona’s sitelen pona writing system, which has become a popular icon within the Toki Pona community.[71]
As a minimalistic isolating constructed language, most words in Toki Pona are much shorter, the median being 4 letters. The longest words featured in the 2014 book Toki Pona: The Language of Good, Lang’s first official Toki Pona publication, are the 7-letter words kepeken («to use, by means of») and sitelen («symbol, picture»). The list of proposed country names in the same book also mentions ma Papuwanijukini («Papua New Guinea»), which includes a 14-letter proper adjective.[72]
VietnameseEdit
Vietnamese is an isolating language, which naturally limits the length of a morpheme. The longest, at seven letters, is nghiêng, which means «inclined» or «to lean».[73] This is the longest word that can be written without a space. However, not all words in Vietnamese are single morphemes. Indeed, nghiêng can be reduplicated as nghiêng nghiêng.
The written language abounds with compound words in which each constituent word is delimited by spaces, just like any freestanding word. Moreover, the grammar lacks inflection to mark parts of speech, and prepositions are often optional. Therefore, the boundary between a word and a phrase is poorly defined.[74] Examples of this ambiguity include:
- Chủ nghĩa phân biệt chủng tộc («racism»), which is composed of the words chủ nghĩa («ideology»), phân biệt («discriminate»), and chủng tộc («race»)
- Cơm gà xào sả ớt, which literally describes a dish of grilled chicken sauteed with lemongrass and peppers on rice
- Ông bà anh chị em, a polite pronoun composed of five kinship terms
Unlike locally coined compound words, compound words in Sino-Vietnamese vocabulary are less ambiguous, because of the use of premodifiers (as in English) as opposed to the native postmodifiers. Long Sino-Vietnamese words include bách khoa toàn thư («encyclopedia») and thủy động lực học («hydrodynamics»).
Loanwords and pronunciation respellings from other languages can also result in long words. For example, «consortium» is côngxoocxiom (12 letters), and «Indonesia» may be left as-is or spelled In-đô-nê-xi-a (13 counting hyphens).[75] The Encyclopedic Dictionary of Vietnam systematically respells foreign names, introducing long names into an official Vietnamese lexicon:
- Kômixacjepxkaia («Komissarzhevskaya», 15 letters)[76]
- Rôjơđextơvenxki («Rozhdestvensky», 15 letters)[77]
- Mêtơrôpôliten Ôpêra («Metropolitan Opera», 18 letters)[78]
Long initialisms in Vietnamese include:
- CHXHCNVN (Cộng hòa Xã hội chủ nghĩa Việt Nam, «Socialist Republic of Vietnam», 8 characters)
- MTDTGPMNVN (Mặt trận Dân tộc Giải phóng miền Nam Việt Nam, «Viet Cong», 10 characters)
In modern Vietnamese, compound words can be identified fairly easily within title cased text: a morpheme that begins with a capital letter followed by one or more morphemes that begin with a lowercase letter. For example, xã hội chủ nghĩa («socialism») is capitalized as one component within Cộng hòa Xã hội chủ nghĩa Việt Nam.
WelshEdit
Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch, a railway station on the island of Anglesey in Wales, is the longest place name in the Welsh language. At 51 letters in the Welsh alphabet (the digraphs ll and ch are each collated as single letters) the name can be translated as «St Mary’s church in the hollow of the white hazel near to the rapid whirlpool and the church of St Tysilio of the red cave». However, it was artificially contrived in the 1860s as a publicity stunt, to give the station the longest name of any railway station in the United Kingdom.
Long words are comparatively rare in Welsh. Candidates for long words other than proper nouns include the following (the digraph dd is also treated as a single letter, as is ng in many instances including in the last word below):
- gwrthddatgysylltiadaeth (antidisestablishmentarianism)
- microgyfrifiaduron (microcomputers)
- gwrthgyfansoddiaethwyr (anticonstitutionalists)
- lled-ddargludyddion (semiconductors)
- tra-arglwyddiaethasant (they tyrannised)
- cyfrwngddarostynedigaeth (intercession)[79] (-au can be added to form the plural, and the word can be further lengthened slightly by initial mutation: fy nghyfrwngddarostynedigaethau, «my intercessions»)
See alsoEdit
- Morphology (linguistics)
- Longest English sentence
- Coxeter group — mathematical concept whose entities are sometimes called words
ReferencesEdit
- ^ McCulloch S. «Longest word in English». Sarah McCulloch.com. Archived from the original on 14 January 2010. Retrieved 12 October 2016.
- ^ Oxford Word and Language Service team. «Ask the experts — What is the longest English word?». AskOxford.com / Oxford University Press. Archived from the original on 13 September 2008. Retrieved 13 January 2008.
- ^ (in Basque) Iñaki Arranz, Hitza azti, Alberdania, 2006, 283 pages. (Zein da euskal hitzik luzeena?)
- ^ Jordan, David K. (1 July 1999). «Chapter 4 (Part 1): Nouns». Being colloquial in Esperanto: a reference guide. Esperanto League for North Amer. ISBN 9780939785049.
The last, «silly» line is the same as the «wrong» one, but it is technically possible because it is a single noun.
- ^ Wennergren, Bertilo. «PMEG – Precizigaj antaŭelementoj – Kombinoj el kombinoj». Plena Manlibro de Esperanta Gramatiko. Retrieved 7 March 2022.
- ^ «Akademia Vortaro«. Akademio de Esperanto. Archived from the original on 24 July 2011. Retrieved 30 November 2009.
- ^ Gonçalo Neves (1997). «Bontone pri la bretona». Monato. Retrieved 7 March 2022.
- ^ «Estonian / Lingvopedia :: lingvo.info». lingvo.info. Retrieved 20 April 2020.
- ^ Appears on page 97 in Laaksonen, Lasse: Viina, hermot ja rangaistukset — sotilasjohdon henkilökohtaiset ongelmat 1918-1945. Docendo, Helsinki 2017.
- ^ «Suupohjan peruspalveluliikelaitoskuntayhtymä – LLKY». llky.fi.
- ^ Karilas, Yrjö: Antero Vipunen, arvoitusten ja ongelmien, leikkien ja pelien sekä eri harrastelualojen pikkujättiläinen, p. 226, 20th edition. WSOY 2003. ISBN 9510121770
- ^ 139. point [
https://helyesiras.mta.hu/helyesiras/default/akh12] in the Hungarian Academy of Sciences: Rules of Hungarian Orthography - ^ See at the end of the entry megszentségtelenít in a monolingual dictionary of Hungarian
- ^ «청자양인각연당초상감모란문은구대접». Naver Dictionary. Retrieved 6 August 2015.
- ^ «독일에서 가장 긴 단어 사라진다» [Longest word in Germany disappears]. JoongAng Ilbo. 4 June 2013. Retrieved 6 August 2015.[permanent dead link]
- ^ «Grammar Pro» Archived 13 December 2017 at the Wayback Machine, a page of the collaborative Anishinaabe language revitalization effort
- ^ «PUTANGINA». TAGALOG LANG. 30 December 2015. Retrieved 25 April 2018.
- ^ «Yeni Mesaj Internet Sitesi». www.yenimesaj.com.tr. Archived from the original on 18 July 2011.
- ^ «Papatyam Forum». www.papatyam.org. Archived from the original on 27 July 2011.
- ^ «Çekoslavakyalılaştıramadıklarımızdan mısınız? TDK’ye Göre Doğru Yazılışı — Çekoslavakyalılaştıramadıklarımızdan mısınız? Doğru Yazımı Nasıldır?». nasil.yazilir.com. 23 December 2016.
- ^ Rosenthal, Eric (1982). Total Book of South African records. Delta Books. p. 61. ISBN 0-908387-19-9.
- ^ الكاش, علي (9 January 2021). الصوفية والصفوية، خصائص وأهداف مشتركة (First ed.). البُرهان. p. 195.
- ^ «непротивоконституционствувателствувайте». rechnik.info. Retrieved 28 October 2013.
- ^ «Psiconeuroimmunoendocrinologia: la paraula més llarga de la UB? – Vocabulària». www.ub.edu (in Catalan). Retrieved 18 November 2017.
- ^ Jeste li znali da najdulja hrvatska riječ ima 31 slovo?, Dalmacija News, 22 February 2014.
- ^ «A Collection of Word Oddities and Trivia». francesfarmersrevenge.com. Archived from the original on 27 April 2009. Retrieved 7 March 2009.
- ^ «Wat is het langste woord in het Nederlands». levenslangleren.be.
- ^ «Welkom bij OpenTaal». opentaal.org.
- ^ a b c «What is the longest English word?» (oxforddictionaries.com)
- ^ «pneumonoultramicroscopicsilicovolcanoconiosis definition». reference.com. Retrieved 7 March 2009.
- ^ «PNEUMONOULTRAMICROSCOPICSILICOVOLCANOCONIOSIS». pathology.med.ohio-state.edu. Archived from the original on 8 June 2009. Retrieved 7 March 2009.
- ^ «BBC – h2g2 – Pneumonoultramicroscopicsilicovolcanoconiosis – The Longest Word». BBC. Retrieved 7 March 2009.
- ^ «Pseudopseudohypoparathyroidism | Genetic and Rare Diseases Information Center (GARD) – an NCATS Program». rarediseases.info.nih.gov. Retrieved 31 January 2017.
- ^ «What is the longest English word?». AskOxford. Archived from the original on 22 October 2008. Retrieved 22 August 2010.
- ^ «Law change spells end for Germany’s longest word». salon.com. Associated Press. 4 June 2013.
- ^ a b Austria chooses its Word of the Year, The Local, 9 Dec. 2016.
- ^ Presseerklärung der Jury zur Wahl des Österreichischen Worts des Jahres, Forschungsstelle Österreichisches Deutsch, 9 Dec. 2016
- ^ De Luca, Kenneth M. (2005). Aristophanes’ male and female revolutions : a reading of Aristophanes’ Knights and Assemblywomen. Lanham, MD: Lexington Books. p. 124. ISBN 978-0-7391-0833-8.
- ^ «Longest word in hebrew | Hebrew language | Preply». preply.com. Retrieved 27 May 2020.
- ^ «हिंदी भाषा का अब तक निर्मित किया गया सबसे बड़ा शब्द है?». Upto Cricket (in Hindi). Retrieved 28 February 2021.
- ^ Helgason, Haukur Már. «Hvernig hljóðar lengsta orð í heimi á íslensku?». Vísindavefurinn. University of Iceland. Retrieved 28 December 2013.
- ^ Quasthoff, Uwe; Fiedler, Sabine; Hallsteinsdóttir, Erla, eds. (14 May 2012). Frequency Dictionary Icelandic / Íslensk tíðniorðabók. Leipziger Universitätsverlag. ISBN 978-3-86583-656-4. OCLC 808247819.
- ^ http://cls.corpora.uni-leipzig.de/de/isl-is_web_2015/3.5.6_Longest%20Words.html[dead link]
- ^ «3 kata terpanjang dalam KBBI».
- ^ «Foclóir Gaeilge–Béarla (Ó Dónaill): grianghrafadóireacht». www.teanglann.ie. Retrieved 13 January 2022.
- ^ Crusca, Accademia Della (1829). «Dizionario della lingua italiana …»
- ^ «Dante: De Vulgari Eloquentia II». Retrieved 22 July 2016.
- ^ «Láadan-to-English». laadanlanguage.org.
- ^ «Láadan to English – Sh». laadanlanguage.org. 25 October 2015.
- ^ «A Collection of Word Oddities and Trivia». jeff560.tripod.com.
- ^ «Loooooooong words». Archived from the original on 9 May 2016. Retrieved 20 April 2016.
- ^ NZPA (11 August 2003). «Nasa turns to Kiwi when it needs expert space advice». New Zealand Herald. Retrieved 28 March 2011.
Three years ago, Mr Coleman, a website designer, posted a message on an internet bulletin board about Taumatawhakatangihangakoauauotamateaturipukakapikimaungahoronukupokaiwhenuakitanatahu in southern Hawkes Bay. It is the second-longest place name in the world, according to Wises New Zealand Guide.
- ^ «pięćdziesięciogroszówka — Słownik SJP». sjp.pl.
- ^ Bălhuc, Paul (15 January 2017). «Câte litere are cel mai lung cuvânt din limba română și care este singurul termen ce conține toate vocalele». Adevărul (in Romanian).
- ^ «Electroglotospectrografie». Dicționarul explicativ al limbii române (in Romanian). Retrieved 10 February 2021.
- ^ «Curiozități lingvistice: cele mai lungi cuvinte din limba română». Dicție.ro (in Romanian). Retrieved 10 February 2021.
- ^ «Слитное и раздельное написание имён числительных — Агентство переводов Lingvotech». lingvotech.com.
- ^ «Ἡλληνιστεύκοντος». hellenisteukontos.blogspot.in. 13 March 2010.
- ^ McFarlan, Donald; McWhirter, Norris (1991). Guinness Book of World Records, 1991. ISBN 9780553289541.
- ^ «Guinness World Records – Longest word». Retrieved 23 September 2017.
- ^ «Aké je najdlhšie slovo v slovenčine?». sme.sk.
- ^ http://www.juls.savba.sk/ediela/ks/2012/4/ks2012-4.pdf[bare URL PDF]
- ^ «Viete, ktoré slovo slovenského jazyka je najdlhšie?». 23 November 2021.
- ^ «Promo/NocVyskumnikov2011/Kviz».
- ^ «Najdlhšie slová v slovenčine, aké dokážeme povedať a vysloviť». 8 December 2016.
- ^ Roldán Calzado, Juan Luis (2 October 2008). «La palabra más larga». Me la juego a letras (in Spanish). Lulu Press. p. 34. ISBN 978-1-4092-2893-6. Retrieved 15 March 2017 – via Google Books.
- ^ «Anticonstitucional | Diccionario de la lengua española».
- ^ The Guinness Book of Records 1985. Guinness Books. 1985. p. 89. ISBN 0-85112-419-4.
- ^ Sonja Lang. «New official word / Nova oficiala vorto». Retrieved 7 March 2022.
- ^ Sonja Lang (2021). Toki Pona Dictionary. ISBN 978-0-9782923-6-2.
- ^ «toki pona | toki! After a lot of demand for a sitelen pona glyph for the extinct words «**apeja**» and «**kijetesantakalu**» *(believe it or not), *I’ve d…» www.facebook.com. 13 July 2019. Retrieved 24 October 2022.
- ^ Sonja Lang (2014). Toki Pona: The Language of Good. ISBN 978-0-9782923-0-0.
- ^ Phan Ngọc Linh; Phạm Thịnh. ««Lộ» sai sót mới tại CK Đường lên đỉnh Olympia 2012?». Dân Trí. Retrieved 18 October 2013.
- ^ Barnes, Leslie (2014). Vietnam and the Colonial Condition of French Literature. University of Nebraska Press. p. 125. ISBN 978-0-8032-66759 – via Google Books.
The formal characteristics of Vietnamese compounds are not completely clear, however, and because no obvious graphic boundaries exist to demarcate one word from another, the distinction between word and phrase is often very difficult to discern.
- ^ «Thông tin cơ bản về các nước, khu vực và quan hệ với Việt Nam» [Basic information on countries, regions, and relations with Vietnam] (in Vietnamese). Vietnam Ministry of Foreign Affairs.
- ^ «Kômixacjepxkaia V. F.». Encyclopedic Dictionary of Vietnam (in Vietnamese). 2005.
- ^ «Rôjơđextơvenxki G. N.». Encyclopedic Dictionary of Vietnam (in Vietnamese). 2005.
- ^ «Mêtơrôpôliten Ôpêra». Encyclopedic Dictionary of Vietnam (in Vietnamese). 2005.
- ^ «LISTSERV 15.5 – WELSH-L Archives». heanet.ie.