All english word database

List Of English Words

A text file containing over 466k English words.

While searching for a list of english words (for an auto-complete tutorial)
I found: https://stackoverflow.com/questions/2213607/how-to-get-english-language-word-database which refers to https://www.infochimps.com/datasets/word-list-350000-simple-english-words-excel-readable (archived).

No idea why infochimps put the word list inside an excel (.xls) file.

I pulled out the words into a simple new-line-delimited text file.
Which is more useful when building apps or importing into databases etc.

Copyright still belongs to them.

Files you may be interested in:

  • words.txt contains all words.
  • words_alpha.txt contains only [[:alpha:]] words (words that only have letters, no numbers or symbols). If you want a quick solution choose this.
  • words_dictionary.json contains all the words from words_alpha.txt as json format.
    If you are using Python, you can easily load this file and use it as a dictionary for faster performance. All the words are assigned with 1 in the dictionary.

See read_english_dictionary.py for example usage.

I need a database of every single valid word in English. I checked the /usr/share/dict/words file, it contains less than 100k words. Wikipedia says English has 475k words. Where do I get the complete list (American spelling)?

Also, is there a single website that gives out words for other languages too, including Asian and European ones?

Edit: Forgot to add, I do not need names etc., just valid English words.

pigrammer's user avatar

pigrammer

2,3201 gold badge9 silver badges24 bronze badges

asked Feb 6, 2010 at 15:31

6

WordNet database might be helpful. I once worked on a Firefox add-on which deals with words and all kinds of simple to complicated associations between them and stuff. Looks like WordNet will be very much useful to you.

Here it is in MySQL format. And this one (web-archived link) uses Wordnet v3.0 data, rather than the older Wordnet 2.0 data.

Graham's user avatar

Graham

2,8033 gold badges15 silver badges30 bronze badges

answered Feb 6, 2010 at 16:09

user266803's user avatar

user266803user266803

1,0881 gold badge8 silver badges9 bronze badges

10

You can find what you need on infochimps.org.

They have a list of 350,000 simple (ie non-compound) words available for free download.

Word List — 350,000+ Simple English Words

Regarding other languages, you might want to poke around on Wiktionary. Here is a link to all the database backups — the information isnt organized so likely but if they have a language, you can download the data in SQL format.

Community's user avatar

answered Feb 6, 2010 at 15:35

danben's user avatar

danbendanben

80.1k18 gold badges122 silver badges145 bronze badges

5

I do not see http://wordlist.sourceforge.net/ mentioned here, but that is where I would start if I were looking for something like this (and I was, when I stumbled over this question).

If you cannot find what you want there, and what you want is a list of english words, then you should probably spend some extra time describing how to recognize what it is that you want.

answered Feb 6, 2012 at 14:25

rdm's user avatar

rdmrdm

6585 silver badges16 bronze badges

3

There’s no such thing as a «complete» list. Different people have different ways of measuring — for example, they might include slang, neologisms, multi-word phrases, offensive terms, foreign words, verb conjugations, and so on. Some people have even counted a million words! So you’ll have to decide what you want in a word list.

answered Feb 6, 2010 at 16:21

JW.'s user avatar

JW.JW.

50.4k36 gold badges114 silver badges142 bronze badges

2

You may check *spell en-GB dictionary used by Mozilla, OpenOffice, plenty of other software.

answered Feb 6, 2010 at 17:20

mloskot's user avatar

mloskotmloskot

36.6k11 gold badges107 silver badges133 bronze badges

5

You didn’t say what you needed this list for. If something used as a blacklist for password checks is enough cracklib might be good for you. It contains over 1.5M words.

answered Feb 6, 2010 at 15:45

Benjamin Bannier's user avatar

Benjamin BannierBenjamin Bannier

53.8k11 gold badges62 silver badges80 bronze badges

3

База данных английских слов в простом формате

Сгенерил по случаю в процессе работы — вдруг кому пригодится.

Нечетные строки — английские слова, четные — русские переводы, всего порядка 100 000 записей, текстовый файл, сжатый архиватором ZIP, кодировка — русская однобайтовая кодировка Windows (Windows-1251).

 Скачать базу *.txt, более чем 100 000 английских слов с переводом (613 Кб)

08.06.2009, 22:55 [29162 просмотра]


теги: форматы
словарь
english


показать комментарии (7)

Download English word lists free of charge

a selection of word lists sorted by frequency

All word lists were generated from a huge multi-billion sample of language called a corpus which ensures all topics and text types are covered and the word list reflects how words are used by real users. The word lists include the most common and frequently used words, most frequently used nouns, verbs, adjectives and prepositions and some additional word lists.

How are different forms of the same word counted?

All wordlists are lemmatized (=different forms of the same words are counted together, i.e. goes, went, gone, going and go are counted together and listed as go). This is generally more practical. However, sometimes non-lemmatized word lists listing each word form separately are needed. Sketch Engine can generate both types of word lists.

list of English words starting with gn

words

nouns

adjectives

verbs

words starting gn-

examples, synonyms and collocations for language learners

examples, synonyms and collocations for language learning more»

I want longer word lists!

Longer English word lists of the most frequent and common words can be generated with Sketch Engine. There is no limit for word lists generated from user corpora, however, there is a limit of 1,000 items for word lists generated from preloaded corpora. The user can produce any number of word lists. Advanced filtering criteria using regex can be applied so that the word list contains exactly what the user needs. Register for a free trial account with Sketch Engine to generate longer word lists in English.

A list of all words in English

Unlimited word lists from preloaded corpora, e.g. a list of all words in English, can be generated from our multi-billion word English corpora.

Please refer to this page to get information about English word lists for commercial use.

Понравилась статья? Поделить с друзьями:
  • All date functions in excel
  • All date formulas in excel
  • All data in one column excel
  • All data in one cell excel
  • All cross references in word