Word lists and history

A word list (or lexicon) is a list of a language’s lexicon (generally sorted by frequency of occurrence either by levels or as a ranked list) within some given text corpus, serving the purpose of vocabulary acquisition. A lexicon sorted by frequency «provides a rational basis for making sure that learners get the best return for their vocabulary learning effort» (Nation 1997), but is mainly intended for course writers, not directly for learners. Frequency lists are also made for lexicographical purposes, serving as a sort of checklist to ensure that common words are not left out. Some major pitfalls are the corpus content, the corpus register, and the definition of «word». While word counting is a thousand years old, with still gigantic analysis done by hand in the mid-20th century, natural language electronic processing of large corpora such as movie subtitles (SUBTLEX megastudy) has accelerated the research field.

In computational linguistics, a frequency list is a sorted list of words (word types) together with their frequency, where frequency here usually means the number of occurrences in a given corpus, from which the rank can be derived as the position in the list.

Type Occurrences Rank
the 3,789,654 1st
he 2,098,762 2nd
[…]
king 57,897 1,356th
boy 56,975 1,357th
[…]
stringyfy 5 34,589th
[…]
transducionalify 1 123,567th

MethodologyEdit

FactorsEdit

Nation (Nation 1997) noted the incredible help provided by computing capabilities, making corpus analysis much easier. He cited several key issues which influence the construction of frequency lists:

  • corpus representativeness
  • word frequency and range
  • treatment of word families
  • treatment of idioms and fixed expressions
  • range of information
  • various other criteria

CorporaEdit

Traditional written corpusEdit

Most of currently available studies are based on written text corpus, more easily available and easy to process.

SUBTLEX movementEdit

However, New et al. 2007 proposed to tap into the large number of subtitles available online to analyse large numbers of speeches. Brysbaert & New 2009 made a long critical evaluation of this traditional textual analysis approach, and support a move toward speech analysis and analysis of film subtitles available online. This has recently been followed by a handful of follow-up studies,[1] providing valuable frequency count analysis for various languages. Indeed, the SUBTLEX movement completed in five years full studies for French (New et al. 2007), American English (Brysbaert & New 2009; Brysbaert, New & Keuleers 2012), Dutch (Keuleers & New 2010), Chinese (Cai & Brysbaert 2010), Spanish (Cuetos et al. 2011), Greek (Dimitropoulou et al. 2010), Vietnamese (Pham, Bolger & Baayen 2011), Brazil Portuguese (Tang 2012) and Portugal Portuguese (Soares et al. 2015), Albanian (Avdyli & Cuetos 2013), Polish (Mandera et al. 2014) and Catalan (2019[2]). SUBTLEX-IT (2015) provides raw data only.[1]

Lexical unitEdit

In any case, the basic «word» unit should be defined. For Latin scripts, words are usually one or several characters separated either by spaces or punctuation. But exceptions can arise, such as English «can’t», French «aujourd’hui», or idioms. It may also be preferable to group words of a word family under the representation of its base word. Thus, possible, impossible, possibility are words of the same word family, represented by the base word *possib*. For statistical purpose, all these words are summed up under the base word form *possib*, allowing the ranking of a concept and form occurrence. Moreover, other languages may present specific difficulties. Such is the case of Chinese, which does not use spaces between words, and where a specified chain of several characters can be interpreted as either a phrase of unique-character words, or as a multi-character word.

StatisticsEdit

It seems that Zipf’s law holds for frequency lists drawn from longer texts of any natural language. Frequency lists are a useful tool when building an electronic dictionary, which is a prerequisite for a wide range of applications in computational linguistics.

German linguists define the Häufigkeitsklasse (frequency class)   of an item in the list using the base 2 logarithm of the ratio between its frequency and the frequency of the most frequent item. The most common item belongs to frequency class 0 (zero) and any item that is approximately half as frequent belongs in class 1. In the example list above, the misspelled word outragious has a ratio of 76/3789654 and belongs in class 16.

 

where   is the floor function.

Frequency lists, together with semantic networks, are used to identify the least common, specialized terms to be replaced by their hypernyms in a process of semantic compression.

PedagogyEdit

Those lists are not intended to be given directly to students, but rather to serve as a guideline for teachers and textbook authors (Nation 1997). Paul Nation’s modern language teaching summary encourages first to «move from high frequency vocabulary and special purposes [thematic] vocabulary to low frequency vocabulary, then to teach learners strategies to sustain autonomous vocabulary expansion» (Nation 2006).

Effects of words frequencyEdit

Word frequency is known to have various effects (Brysbaert et al. 2011; Rudell 1993). Memorization is positively affected by higher word frequency, likely because the learner is subject to more exposures (Laufer 1997). Lexical access is positively influenced by high word frequency, a phenomenon called word frequency effect (Segui et al.). The effect of word frequency is related to the effect of age-of-acquisition, the age at which the word was learned.

LanguagesEdit

Below is a review of available resources.

EnglishEdit

Word counting dates back to Hellenistic time. Thorndike & Lorge, assisted by their colleagues, counted 18,000,000 running words to provide the first large-scale frequency list in 1944, before modern computers made such projects far easier (Nation 1997).

Traditional listsEdit

These all suffer from their age. In particular, words relating to technology, such as «blog,» which, in 2014, was #7665 in frequency[3] in the Corpus of Contemporary American English,[4] was first attested to in 1999,[5][6][7] and does not appear in any of these three lists.

The Teachers Word Book of 30,000 words (Thorndike and Lorge, 1944)

The TWB contains 30,000 lemmas or ~13,000 word families (Goulden, Nation and Read, 1990). A corpus of 18 million written words was hand analysed. The size of its source corpus increased its usefulness, but its age, and language changes, have reduced its applicability (Nation 1997).

The General Service List (West, 1953)

The GSL contains 2,000 headwords divided into two sets of 1,000 words. A corpus of 5 million written words was analyzed in the 1940s. The rate of occurrence (%) for different meanings, and parts of speech, of the headword are provided. Various criteria, other than frequence and range, were carefully applied to the corpus. Thus, despite its age, some errors, and its corpus being entirely written text, it is still an excellent database of word frequency, frequency of meanings, and reduction of noise (Nation 1997). This list was updated in 2013 by Dr. Charles Browne, Dr. Brent Culligan and Joseph Phillips as the New General Service List.

The American Heritage Word Frequency Book (Carroll, Davies and Richman, 1971)

A corpus of 5 million running words, from written texts used in United States schools (various grades, various subject areas). Its value is in its focus on school teaching materials, and its tagging of words by the frequency of each word, in each of the school grade, and in each of the subject areas (Nation 1997).

The Brown (Francis and Kucera, 1982) LOB and related corpora

These now contain 1 million words from a written corpus representing different dialects of English. These sources are used to produce frequency lists (Nation 1997).

FrenchEdit

Traditional datasets

A review has been made by New & Pallier.
An attempt was made in the 1950s–60s with the Français fondamental. It includes the F.F.1 list with 1,500 high-frequency words, completed by a later F.F.2 list with 1,700 mid-frequency words, and the most used syntax rules.[8] It is claimed that 70 grammatical words constitute 50% of the communicatives sentence,[9] while 3,680 words make about 95~98% of coverage.[10] A list of 3,000 frequent words is available.[11]

The French Ministry of the Education also provide a ranked list of the 1,500 most frequent word families, provided by the lexicologue Étienne Brunet.[12] Jean Baudot made a study on the model of the American Brown study, entitled «Fréquences d’utilisation des mots en français écrit contemporain».[13]

More recently, the project Lexique3 provides 142,000 French words, with orthography, phonetic, syllabation, part of speech, gender, number of occurrence in the source corpus, frequency rank, associated lexemes, etc., available under an open license CC-by-sa-4.0.[14]

Subtlex

This Lexique3 is a continuous study from which originate the Subtlex movement cited above. New et al. 2007 made a completely new counting based on online film subtitles.

SpanishEdit

There have been several studies of Spanish word frequency (Cuetos et al. 2011).[15]

ChineseEdit

Chinese corpora have long been studied from the perspective of frequency lists. The historical way to learn Chinese vocabulary is based on characters frequency (Allanic 2003). American sinologist John DeFrancis mentioned its importance for Chinese as a foreign language learning and teaching in Why Johnny Can’t Read Chinese (DeFrancis 1966). As a frequency toolkit, Da (Da 1998) and the Taiwanese Ministry of Education (TME 1997) provided large databases with frequency ranks for characters and words. The HSK list of 8,848 high and medium frequency words in the People’s Republic of China, and the Republic of China (Taiwan)’s TOP list of about 8,600 common traditional Chinese words are two other lists displaying common Chinese words and characters. Following the SUBTLEX movement, Cai & Brysbaert 2010 recently made a rich study of Chinese word and character frequencies.

OtherEdit

Most frequently used words in different languages based on Wikipedia or combined corpora.[16]

See alsoEdit

  • Letter frequency
  • Most common words in English
  • Long tail
  • Google Ngram Viewer – shows changes in word/phrase frequency (and relative frequency) over time

NotesEdit

  1. ^ a b «Crr » Subtitle Word Frequencies».
  2. ^ Boada, Roger; Guasch, Marc; Haro, Juan; Demestre, Josep; Ferré, Pilar (1 February 2020). «SUBTLEX-CAT: Subtitle word frequencies and contextual diversity for Catalan». Behavior Research Methods. 52 (1): 360–375. doi:10.3758/s13428-019-01233-1. ISSN 1554-3528. PMID 30895456. S2CID 84843788.
  3. ^ «Words and phrases: Frequency, genres, collocates, concordances, synonyms, and WordNet».
  4. ^ «Corpus of Contemporary American English (COCA)».
  5. ^ «It’s the links, stupid». The Economist. 20 April 2006. Retrieved 2008-06-05.
  6. ^ Merholz, Peter (1999). «Peterme.com». Internet Archive. Archived from the original on 1999-10-13. Retrieved 2008-06-05.
  7. ^ Kottke, Jason (26 August 2003). «kottke.org». Retrieved 2008-06-05.
  8. ^ «Le français fondamental». Archived from the original on 2010-07-04.
  9. ^ Ouzoulias, André (2004), Comprendre et aider les enfants en difficulté scolaire: Le Vocabulaire fondamental, 70 mots essentiels (PDF), Retz — Citing V.A.C Henmon
  10. ^ «Generalities».
  11. ^ «PDF 3000 French words».
  12. ^ «Maitrise de la langue à l’école: Vocabulaire». Ministère de l’éducation nationale.
  13. ^ Baudot, J. (1992), Fréquences d’utilisation des mots en français écrit contemporain, Presses de L’Université, ISBN 978-2-7606-1563-2
  14. ^ «Lexique».
  15. ^ «Spanish word frequency lists». Vocabularywiki.pbworks.com.
  16. ^ Most frequently used words in different languages, ezglot

ReferencesEdit

Theoretical conceptsEdit

  • Nation, P. (1997), «Vocabulary size, text coverage, and word lists», in Schmitt; McCarthy (eds.), Vocabulary: Description, Acquisition and Pedagogy, Cambridge: Cambridge University Press, pp. 6–19, ISBN 978-0-521-58551-4
  • Laufer, B. (1997), «What’s in a word that makes it hard or easy? Some intralexical factors that affect the learning of words.», Vocabulary: Description, Acquisition and Pedagogy, Cambridge: Cambridge University Press, pp. 140–155, ISBN 9780521585514
  • Nation, P. (2006), «Language Education — Vocabulary», Encyclopedia of Language & Linguistics, Oxford: 494–499, doi:10.1016/B0-08-044854-2/00678-7, ISBN 9780080448541.
  • Brysbaert, Marc; Buchmeier, Matthias; Conrad, Markus; Jacobs, Arthur M.; Bölte, Jens; Böhl, Andrea (2011). «The word frequency effect: a review of recent developments and implications for the choice of frequency estimates in German». Experimental Psychology. 58 (5): 412–424. doi:10.1027/1618-3169/a000123. PMID 21768069. database
  • Rudell, A.P. (1993), «Frequency of word usage and perceived word difficulty : Ratings of Kucera and Francis words», Most, vol. 25, pp. 455–463
  • Segui, J.; Mehler, Jacques; Frauenfelder, Uli; Morton, John (1982), «The word frequency effect and lexical access», Neuropsychologia, 20 (6): 615–627, doi:10.1016/0028-3932(82)90061-6, PMID 7162585, S2CID 39694258
  • Meier, Helmut (1967), Deutsche Sprachstatistik, Hildesheim: Olms (frequency list of German words)
  • DeFrancis, John (1966), Why Johnny can’t read Chinese (PDF)
  • Allanic, Bernard (2003), The corpus of characters and their pedagogical aspect in ancient and contemporary China (fr: Les corpus de caractères et leur dimension pédagogique dans la Chine ancienne et contemporaine) (These de doctorat), Paris: INALCO

Written texts-based databasesEdit

  • Da, Jun (1998), Jun Da: Chinese text computing, retrieved 2010-08-21.
  • Taiwan Ministry of Education (1997), 八十六年常用語詞調查報告書, retrieved 2010-08-21.
  • New, Boris; Pallier, Christophe, Manuel de Lexique 3 (in French) (3.01 ed.).
  • Gimenes, Manuel; New, Boris (2016), «Worldlex: Twitter and blog word frequencies for 66 languages», Behavior Research Methods, 48 (3): 963–972, doi:10.3758/s13428-015-0621-0, ISSN 1554-3528, PMID 26170053.

SUBTLEX movementEdit

  • New, B.; Brysbaert, M.; Veronis, J.; Pallier, C. (2007). «SUBTLEX-FR: The use of film subtitles to estimate word frequencies» (PDF). Applied Psycholinguistics. 28 (4): 661. doi:10.1017/s014271640707035x. hdl:1854/LU-599589. S2CID 145366468. Archived from the original (PDF) on 2016-10-24.
  • Brysbaert, Marc; New, Boris (2009), «Moving beyond Kucera and Francis: a critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English» (PDF), Behavior Research Methods, 41 (4): 977–990, doi:10.3758/brm.41.4.977, PMID 19897807, S2CID 4792474
  • Keuleers, E, M, B.; New, B. (2010), «SUBTLEX—NL: A new measure for Dutch word frequency based on film subtitles», Behavior Research Methods, 42 (3): 643–650, doi:10.3758/brm.42.3.643, PMID 20805586
  • Cai, Q.; Brysbaert, M. (2010), «SUBTLEX-CH: Chinese Word and Character Frequencies Based on Film Subtitles», PLOS ONE, 5 (6): 8, Bibcode:2010PLoSO…510729C, doi:10.1371/journal.pone.0010729, PMC 2880003, PMID 20532192
  • Cuetos, F.; Glez-nosti, Maria; Barbón, Analía; Brysbaert, Marc (2011), «SUBTLEX-ESP : Spanish word frequencies based on film subtitles» (PDF), Psicológica, 32: 133–143
  • Dimitropoulou, M.; Duñabeitia, Jon Andoni; Avilés, Alberto; Corral, José; Carreiras, Manuel (2010), «SUBTLEX-GR: Subtitle-Based Word Frequencies as the Best Estimate of Reading Behavior: The Case of Greek», Frontiers in Psychology, 1 (December): 12, doi:10.3389/fpsyg.2010.00218, PMC 3153823, PMID 21833273
  • Pham, H.; Bolger, P.; Baayen, R.H. (2011), «SUBTLEX-VIE : A Measure for Vietnamese Word and Character Frequencies on Film Subtitles», ACOL
  • Brysbaert, M.; New, Boris; Keuleers, E. (2012), «SUBTLEX-US : Adding Part of Speech Information to the SUBTLEXus Word Frequencies» (PDF), Behavior Research Methods: 1–22 (databases)
  • Mandera, P.; Keuleers, E.; Wodniecka, Z.; Brysbaert, M. (2014). «Subtlex-pl: subtitle-based word frequency estimates for Polish» (PDF). Behav Res Methods. 47 (2): 471–483. doi:10.3758/s13428-014-0489-4. PMID 24942246. S2CID 2334688.
  • Tang, K. (2012), «A 61 million word corpus of Brazilian Portuguese film subtitles as a resource for linguistic research», UCL Work Pap Linguist (24): 208–214
  • Avdyli, Rrezarta; Cuetos, Fernando (June 2013), «SUBTLEX- AL: Albanian word frequencies based on film subtitles», ILIRIA International Review, 3 (1): 285–292, doi:10.21113/iir.v3i1.112, ISSN 2365-8592
  • Soares, Ana Paula; Machado, João; Costa, Ana; Iriarte, Álvaro; Simões, Alberto; de Almeida, José João; Comesaña, Montserrat; Perea, Manuel (April 2015), «On the advantages of word frequency and contextual diversity measures extracted from subtitles: The case of Portuguese», The Quarterly Journal of Experimental Psychology, 68 (4): 680–696, doi:10.1080/17470218.2014.964271, PMID 25263599, S2CID 5376519

Learn words with Flashcards and other activities

Other learning activities

Full list of words from this list:

  1. absolution

    the act of being formally forgiven

  2. anarchy

    a state of lawlessness and disorder

  3. anthropology

    science of the origins and social relationships of humans

  4. age

    how long something has existed

  5. agriculture

    the practice of cultivating the land or raising stock

  6. archaeology

    the branch of anthropology that studies prehistoric people

  7. architecture

    the discipline dealing with the design of fine buildings

  8. archive

    a depository containing historical records and documents

  9. artifact

    a man-made object

  10. artisan

    a skilled worker who practices some trade or handicraft

  11. autobiography

    a book or account of your own life

  12. barter

    exchange goods without involving money

  13. boycott

    refusal to have commercial dealings with some organization

  14. census

    a periodic count of the population

  15. century

    a period of 100 years

  16. city state

    a state consisting of a sovereign city

  17. civilization

    a society in an advanced state of social development

  18. clergy

    the entire class of religious officials

  19. colony

    a group of organisms of the same type living together

  20. confrontation

    discord resulting from a clash of ideas or opinions

  21. constitution

    the act of forming or establishing something

  22. culture

    all the knowledge and values shared by a society

  23. chronological

    relating to or arranged according to the order of time

  24. curator

    the custodian of a collection, as a museum or library

  25. data

    a collection of facts from which conclusions may be drawn

  26. deity

    a supernatural being worshipped as controlling the world

  27. deism

    the belief in God on the basis of reason alone

  28. democracy

    the orientation of those who favor government by the people

  29. demographic

    a statistic characterizing human populations

  30. dictator

    a ruler who is unconstrained by law

  31. diplomacy

    negotiation between nations

  32. document

    a representation of a person’s thinking with symbolic marks

  33. documentary

    a film presenting the facts about a person or event

  34. domino effect

    the consequence of one event setting off a chain of similar events (like a falling domino causing a whole row of upended dominos to fall)

  35. domestic

    of or relating to the home

  36. dynasty

    a sequence of powerful leaders in the same family

  37. economics

    science dealing with the circulation of goods and services

  38. empire

    the domain ruled by a single authoritative sovereign

  39. enlightenment

    education that results in the spread of knowledge

  40. entrepreneur

    someone who organizes a business venture

  41. epoch

    a period marked by distinctive character

  42. era

    a period marked by distinctive character

  43. exile

    the act of expelling a person from their native land

  44. export

    sell or transfer abroad

  45. fossil

    the remains of a plant or animal from a past geological age

  46. heresy

    a belief that rejects the orthodox tenets of a religion

  47. hierarchy

    a series of ordered groupings within a system

  48. impeach

    bring an accusation against

  49. immigration

    movement of people into a country or area

  50. inflation

    the act of filling something with air

  51. initiative

    readiness to embark on bold new ventures

  52. import

    bring in from abroad

  53. irrigation

    the act of supplying dry land with water by artificial means

  54. isolationist

    of or relating to isolationism

  55. intolerable

    incapable of being put up with

  56. Judaical

    of or relating to or characteristic of the Jews or their culture or religion

  57. jury system

    a legal system for determining the facts at issue in a law suit

  58. legislative

    relating to a lawmaking assembly

  59. literacy

    the ability to read and write

  60. mass medium

    a technology that publicly transmits to a large audience

  61. malnutrition

    a state of poor nourishment

  62. migration

    the movement of persons from one locality to another

  63. millennium

    a span of 1000 years

  64. monotheistic

    believing that there is only one god

  65. monarchy

    autocracy governed by a ruler who usually inherits authority

  66. myth

    a traditional story serving to explain a world view

  67. nationalism

    the doctrine that your country’s interests are superior

  68. neglect

    leave undone or leave out

  69. neutrality

    nonparticipation in a dispute or war

  70. nomad

    a member of a people who have no permanent home

  71. paleontology

    the earth science that studies fossil organisms

  72. pardon

    accept an excuse for

  73. patriarch

    the male head of family or tribe

  74. perspective

    a way of regarding situations or topics

  75. poll

    the counting of votes (as in an election)

  76. prehistoric

    belonging to or existing before recorded times

  77. primary

    of first rank or importance or value

  78. propaganda

    information that is spread to promote some cause

  79. province

    the territory in an administrative district of a nation

  80. ratification

    making something valid by formally confirming it

  81. reformation

    improvement in the condition of institutions or practices

  82. refugee

    an exile who flees for safety

  83. republic

    a form of government whose head of state is not a monarch

  84. research

    a seeking for knowledge

  85. revolution

    a single complete turn

  86. rural

    living in or characteristic of farming or country life

  87. schism

    division of a group into opposing factions

  88. scribe

    someone employed to make written copies of documents

  89. secular

    someone who is not a clergyman or a professional person

  90. secondary

    being of second rank or importance or value

  91. sectionalism

    excessive devotion to the interests of a particular region

  92. segregation

    the act of keeping apart

  93. social contract

    an agreement that results in the organization of society

  94. socialism

    a political theory advocating state ownership of industry

  95. statistics

    a branch of mathematics concerned with quantitative data

  96. suffrage

    a legal right to vote

  97. tariff

    a government tax on imports or exports

  98. technology

    the practical application of science to commerce or industry

  99. theocracy

    a political unit governed by a deity

  100. totalitarianism

    a form of government in which the ruler is unconstrained

  101. tribe

    a group of people with shared ancestry and customs

  102. tribune

    an ancient Roman official elected by the plebeians

  103. tyranny

    government in which the ruler is an absolute dictator

  104. urban

    relating to a city or densely populated area

  105. veto

    a vote that blocks a decision

  106. Zealot

    a member of an ancient Jewish sect in Judea in the first century who fought to the death against the Romans and who killed or persecuted Jews who collaborated with the Romans

Created on September 21, 2011

history words

History, like many disciplines, has its own distinctive styles of writing. While learning to think about the past, history students must also learn to write history in a clear and convincing manner. Those who are already strong writers will relish this challenge but others may find it confronting.

This page contains several lists of ‘history words’ to provide you with a head start in writing history. You will encounter many of these words when reading history while others are useful descriptive words you can use in your own writing. These lists are not comprehensive or exhaustive but may prove useful for inexperienced writers.

If you are new to history, or have difficulty finding the right words, save or print off these lists and keep them to hand. If you would like to suggest words for these lists, please make contact with your ideas.

Sections or groups in society

academia People who work in schools and universities, teaching or undertaking research
agrarian People involved in producing crops and livestock through farming
aristocracy People who possess noble titles and privileges, often with wealth and power
artisans People involved in the manufacture or repair of items, such as mechanics
bourgeoisie People who own capital, such as land, factories and raw materials
capitalist As for bourgeoisie (above), people who own capital and the means of production
clergy People ordained by the church to carry out its functions, such as priests, monks and nuns
commercial People involved in trade, such as importing and exporting, buying and selling
economic People, institutions and activities that produce society’s wants and needs
establishment The political, social and economic elites who wield power in a society
gender Refers to the rights, roles and conditions of men and women in a society
industrial The mass production of wants and needs, particularly on a large scale
intelligentsia People who develop ideas, theories and policies in a society
middle class The social classes who own some property and enjoy safe and stable standards of living
military A state’s defence forces, such as the army, navy and air force
monarchy The institution of hereditary royalty, led by a king, queen or emperor
nobility People who possess noble titles, either from birth, royal grant or venality
peasantry People who work the land, usually as tenant farmers and often in impoverished conditions
philosophes Intellectuals and writers who engage in critical study of society, beliefs and ideas
political The people, bodies and processes that govern and make decisions in a society
proletariat People who work for wages in a society, particularly in the industrial sector
provincial The areas of a nation outside major cities, such as lesser towns, rural areas or colonies
upper class The upper levels of a society, such as royalty, aristocracy and the very wealthy
urban The people, actions and conditions in large cities
village A small agricultural community, usually in a rural area
working class The lower levels of society, whose members must work to survive

Political systems

absolutism Any political system where the ruler or government wields absolute power
anarchism A political system that seeks to abolish the state and create a communal society
autocracy A system where political power is concentrated in the hands of a single person
capitalism An economic system where most companies, land and other resources are privately owned
colonialism A system of claiming, settling, ruling and maintaining one or more colonies (see imperialism)
communism A political-economic system with no state, minimal class differences and economic equality
constitutional monarchy A political system with a monarch whose power is limited and shared with the people
democracy A political system where the government or parts of it are selected by the people
divine right A form of political authority where power is said to be ordained by God
fascism A political system marked by authoritarian rule, nationalism, state and military power
feudalism A medieval socio-political system with a hierarchy of kings, lords, knights and vassals
imperialism A system where a powerful state conquers territories (colonies) for its own gain
Marxism A system or world view based on material factors, inequalities of wealth and class struggle
mercantilism An economic system designed to increase national power by increasing wealth and trade
militarism A system where military needs are prioritised and the military exerts political influence
nationalism An ideology urging loyalty to one’s own country; to put your country first
popular sovereignty A form of political authority where power is derived from the consent of the people
socialism A system where the government rules in the interests of the workers or common people
syndicalism A form of socialism where the workers collectively control their factories or workplaces
theocracy A system where government and laws are determined by religious leaders and teachings
totalitarianism A political system where the power of the state often overrides the rights of individuals
welfare state A system that provides necessities of life to the homeless, unemployed, sick or elderly

Political concepts

assembly A body of people, elected or appointed to form government or make decisions
autocracy A form of government where one person is responsible for decision making
constitution A document defining systems of government and the limits of government power
democracy A political system where government is formed by popular elections
divine right The idea that governments and autocrats derive their power and authority from God
elections The process of voting to select others, usually to form a representative government
executive The branch of government responsible for leadership and day to day decision making
government A system responsible for leadership, making decisions and making laws in a society
ideology A system of ideas and beliefs that shapes one’s views about politics and government
legislature An assembly that exists to pass new laws or review, amend or abolish existing laws
parliament An elected legislature from which an executive government is also formed
participation The involvement of ordinary people in selecting government and in political discourse
popular sovereignty The idea that governments derive their power and authority from the consent of the people
representation A political concept where some individuals act, speak or make decisions on behalf of others
sovereignty The supreme authority of a government, the basis for its power and autonomy
state ‘The state’ describes an organised society and the political system that governs it

Economic concepts

capital The resources needed to produce things, such as land, raw materials and equipment
commerce The business of buying and selling, particularly on a large scale
debt Money owed to another party, usually because it has been previously borrowed
deficit The shortfall that exists when spending is greater than income
exports Resources or goods sold and shipped to another country, which boosts national income
finance Describes the sections of an economy concerned with managing money, such as banking
imports Resources or goods bought and shipped in from another country, depleting national income
industry The production of raw materials and manufactured goods within an economy
inflation An increase in prices for goods and services, reducing the purchasing power of money
labour The people who provide work to enable production or delivery of services; the workers
laissez-faire French for “let it be”; an economy free of trade regulations, tariffs or costs
manufacturing The process of making or producing goods, particularly on a large scale
production The process of making things, particularly things that have additional value
profit Financial reward obtained from business or investment, where income exceeds costs
revenue Money received for normal activities, such as sales (business) or taxation (government)
taxation Money collected from individuals and groups by the government to fund the state
trade The buying or selling of goods, usually in exchange for money

Words for describing historical cause

agitated aroused awakened brought about catalyst
developed deteriorated encouraged exacerbated fuelled
generated incited inflamed instigated kindled
led to long term motivated popularised propagandised
prompted promoted protested provoked radicalised
reformed rocked roused set off short term
solicited sparked spurred stimulated stirred up
transformed triggered urged whipped up worsened

Words for describing historical effect or consequence

boosted catastrophic consolidated crippled decimated
demoralised depleted disastrous disbanded disoriented
dispersed dissolved divided drained elevated
emboldened enriched exhausted fatigued hardened
heartened improved inspired mobilised prospered
punished restored sapped scattered separated
stimulated strained strengthened stretched unified
united unsettled uplifted upset wearied

Words for describing historical continuity

blocked calmed censored clamped down concealed
conservative contained curbed deterred dispersed
froze halted held back limited mollified
pacified oppressed overpowered prohibited quashed
quelled reactionary regressed repressed resisted
restored restrained restricted smothered stabilised
stemmed stunted subdued suppressed wound back

Words for describing historical significance

adverse calamitous catastrophic destabilising destructive
devastating dire disastrous essential expedient
far reaching far sighted fateful forerunner ground breaking
healing important innovative meaningful modernising
negative ominous opportune profound pivotal
positive revolutionary ruinous serious shaking
shattering significant spear heading timely trail blazing
transforming tumultuous unsettling uprooting vital

Words for evaluating historical sources

balanced baseless biased convincing credible
deceptive dishonest distorted doubtful dubious
emotive exaggerated fallacious far fetched flawed
honest imbalanced impossible inflated limited
misleading one sided overwrought persuasive phoney
plausible propagandist realistic reasonable selective
sensationalist skewed sound spurious unrealistic
unreliable untenable useful valid vivid

Command words for history tasks and activities

analyse Examine and discuss the important structure or parts of something
annotate Record written questions, comments or explanations on a document or visual source
annotated bibliography A list of books that contains a note about the content and usefulness of each book
argue Present a case, to express and explain a particular reason or theory
brainstorm Gather and record thoughts and ideas spontaneously, without sorting or evaluating them
cite Refer to an authority or trusted source, as evidence of your information or idea
compare Examine two or more propositions and identify and discuss similarities between them
concept map A visual chart or diagram, using shapes and lines to organise and connect topics or ideas
conclusion The last paragraph in sustained writing, it restates the contention and ’rounds off’ the text
contrast Examine two or more propositions and identify and discuss differences between them
critically analyse Analyse something and offer views and judgements about the merit or value of its parts
define Provide precise meanings and explanations about something
describe Provide a detailed and graphic account of something
discuss Provide a balanced commentary about something, mentioning arguments for and against
evaluate Analyse something and form final conclusions about its value, credibility or merit
explain Provide a clear, straightforward and detailed account of something
historiographical activity A task requiring discussion of historians and their interpretations of a particular topic
interpret Examine something to extract its meaning and express it in your own words
introduction The first paragraph in sustained writing, offering a contention and an outline of the text
issue A topic or question that is open to discussion, debate or dispute
justify Provide clear reasons, grounds and evidence for a particular argument or conclusion
outline Provide a basic overview of something, describing only its main features
paraphrase To describe someone else’s words, statement or meaning, in your own words
review Read or examine something and offer your own thoughts and judgements about it
signpost Use phrases and sentences outlining the direction or structure your writing will take
summarise Briefly describe the main points or attributes of something, without going into much detail

Citation information
Title: “History words”
Authors: Jennifer Llewellyn, Steve Thompson
Publisher: Alpha History
URL: https://alphahistory.com/history-words/
Date published: June 3, 2018
Date updated: December 24, 2022
Date accessed: April 12, 2023
Copyright: Content on this page may not be republished without our express permission. For more information on usage, please refer to our Terms of Use.

Lexical lists are compilations of cuneiform signs and word readings written on clay tablets throughout Mesopotamia. From the late 4th millennium BCE up to the 1st century CE, scribal communities copied, modified, and passed on these cuneiform lexical lists and preserved them for as knowledge for a variety of purposes. Just as today people pass on and embrace the knowledge of scientific discoveries, lexical lists were the knowledge and intellectual material of the day when cuneiform writing emerged in the 4th millennium BCE. Including unpublished lexical lists, over 15,000 tablets exist. For the duration of the cuneiform lexical tradition, the meaning, purpose, and significance between world lists was in flux and development.

Neo-Assyrian Cuneiform Lexical List

Neo-Assyrian Cuneiform Lexical List

The Trustees of the British Museum (Copyright)

Description of Lexical Lists

In the simplest form, lexical lists may be divided into two categories: sign lists and word lists. The first primarily presents an inventory of signs along with their proper use. The second organizes cuneiform by semantics, which is the branch of linguistics and logic concerned with meaning, and is typically written in a thematic organization. Of course, some contain elements of both sign lists and word lists, indicative that we must permit a certain amount of fluidity when attempting to define lexical lists. Over time and with greater cultural interactions, they were further added upon with two columns, and sometimes three, in different languages in order to operate as transmitters of language for future generations. Although this description makes lexical lists seem mundane and pointless, they, in reality, can be used to understand historical developments and reconstruct the cultural landscape and ideas of the ancient Near East.

Brief History of Lexical Lists

In c. 3200 BCE, archaic writing of cuneiform was developed. During this period, the technology of writing was novel. Niek Veldhuis comments on the historical significance of archaic lexical lists: «The invention of a writing system is to be seen in the context of the development of standardized mass production and organized labor» (27). Consequently, a new class of society emerged, namely the scribal class, and lexical lists became a tool for constructing social identity within early scribal communities.

YouTube
Follow us on YouTube!

Moving into the 3rd millennium BCE, cuneiform lexical lists spread unevenly, which prevents strong conclusions from being made. Up to the Old Akkadian and Ur III periods (c. 2230 – 2004 BCE), lexical lists were primarily based in single locations, though not spread across Mesopotamia. In the Old Akkadian and Ur III periods, «the lexical material is reduced to a trickle» (Veldhuis, 142). Thus, for the duration of the 3rd millennium BCE, we only have evidence that lexical lists were primarily tools of authority, power, and leadership, not teaching within scribal communities. Importantly, in both the archaic lexical lists and those within the third millennium, there is great conservativeness, with many of the same texts being copied and written, with minor adjustments.

In the 3rd millennium BCE lexical lists were primarily tools of authority & power, only the Old Babylonian period sees the establishment of a scribal curriculum.

At the dawning of the second millennium, the Old Babylonian period (ca. 2000 – 1600 BCE), traditional texts from the archaic period and third millennium began to dwindle and new word lists and sign lists began to emerge. This period is extremely important in reconstructing the development of scribal practices and lexical lists because we see the establishment of an Old Babylonian scribal curriculum. Many of the texts from the archaic period became «teaching texts that introduced pupils to the invented tradition of a glorious Sumerian past» (Veldhuis, 218). Additionally, the new lexical lists, such as grammatical lists, found association with divinatory and mathematical literature rather than the scribal school. Third, we see the emergence of lexical lists oriented towards speculative philology, or the isolation of Sumerian symbols to translate them into Akkadian. This third category for usage of lexical lists is important because it marks the foundation of the social class of scholars. All in all, the developments during this period fit within the broader societal changes, namely the emergence of Babylonian elites.

Transitioning into the International period (c. 1600 – 1000 BCE), the late 2nd millennium, also known as the Late Bronze Age, Middle Babylonian, Kassite, Amarna, or Middle Assyrian periods, «saw an unprecedented spread of cuneiform writing and Babylonian written culture over the entire ancient Near East» (Veldhuis, 226). Reception of lexical lists during the period varied diversely because of different attitudes towards the cuneiform and the lexical tradition. During the International period, lexical lists began to splinter into various traditions, meaning that one could place two of the same lexical lists side by side and find variations. Most significant in terms of reception of lexical lists is Assyria’s, who reacted with conservatism and embraced their Babylonian cultural heritage.

Love History?

Sign up for our free weekly email newsletter!

Overlapping with the International period, the early history of Assur, the heart of ancient Assyria, treated the Babylonian cultural heritage like holy writ, thereby redefining the character of scribal practice. With the acquisition and high value of this intellectual tradition, lexical lists became the literary technology in the Middle Assyrian period, which justified and cemented Assyria into a respected and ancient tradition. The fluidity of lexical lists during this period decreased and became objectified, frozen in time as a sort of canon. They were considered so because, to a certain extent, lexical lists symbolized primordial knowledge and «came to play a role in the management of power and legitimation of a world empire» (Veldhuis, 391).

Babylonian Cuneiform Lexical List

Babylonian Cuneiform Lexical List

The Trustees of the British Museum (Copyright)

Finally, in the Neo- and Late Babylonian period, scholarship, and thereby lexical lists, became the property and responsibility of temples and elite families in charge. Many of these late lexical lists include dedication prayers, indicative that writing and education were closely associated with temples and political leadership. Additionally, unlike the Old Babylonian period, lexical traditions ceased becoming the primary focus of scholarship; rather they became integral to further other areas of scholarship, such as celestial sciences and horoscopy.

Unfortunately, many lexical lists which possibly existed in the 1st century CE are now absent because the scribes chose to write with a different cultural medium brought by Hellenization, namely writing on parchment or other surfaces rather than clay tablets.

Final Remarks

From the archaic period up to the 1st century CE, roughly 3,300 years, the tradition of lexical lists developed into a source of knowledge and a political legitimation tool. Yet, during this long period of time, lexical lists maintained an important position within the cultural landscape because they represented the increasingly valuable technology of writing, a technology which eventually became associated with primordial knowledge. Through a lengthy reception history, many of the lexical lists from the archaic period were still utilized in the 1st century CE, a remarkable time for any literature to be remembered and well-received. In a world that takes writing and reading for granted, though, we would do well to remember that scribal practice, writing, and reading are all technologies and potential mediums for social, political, and religious change.

This article has been reviewed for accuracy, reliability and adherence to academic standards prior to publication.

There are many word lists for general and academic English study. This page describes
the most important ones, first giving an
overview of the different types of word list, then presenting a
more detailed summary of individual lists.
The summary contains links to other pages on the site which have more detail of each list and (often) a complete copy of the list itself.

There is a companion page in this section which gives
information on why word lists are important (and tips on how to use them).

[Note: Links to other pages are in blue, links to other parts of this page are in red.]

Types of word list

Word lists can be divided into three types, namely
general word lists and
academic word lists, although as will be explained below, academic lists can be sub-divided into
general academic lists and
field-specific (i.e. subject-specific) academic lists.
An additional way to classify word lists is those which contain only single words (the majority of the lists are this type), and
multi-word lists. A final way to classify lists is written vs. spoken. Most of the lists that exist are
for written English, though many of the multi-word lists include both a spoken and written component.

General word lists (single words)

Interest in word lists began with studies of core or general vocabulary, that is, words having high frequency across a wide range of
texts. The first general word list to have important use in language study was the
General Service List (GSL), created by Michael West in 1953.
This list has been used to design EFL materials and courses, and, despite its age, it is probably still the most widely used list of general vocabulary.
Originally consisting of 2000 words (called headwords) and their corresponding word families, it was revised in 1995 by Bauman and Culligan,
with an increase in the number of headwords from 2000 to 2284.

One criticism of the GSL is its inclusion of too many low frequency words, some of which are a product of its age (e.g. shilling, headdress, cart, servant) while
excluding more recent vocabulary (e.g. computer, television, Internet). A second criticism is that it uses word families. The assumption behind the use of
word families is that once one word is known, other members of the family can be easily recognised; however, this may not always be the case. Examples of
distantly related word family pairs in the GSL are: please/unpleasantly, part/particle and value/invaluable. Additionally, some word
forms are used more frequently than others, and the inclusion of less frequent forms adds an unnecessarily burden to the learning load of students.

general lists

These criticisms have led to the creation of two updated versions of the list, both devised in 2013, both called the New General Service List.
Both lists use inflected forms and variant spellings (called lemmas), rather than extended word families.
The first, abbreviated to
NGSL, was developed by Browne, Culligan and Phillips. It is a list of 2801 words which give over 90% coverage.
It was generated from a corpus of 273 million words, 100 times larger than that used for the GSL.
The second list, abbreviated to
new-GSL, was devised by Brezina and Gablasova from a corpus of over 12 billion words.
It consists of 2494 words and gives around 80% coverage.

General word lists (multi-word)

The above are all single word lists. There are several multi-word lists for general vocabulary, such as the
First 100 Spoken Collocations (First 100) by Shin and Nation (2008), and the Phrasal Expressions List (PHRASE List) by Martinez and Schmitt (2012). However,
since none of these is used as a basis for academic word lists, in contrast to the general lists given above, they are not explained here in detail.

Academic word lists (single words)

Researchers have long been interested in defining and isolating academic vocabulary, and there have been many attempts to devise
lists which are of general use to students of academic English.

The first widely used academic word list was the
University Word List (UWL),
created in 1984 by Xue and Nation. It comprises 836 word families, divided into levels based on frequency.
It excludes words from the GSL, and gives 8.5% coverage of academic texts. It was developed by combining four existing lists.

A major update to the UWL came in 2000, when Averil Coxhead, of the University of Wellington, devised the
Academic Word List (AWL). This list
has been hugely influential and is perhaps the most widely known and used academic word list. Like the UWL, it comprises word families and is
divided into levels based on frequency. It gives similar coverage, around 10% of texts; however, it does so using far fewer word families, 570 in total.
Like the UWL, it excludes words from the GSL. It was devised in a more systematic way, using a corpus of texts from a range of academic disciplines.

Although the AWL is still widely used, it has received criticism in a number of areas. One criticism is that it is based on the
GSL, which is a very old list, dating from 1953. A second criticism is that, like the GSL, it uses word families, with the same problems as mentioned for
the GSL above.

academic lists

In response to these criticisms, other academic word lists have been created. One of these is the
Academic Keyword List (AKL), developed by Paquot in 2010. This
consists of 930 words which appear more frequently in academic texts than non-academic ones, a tendency called keyness,
which leads to the name of the list.

A second list is the
New Academic Word List (NAWL) by
Browne, Culligan and Phillips. This list responds to the criticisms of the AWL by using lemmas rather than word families, and by basing itself on a more
updated general service list, the
NGSL, created by the authors at the same time, in 2013.

A third updated list is the
Academic Vocabulary List (AVL), developed by Gardner and Davies in 2013. This list, which is also lemma-based,
selects academic words by considering their ratio in academic versus non-academic texts, with words needing to occur 1.5 times as often in the
academic texts as in non-academic ones. This is similar to the approach used to devise the AKL (above), and in contrast to lists like the AWL and NAWL which
exclude an existing general service list. In addition, the authors considered the range of words in the academic disciplines used in their corpus,
the dispersion, and discipline measure, which required that words could not occur more than three times the expected frequency in any of
the disciplines. This approach has been influential in the development of other,
field-specific lists, as well as some
technical lists, as explained below.

There are several lists specifically for academic spoken English (as distinct from the spoken components of the multi-word lists, below).
These include the English Spoken Academic Wordlist, devised by Nesi in 2002,
the Academic Spoken Word List (ASWL), devised by Dang et al. in 2017, and
the Vocabulary for Academic Lecture Listening word list (VALL), devised by Thompson in 2015.

Academic word lists (multi-word)

multi-word

Focusing exclusively on single words can lead learners to overlook valuable multi-word constructions which are commonly used in academic English.
For example, while use of the word thing is generally considered to be poor
academic style, it occurs in several phrases used by expert writers, such as
the same thing as and other things being equal.

Several multi-word lists have been developed for academic English. One is the
Academic Formulas List (AFL), devised by Simpson-Vlach and Ellis in 2009. This list
gives the most common formulaic sequences in academic English, i.e. recurring word sequences three to five words long.
There are three separate lists: one for formulas that are common in both academic spoken and written English (the ‘core’ AFL),
one for spoken English, and one for written English.

Another multi-word list is the
Academic Collocation List (ACL), developed by Ackermann and Chen in 2013. The ACL
contains 2469 of the most frequent and useful collocations which occur in written academic English.

A third list is the
Discourse Connectors List (DCL), devised by Rezvani Kalajahi, Neufeld and Abdullah in 2017. This list
classifies and describes 632 discourse connectors, ranking them by frequency in three different registers (academic, non-academic and spoken).

More recently, there is the
Academic idioms list, developed by Miller in 2019. This gives 170 idioms which are common in spoken academic
English, and 38 which are frequently used in written academic English.

Field-specific academic word lists (single words)

Academic word lists such as the AWL are designed to be used by students of all disciplines. Researchers have found, however, that the AWL and other lists
provide varied coverage in different subject areas. For example, the AWL provides 12.0% coverage of the Commerce sub-corpus used to derive the list,
but only 9.1% for the Science sub-corpus (with only 6.2% for Biology).
Additionally, words in the AWL (and similar lists) occur with different frequencies in different disciplines.
For example, words such as legal, policy, income, finance and legislate,
which all fall in the first (most frequent) sublist of the AWL, may be common in Business or Finance,
but are very infrequent in disciplines such as Chemistry.
Words also have different collocations and meanings across different subject areas. Examples are base, which has a special meaning in Chemistry,
and bug, which has a different meaning in Computer Science than in general English.

Researchers have therefore become increasingly interested in field-specific (i.e. subject-specific) academic lists, in disciplines ranging
from science to business to medicine. These are generally not
technical word lists, since they are intended to comprise academic (sub-technical) vocabulary.
However, not all of them set out to exclude technical words (some actually set out to include them), and even for those that do,
the line between academic and technical words is often blurred.

Broadly speaking, there are three approaches used by researchers when devising field-specific academic lists.

The first of these is to use the GSL and AWL as a starting point, and to devise a third list which supplements the other two. These lists
exclude GSL and AWL words, and, since they are based on word family lists, also comprise word families.
These lists usually replace the ‘A’ of ‘AWL’ with a subject specific letter.
Examples are the
SWL (Science Word List), the
BWL#1 (Business Word List #1), the
Pharmacology Word List and the
CSWL (Computer Science Word List).

field1

The second approach is to assume that learners are already familiar with general vocabulary and to devise a second list which replaces
other academic lists such as the AWL or NAWL for specific subject areas. As such, these lists exclude the GSL (or NGSL), but do not
exclude any other lists such as the AWL.
These lists usually add the subject letter before ‘AWL’ to derive their name.
Examples are the
MAWL (Medical Academic Word List) and the
NAWL (Nursing Academic Word List), both of which exclude the GSL and are word family lists (like the GSL), and the
EAWL (Economics Academic Word List), which excludes the NGSL and is a lemma-based list (like the NGSL).

The third approach is to devise a single, completely independent list, which includes words based on ratio, dispersion, and other measures, in a similar
way the AVL. These lists, which are usually lemma-based, tend to use ‘AVL’ in their name, preceded by an abbreviation for the subject. Examples are the
MAVL (Medical Academic Vocabulary List) and the
CSAVL (Computer Science Academic Vocabulary List). The
Chemistry Academic Word List (CAWL), although it broadly uses
the same approach, uses word families, and also predates the creation of the AVL, and does not follow the same naming pattern.

There are two further lists which deserve mention here. Both have been developed using the same principles as the lists above; however, they
are intended for school-age rather than university students.

The first is the
Middle School Vocabulary Lists (MSVL). These are a series of five lists developed in 2015 by Greene and
Coxhead, along similar lines to Coxhead’s earlier AWL, i.e. by excluding the GSL and working with word families. However, this list is
intended not for students at or preparing for university, but middle school students, and covers technical rather than purely academic vocabulary.
The lists cover the following subjects: English, Health, Mathematics, Science, and Social Science/History.

Another is the
Secondary Schools Vocabulary Lists (SVL). Developed in 2018 by Green and Lambert, the SVL are a series of lists of
discipline-specific words for secondary school education, covering eight core subjects: Biology, Chemistry, Economics, English, Geology, History,
Mathematics, and Physics. The lists were devised using methods similar to those used to create the
AVL and the
MAVL, which are lemma-based lists which consider measures such as range and dispersion along with word frequency.
The lists also include word family versions, as well as collocation lists. The SVL are designed to help students in secondary schools improve their
disciplinary literacy.

There are at least two field-specific academic lists of spoken English, both devised by Dang in 2018. They are the Hard Science Spoken Word
List (HSWL), and the Soft Science Spoken Word List (SSWL).

Technical word lists (multi-word)

field2

There have been some attempts to create discipline-specific multi-word lists, using principles employed in the creation of academic lists.
One is the Computer Science Multi-Word List (CSMWL), created by Minshall at the same time as the
Computer Science Word List (CSWL). However, it comprises only 23 items.

Another example is the
Secondary Phrase Lists (SPL), developed in 2018 by Green and Lambert, who also developed the SVL (above).
This is a series of lists, for the same eight subjects as covered by the SVL, presenting noun-noun, adjective-noun, noun-verb, verb-noun and verb-adverb
collocations.

A third, more recent example is the
Nursing Collocation List (NCL), developed in 2020 by Mandić and
Dankić. It comprises 488 collocations which occur frequently in nursing journal articles.

Summary

The following image, and table below, provide an overview of the major word lists. Spoken word lists are only included in the table
(in italics). All word lists (except spoken ones) are explained in more detail later. Note: there is a higher resolution copy of the following image in the
infographics section.

overview

Single word Multi-word
General

• GSL (General Service List) 1953

• NGSL (New General Service List) 2013

• New-GSL (New General Service List) 2013

These exist, but none are used as a basis for academic lists.

Academic

• UWL (University Word List) 1984

• AWL (Academic Word List) 2000

• AKL (Academic Keyword List) 2010

• NAWL (New Academic Word List) 2013

• AVL (Academic Vocabulary List) 2013

• English Spoken Academic Wordlist 2002

• ASWL (Academic Spoken Word List) 2017

• VALL (Vocabulary for Academic Lecture Listening word list) 2015

• AFL (Academic Formulas List) 2009

• ACL (Academic Collocation List) 2013

• DCL (Discourse Connectors List) 2017

• Academic idioms list 2019

Field-specific/ technical

• SWL (Science Word List) 2007

• BWL#1 (Business Word List #1) 2007

• PWL (Pharmacology Word List) 2007

• MAWL (Medical Academic Word List) 2008

• AgroCorpus List 2009

• BEL (Basic Engineering List) 2009

• BWL#2 (Business Word List #2) 2011

• CSWL (Computer Science Academic Word List) 2013

• CAWL (Chemistry Academic Word List) 2013

• MAVL (Medical Academic Vocabulary List) 2015

• NAWL (Nursing Academic Word List) 2015

• EAWL (Environmental Academic Word List) 2015

• EAWL (Economics Academic Word List) 2019

• CSAVL (Computer Science Academic Vocabulary List) 2021


• MSVL (Middle School Vocabulary Lists) 2015

• SVL (Secondary School Vocabulary Lists) 2018


• HSWL (Hard Science Spoken Word List) 2018

• SSWL (Soft Science Spoken Word List) 2018

• CSMWL (Computer Science Multi-Word List) 2013

• SPL (Secondary Phrase Lists) 2018

• NCL (Nursing Collocation List) 2020

References

Granger, S., and Larsson, T. (2021), ‘Is core vocabulary a friend or foe of academic writing? Singleword vs multi-word uses of THING’, Journal of English for Academic Purposes, 52 (2021) 100999.

Hyland, K. and Tse, P. (2007). ‘Is There an “Academic Vocabulary”?’, TESOL QUARTERLY, Vol. 41, No. 2, June 2007.

Radmila Palinkašević, M.A. (2017), ‘Specialized Word Lists — Survey of the Literature — Research Perspective’, Research in Pedagogy, Vol. 7, Issue 2 (2017), pp. 221-238.

Therova, D. (2020), ‘Review of Academic Word Lists’, The Electronic Journal for English as a Second Language, Volume 24, Number 1.

Detailed summary of individual lists

Below is more detail about the lists above. The lists are sorted into the following categories:

  • General (core) vocabulary single word lists (3 lists)
  • Academic single word lists: general purpose (5 lists)
  • Academic single word lists: field-specific (14 lists)
  • Technical single word lists (2 lists)
  • Academic multi-word lists (4 lists)
  • Technical multi-word lists (3 lists)

General (core) vocabulary single word lists

The following gives a more detailed summary of the general word lists mentioned on this page. Blue links
are links to other pages (with even more detail, and, often, a copy of the full word list).

Word list About
General Service List (GSL) Author: West (1953)

Size: 2284 word families

Originally a list of the 2000 most frequent word families in English, covering around 80% of various types of texts. Further divided into the 1K
(first 1000 words) and 2K (second 1000). Used as the basis for many graded readers and other ESL/EFL materials. The list was revised in 1995 by
Bauman and Culligan, and their revision, which is the version most commonly used, contains 2284 words.

Examples: the, be, of, and, a, to, in, he, have, it
New General Service List (NGSL) Author: Browne, Culligan and Phillips (2013)

Size: 2801 words

The New General Service List (NGSL), an update of the GSL, is a list of 2801 words which comprise the most important high-frequency words
in English, giving the highest possible coverage with the fewest possible words. Not to be confused with the
new-GSL (below),
also developed in 2013, the NGSL gives over 90% coverage of the corpus used. The NGSL was generated from a corpus of 273 million words,
100 times larger than that used for the GSL. Presents only inflected forms, not word families. Used as the basis for other lists, e.g.
NAWL. Has yet to have the same influence as the GSL.

Examples: the, be, and, of, to, a, in, have, it, you
New-General Service List (new-GSL) Author: Brezina and Gablasova (2013)

Size: 2494 words

The new-General Service List (new-GSL), an update of the GSL, is a list of 2494 words drawn from four different corpora with a
total size of 12 billion words. Not to be confused with the
NGSL (above), also developed in 2013, the new-GSL gives around 80%
coverage of the corpora used, similar to the GSL, though with fewer words overall, 2494 compared to approximately 4100 for the GSL.
The 2494 words comprise a core list of 2122 words, which had a similar rank in all four corpora, plus 378 words which were common in the
two more recent corpora. Like the NGSL, it uses lemmas i.e. inflected forms, not word families. Does not (yet) appear to have been used as
the basis for other lists, and is yet to have the same influence as the GSL.

Examples: the, be, of, and, a, in, to, have, that, to

Academic single word lists: general purpose

The following are the general academic word lists mentioned earlier.

Word list About
University Word List (UWL) Author: Xue and Nation (1984)

Size: 836 word families

One of the first widely used academic word lists, the UWL contains 836 word families divided into levels based on frequency.
It excludes words from the GSL, and gives coverage of 8.5% of academic texts.
Now largely replaced by the AWL.

Examples: alternative, analyze, approach, arbitrary, assess, assign, assume, compensate, complex, comply
Academic Word List (AWL) Author: Coxhead (2000)

Size: 570 word families

Perhaps the most widely known and used academic word list, the AWL is a list of 570 word families that are not included in the GSL but which appear
frequently in academic texts, across a range of disciplines. Divided into 10 sublists based on frequency. It was designed to be an improvement on the
UWL, and covers around 10% of words in academic texts: a similar amount to the UWL, but
using far fewer word families.

Examples: analyse, approach, area, assess, assume, authority, available, benefit, concept, consist
Academic Keyword List (AKL) Author: Paquot (2010)

Size: 930 words

The Academic Keyword List (AKL) consists of 930 words which appear more frequently in academic texts than non-academic ones.
This tendency is called keyness, which leads to the name of the list, since it identifies keywords in academic (vs. non-academic)
texts (the AVL, below, uses a similar principle to select words). As such,
the AKL does not exclude words from the
GSL. 49.6% of words in the AKL appear in the GSL, 38.7% in the AWL,
while 11.7% appear in neither list.

Example words: ability, absence, account, achievement, act, accept, account (for), absolute, above, according to
New Academic Word List (NAWL) Author: Browne, Culligan and Phillips (2013)

Size: 963 words

The New Academic Word List (NAWL) is a list of words that frequently appear in academic texts, but which are not contained in the
New General Service List (NGSL) (by the same authors).
The NGSL and NAWL in combination give 92% coverage of words (86% for the NGSL and 6% for the NAWL). The NAWL differs from the
AWL in that it is more up-to-date, using the NGSL rather than the much older
GSL as a basis. Additionally, it uses only inflected forms or variant
spellings of words, rather than whole word families, meaning that although it has more headwords than the AWL (963 compared to 570),
it has fewer word forms overall (2604 compared to 3112).

Example words: repertoire, obtain, distribution, parameter, aspect, dynamic, impact, domain, publish, denote.
Academic Vocabulary List (AVL) Author: Gardner and Davies (2013)

Size: 3015 words

The AVL is a list of 3015 academic words derived from the Corpus of Contemporary American English (COCA). The list excludes general
high-frequency words as well as subject-specific (technical) words, though not by directly excluding any existing list. Key features
of the list are ratio (words needed to occur 1.5 times as often in academic texts as in non-academic ones), range (words needed to
occur frequently in at least seven of nine academic disciplines), dispersion (words needed to be evenly dispersed among the disciplines)
and discipline measure (words could not occur more than three times the expected frequency in any of the disciplines). Like the
NAWL and in contrast to the
AWL, the AVL is based on words and inflected forms, not word families.

Example words: study, group, system, social, provide, however, research, level, result, include.

Academic single word lists: field-specific

The following are the field-specific lists mentioned earlier.

Word list About
Science Word List (SWL) Author: Coxhead and Hirsh (2007)

Size: 318 word families

The Science Word List (SWL) provides a list of 318 word families which do not occur in the
GSL or
AWL but which occur with reasonable frequency and range in written science texts.
The authors found that the GSL and AWL in combination give only 80% coverage of science texts, compared to 86.7% for Art, 88.8% for Commerce and
88.5% for Law. The 318 word families in the SWL make up for this shortfall, and provide an extra coverage of 3.79% of the science corpus used to
derive the list. In comparison, the SWL gives only 0.61% coverage of an Arts corpus, 0.54% for Commerce and 0.34% for Law, demonstrating that it
is a true science list. The SWL is divided into sublists based on frequency, in a similar way to the AWL. It contains 6 sublists, with the first
5 each containing 60 word families, and the last containing 18.

Example words: cell, species, acid, muscle, protein, molecule, nutrient, dense, laboratory, ion.
Business Word List #1 (BWL#1) Author: Konstantakis (2007)

Size: 560 word families

This is the first of two lists called Business Word List (BWL); the
second is considered later.
To compile the list, the author used a corpus of 33 popular Business English course books published between 1986 and 1996.
The list consists of 560 word families, comprising 480 word families selected according to range (needed to occur in at least five of the text books),
supplemented by a further 80 word families selected for frequency (needed to appear at least 10 times). The list excludes
GSL and
AWL words, and therefore provides a third, more specialised and business-oriented list for
students. The BWL provided 2.79% coverage of the texts. A separate list of common abbreviations was compiled, which added a further 0.30% coverage.
These two lists, together with the GSL and AWL, provided 93.47% coverage, although the author noted that, if proper names and nationalities were included
(e.g. London, Mexican), the coverage reached 95.65%, which is above the 95% minimum comprehension threshold. The list is presented in alphabetical order,
without frequencies.

Example words: above-mentioned, accessories, acid, adverse, aerospace, after-sales, agenda, aggressive, aircraft, airline.
Pharmacology Word List (PWL) Author: Fraser (2007)

Size: 601 word families

The PWL is intended to provide a list of words which are common in the field of pharmacology, but which are not contained in the
GSL or AWL. The PWL gives around 13% coverage of pharmacology journal articles, and 15% coverage of pharmacology textbooks.

Example words: abbreviation, abnormality, abolish, absorb, abuse, accumbens, acetonitrile, acetate, acetylcholine, acid.
Medical Academic Word List (MAWL) Author: Wang, Liang, and Ge (2008)

Size: 623 word families

The Medical Academic Word List (MAWL) was developed from a study of a 1.09 million-word corpus of medical research articles from online resources.
It contains 623 word families, and has a coverage of 12.24% of words in the corpus. The MAWL was developed in a similar way to the
AWL (Academic Word List), by first eliminating words from the
GSL (General Service List). In addition, members of the word family needed to occur in at
least half of the 32 subject areas of the corpus, and occur at least 30 times in the corpus. It provides an alternative to the AWL for medical students.

Example words: cell, data, muscular, significant, clinic, analyze, respond, factor, method, protein.
AgroCorpus List Author: Martínez, Beck, and Panza (2009)

Size: 92 word families

The AgroCorpus List is a subset of the AWL, and consists of the word families that were found to be most frequent in
an 826,416-word corpus of agriculture research articles.

Example words: environmental, accumulation, region, variation, chemical.
Basic Engineering List (BEL) Author: Ward (2009)

Size: 299 words

The Basic Engineering List (BEL), developed from a corpus of 250,000 words from 25 engineering textbooks,
is intended to serve as a foundation for students in reading English language engineering textbooks.
The list is purposely short and non-technical in nature, and focuses on word types rather than lemmas or families in order to
encourage a focus on individual words.

Examples words: system, calculate, value, flow, process, column, factors.
Business Word List #2 (BWL#2) Author: Hsu (2011)

Size: 426 word families

This is the second of two lists called Business Word List (BWL); the
first is considered above.
This BWL gives 426 word families which occur frequently in business texts, but which are not general words. This list
used a different approach to other specialist lists, by excluding the first 3000 word families from the BNC (British National Corpus),
rather than excluding other word lists. The author used a corpus which consisted of business research articles across 20 business subject
areas. The word families were chosen by range and frequency in the corpus and accounted for 5.66% of words. The words in the BWL are listed
according to which 1000 word section of the BNC they appear in (BNC 4th 1000, BNC 5th 1000, etc.), then by frequency in the business corpus.
Range (number of articles they occur in) is also given. As such, this BWL is more detailed than the first one.

Example words: asset, audit, statistic, review, transact, network, database, acquire, interact, construct
CSWL (Computer Science Word List) Author: Minshall (2013)

Size: 433 word families

This Computer Science Word List (CSWL) was designed for use by non-native English speakers studying computer science in UK universities.
It was developed from a corpus of 3.66 million words from journal articles and conference proceedings covering 10 sub-disciplines of
computer science as defined by the Association for Computing Machinery (ACM). In combination with the GSL and AWL, the CSWL
gave 95.11% coverage of the corpus.

Example words: accelerate, activate, acyclic, adversary, affine, afore, algebra, algorithm, align, alphabet.
CAWL (Chemistry Academic Word List) Author: Valipouri and Nassaji (2013)

Size: 1400 word families

The Chemistry Academic Word List (CAWL) was developed for EFL graduate Chemistry students.
It comprises word families which occur frequently in Chemistry research articles.
It includes both general and academic words, since many high frequency words have different meanings,
frequencies and collocations in specialist contexts. Of the 1400 word families in the CAWL, 683 are from the
GSL, 327 are from the
AWL, while the remaining 390 occur in neither list. In total, the CAWL gives
81.18% coverage of the CRAC (Chemistry Research Article Corpus) used to derive the list.

Example words: use, show, react, results, solve, spectrum, can, form, temperature, high.
Medical Academic Vocabulary List (MAVL) Author: Lei and Liu (2015)

Size: 819 words

The Medical Academic Vocabulary List (MAVL) was developed based on a study of a 2.7 million-word corpus of medical
academic English and a 3.5 million-word corpus of medical English textbooks. The coverage of the MAVL in the two corpuses was 19.44% and
20.18% respectively. The MAVL can be contrasted with the earlier
Medical Academic Word List (MAWL), developed in 2008, in four ways.
First, unlike the MAWL, which used only medical academic English texts, the MAVL used both medical academic English texts alongside
medical English textbooks to develop the list. Second, unlike the MAWL, the MAVL did not exclude high frequency (general) words.
Third, the MAVL is lemma-based not word family based. Fourth, it provides greater coverage, with the MAVL covering 19.44% of words in medical
academic English texts, compared to 10.52% for the MAWL, and 20.18% of words in medical English textbooks, in contrast to 12.97% for
the MAWL.

Example words: abdominal, ability, abnormal, abnormality, absence, absent, absolute, absorption, accord, accumulate.
NAWL (Nursing Academic Word List) Author: 2015

Size: 676 word families

The Nursing Academic Word List (NAWL) contains the most frequent nursing words in a one million word corpus (called the NRAC) consisting
of 252 English online nursing research articles. It is intended for graduate nursing students who need to read and publish nursing articles
in English. The NAWL covers 13.64% of the NRAC. Not to be confused with the
New Academic Word List (above), also abbreviated NAWL.
Environmental Academic Word List (EAWL) Author: Liu and Han (2015)

Size: 458 word families

Not to be confused with the
Economics Academic Word List, also abbreviated, EAWL (below),
the Environmental Academic Word List (EAWL) is intended for environmental science learners. The list gives 15.43% coverage of the
862,242 word corpus used to derive the list, compared to 12.82% for the AWL.
Economics Academic Word List (EAWL) Author: O’Flynn (2020)

Size: 887 words

The Economics Academic Word List (EAWL) is a list of words which frequently appear in economics texts,
but which are not contained in the
New General Service List (NGSL). The 887 words of the EAWL are divided
into 9 sublists based on frequency. The EAWL, which, like the NGSL, is lemma-based, makes up around 5.5% of the words in university
economics texts in English, based on a corpus study of texts ranging from economics journal articles to economics dissertations.
Not to be confused with the
Environmental Academic Word List, also abbreviated, EAWL (above).

Example words: administrative, aggregate, agriculture, allocation, aspect, audit, authority, best, better, calculation.
Computer Science Academic Vocabulary List (CSAVL) Author: Roesler (2021)

Size: 1606 words

The Computer Science Academic Vocabulary List (CSAVL) comprises two lists for use by Computer Science undergraduate students in the US.
The first list gives 904 words, while the second supplementary list, CSAVL-S, gives more technical words. Words were chosen by frequency,
range, dispersion and other criteria from a corpus of Computer Science textbooks and journal articles, and together give
19.90% coverage of a second corpus used to evaluate the list. This list, which is a stand-alone list, contrasts with the
CSWL, which is intended as a third, supplementary list to the GSL and AWL.

Example words: system, data, algorithm, such, base, node, model, case, program, information.

Technical single word lists

There are two important technical lists, both for school age students, which use the similar methodology to derive them as the other lists
on this page.

Word list About
Middle School Vocabulary Lists (MSVL) Author: Greene and Coxhead (2015)

Size: 600-800 word families per subject

The Middle School Vocabulary Lists (MSVL) are a series of lists developed in 2015 by Greene and
Coxhead, covering English, Health, Mathematics, Science, and Social Science/History.
The lists were developed from a corpus of 109 textbooks for grades 6-8 (11-14 years old). Like the
AWL, the MSVL excludes words from the
GSL and uses a word family approach. Text coverage of the lists is between 5.83% (Social Studies/History) and 10.17% (Science).

Example words [Health]: drug, physical, alcohol, stress, goal
Example words [Mathematics]: equate, graph, area, fraction, chapter.
Secondary School Vocabulary Lists (SVL) Author: Green and Lambert (2018)

Size: Varies, from 253 words (Mathematics) to 880 words (Biology)

The Secondary School Vocabulary Lists (SVL) is a series of lists of discipline-specific words for secondary school
education, covering eight core subjects: Biology, Chemistry, Economics, English, Geology, History, Mathematics, and Physics. The list was
devised using methods similar to those used to create the
AVL and the
MAVL. The SVL does not present a single list. Rather, it comprises three different types of word list for eight different subjects,
and therefore presents 24 lists in total. The three different list types are: lemma lists (sorted by frequency);
word family lists (also sorted by frequency, of all words in the family); and collocation lists (the most common 10 word associations for each).

Example words [Biology]: cell, blood, plant, enzyme, molecule.
Example words [Economics]: price, cost, demand, rate, firm.

Academic multi-word lists: general purpose

The following are the general academic multi-word lists mentioned earlier.

Word list About
Academic Formulas List (AFL) Author: Simpson-Vlach and Ellis (2009)

Size: 607 formulas

The Academic Formulas List (AFL) contains the most common formulaic sequences in academic English, i.e. recurring word sequences three to
five words long. There are three separate lists: one for formulas that are common in both academic spoken and academic written language
(the core AFL, 207 entries), one for formulas which are used frequently in academic spoken English (200 entries), and one for
those which are used frequently in academic written English (also 200 entries).

Examples [core]: in terms of, at the same time, from the point of view, in order to
Examples [spoken]: be able to, blah blah blah, this is the, you know what I mean
Examples [written]: on the other hand, due to the fact that, it should be noted
Academic Collocation List (ACL) Author: Ackermann and Chen (2013)

Size: 2469 collocations

The Academic Collocation List (ACL) contains 2469 of the most frequent and useful collocations which occur in written academic English.
It was developed using the Pearson International Corpus of Academic English (PICAE), with advice from English teaching experts to ensure
the collocations chosen would be useful to students of English. The ACL gives around 1.4% coverage of words in academic English, in contrast
to only 0.1% coverage for a general corpus.

Example collocations: cognitive ability, abstract concept, sexual abuse, (in) academic circles,
accept responsibility, allow access (to), brief account, great accuracy, achieve (a) goal,
acquire knowledge
.
Discourse Connectors List (DCL) Author: Rezvani Kalajahi, Neufeld and Abdullah (2017)

Size: 632 discourse connectors

The Discourse Connector List (DCL) classifies and describes 632 discourse connectors, ranking them by frequency in three
different registers (academic, non-academic and spoken registers) in two different corpora, namely the BNC (British National Corpus)
and COCA (Corpus of Contemporary American English).

Examples: and, or, as, if, when, also, however, after, even, because.
Academic Idioms list Author: Miller (2019)

Size: 170 idioms (spoken), 38 idioms (written)

The academic idioms list is derived from the British Academic Spoken English (BASE) corpus for spoken texts and the Oxford Corpus of
Academic English (OCAE) corpus for written texts. Only idioms with a frequency of more than 1.2 per million words in the BASE corpus
were included. Together the list accounts for approximately 0.1% of words in academic English.

Examples [written]: on the other hand, in (the) light of, on the one hand, in the hands of, bear in mind
Examples [spoken]: the balance of power, at the end of the day, the bottom line, take on board, by and large

Academic multi-word lists: field-specific

There appear to be no field-specific, academic multi-word lists at present.

Technical multi-word lists

The following are technical multi-word lists.

Word list About
Computer Science Multi-Word List (CSMWL) Author: Minshall (2013)

Size: 23 collocations

The Computer Science Multi-Word List (CSMWL) was developed by Minshall at the same time as the
CSWL. It comprises only 23 items (listed in full below).

Complete list of CSMWL collocations: control flow graph, data flow, data mining, data set,
data structure, data transfer, lower bound, flash memory, execution time, garbage collection,
machine learning, operating system, polynomial time, response time, scratch pad, search engine,
social network, software development, software engineer, steady state, upper bound, user interface,
virtual machine.
Secondary Phrase Lists (SPL) Author: Green and Lambert (2018)

Size: Size varies according to list

The Secondary Phrase Lists (SPL) was developed by Green and Lambert at the same time as the
SVL. It comprises collocations for the same eight subjects as covered
by the SVL.

Example collocations [Biology]: carbon dioxide, amino acids, water potential, blood cells
Example collocations [Economics]: demand curve, interest rate, supply curve, price level
Nursing Collocation List (NCL) Author: Mandić and Dankić (2020)

Size: 488 collocations

The Nursing Collocation List (NCL) is a list of 488 collocations which occur frequently in nursing journal articles. It was developed
using the nursing scientific article corpus (NSAC), which consisted of 1.1 million words drawn from 262 nursing articles, from ten
prominent nursing journals, all published in 2017 or 2018. The list includes only noun-adjective collocations (254, or 52.1% of the
total) and noun-noun collocations (234, or 47.9%), since these are the most common in nursing articles.

Example collocations: alcohol abuse, open access, action research, acute care, medication adherence,
chemotherapy administration, hospital admission, adverse effect, age group, significant amount.

Mossbourne Victoria Park Academy Lists

Miss Manktelow LST

Study this wordlist Study

Learn using Look, Say, Cover, Write, Check Learn

Spelling Games Games

Listen and Spell test Test

Printable worksheets and activities Print

Handwriting worksheets Write

Study the word list: history

for homework

escape

He tried to escape from the cell.

railroad

The slaves used the underground railroad.
*

conductor

The orchestra watched the conductor carefully.

reward

There’s a big reward for his capture.

route

We drove home by the scenic route.

civil

You must speak to others in a civil manner.

punishment

The judge handed out a harsh punishment to the criminal.
*

* This sentence was added by a Spellzone user.

Use the list: history

'Look, Say, Cover, Write, Check' spelling exercise Learn this spelling list using the ‘Look, Say, Cover, Write, Check‘ activity.

English spelling games Spelling games using the word list: history

spelling test Test yourself using the ‘Listen and Spell‘ spelling test.

Worksheets and activities Print worksheets and activities using: history

Handwriting worksheets Handwriting worksheets using: history

Понравилась статья? Поделить с друзьями:
  • Word list слова с переводом
  • Word list скачать бесплатно
  • Word list year 1 and 2
  • Word list wpa wpa2
  • Word list with prefix over