Видео
Лексическая сочетаемость
Лингвист Александр Пиперски о парадоксальных словосочетаниях, модели мешка слов и использовании логарифмов в лингвистике
Над материалом работали
From Wikipedia, the free encyclopedia
OBE John Rupert Firth |
|
---|---|
Born | June 17, 1890
Keighley, Yorkshire, UK |
Died | December 14, 1960 (aged 70)
Lindfield, West Sussex, UK |
Academic background | |
Education | University of Leeds |
Academic work | |
Institutions | City of Leeds Training College University of the Punjab University College London SOAS University of London |
John Rupert Firth (June 17, 1890 in Keighley, Yorkshire – December 14, 1960 in Lindfield, West Sussex), commonly known as J. R. Firth, was an English linguist and a leading figure in British linguistics during the 1950s.[1]
Education and career[edit]
Firth studied history at University of Leeds, graduating with a BA in 1911 and an MA in 1913. He taught history at the City of Leeds Training College before World War I broke out. He joined the Indian Education Service during 1914–1918.[2] He was Professor of English at the University of the Punjab from 1919 to 1928. He then worked in the phonetics department of University College London before moving to the School of Oriental and African Studies (SOAS), where he became Professor of General Linguistics, a position he held until his retirement in 1956.[3]
In July 1941, before the outbreak of war with Japan, Firth attended a conference on the training of Japanese interpreters and translators and began to think of how crash courses might be devised. By the summer of 1942 he had devised a method of training people rapidly in how to eavesdrop on Japanese conversations (for example, between pilots and ground control) and to interpret what they heard. The first course began on 12 October 1942 and was for RAF personnel. He had used captured Japanese code books and other such material to draw up a list of essential military vocabulary and had arranged for two Japanese teachers at SOAS (one had been interned on the Isle of Man but had volunteered to teach, while the other was a Canadian-Japanese) to record sentences in which these words might be used. Trainees listened through headphones to recordings containing expressions such as ‘Bakugeki junbi taikei tsukure’ (Take up formation for bombing). At the end of each course he sent a report to Bletchley Park commenting on the abilities of each trainee. The trainees were mostly posted to India and played a vital role during the long Burma Campaign giving warning of bombing raids, and a few of them were undertaking similar duties on ships of the Royal Navy during the last year of the war. For his work during the war he was awarded an OBE in 1945.[4]
Contributions to linguistics[edit]
His work on prosody, which he emphasised at the expense of the phonemic principle, prefigured later work in autosegmental phonology. Firth is noted for drawing attention to the context-dependent nature of meaning with his notion of ‘context of situation’, and his work on collocational meaning is widely acknowledged in the field of distributional semantics. In particular, he is known for the famous quotation:
- You shall know a word by the company it keeps (Firth, J. R. 1957:11)[5]
Firth developed a particular view of linguistics that has given rise to the adjective ‘Firthian’. Central to this view is the idea of polysystematism. David Crystal describes this as:
- an approach to linguistic analysis based on the view that language patterns cannot be accounted for in terms of a single system of analytic principles and categories … but that different systems may need to be set up at different places within a given level of description.
His approach can be considered as resuming that of Malinowski’s anthropological semantics, and as a precursor of the approach of semiotic anthropology.[6][7][8] Anthropological approaches to semantics are alternative to the three major types of semantics approaches: linguistic semantics, logical semantics, and General semantics.[6] Other independent approaches to semantics are philosophical semantics and psychological semantics.[6]
His theory that «you shall know a word by the company it keeps» / «a word is characterized by the company it keeps»[9] inspired works on word embedding[10] hence add a major impact in natural language processing. Many techniques were designed to build dense vectors representing words semantics based on their neighbors (e.g. Word2vec, GloVe).
The ‘London School’[edit]
As a teacher in the University of London for more than 20 years, Firth influenced a generation of British linguists. The popularity of his ideas among contemporaries gave rise to what was known as the ‘London School’ of linguistics. Among Firth’s students, the so-called neo-Firthians were exemplified by Michael Halliday, who was Professor of General Linguistics in the University of London from 1965 until 1971.
Firth encouraged a number of his students, who later became well known linguists, to carry out research on a number of African and Oriental languages. T. F. Mitchell worked on Arabic and Berber, Frank R. Palmer on Ethiopian languages, including Tigre, and Michael Halliday on Chinese. Some other students whose native tongues were not English also worked with him and that enriched Firth’s theory on prosodic analysis. Among his influential students were Masud Husain Khan and the Arab linguists Ibrahim Anis, Tammam Hassan and Kamal Bashir . Firth got many insights from work done by his students in Semitic and Oriental languages so he made a great departure from the linear analysis of phonology and morphology to a more of syntagmatic and paradigmatic analysis, where it is important to distinguish between the two levels of phonematic units (equivalent to phone) and prosodies (equivalent to features like «nasalization», «velarization» etc.). Prosodic analysis paved the way to autosegmental phonology, though many linguists, who do not have a good background on the history of phonology, do not acknowledge this.[11]
Selected publications[edit]
- Speech. London: Ernest Benn, 1930.
- The Tongues of Men. London: Watts, 1937.
- Papers in Linguistics, 1934–1951. London: Oxford University Press, 1957.
- A synopsis of linguistic theory 1930-1955, in J. R. Firth, editor, Studies in Linguistic Analysis, Special volume of the Philological Society, chapter 1, pages 1–32, Oxford: Blackwell, 1957.
- Selected Papers of J. R. Firth, 1952-59, edited by F. R. Palmer. London: Longmans, 1968.
See also[edit]
- Phonaestheme
- Systemic linguistics
Notes[edit]
- ^ Kenneth Church (2011). «A Pendulum Swung too Far» (PDF). Linguistic Issues in Language Technology. 6 (4). Retrieved 4 June 2015.
- ^ «John Rupert Firth, Portraits of Linguists: A Biographical Source Book for the History of Western Linguistics, 1746-1963, V. 2». Open Indiana | Indiana University Press. Indiana University Press. Retrieved 17 September 2022.
- ^ John R. Firth. On Encyclopædia Britannica. Encyclopædia Britannica Online Academic Edition. Encyclopædia Britannica Inc., 2013
- ^ Peter Kornicki, Eavesdropping on the Emperor: Interrogators and Codebreakers in Britain’s War with Japan (London: Hurst & Co., 2021), pp. 18, 61-62, 64, 92, 146-148, 292
- ^ Firth, J. R. (1957). Studies in Linguistic Analysis (PDF). Wiley-Blackwell.
- ^ a b c Winfried Nöth (1995) Handbook of semiotics p.103
- ^ Edwin Ardener (editor) (1971) Social anthropology and language, [1]
- ^ Milton B. Singer (1984) Man’s glassy essence: explorations in semiotic anthropology
- ^ R, Firth J. (1957). «A synopsis of linguistic theory, 1930-1955». Studies in Linguistic Analysis.
- ^ Jiao, Qilu; Zhang, Shunyao (March 2021). «A Brief Survey of Word Embedding and Its Recent Development». 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). 5: 1697–1701. doi:10.1109/IAEAC50856.2021.9390956. ISBN 978-1-7281-8028-1. S2CID 233196376.
- ^ O’Grady, Gerard (2013). Key Concepts in Phonetics and Phonology. Palgrave. p. 55. ISBN 978-0-230-27647-5.
Further reading[edit]
- Honeybone, Patrick (2005). «J. R. Firth» (PDF). In Chapman, Siobhan; Routledge, Christopher (eds.). Key Thinkers in Linguistics and the Philosophy of Language. Edinburgh University Press. pp. 80–86. ISBN 978-0-19-518767-0.
- Koerner, E.F.K. (2000). «J. R. Firth and the Cours de linguistique générale: A Historiographical Sketch». In Tomić, Olga Mišeska; Radovanović, Milorad (eds.). History and Perspectives of Language Study: Papers in Honor of Ranko Bugarski. Current Issues in Linguistic Theory. Vol. 186. Amsterdam: John Benjamins. doi:10.1075/cilt.186.09koe. ISBN 9781556199639.
- Koerner, E. F. K. (2004). «R. H. Robins, J. R. Firth, and Linguistic Historiography». Essays in the History of Linguistics. Studies in the History of the Language Sciences. Vol. 104. Amsterdam: John Benjamins. pp. 197–205. doi:10.1075/sihols.104. ISBN 9789027285379. [An earlier, shorter version was published as: Koerner, E. F. K. (2001). «R. H. Robins, J. R. Firth, and Linguistic Historiography». Henry Sweet Society for the History of Linguistic Ideas Bulletin. 36 (1): 5–11. doi:10.1080/02674971.2001.11745530. S2CID 163615138.]
- Plug, Leendert (2004). «The Early Career of J. R. Firth: Comments on Rebori (2002)». Historiographia Linguistica. 31 (2–3): 469–477. doi:10.1075/hl.31.2.15plu.
- Rebori, Victoria (2002). «The legacy of J. R. Firth: A report on recent research». Historiographia Linguistica. 29 (1–2): 165–190. doi:10.1075/hl.29.1.11reb.
20
Sunday
Jul 2014
Firth once said “You shall know a word by the company it keeps”, and he seems to be right. The words that usually appear with another word tell a lot about it. They may explain what the word means, or what the word means for specific users.
Those lexical associations may help us learn how words behave, and they can disambiguate potentially ambiguous words. In the following, I’ll try and answer the question: ‘What is the difference between the two English adjectives: powerful and strong?”
How?
- I scraped an American news website, and extracted the text from it. I ended up having 38 million words of text. I won’t reveal the website name for now. I don’t think it’s illegal, since I’m not distributing the text, but just in case.
- I used a POS tagger to tag the grammatical categories of words. This lets us know whether a word is a noun, adjective, verb, .. etc.
- For every word in this corpus, I extracted the preceding 5 words and the following 5 words.
- I calculated the point-wise mutual information score between the focus word and every word in that 10 word window. PMI is known to disfavour frequent bigrams (two-word groups).
- I dumped this info in a sqlite database. It’s only one table, but the db enables us to easily get different rankings. The db has info on the frequency of the bigram, it’s PMI score, and a pre-computed PMI * frequency score. I have found the PMI * frequency score to yield the best results.
To make things easier, I present the top 100 collocates of powerful and strong as word clouds. The font size shows how import a collocate is. The bigger the font size, the more important the word. The words have been ranked using the PMI * frequency score.
First, here are the lexical associations of powerful:
The collocates of “powerful” in a 38 million word corpus of American English
And here is the same for the rival adjective: strong:
The collocates of “strong” in a 38 million word corpus of American English
The difference is now obvious. Isn’t it?
John Rupert Firth (June 17, 1890 in Keighley, Yorkshire – December 14, 1960 in Lindfield, West Sussex), commonly known as J. R. Firth, was an English linguist and a leading figure in British linguistics during the 1950s.
Quotes[edit]
- A western scholar must de-europeanize himself, and, in view of the most universal use of English, an Englishman must de-Anglicize himself as well.
- J. R. Firth, (1956). «Descriptive linguistics and the study of english.» in: F.R. Palmer (ed.), Selected Papers of J.R. Firth, Indiana University Press, p. 96; As cited in: Angela Senis (2016)
- Collocations are actual words in habitual company. A word in a usual collocation stares you in the face just as it is. Colligations cannot be of words as such. Colligations of grammatical categories related in a grammatical structure do not necessarily follow word divisions or even sub-divisions of words.
- Firth (1962, p. 14), as cited in Wendy J. Anderson, A corpus linguistic analysis of phraseology and collocation in the register of current European Union administrative French. Diss. University of St Andrews, 2003.
Speech, 1930[edit]
J. R. Firth (1930). Speech. Benn’s Sixpenny Library. Reprinted in Peter Strevens (ed.) (1964). The Tongues of Men and Speech. London: Oxford University Press.
- The phonetic animal par excellence is man. All men are born with an infinite capacity for making noises and using them.
- 1964, p. 141; Chapter 1; Chapter 1: The Origin of Speech
- In English we have noticed twenty-five consonant and about twenty vowel phonemes. Although individual pronunciations may differ, the phonemic habits of the same group or class will be similar. They will make similar use of heterophony. Words not phonetically separated — that is, homophones — may be separated by function or by experiential context
- p. 182
- It is not easy to determine what are the units of speech. Some would say speech sounds, others phonemes… The general opinion is, however, that words, not phones or phonemes or phoneme systems, are the units of speech.
- p. 182-183; As cited in: Angela Senis (2016: 293)
«The Technique of Semantics.» 1935[edit]
J. R. Firth (1935). «The Technique of Semantics.» Transactions of the Philological Society, 36-72; p. 37 (Reprinted in Firth (1957) Papers in Linguistics. London: Oxford University Press, 7-33).
- The complete meaning of a word is always contextual, and no study of meaning apart from context can be taken seriously.
- p. 37
- Research into the detailed contextual distribution of sociologically important words, what one might call focal or pivotal words, is only just beginning.
- 1957, p. 10
- The study of such words as work, labour, trade, employ, occupy, play, leisure, time, hours, means, self-respect, in all their derivatives and compounds in sociologically significant contexts during the last twenty years would be quite enlightening. So would the study of words particularly associated with the dress, occupations, and ambitions of women, or the language of advertising, especially of quackery, entertainments, food, drink, or of political movements and propaganda.
- 1957, p. 13
- Meaning… is to be regarded as a complex of contextual relations, and phonetics, grammar, lexicography, and semantics each handles its own components of the complex in its appropriate context.
- p. 54
The tongues of men. 1937[edit]
J. R. Firth (1937). The tongues of men. Oxford University Press, 1964 edn.
- Strictly speaking, the grammatical method of resolving a sentence into parts is nothing but a fanciful procedure ; but it is the real fountain of all knowledge, since it led to the invention of writing.
- p. 15; As cited in: Angela Senis (2016) , «The contribution of John Rupert Firth to the history of linguistics and the rejection of the phoneme theory.» Proceedings of ConSOLE XXIII 273.
- Speech is our most valuable instrument, because we can make it fit our common lives. We are not born to follow words. Words follow life. It has always been so from the very beginning. In fact, the human larynx and the shape of the passages above it have evolved in harmony with the lives our earliest ancestors lived, first in the trees, and then on the ground.
- p. 25 (1968; 24)
«A synopsis of linguistic theory 1930-1955.» 1957[edit]
John Rupert Firth (1957). «A synopsis of linguistic theory 1930-1955.» In Special Volume of the Philological Society. Oxford: Oxford University Press.;
- You shall know a word by the company it keeps.
- Cited in: Kenneth Church (2007). «A Pendulum Swung too Far». Linguistic Issues in Language Technology 6 (4): 5.
- The various structures of sentences in any given language, comprising for example at least two nominal pieces and a verbal piece must be collated, and such categories as voice, mood, affirmative, negative, tense, aspect, gender, number, person and case, if found applicable and valid in descriptive statement, are to be abstracted from, and referred back to the sentence as a whole.
- p. 20
- There is always the danger that the use of traditional grammatical terms with reference to a wide variety of languages may be taken to imply a secret belief in universal grammar. Every analysis of a particular ‘language’ must of necessity determine the values of the ad hoc categories to which traditional names are given. What is here being sketched is a general linguistic theory applicable to particular linguistic descriptions, not a theory of universals for general linguistic description.
- p. 21; as cited in: Olivares, Beatriz Enriqueta Quiroz. The interpersonal and experiential grammar of Chilean Spanish: Towards a principled Systemic-Functional description based on axial argumentation. Diss. University of Sydney, 2013.
External links[edit]
John Rupert Firth? Melanie Mitchell? Ludwig Wittgenstein? A. H. Schutz? Apocryphal?
Question for Quote Investigator: A dictionary defines the meaning of a word by using a sequence of other words. Occasionally, a definition employs a picture. Linguists and artificial intelligence researchers have suggested that the denotations and connotations of a word emerge via an examination of the words that commonly occur adjacent or nearby. This notion is reflected in the following adage:
You shall know a word by the company it keeps.
Would you please explore the provenance of this statement?
Reply from Quote Investigator: The earliest close match known to QI appeared in 1957 within an article by linguist John Rupert Firth titled “A Synopsis of Linguistic Theory” which was published by the Philological Society of London. Boldface added to excerpts by QI:[1]1968, Selected Papers of J. R. Firth 1952-59 (John Rupert Firth), Edited by F. R. Palmer (Frank Robert Palmer), Chapter 11: A synopsis of linguistic theory, Reprinted from: Studies in linguistic … Continue reading
As Wittgenstein says, ‘the meaning of words lies in their use.’ The day-to-day practice of playing language games recognizes customs and rules. It follows that a text in such established usage may contain sentences such as ‘Don’t be such an ass!’, ‘You silly ass!’, ‘What an ass he is!’ In these examples, the word ass is in familiar and habitual company, commonly collocated with you silly—, he is a silly—, don’t be such an—. You shall know a word by the company it keeps!
QI believes John Rupert Firth should receive credit for the expression under investigation.
Below are additional selected citations in chronological order.
Firth’s remark alluded to the following earlier saying: Tell me what company you keep and I will tell you what you are. A separate Quote Investigator article about this adage is available here.
An ancient precursor occurred in the works of Greek tragedian Euripides who died circa 406 B.C. A translation of Euripides by Morris Hickey Morgan appeared in “Bartlett’s Familiar Quotations”:[2]1938, Familiar Quotations by John Bartlett, Eleventh Edition, Edited by Christopher Morley and Louella D. Everett, Entry: Euripides, Quote Page 968, Little, Brown and Company, Boston, Massachusetts. … Continue reading
Every man is like the company he is wont to keep.
Phoenix. Fragment 809
Also, a Latin proverb is listed in “The Home Book of Quotations: Classical and Modern” edited by Burton Stevenson:[3] 1949, The Home Book of Quotations: Classical and Modern, Selected by Burton Stevenson, Sixth Edition, Topic: Companions, Quote Page 288, Dodd, Mead and Company, New York. (Verified with scans)
He is known by his companions. (Noscitur a sociis.)
UNKNOWN. A Latin proverb.
In 1845 prominent English legal thinker Herbert Broom published “A Selection of Legal Maxims” which included an entry for the Latin proverb above applied to the domain of judicial interpretation:[4] 1845, A Selection of Legal Maxims, Classified and Illustrated by Herbert Broom, Maxim: Noscitur a sociis, Quote Page 149, A. Maxwell & Son, London. (Google Books Full View) link
NOSCITUR A SOCIIS. (3 T.R. 87.)—The meaning of a word may be known by reference to the neighbouring words
. . .
So, where the meaning of any particular word is doubtful or obscure, or where the particular expression when taken singly is inoperative, the intention of the party who has made use of it may frequently be ascertained and carried into effect by looking at the adjoining words, or at expressions occurring in other parts of the same instrument . . .
In 1957 John Rupert Firth crafted the saying about words as mentioned at the beginning of this article.
In 1958 A. H. Schutz published “Some Provençal Words Indicative of Knowledge” in “Speculum: A Journal of Mediaeval Studies”. Schutz discussed the difficult of understanding words from previous centuries:[5]1958 October, Speculum: A Journal of Mediaeval Studies, Volume 33, Number 4, Some Provençal Words Indicative of Knowledge by A. H. Schutz, Start Page 508, Quote Page 508, Published by The University … Continue reading
We cannot count on the dictionaries for help, as their formal definitions allow for no overtones. It is difficult to be succinct in the rendition of an idea centuries old, conceived in an environment completely different from our own, and more difficult to replace that idea by a pat synonym. What we can do and expect to do here, since words do occur in context, is to delineate the scope within that context, to identify a word by the company it keeps.
If, as is frequently the case, our terms occur in well-worn “strings,” the recurrent associative groups allow us to define one item in terms of those to which it is frequently tied, with which it is sufficiently often associated to merit our attention.
In 1997 “The Oxford Dictionary of Literary Quotations” included this entry:[6] 1997, The Oxford Dictionary of Literary Quotations, Edited by Peter Kemp, Topic: Grammar and Usage, Quote Page 100, Oxford University Press, New York. (Verified with scans)
You shall know a word by the company it keeps.
J. R. Firth 1890-1960: ‘A Synopsis of Linguistic Theory’ (1957)
In 2019 Melanie Mitchell published “Artificial Intelligence: A Guide for Thinking Humans”. In the following excerpt, NLP refers to natural language processing performed by computers:[7]2019, Artificial Intelligence: A Guide for Thinking Humans by Melanie Mitchell, Chapter 11: Words, and the Company They Keep, Section: The Semantic Space of Words, Quote Page 188, Picador Paperback: … Continue reading
The NLP research community has proposed several methods for encoding words in a way that would capture such semantic relationships. All of these methods are based on the same idea, which was expressed beautifully by the linguist John Firth in 1957: “You shall know a word by the company it keeps.”
That is, the meaning of a word can be defined in terms of other words it tends to occur with, and the words that tend to occur with those words, and so on. Abhorred tends to occur in the same contexts as hated. Laughed tends to occur with the same words that humor finds in its company.
In 2022 “The Economist” published an article titled “Huge ‘foundation models’ are turbo-charging AI progress” which included the following passage:[8]Website: The Economist, Article title: Huge “foundation models” are turbo-charging AI progress, Date on website: June 11, 2022, Website description: General interest magazine based in London. … Continue reading
Such models are trained using a technique called self-supervised learning, rather than with pre-labelled data sets. As they burrow through piles of text they hide specific words from themselves and then guess, on the basis of the surrounding text, what the hidden word should be. After a few billion guess-compare-improve-guess cycles this Mad-Libs approach gives new statistical power to an adage coined by J.R. Firth, a 20th-century linguist: “You shall know a word by the company it keeps.”
In conclusion, John Rupert Firth employed the quotation in 1957, and he should receive credit. The underlying notion has a long history. There is a thematically related Latin proverb “Noscitur a sociis” which is listed in “Broom’s Legal Maxims” of the 19th century. This maxim suggests that the judicial interpretation of a word should be guided by an examination of neighboring words.
Image Notes: Illustration of a coffee cup constructed out of a collection of words related to coffee. This image was created by GDJ at Pixabay.
(Special thanks to Robert Schwartz who notified QI of the Latin proverb which is used as a legal maxim: Noscitur a sociis.)
Update History: On September 22, 2022 citations for Euripides, the Latin proverb, and the legal maxim were added to this article.