This article shows you how to extract the meaningful bits of information from raw text and how to identify their roles. Let’s first look into why identifying roles is important. |
Take 40% off Getting Started with Natural Language Processing by entering fcckochmar into the discount code box at checkout at manning.com.
Understanding word types
The first fact to notice is that there‘s a conceptual difference between the bits of the expression like “[Harry] [met] [Sally]”: “Harry” and “Sally” both refer to people participating in the event, and “met” represents an action. When we humans read text like this, we subconsciously determine the roles each word or expression plays along those lines: to us, words like “Harry” and “Sally” can only represent participants of an action but can’t denote an action itself, and words like “met” can only denote an action. This helps us get at the essence of the message quickly: we read “Harry met Sally” and we understand [HarryWHO] [metDID_WHAT] [SallyWHOM].
This recognition of word types has two major effects: the first effect is that the straightforward unambiguous use of words in their traditional functions helps us interpret the message. Funnily enough, this applies even when we don’t know the meaning of the words. Our expectations about how words are combined in sentences and what roles they play are strong, and when we don’t know what a word means such expectations readily suggest what it might mean: e.g., we might not be able to exactly pin it down, but we can still say that an unknown word means some sort of an object or some sort of an action. This “guessing game” is familiar to anyone who has ever tried learning a foreign language and had to interpret a few unknown words based on other, familiar words in the context. Even if you are a native speaker of English and never tried learning a different language, you can still try playing a guessing game, for example, with nonsensical poetry. Here’s an excerpt from “Jabberwocky”, a famous nonsensical poem by Lewis Carroll:[1]
Figure 1. An example of text where the word meaning can only be guessed
Some of the words here are familiar to anyone, but what do “Jabberwock”, “Bandersnatch” and “frumious” mean? It’s impossible to give a precise definition for any of them because these words don’t exist in English or any other language, and their meaning is anybody’s guess. One can say with high certainty that “Jabberwock” and “Bandersnatch” are some sort of creatures, and “frumious” is some sort of quality.[2] How do we make such guesses? You might notice that the context for these words gives us some clues: for example, we know what “beware” means. It’s an action, and as an action it requires some participants: one doesn’t normally “beware”, one needs to beware of someone or something. We expect to see this someone or something, and here comes “Jabberwock”. Another clue is given away by the word “the” which normally attaches itself to objects (like “the car”) or creatures (like “the dog”), and we arrive at an interpretation of “Jabberwock” and “Bandersnatch” being creatures. Finally, in “the frumious Bandersnatch” the only possible role for “frumious” is some quality because this is how it typically works in language: e.g. “the red car” or “the big dog”.
The second effect that the expectations about the roles that words play have on our interpretation is that we tend to notice when these roles are ambiguous or somehow violated, because such violations create a discordance. This is why ambiguity in language is a rich source of jokes and puns, intentional or not. Here’s one expressed in a news headline:
Figure 2. An example of ambiguity in action
What is the first reading that you get? You wouldn’t be the only one if you read this as if “Police help a dog to bite a victim”, but common sense suggests that the intended meaning is probably “Police help a victim with a dog bite (or, that was bitten by a dog)”. News headlines are rich in ambiguities like that because they use a specific format aimed at packing the maximum amount of information in a shortest possible expression. This sometimes comes at a price as both “Police help a dog to bite a victim” and “Police help a victim with a dog bite (that was bitten by a dog)” are clearer but longer than “Police help dog bite victim” that a newspaper might prefer to use. This ambiguity isn’t necessarily intentional, but it’s easy to see how this can be used to make endless jokes.
What exactly causes confusion here? It’s clear that “police” denotes a participant in an event, and “help” denotes the action. “Dog” and “victim” also seem to unambiguously be participants of an action, but things are less clear with “bite”. “Bite” can denote an action as in “Dog bites a victim” or a result of an action as in “He has mosquito bites.” In both cases, what we read is a word “bites”, and it doesn’t give away any further clues as to what it means, but in “Dog bites a victim” it answers the question “What does the dog do?” and in “He has mosquito bites” it answers the question “What does he have?”. Now, when you see a headline like “Police help dog bite victim”, your brain doesn’t know straight away which path to follow:
- Path 1: “bite” is an action answering the question “what does one do?” → “Police help dog [biteDO_WHAT] victim”
- Path 2: “bite” is the result of an action answering the question “what happened?” → “Police help dog [biteWHAT] victim”.
Apart from the humorous effect of such confusions, ambiguity may also slow the information processing down and lead to misinterpretations. Try solving Exercise 1 to see how the same expression may lead to completely different readings.
Solution: These are quite well-known examples that are widely used in NLP courses to exemplify ambiguity in language and its effect on interpretation.
In (1), “I” certainly denotes a person, and “can” certainly denotes an action, but “can” as an action has two potential meanings: it can denote ability “I can” = “I am able to” or the action of putting something in cans.[3] “Fish” can denote an animal as in “freshwater fish” (or a product as in “fish and chips”), or it can denote an action as in “learn to fish”. In combination with the two meanings of “can” these can produce two completely different readings of the same sentence: either “I can fish” means “I am able / I know how to fish” or “I put fish in cans”.
In (2), “I” is a person and “saw” is an action, but “duck” may mean an animal or an action of ducking. In the first case, the sentence means that I saw a duck that belongs to her, and in the second it means that I witnessed how she ducked – once again, completely different meanings of what seems to be the same sentence!
Figure 3. Ambiguity might result in some serious misunderstanding[4]
This far, we’ve been using the terminology quite frivolously: we’ve been defining words as denoting actions or people or qualities, but in fact there are more standard terms for that. The types of words defined by the different functions that words might fulfill are called parts-of-speech, and we distinguish between a number of such types:
- words that denote objects, animals, people, places and concepts are called nouns;
- words that denote states, actions and occurrences are called verbs;
- words that denote qualities of objects, animals, people, places and concepts are called adjectives;
- those for qualities of actions, states and occurrences are called adverbs.
Table 1 provides some examples and descriptions of different parts-of-speech:
Table 1. Examples of words of different parts-of-speech
Part-of-speech Nouns |
What it denotes Objects, people, animals, places, concepts, time references |
Examples car, Einstein, dog, Paris, calculation, Friday |
Verbs |
Actions, states, occurrences |
meet, stay, become, happen |
Adjectives |
Qualities of objects, people, animals, places, concepts |
red car, clever man, big dog, beautiful city, fast calculation |
Adverbs |
Qualities of actions, states, occurrences |
meet recently, stay longer, just become, happen suddenly |
Articles |
Don’t have a precise meaning of their own, but show whether the noun they are attached to is identifiable in context (it is clear what / who the noun is referring to) or not (the noun hasn’t been mentioned before) |
I saw a man = This man is mentioned for the first time (“a” is an indefinite article) The man is clever = This suggests that it should be clear from the context which particular man we are talking about (“the” is a definite article) |
Prepositions |
Don’t have a precise meaning of their own, but serve as a link between two words or groups of words: for example, linking a verb denoting action with nouns denoting participants, or a noun to its attributes |
meet on Friday – links action to time meet with administration – links action to participants meet at house – links action to location a man with a hat – links a noun to its attribute |
This isn’t a comprehensive account of all parts-of-speech in English, but with this brief guide you should be able to recognize the roles of the most frequent words in text and this suite of word types should provide you with the necessary basis for implementation of your own information extractor.
Why do we care about the identification of word types in the context of information extraction and other tasks? You’ve seen above that correct and straightforward identification of types helps information processing, although ambiguities lead to misunderstandings. This is precisely what happens with the automated language processing: machines like humans can extract information from text better and more efficiently if they can recognize the roles played by different words, although misidentification of these roles may lead to mistakes of various kinds. For instance, having identified that “Jabberwock” is a noun and some sort of a creature, a machine might be able to answer a question like “Who is Jabberwock?” (e.g., “Someone / Something with jaws that bite and claws that catch”), although if a machine processed “I can fish” as “I know how to fish” it wouldn’t be able to answer the question “What did you put in cans?”
Luckily, there are NLP algorithms that can detect word types in text, and such algorithms are called part-of-speech taggers (or POS taggers). Figure 4 presents a mental model to help you put POS taggers into the context of other NLP techniques:
Figure 4. Mental Model that visualizes the flow of information between different NLP components
As POS tagging is an essential part of many tasks in language processing, all NLP toolkits contain a tagger and often you need to include it in your processing pipeline to get at the essence of the message. Let’s now look into how this works in practice.
Part-of-speech tagging with spaCy
I want to introduce spaCy[5] – a useful NLP library that you can put under your belt. A number of reasons to look into spaCy in this book are:
- NLTK and spaCy have their complementary strengths, and it’s good to know how to use both;
- spaCy is an actively supported and fast-developing library that keeps up-to-date with the advances in NLP algorithms and models;
- A large community of people work with this library, and you can find code examples of various applications implemented with or for spaCy on their webpage,[6] as well as find answers to your questions on their github;
- spaCy is actively used in industry; and
- It includes a powerful set of tools particularly applicable to large-scale information extraction.
Unlike NLTK that treats different components of language analysis as separate steps, spaCy builds an analysis pipeline from the beginning and applies this pipeline to text. Under the hood, the pipeline already includes a number of useful NLP tools which are run on input text without you needing to call on them separately. These tools include, among others, a tokenizer and a POS tagger. You apply the whole lot of tools with a single line of code calling on the spaCy processing pipeline, and then your program stores the result in a convenient format until you need it. This also ensures that the information is passed between the tools without you taking care of the input-output formats. Figure 5 visualizes spaCy’s NLP pipeline, that we’re going to discuss in more detail next:
Figure 5. spaCy’s processing pipeline with some intermediate results[7]
Machines, unlike humans, don’t treat input text as a sequence of sentences or words – for machines, text is a sequence of symbols. The first step that we applied before was splitting text into words – this step is performed by a tool called tokenizer. Tokenizer uses raw text as an input and returns a list of words as an output. For example, if you pass it a sequence of symbols like “Harry, who Sally met”, it returns a list of tokens [“Harry”, “,”, “who”, …] Next, we apply a stemmer that converts each word to some general form: this tool takes a word as an input and returns its stem as an output. For instance, a stemmer returns a generic, base form “meet” for both “meeting” and “meets”. A stemmer can be run on a list of words, where it treats each word separately and returns a list of correspondent stems. Other tools require an ordered sequence of words from the original text: for example, we’ve seen that it’s easier to figure out that Jabberwock is a noun if we know that it follows a word like “the”; order matters for POS tagging. This means that each of the three tools – tokenizer, stemmer, POS tagger – requires a different type of input and produces a different type of output, and in order to apply them in sequence we need to know how to represent information for each of them. This is what spaCy’s processing pipeline does for you: it runs a sequence of tools and connects their outputs together.
For information retrieval we opted for stemming that converts different forms of a word to a common core. We said that it’s useful because it helps connect words together on a larger scale, but it also produces non-words: you won’t always be able to find stems of the words (e.g. something like “retriev”, the common stem of retrieval and retrieve) in a normal dictionary. An alternative to this tool is lemmatizer, which aims at converting different forms of a word to its base form which can be found in a dictionary: for instance, it returns a lemma retrieval that can be found in a dictionary. Such base form is called lemma. In its processing pipeline, spaCy uses a lemmatizer.
The starting point for spaCy’s processing pipeline is, as before, raw text: for example, “On Friday board members meet with senior managers to discuss future development of the company.” The processing pipeline applies tokenization to this text to extract individual words: [“On”, “Friday”, “board”, …]. The words are then passed to a POS tagger that assigns parts-of-speech (or POS) tags like [“ADP”, “PROPN”,[8] “NOUN”, …], to a lemmatizer that produces output like [“On”, “Friday”, …, “member”, …, “manager”, …], and to a bunch of other tools.
You may notice that the processing tools in Figure 5 are comprised within a pipeline called nlp
. As you’ll shortly see in the code, calling on nlp
pipeline makes the program first invoke all the pre-trained tools and then applies them to the input text. The output of all the steps gets stored in a “container” called Doc
– it contains a sequence of tokens extracted from input text and processed with the tools. Here’s where spaCy implementation comes close to object-oriented programming: the tokens are represented as Token
objects with a specific set of attributes. If you’ve done object-oriented programming before, you’ll hopefully see the connection soon. If not, here’s a brief explanation: imagine you want to describe a set of cars. All cars share the list of attributes they have: with respect to cars, you may want to talk about the car model, size, color, year of production, body style (e.g. saloon, convertible), type of engine, etc. At the same time, such attributes as wingspan or wing area won’t be applicable to cars – they rather relate to planes. You can define a class of objects called Car
and require that each object car
of this class should have the same information fields, for instance calling on car.model
should return the name of the model of the car, for example car.model=“Volkswagen Beetle”
, and car.production_year
should return the year the car was made, for example car.production_year=“2003”
, etc.
This is the approach taken by spaCy to represent tokens in text: after tokenization, each token (word) is packed up in an object Token that has a number of attributes. For instance:
token.text
contains the original word itself;token.lemma_
stores the lemma (base form) of the word;[9]token.pos_
– its part-of-speech tag;token.i
– the index position of the word in text;token.lower_
– lowercase form of the word;
and so on.
The nlp
pipeline aims to fill in the information fields like lemma
, pos
and others with the values specific for each particular token. Because different tools within the pipeline provide different bits of information, the values for the attributes are added on the go. Figure 6 visualizes this process for the words “on” and “members” in the text “On Friday board members meet with senior managers to discuss future development of the company”:
Figure 6. Processing of words “On” and “members” within the nlp
pipeline
Now, let’s see how this is implemented in Python code. Listing 1 provides you with an example.
Listing 1. Code exemplifying how to run spaCy’s processing pipeline
import spacy #A nlp = spacy.load("en_core_web_sm") #B doc = nlp("On Friday board members meet with senior managers " + "to discuss future development of the company.") #C rows = [] rows.append(["Word", "Position", "Lowercase", "Lemma", "POS", "Alphanumeric", "Stopword"]) #D for token in doc: rows.append([token.text, str(token.i), token.lower_, token.lemma_, token.pos_, str(token.is_alpha), str(token.is_stop)]) #E columns = zip(*rows) #F column_widths = [max(len(item) for item in col) for col in columns] #G for row in rows: print(''.join(' {:{width}} '.format(row[i], width=column_widths[i]) for i in range(0, len(row)))) #H
#A Start by importing spaCy library
#B spacy.load
command initializes the nlp
pipeline. The input to the command is a particular type of data (model) that the language tools were trained on. All models use the same naming conventions (en_core_web_
), which means that it’s a set of tools trained on English Web data; the last bit denotes the size of data the model was trained on, where sm
stands for ‘small’[10]
#C Provide the nlp
pipeline with input text
#D Let’s print the output in a tabular format. For clarity, add a header to the printout
#E Add the attributes of each token in the processed text to the output for printing
#F Python’s zip
function[11] allows you to reformat input from row-wise representation to column-wise
#G As each column contains strings of variable lengths, calculate the maximum length of strings in each column to allow enough space in the printout
#H Use format
functionality to adjust the width of each column in each row as you print out the results[12]
Here’s the output that this code returns for some selected words from the input text:
Word Position Lowercase Lemma POS Alphanumeric Stopword On 0 on on ADP True False Friday 1 friday friday PROPN True False ... members 3 members member NOUN True False ... to 8 to to PART True True discuss 9 discuss discuss VERB True False ... . 15 . . PUNCT False False
This output tells you:
- The first item in each line is the original word from text – it’s returned by
token.text
; - The second is the position in text, which starts as all other indexing in Python from zero – this is identified by
token.i
; - The third item is the lowercase version of the original word. You may notice that it changes the forms of “On” and “Friday”. This is returned by
token.lower_
; - The fourth item is the lemma of the word, which returns “member” for “members” and “manager” for “managers”. Lemma is identified by
token.lemma_
; - The fifth item is the part-of-speech tag. Most of the tags should be familiar to you by now. The new tags in this piece of text are PART, which stands for “particle” and is assigned to particle “to” in “to discuss”, and PUNCT for punctuation marks. POS tags are returned by
token.pos_
; - The sixth item is a True/False value returned by
token.is_alpha
, which checks whether a word contains alphabetic characters only. This attribute is False for punctuation marks and some other sequences that don’t consist of letters only, and it’s useful for identifying and filtering out punctuation marks and other non-words; - Finally, the last, seventh item in the output is a True/False value returned by
token.is_stop
, which checks whether a word is in a stopwords list – a list of highly frequent words in language that you might want to filter out in many NLP applications, as they aren’t likely to be informative. For example, articles, prepositions and particles have theiris_stop
values set to True as you can see in the output above.
Solution: Despite the fact that a text like “Jabberwocky” contains non-English words, or possibly non-words at all, this Python code is able to tell that “Jabberwock” and “Bandersnatch” are some creatures that have specific names (it assigns a tag PROPN, proper noun to both of them), and that “frumious” is an adjective. How does it do that? Here’s a glimpse under the hood of a typical POS tagging algorithm (see Figure 7):
Figure 7. A glimpse under the hood of a typical POS tagging algorithm
We’ve said earlier that when we try to figure out what type of a word something like “Jabberwock” is we rely on the context. In particular, the previous words are important to take into account: if we see “the”, chances that the next word is a noun or an adjective are high, but a chance that we see a verb next is minimal – verbs shouldn’t follow articles in grammatically correct English. Technically, we rely on two types of intuitions: we use our expectations about what types of words typically follow other types of words, and we also rely on our knowledge that words like “fish” can be nouns or verbs but hardly anything else. We perform the task of word type identification in sequence. For instance, in the example from Figure 7, when the sentence begins, we already have certain expectations about what type of a word we may see first – quite often, it’s a noun or a pronoun (like “I”). Once we’ve established that it’s likely for a pronoun to start a sentence, we also rely on our intuitions about how likely it is that such a pronoun will be exactly “I”. Then we move on and expect to see a particular range of word types after a pronoun – almost certainly it should be a normal verb or a modal verb (as verbs denoting obligations like “should” and “must” or abilities like “can” and “may” are technically called). More rarely, it may be a noun (like “I, Jabberwock”), an adjective (“I, frumious Bandersnatch”), or some other part of speech. Once we’ve decided that it’s a verb, we assess how likely it is that this verb is “can”; if we’ve decided that it’s a modal verb, we assess how likely it is that this modal verb is “can”, etc. We proceed like that until we reach the end of the sentence, and this is where we assess which interpretation we find more likely. This is one possible step-wise explanation of how our brain processes information, on which part-of-speech tagging is based.
The POS tagging algorithm takes into account two types of expectations: an expectation that a certain type of a word (like modal verb) may follow a certain other type of a word (like pronoun), and an expectation that if it’s a modal verb such a verb may be “can”. These “expectations” are calculated using the data: for example, to find out how likely it is that a modal verb follows a pronoun, we calculate the proportion of times we see a modal verb following a pronoun in data among all the cases where we saw a pronoun. For instance, if we saw ten pronouns like “I” and “we” in data before, and five times out of those ten these pronouns were followed by a modal verb like “can” or “may” (as in “I can” and “we may”), what’s the likelihood, or probability, or seeing a modal verb following a pronoun be? Figure 8 gives a hint on how probability can be estimated:
Figure 8. If modal verb follows pronoun 5 out of 10 times, the probability is 5/10
We can calculate it as:
Probability(modal verb follows pronoun) = 5 / 10
or in general case:
Probability(modal verb follows pronoun) = How_often(pronoun is followed by verb) / How_often(pronoun is followed by any type of word, modal verb or not)
To estimate how likely (or how probable) it is that the pronoun is “I”, we need to take the number of times we’ve seen a pronoun “I” and divide it by the number of times we’ve seen any pronouns in the data. If among those ten pronouns that we’ve seen in the data before seven were “I” and three were “we”, the probability of seeing a pronoun “I” is estimated as Figure 9 illustrates:
Figure 9. If 7 times out of 10 the pronoun is “I”, the probability of a word being “I’ given that we know the POS of such a word is pronoun is 7/10
Probability(pronoun being “I”) = 7 / 10
or in general case:
Probability(pronoun being “I”) = How_often(we’ve seen a pronoun “I”) / How_often(we’ve seen any pronoun, “I” or other)
In the end, the algorithm goes through the sequence of tags and words one by one, and takes all the probabilities into account. Because the probability of each decision, each tag and each word is a separate component in the process, these individual probabilities are multiplied. To find out how probable it is that “I can fish” means “I am able / know how to fish”, the algorithm calculates:
Probability(“I can fish” is “pronoun modal_verb verb”) = probability(a pronoun starts a sentence) * probability(this pronoun is “I”) * probability(a pronoun is followed by a modal verb) * probability(this modal verb is “can”) * … * probability(a verb finishes a sentence)
This probability gets compared with the probabilities of all the alternative interpretations, like “I can fish” = “I put fish in cans”:
Probability(“I can fish” is “pronoun verb noun”) = probability(a pronoun starts a sentence) * probability(this pronoun is “I”) * probability(a pronoun is followed by a verb) * probability(this verb is “can”) * … * probability(a noun finishes a sentence)
In the end, the algorithm compares the calculated probabilities for the possible interpretations and chooses the one which is more likely, i.e. has higher probability.
That’s all for this article. We’re going to move onto syntactic parsing in part 2.
If you want to learn more about the book, you can preview its contents on our browser-based liveBook platform here.
[2] A blend of “fuming” and “furious”, according to Lewis Carroll himself.
[3] Formally, when a word has several meanings this is called lexical ambiguity.
[5] To get more information on the library, check https://spacy.io. Installation instructions walk you through the installation process depending on the operating system you’re using: https://spacy.io/usage#quickstart.
[8] In the scheme used by spaCy, prepositions are referred to as “adposition” and use a tag ADP. Words like “Friday” or “Obama” are tagged with PROPN, which stands for “proper nouns” reserved for names of known individuals, places, time references, organizations, events and such. For more information on the tags, see documentation here: https://spacy.io/api/annotation.
[9] You may notice that some attributes are called on using an underscore, like token.lemma_. This is applicable when spaCy has two versions for the same attribute: for example, token.lemma returns an integer version of the lemma, which represents a unique identifier of the lemma in the vocabulary of all lemmas existing in English, and token.lemma_ returns a Unicode (plain text) version of the same thing – see the description of the attributes on https://spacy.io/api/token.
[10] Check out the different language models available for use with spaCy: https://spacy.io/models/en. Small model (en_core_web_sm) is suitable for most purposes and it’s more efficient to upload and use, but larger models like en_core_web_md (medium) and en_core_web_lg (large) are more powerful and some NLP tasks require the use of such larger models.
In this article, we’ll learn two methods that get the meaning of an English word using python.
The first method is the PyDictionary library.
The second method is the dictionaryapi API.
Let’s get started.
Method #1: Get Word meaning using PyDictionary
As I said, PyDictionary is a library which means we need to install it.
PyDictionary installation
To install PyDictionary, choose your preferred command:
Via pip:
pip3 install PyDictionary
via Easy_Install:
easy_install PyDictionary
How to use PyDictionary
To get the meaning of a word, we need to use the meaning() method.
Syntax:
meaning("word", disable_errors=boolean)
meaning() returns the response as Dict, otherwise print Error with None.
In the following example, we’ll use the meaning() method to get the meaning of the word Code.
from PyDictionary import PyDictionary
# Call PyDictionary class
dc = PyDictionary()
# Get meaning of word "Code"
mn = dc.meaning("Code")
# Print Result
print(mn)
Output:
{'Noun': ['a set of rules or principles or laws (especially written ones', 'a coding system used for transmitting messages requiring brevity or secrecy', '(computer science'], 'Verb': ['attach a code to', 'convert ordinary language into code']}
Let’s try a word that does not exist to see what will happen.
# Get meaning of word "Codexx"
mn = dc.meaning("Codexx")
# Print Result
print(mn)
Output:
Error: The Following Error occured: list index out of range
None
If you want to hide the error, add disable_errors=True into the parameter:
# Get meaning of word "Codexx"
mn = dc.meaning("Codexx", disable_errors=True)
# Print Result
print(mn)
Output:
None
How to use PyDictionary with multiple words
If you have multiple words, you need to use the getMeanings() method.
Syntax:
PyDictionary("list_of_words").getMeanings()
Let’s see an example:
from PyDictionary import PyDictionary
# Words
my_words = ['cat', 'go', 'likewise', 'watch']
dc = PyDictionary(my_words)
# Get meaning of muli words
res = dc.getMeanings()
# Print result
print(res)
Output:
{'cat': {'Noun': ['feline mammal usually having thick soft fur and no ability to roar: domestic cats; wildcats', 'an informal term for a youth or man', 'a spiteful woman gossip', 'the leaves of the shrub Catha edulis which are chewed like tobacco or used to make tea; has the effect of a euphoric stimulant', 'a whip with nine knotted cords', 'a large tracked vehicle that is propelled by two endless metal belts; frequently used for moving earth in construction and farm work', 'any of several large cats typically able to roar and living in the wild', 'a method of examining body organs by scanning them with X rays and using a computer to construct a series of cross-sectional scans along a single axis'], 'Verb': ["beat with a cat-o'-nine-tails", 'eject the contents of the stomach through the mouth']}, 'go': {'Noun': ['a time period for working (after which you will be relieved by someone else', 'street names for methylenedioxymethamphetamine', 'a usually brief attempt', "a board game for two players who place counters on a grid; the object is to surround and so capture the opponent's counters"], 'Verb': ['change location; move, travel, or proceed, also metaphorically', 'follow a procedure or take a course', 'move away from a place into another direction', 'enter or assume a certain state or condition', 'be awarded; be allotted', 'have a particular form', 'stretch out over a distance, space, time, or scope; run or extend between two points or beyond a certain point', 'follow a certain course', 'be abolished or discarded', 'be or continue to be in a certain condition', 'make a certain noise or sound', 'perform as expected when applied', 'to be spent or finished', 'progress by being changed', 'continue to live and avoid dying', 'pass, fare, or elapse; of a certain state of affairs or action', 'pass from physical life and lose all bodily attributes and functions necessary to sustain life', 'be in the right place or situation', 'be ranked or compare', 'begin or set in motion', "have a turn; make one's move in a game", 'be contained in', 'be sounded, played, or expressed', 'blend or harmonize', 'lead, extend, or afford access', 'be the right size or shape; fit correctly or as desired', "go through in search of something; search through someone's belongings in an unauthorized way", 'be spent', 'give support (to', 'stop operating or functioning'], 'Adjective': ['functioning correctly and ready for action']}, 'likewise': {'Adverb': ['in like or similar manner', 'in addition', 'equally']}, 'watch': {'Noun': ['a small portable timepiece', 'a period of time (4 or 2 hours', 'a purposeful surveillance to guard or observe', 'the period during which someone (especially a guard', 'a person employed to keep watch for some anticipated event', 'the rite of staying awake for devotional purposes (especially on the eve of a religious festival'], 'Verb': ['look attentively', 'follow with the eyes or the mind', 'see or watch', 'observe with attention', 'be vigilant, be on the lookout or be careful', 'observe or determine by looking', 'find out, learn, or determine with certainty, usually by making an inquiry or other effort']}}
Method #2: Get Word meaning using dictionaryapi API
dictionaryapi API is a free API that provides us the meaning, audio, synonyms, antonyms, and example of an English words.
However, dictionaryapi API returns the response as JSON. and, we’ll use the requests module to make a request.
dictionaryapi API Usage
To get the meaning of a word, we’ll send a get request to https://api.dictionaryapi.dev/api/v2/entries/en/<word> URL.
In the following example, we’ll get information about the word javascript:
import requests
# Word
my_word = "javascript"
# Get Info
req = requests.get(f"https://api.dictionaryapi.dev/api/v2/entries/en/{my_word}")
# Print result
print(req.text)
Output:
[{"word":"JavaScript","phonetic":"ˈdʒɑːvəˌskrɪpt","phonetics":[{"text":"ˈdʒɑːvəˌskrɪpt","audio":"//ssl.gstatic.com/dictionary/static/sounds/20200429/javascript--1_gb_1.mp3"}],"origin":"1990s: from Java2 + script1.","meanings":[{"partOfSpeech":"noun","definitions":[{"definition":"an object-oriented computer programming language commonly used to create interactive effects within web browsers.","synonyms":[],"antonyms":[]}]}]}]
As you can see, we got some info like:
- Audio URL
- Meanings
- Phonetic
If you got the ModuleNotFoundError: No module named ‘requests’ error, follow this article:
Solution: ModuleNotFoundError: No module named ‘requests’
Happy codding!
Ответы на госы по лексикологии
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 1
1. Lexicology, its aims and significance
Lexicology is a branch of linguistics which deals with a systematic description and study of the vocabulary of the language as regards its origin, development, meaning and current use. The term is composed of 2 words of Greek origin: lexis + logos. A word about words, or the science of a word. It also concerns with morphemes, which make up words and the study of a word implies reference to variable and fixed groups because words are components of such groups. Semantic properties of such words define general rules of their joining together. The general study of the vocabulary irrespective of the specific features of a particular language is known as general lexicology. Therefore, English lexicology is called special lexicology because English lexicology represents the study into the peculiarities of the present-day English vocabulary.
Lexicology is inseparable from: phonetics, grammar, and linguostylistics b-cause phonetics also investigates vocabulary units but from the point of view of their sounds. Grammar- grammatical peculiarities and grammatical relations between words. Linguostylistics studies the nature, functioning and structure of stylistic devices and the styles of a language.
Language is a means of communication. Thus, the social essence is inherent in the language itself. The branch of linguistics which deals with relations between the language functions on the one hand and the facts of social life on the other hand is termed sociolinguistics.
Modern English lexicology investigates the problems of word structure and word formation; it also investigates the word structure of English, the classification of vocabulary units, replenishment3 of the vocabulary; the relations between different lexical layers4 of the English vocabulary and some other. Lexicology came into being to meet the demands of different branches of applied linguistic! Namely, lexicography — a science and art of compiling dictionaries. It is also important for foreign language teaching and literary criticism.
2. Referential approach to meaning
SEMASIOLOGY
There are different approaches to meaning and types of meaning
Meaning is the object of semasiological study -> semasiology is a branch of lexicology which is concerned with the study of the semantic structure of vocabulary units. The study of meaning is the basis of all linguistic investigations.
Russian linguists have also pointed to the complexity of the phenomenon of meaning (Потебня, Щерба, Смирницкий, Уфимцева и др.)
There are 3 main types of definition of meaning:
(a) Analytical or referential definition
(b) Functional or contextual approach
(c) Operational or information-oriented definition of meaning
REFERENTIAL APPROACH
Within the referential approach linguists attempt at establishing interdependence between words and objects of phenomena they denote. The idea is illustrated by the so-called basic triangle:
Concept
Sound – form_ _ _ _ _ _ _ _ _ _ Referent
[kæt] (concrete object)
The diagram illustrates the correlation between the sound form of a word, the concrete object it denotes and the underlying concept. The dotted line suggests that there is no immediate relation between sound form and referent + we can say that its connection is conventional (human cognition).
However the diagram fails to show what meaning really is. The concept, the referent, or the relationship between the main and the concept.
The merits: it links the notion of meaning to the process of namegiving to objects, process of phenomena. The drawbacks: it cannot be applied to sentences and additional meanings that arise in the conversation. It fails to account for polysemy and synonymy and it operates with subjective and intangible mental process as neither reference nor concept belong to linguistic data.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 2
1. Functional approach to meaning
SEMASIOLOGY
There are different approaches to meaning and types of meaning
Meaning is the object of semasiological study -> semasiology is a branch of lexicology which is concerned with the study of the semantic structure of vocabulary units. The study of meaning is the basis of all linguistic investigations.
Russian linguists have also pointed to the complexity of the phenomenon of meaning (Потебня, Щерба, Смирницкий, Уфимцева и др.)
There are 3 main types of definition of meaning:
(a) Analytical or referential definition
(b) Functional or contextual approach
(c) Operational or information-oriented definition of meaning
FUNCTIONAL (CONTEXTUAL) APPROACH
The supporters of this approach define meaning as the use of word in a language. They believe that meaning should be studied through contexts. If the distribution (position of a linguistic unit to other linguictic units) of two words is different we can conclude that heir meanings are different too (Ex. He looked at me in surprise; He’s been looking for him for a half an hour.)
However, it is hardly possible to collect all contexts for reliable conclusion. In practice a scholar is guided by his experience and intuition. On the whole, this approach may be called complimentary to the referential definition and is applied mainly in structural linguistics.
2. Classification of morphemes
A morpheme is the smallest indivisible two-facet language unit which implies an association of a certain meaning with a certain sound form. Unlike words, morphemes cannot function independently (they occur in speech only as parts of words).
Classification of Morphemes
Within the English word stock maybe distinguished morphologically segment-able and non-segment-able words (soundless, rewrite – segmentable; book, car — non-segmentable).
Morphemic segmentability may be of three types:
a) Complete segmentability is characteristic of words with transparent morphemic structure (morphemes can be easily isolated, e.g. heratless).
b) Conditional segmentability characterizes words segmentation of which into constituent morphemes is doubtful for semantic reasons (retain, detain, contain). Pseudo-morphemes
c) Defective morphemic segmentability is the property of words whose component morphemes seldom or never occur in other words. Such morphemes are called unique morphemes (cran – cranberry (клюква), let- hamlet (деревушка)).
· Semantically morphemes may be classified into: 1) root morphemes – radicals (remake, glassful, disorder — make, glass, order- are understood as the lexical centres of the words) and 2) non-root morphemes – include inflectional (carry only grammatical meaning and relevant only for the formation of word-forms) and affixational morphemes (relevant for building different types of stems).
· Structurally, morphemes fall into: free morphemes (coincides with the stem or a word-form. E.g. friend- of thenoun friendship is qualified as a free morpheme), bound morphemes (occurs only as a constituent part of a word. Affixes are bound for they always make part of a word. E.g. the suffixes –ness, -ship, -ize in the words darkness, friendship, to activize; the prefixes im-, dis-, de- in the words impolite, to disregard, to demobilize) and semi-free or semi-bound morphemes (can function both as affixes and free morphemes. E.g. well and half on the one hand coincide with the stem – to sleep well, half an hour, and on the other in the words – well-known, half-done).
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 3
1. Types of meaning
The word «meaning» is not homogeneous. Its components are described as «types of meaning». The two main types of meaning are grammatical and lexical meaning.
The grammatical meaning is the component of meaning, recurrent in identical sets of individual forms of words (e.g. reads, draws, writes – 3d person, singular; books, boys – plurality; boy’s, father’s – possessive case).
The lexical meaning is the meaning proper to the linguistic unit in all its forms and distribution (e.g. boy, boys, boy’s, boys’ – grammatical meaning and case are different but in all of them we find the semantic component «male child»).
Both grammatical meaning and lexical meaning make up the word meaning and neither of them can exist without the other.
There’s also the 3d type: lexico-grammatical (part of speech) meaning. Third type of meaning is called lexico-grammatical meaning (or part-of-speech meaning). It is a common denominator of all the meanings of words belonging to a lexical-grammatical class (nouns, verbs, adjectives etc. – all nouns have common meaning oа thingness, while all verbs express process or state).
Denotational meaning – component of the lexical meaning which makes communication possible. The second component of the lexical meaning is the connotational component – the emotive charge and the stylistic value of the word.
2. Syntactic structure and pattern of word-groups
The meaning of word groups can be defined as the combined lexical meaning of the component words but it is not a mere additive result of all the lexical meanings of components. The meaning of the word group itself dominates the meaning of the component members (Ex. an easy rule, an easy person).
The meaning of the word group is further complicated by the pattern of arrangement of its constituents (Ex. school grammar- grammar school).
That’s why we should bear in mind the existence of lexical and structural components of meaning in word groups, since these components are independent and inseparable. The syntactic structure (formula) implies the description of the order and arrangement of member-words as parts of speech («to write novels» — verb + noun; «clever at mathematics»- adjective + preposition + noun).
As a rule, the difference in the meaning of the head word is presupposed by the difference in the pattern of the word group in which the word is used (to get + noun = to get letters / presents; to get + to + noun = to get to town). If there are different patterns, there are different meanings. BUT: identity of patterns doesn’t imply identity of meanings.
Semanticallv. English word groups are analyzed into motivated word groups and non-motivated word groups. Word groups are lexically motivated if their meanings are deducible from the meanings of components. The degree of motivation may be different.
A blind man — completely motivated
A blind print — the degree of motivation is lower
A blind alley (= the deadlock) — the degree of motivation is still less.
Non-motivated word-groups are usually described as phraseological units.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 4
1. Classification of phraseological units
The term «phraseological unit» was introduced by Soviet linguist (Виноградов) and it’s generally accepted in this country. It is aimed at avoiding ambiguity with other terms, which are generated by different approaches, are partially motivated and non-motivated.
The first classification of phraseological units was advanced for the Russian language by a famous Russian linguist Виноградов. According to the degree of idiomaticity phraseological units can be classified into three big groups: phraseological collocations (сочетания), phraseological unities (единства) and phraseological fusions (сращения).
Phraseological collocations are not motivated but contain one component used in its direct meaning, while the other is used metaphorically (e.g. to break the news, to attain success).
Phraseological unities are completely motivated as their meaning is transparent though it is transferred (e.g. to shoe one’s teeth, the last drop, to bend the knee).
Phraseological fusions are completely non-motivated and stable (e.g. a mare’s nest (путаница, неразбериха; nonsense), tit-for-tat – revenge, white elephant – expensive but useless).
But this classification doesn’t take into account the structural characteristic, besides it is rather subjective.
Prof. Смирнитский treats phraseological units as word’s equivalents and groups them into: (a) one-summit units => they have one meaningful component (to be tied, to make out); (b) multi-summit units => have two or more meaningful components (black art, to fish in troubled waters).
Within each of these groups he classifies phraseological units according to the part of speech of the summit constituent. He also distinguishes proper phraseological units or units with non-figurative meaning and idioms that have transferred meaning based on metaphor (e.g. to fall in love; to wash one’s dirty linen in public).
This classification was criticized as inconsistent, because it contradicts the principle of idiomaticity advanced by the linguist himself. The inclusion of phrasal verbs into phraseology wasn’t supported by any convincing argument.
Prof. Амазова worked out the so-called contextual approach. She believes that if 3 word groups make up a variable context. Phraseological units make up the so-called fixed context and they are subdivided into phrases and idioms.
2. Procedure of morphemic analysis
Morphemic analysis deals with segmentable words. Its procedure flows to split a word into its constituent morphemes, and helps to determine their number and type. It’s called the method of immediate and ultimate constituents. This method is based on the binary principle which allows to break morphemic structure of a word into 2 components at each stage. The analysis is completed when we arrive at constituents unable of any further division. E.g. Louis Bloomfield — classical example:
ungentlemanly
I. un-(IC/UC) +gentlemanly (IC) (uncertain, unhappy)
II. gentleman (IC) + -ly (IC/UC) (happily, certainly)
III. gentle (IC) +man (IC/UC) (sportsman, seaman)
IV. gent (IC/UC) + le (IC/UC) (gentile, genteel)
The aim of the analysis is to define the number and the type of morphemes.
As we break the word we obtain at any level only 2 immediate constituents, one of which is the stem of the given word. The morphemic analysis may be based either on the identification of affixational morphemes within a set of words, or root morphemes.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 5
1. Causes, nature and results of semantic change
The set of meanings the word possesses isn’t fixed. If approached diachronically, the polysemy reflects sources and types of semantic changes. The causes of such changes may be either extra-linguistic including historical and social factors, foreign influence and the need for a new name, or linguistic, which are due to the associations that words acquire in speech (e.g. «atom» has a Greek origin, now is used in physics; «to engage» in the meaning «to invite» appeared in English due to French influence = > to engage for a dance). To unleash war – развязать войну – but originally – to unleash dogs)
The nature of semantic changes may be of two main types: 1) Similarity of meaning (metaphor). It implies a hidden comparison (bitter style – likeness of meaning or metonymy). It is the process of associating two references, one of which is part of the other, or is closely connected with it. In other words, it is nearest in type, space or function (e.g. «table» in the meaning of “food” or “furniture” [metonymy]).
The semantic change may bring about following results: 1. narrowing of meaning (e.g. “success” – was used to denote any kind of result, but today it is onle “good results”);
2. widening of meaning (e.g. “ready” in Old English was derived from “ridan” which went to “ride” – ready for a ride; but today there are lots of meanings),
3. degeneration of meaning — acquisition by a word of some derogatory or negative emotive charge (e.g. «villain» was borrowed from French “farm servant”; but today it means “a wicked person”).
4. amelioration of meaning — acquisition by a word of some positive emotive charge (e.g. «kwen» in Old English meant «a woman» but in Modern English it is «queen»).
It is obvious that 3, 4 result illustrate the change in both denotational and connotational meaning. 1, 2 change in the denotational.
The change of meaning can also be expressed through a change in the number and arrangement of word meanings without any other changes in the semantic structure of a word.
2. Productivity of word-formation means
According to Смирницкий, word-formation is the system of derivative types of words and the process of creating new words from the material available in the language. Words are formed after certain structural and semantic patterns. The main two types of word-formation are: word-derivation and word-composition (compounding).
The degree of productivity of word-formation and factors that favor it make an important aspect of synchronic description of every derivational pattern within the two types of word-formation. The two general restrictions imposed on the derivational patterns are: 1. the part of speech in which the pattern functions; 2. the meaning which is attached to it.
Three degrees of productivity are distinguished for derivational patterns and individual derivational affixes: highly productive, productive or semi-productive and non-productive.
Productivity of derivational patterns and affixes shouldn’t be identified with frequency of occurrence in speech (e.g.-er — worker, -ful – beautiful are active suffixes because they are very frequently used. But if -er is productive, it is actively used to form new words, while -ful is non-productive since no new words are built).
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 6
1. Morphological, phonetical and semantic motivation
A new meaning of a word is always motivated. Motivation — is the connection between the form of the word (i.e. its phonetic, morphological composition and structural pattern) and its meaning. Therefore a word may be motivated phonetically, morphologically and semantically.
Phonetically motivated words are not numerous. They imitate the sounds (e.g. crash, buzz, ring). Or sometimes they imitate quick movement (e.g. rain, swing).
Morphological motivation is expressed through the relationship of morphemes => all one-morpheme words aren’t motivated. The words like «matter» are called non-motivated or idiomatic while the words like «cranberry» are partially motivated because structurally they are transparent, but «cran» is devoid of lexical meaning; «berry» has its lexical meaning.
Semantic motivation is the relationship between the direct meaning of the word and other co-existing meanings or lexico-semantic variants within the semantic structure of a polysemantic word (e.g. «root»— «roots of evil» — motivated by its direct meaning, «the fruits of peace» — is the result).
Motivation is a historical category and it may fade or completely disappear in the course of years.
2. Classification of compounds
The meaning of a compound word is made up of two components: structural meaning of a compound and lexical meaning of its constituents.
Compound words can be classified according to different principles.
1. According to the relations between the ICs compound words fall into two classes: 1) coordinative compounds and 2) subordinative compounds.
In coordinative compounds the two ICs are semantically equally important. The coordinative compounds fall into three groups:
a) reduplicative compounds which are made up by the repetition of the same base, e.g. pooh-pooh (пренебрегать), fifty-fifty;
b) compounds formed by joining the phonically variated rhythmic twin forms, e.g. chit-chat, zig-zag (with the same initial consonants but different vowels); walkie-talkie (рация), clap-trap (чепуха) (with different initial consonants but the same vowels);
c) additive compounds which are built on stems of the independently functioning words of the same part of speech, e.g. actor-manager, queen-bee.
In subordinative compounds the components are neither structurally nor semantically equal in importance but are based on the domination of the head-member which is, as a rule, the second IС, e.g. stone-deaf, age-long. The second IС preconditions the part-of-speech meaning of the whole compound.
2. According to the part of speech compounds represent they fall into:
1) compound nouns, e.g. sunbeam, maidservant;
2) compound adjectives, e.g. heart-free, far-reaching;
3) compound pronouns, e.g. somebody, nothing;
4) compound adverbs, e.g. nowhere, inside;
5) compound verbs, e.g. to offset, to bypass, to mass-produce.
From the diachronic point of view many compound verbs of the present-day language are treated not as compound verbs proper but as polymorphic verbs of secondary derivation. They are termed pseudo-compounds and are represented by two groups: a) verbs formed by means of conversion from the stems of compound nouns, e.g. to spotlight (from spotlight); b) verbs formed by back-derivation from the stems of compound nouns, e.g. to babysit (from baby-sitter).
However synchronically compound verbs correspond to the definition of a compound as a word consisting of two free stems and functioning in the sentence as a separate lexical unit. Thus, it seems logical to consider such words as compounds by right of their structure.
3. According to the means of composition compound words are classified into:
1) compounds composed without connecting elements, e.g. heartache, dog-house;
2)compounds composed with the help of a vowel or a consonant as a linking element, e.g. handicraft, speedometer, statesman;
3) compounds composed with the help of linking elements represented by preposition or conjunction stems, e.g. son-in-law, pepper-and-salt.
4. According to the type of bases that form compounds the following classes can be singled out:
1) compounds proper that are formed by joining together bases built on the stems or on the word-forms with or without a linking element, e.g. door-step, street-fighting;
2) derivational compounds that are formed by joining affixes to the bases built on the word-groups or by converting the bases built on the word-groups into other parts of speech, e.g. long-legged —> (long legs) + -ed; a turnkey —> (to turn key) + conversion. Thus, derivational compounds fall into two groups: a) derivational compounds mainly formed with the help of the suffixes -ed and -er applied to bases built, as a rule, on attributive phrases, e.g. narrow-minded, doll-faced, lefthander; b) derivational compounds formed by conversion applied to bases built, as a rule, on three types of phrases — verbal-adverbial phrases (a breakdown), verbal-nominal phrases (a kill-joy) and attributive phrases (a sweet-tooth).
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 7
1. Diachronic and synchronic approaches to polysemy
Diachronically, polysemy is understood as the growth and development of the semantic structure of the word. Historically we differentiate between the primary and secondary meanings of words.
The relation between these meanings isn’t only the one of order of appearance but it is also the relation of dependence = > we can say that secondary meaning is always the derived meaning (e.g. dog – 1. animal, 2. despicable person)
Synchronically it is possible to distinguish between major meaning of the word and its minor meanings. However it is often hard to grade individual meaning of the word in order of their comparative value (e.g. to get the letter — получить письмо; to get to London — прибыть в Лондон — minor).
The only more or less objective criterion in this case is the frequency of occurrence in speech (e.g. table – 1. furniture, 2. food). The semantic structure is never static and the primary meaning of a word may become synchronically one of the minor meanings and vice versa. Stylistic factors should always be taken into consideration
Polysemy of words: «yellow»- sensational (Am., sl.)
The meaning which has the highest frequency is the one representative of the whole semantic structure of the word. The Russian equivalent of «a table» which first comes to your mind and when you hear this word is ‘cтол» in the meaning «a piece of furniture». And words that correspond in their major meanings in two different languages are referred to as correlated words though their semantic structures may be different.
Primary meaning — historically first.
Major meaning — the most frequently used meaning of the word synchronically.
2. Typical semantic relations between words in conversion pairs
We can single out the following typical semantic relation in conversion pairs:
1) Verbs converted from nouns (denominal verbs):
a) Actions characteristic of the subject (e.g. ape – to ape – imitate in a foolish way);
b) Instrumental use of the object (e.g. whip — to whip – strike with a whip);
c) Acquisition or addition of the objects (e.g. fish — to fish — to catch fish);
d) Deprivation of the object (e.g. dust — to dust – remove dust).
2) Nouns converted from verbs (deverbal nouns):
a) Instance of the action (e.g. to move — a move = change of position);
b) Agent of an action (e.g. to cheat — a cheat – a person who cheats);
c) Place of the action (e.g. to walk-a walk – a place for walking);
d) Object or result of the action (e.g. to find- a find – something found).
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 8
1. Classification of homonyms
Homonyms are words that are identical in their sound-form or spelling but different in meaning and distribution.
1) Homonyms proper are words similar in their sound-form and graphic but different in meaning (e.g. «a ball»- a round object for playing; «a ball»- a meeting for dances).
2) Homophones are words similar in their sound-form but different in spelling and meaning (e.g. «peace» — «piece», «sight»- «site»).
3) Homographs are words which have similar spelling but different sound-form and meaning (e.g. «a row» [rau]- «a quarrel»; «a row» [rəu] — «a number of persons or things in a more or less straight line»)
There is another classification by Смирницкий. According to the type of meaning in which homonyms differ, homonyms proper can be classified into:
I. Lexical homonyms — different in lexical meaning (e.g. «ball»);
II. Lexical-grammatical homonyms which differ in lexical-grammatical meanings (e.g. «a seal» — тюлень, «to seal» — запечатывать).
III. Grammatical homonyms which differ in grammatical meaning only (e.g. «used» — Past Indefinite, «used»- Past Participle; «pupils»- the meaning of plurality, «pupil’s»- the meaning of possessive case).
All cases of homonymy may be subdivided into full and partial homonymy. If words are identical in all their forms, they are full homonyms (e.g. «ball»-«ball»). But: «a seal» — «to seal» have only two homonymous forms, hence, they are partial homonyms.
2. Classification of prefixes
Prefixation is the formation of words with the help of prefixes. There are about 51 prefixes in the system of modern English word-formation.
1. According to the type they are distinguished into: a) prefixes that are correlated with independent words (un-, dis-), and b) prefixes that are correlated with functional words (e.g. out, over. under).
There are about 25 convertive prefixes which can transfer words to a different part of speech (E.g. embronze59).
Prefixes may be classified on different principles. Diachronically they may be divided into native and foreign origin, synchronically:
1. According to the class they preferably form: verbs (im, un), adjectives (un-, in-, il-, ir-) and nouns (non-, sub-, ex-).
2. According to the lexical-grammatical type of the base they are added to:
a). Deverbal — rewrite, overdo;
b). Denominal — unbutton, detrain, ex-president,
c). Deadjectival — uneasy, biannual.
It is of interest to note that the most productive prefixal pattern for adjectives is the one made up of the prefix un- and the base built either on adjectival stems or present and past participle, e.g. unknown, unsmiling, unseen etc.
3. According to their semantic structure prefixes may fall into monosemantic and polysemantic.
4. According to the generic-denotational meaning they are divided into different groups:
a). Negative prefixes: un-, dis-, non-, in-, a- (e.g. unemployment, non-scientific, incorrect, disloyal, amoral, asymmetry).
b). Reversative or privative60 prefixes: un-, de-, dis- (e.g. untie, unleash, decentralize, disconnect).
c). Pejorative prefixes: mis-, mal-, pseudo- (e.g. miscalculate, misinform, maltreat, pseudo-classicism).
d). Prefixes of time and order: fore-, pre-, post-, ex- (e.g. foretell, pre-war, post-war, ex-president).
e). Prefix of repetition re- (e.g. rebuild, rewrite).
f). Locative prefixes: super-, sub-, inter-, trans- (e.g. superstructure, subway, inter-continental, transatlantic).
5. According to their stylistic reference:
a). Neutral: un-, out-, over-, re-, under- (e.g. outnumber, unknown, unnatural, oversee, underestimate).
b). Stylistically marked: pseudo-, super-, ultra-, uni-, bi- (e.g. pseudo-classical, superstructure, ultra-violet, unilateral) they are bookish.
6. According to the degree of productivity: a). highly productive, b). productive, c). non-productive.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 9
.
1. Types of linguistic contexts
The term “context” denotes the minimal stretch of speech determining each individual meaning of the word. Contexts may be of two types: linguistic (verbal) and extra-linguistic (non-verbal).
Linguistic contexts may be subdivided into lexical and grammatical.
In lexical contexts of primary importance are the groups of lexical items combined with polysemantic word under consideration (e.g. adj. “heavy” is used with the words “load, table” means ‘of great weight’ ; but with natural phenomena “rain, storm, snow, wind’ it is understood as ‘abundant, striking, falling with force’; and if with “industry, artillery, arms” – ‘the larger kind of smth’). The meaning at the level of lexical contexts is sometimes described as meaning by collocation.
In grammatical meaning it is the grammatical (syntactic) structure of the context that serves to determine various individual meanings of a polysemantic word (e.g. the meaning of the verb “to make” – ‘to force, to induce’ is found only in the syntactic structure “to make + prn. +verb”; another meaning ‘to become’ – “to make + adj. + noun” (to make a good teacher, wife)). Such meanings are sometimes described as grammatically bound meanings.
2. Classification of suffixes
Suffixation is the formation of words with the help of suffixes. Suffixes usually modify the lexical meaning of the base and transfer words to a different part of speech. There are suffixes, however, which do not shift words from one part of speech into another; a suffix of this kind usually transfers a word into a different semantic group, e.g. a concrete noun becomes an abstract one, as in the case with child — childhood, friend- friendship etc. Suffixes may be classified:
1. According to the part of speech they form
a). Noun-suffixes: -er, -dom, -ness, -ation (e.g. teacher, freedom, brightness, justification).
b). Adjective-suffixes: -able, -less, -ful, -ic, -ous (e.g. agreeable, careless, doubtful, poetic, courageous).
c). Verb-suffixes: -en, -fy, -ize (e.g. darken, satisfy, harmonize).
d). Adverb-suffixes: -ly, -ward (e.g. quickly, eastward).
2. According to the lexico-grammatical character of the base the suffixes are usually added to:
a). Deverbal suffixes (those added to the verbal base):-er, -ing, -ment, -able (speaker, reading, agreement, suitable).
b). Denominal suffixes (those added to the noun base):-less, -ish, -ful, -ist, -some (handless, childish, mouthful, troublesome).
c). Deadjectival suffixes (those affixed to the adjective base):-en, -ly, -ish, -ness (blacken, slowly, reddish, brightness).
3. According to the meaning expressed by suffixes:
a). The agent of an action: -er, -ant (e.g. baker, dancer, defendant), b). Appurtenance64: -an, -ian, -ese (e.g. Arabian, Elizabethan, Russian, Chinese, Japanese).
c). Collectivity: -age, -dom, -ery (-ry) (e.g. freightage, officialdom, peasantry).
d). Diminutiveness: -ie, -let, -ling (birdie, girlie, cloudlet, booklet, darling).
4. According to the degree of productivity:
a). Highly productive
b). Productive
c). Non-productive
5. According to the stylistic value:
a). Stylistically neutral:-able, -er, -ing.
b). Stylistically marked:-oid, -i/form, -aceous, -tron (e.g. asteroid)
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 10
1. Semantic equivalence and synonymy
The traditional initial category of words that can be singled out on the basis of proximity is synonyms. The degree of proximity varies from semantic equivalence to partial semantic similarity. The classes of full synonyms are very rare and limited mainly two terms.
The greatest degree of similarity is found in those words that are identical in their denotational aspect of meaning and differ in connotational one (e.g. father- dad; imitate – monkey). Such synonyms are called stylistic synonyms. However, in the major of cases the change in the connotational aspect of meaning affects in some way the denotational aspect. These synonyms of the kind are called ideographic synonyms (e.g. clever – bright, smell – odor). Differ in their denotational aspect ideographic synonyms (kill-murder, power – strength, etc.) – these synonyms are most common.
It is obvious that synonyms cannot be completely interchangeable in all contexts. Synonyms are words different in their sound-form but similar in their denotational aspect of meaning and interchangeable at least in some contexts.
Each synonymic group comprises a dominant element. This synonymic dominant is general term which has no additional connotation (e.g. famous, celebrated, distinguished; leave, depart, quit, retire, clear out).
Syntactic dominants have high frequency of usage, vast combinability and lack connotation.
2. Derivational types of words
The basic units of the derivative structure of words are: derivational basis, derivational affixes, and derivational patterns.
The relations between words with a common root but of different derivative structure are known as derivative relations.
The derivational base is the part of the word which establishes connections with the lexical unit that motivates the derivative and defines its lexical meaning. It’s to this part of the word (derivational base) that the rule of word formation is applied. Structurally, derivational bases fall into 3 classes: 1. Bases that coincide with morphological stems (beautiful, beautifully); 2. Bases that coincide with word-forms (unknown- limited mainly to verbs); 3. Bases that coincide with word groups. They are mainly active in the class of adjectives and nouns (blue-eyed, easy-going).
According to their derivational structure words fall into: simplexes (simple, non-derived words) and complexes (derivatives). Complexes are grouped into: derivatives and compounds. Derivatives fall into: affixational (suffixal and affixal) types and conversions. Complexes constitute the largest class of words. Both morphemic and derivational structure of words is subject to various changes in the course of time.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 11
1. Semantic contrasts and antonymy
The semantic relations of opposition are the basis for grouping antonyms. The term «antonym» is of Greek origin and means “opposite name”. It is used to describe words different in some form and characterised by different types of semantic contrast of denotational meaning and interchangeability at least in some contexts.
Structurally, all antonyms can be subdivided into absolute (having different roots) and derivational (of the same root), (e.g. «right»- «wrong»; «to arrive»- «to leave» are absolute antonyms; but «to fit» — «to unfit» are derivational).
Semantically, all antonyms can be divided in at least 3 groups:
a) Contradictories. They express contradictory notions which are mutually opposed and deny each other. Their relations can be described by the formula «A versus NOT A»: alive vs. dead (not alive); patient vs. impatient (not patient). Contradictories may be polar or relative (to hate- to love [not to love doesn’t mean «hate»]).
b) Contraries are also mutually opposed, but they admit some possibility between themselves because they are gradable (e.g. cold – hot, warm; hot – cold, cool). This group also includes words opposed by the presence of such components of meaning as SEX and AGE (man -woman; man — boy etc.).
c) Incompatibles. The relations between them are not of contradiction but of exclusion. They exclude possibilities of other words from the same semantic set (e.g. «red»- doesn’t mean that it is opposed to white it means all other colors; the same is true to such words as «morning», «day», «night» etc.).
There is another type of opposition which is formed with reversive antonyms. They imply the denotation of the same referent, but viewed from different points (e.g. to buy – to sell, to give – to receive, to cause – to suffer)
A polysemantic word may have as many antonyms as it has meanings. But not all words and meanings have antonyms!!! (e.g. «a table»- it’s difficult to find an antonym, «a book»).
Relations of antonymy are limited to a certain context + they serve to differentiate meanings of a polysemantic word (e.g. slice of bread — «thick» vs. «thin» BUT: person — «fat» vs. «thin»).
2. Types of word segmentability
Within the English word stock maybe distinguished morphologically segment-able and non-segmentable words (soundless, rewrite — segmentable; book, car — non-segmentable).
Morphemic segmentability may be of three types: 1. complete, 2. conditional, 3. defective.
A). Complete segmentability is characteristic of words with transparent morphemic structure. Their morphemes can be easily isolated which are called morphemes proper or full morphemes (e.g. senseless, endless, useless). The transparent morphemic structure is conditioned by the fact that their constituent morphemes recur with the same meaning in a number of other words.
B). Conditional segmentability characterizes words segmentation of which into constituent morphemes is doubtful for semantic reasons (e.g. retain, detain, contain). The sound clusters «re-, de-, con-» seem to be easily isolated since they recur in other words but they have nothing in common with the morphemes «re, de-, con-» which are found in the words «rewrite», «decode», «condensation». The sound-clusters «re-, de-, con-» can possess neither lexical meaning nor part of speech meaning, but they have differential and distributional meaning. The morphemes of the kind are called pseudo-morphemes (quasi morphemes).
C). Defective morphemic segmentability is the property of words whose component morphemes seldom or never recur in other words. Such morphemes are called unique morphemes. A unique morpheme can be isolated and displays a more or less clear meaning which is upheld by the denotational meaning of the other morpheme of the word (cranberry, strawberry, hamlet).
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 12
1. The main features of A.V.Koonin’s approach to phraseology
Phraseology is regarded as a self-contained branch of linguistics and not as a part of lexicology.
His classification is based on the combined structural-semantic principle and also considers the level of stability of phraseological units.
Кунин subdivides set-expressions into: phraseological units or idioms(e.g. red tape, mare’s nest, etc.), semi-idioms and phraseomatic units(e.g. win a victory, launch a campaign, etc.).
Phraseological units are structurally separable language units with completely or partially transferred meanings (e.g. to kill two birds with one stone, to be in a brown stubby – to be in low spirits). Semi-idioms have both literal and transferred meanings. The first meaning is usually terminological or professional and the second one is transferred (e.g. to lay down one’s arms). Phraseomatic units have literal or phraseomatically bound meanings (e.g. to pay attention to smth; safe and sound).
Кунин assumes that all types of set expressions are characterized by the following aspects of stability: stability of usage (not created in speech and are reproduced ready-made); lexical stability (components are irreplaceable (e.g. red tape, mare’s nest) or partly irreplaceable within the limits of lexical meaning, (e.g. to dance to smb tune/pipe; a skeleton in the cupboard/closet; to be in deep water/waters)); semantic complexity (despite all occasional changes the meaning is preserved); syntactic fixity.
Idioms and semi-idioms are much more complex in structure than phraseological units. They have a broad stylistic range and they admit of more complex occasional changes.
An integral part of this approach is a method of phraseological identification which helps to single out set expressions in Modern English.
2. Types and ways of forming words
According to Смирницкий word-formation is a system of derivative types of words and the process of creating new words from the material available in the language after certain structural and semantic patterns. The main two types are: word-derivation and word-composition (compounding).
The basic ways of forming words in word-derivation are affixation and conversion (the formation of a new word by bringing a stem of this word into a different formal paradigm, e.g. a fall from to fall).
There exist other types: semantic word-building (homonymy, polysemy), sound and stress interchange (e.g. blood – bleed; increase), acronymy (e.g. NATO), blending (e.g. smog = smoke + fog) and shortening of words (e.g. lab, maths). But they are different in principle from derivation and compound because they show the result but not the process.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 13
1. Origin of derivational affixes
From the point of view of their origin, derivational affixes are subdivided into native (e.g suf.- nas, ish, dom; pref.- be, mis, un) and foreign (e.g. suf.- ation, ment, able; pref.- dis, ex, re).
Many original affixes historically were independent words, such as dom, hood and ship. Borrowed words brought with them their derivatives, formed after word-building patterns of their languages. And in this way many suffixes and prefixes of foreign origin have become the integral part of existing word-formation (e.g. suf.- age; pref.- dis, re, non). The adoption of foreign words resulted into appearance of hybrid words in English vocabulary. Sometimes a foring stem is combined with a native suffix (e.g. colourless) and vise versa (e.g. joyous).
Reinterpretation of verbs gave rise to suffix-formation source language (e.g. “scape” – seascape, moonscape – came from landscape. And it is not a suffix.).
2. Correlation types of compounds
Motivation and regularity of semantic and structural correlation with free word-groups are the basic factors favouring a high degree of productivity of composition and may be used to set rules guiding spontaneous, analogic formation of new compound words.
The description of compound words through the correlation with variable word-groups makes it possible to classify them into four major classes: 1) adjectival-nominal, 2) verbal-nominal, 3) nominal and 4) verbal-adverbial.
I. Adjectival-nominal comprise for subgroups of compound adjectives:
1) the polysemantic n+a pattern that gives rise to two types:
a) Compound adjectives based on semantic relations of resemblance: snow-white, skin-deep, age-long, etc. Comparative type (as…as).
b) Compound adjectives based on a variety of adverbial relations: colour-blind, road-weary, care-free, etc.
2) the monosemantic pattern n+venbased mainly on the instrumental, locative and temporal relations, e.g. state-owned, home-made. The type is highly productive. Correlative relations are established with word-groups of the Ven+ with/by + N type.
3) the monosemantic num + npattern which gives rise to a small and peculiar group of adjectives, which are used only attributively, e.g. (a) two-day (beard), (a) seven-day (week), etc. The quantative type of relations.
4) a highly productive monosemantic pattern of derivational compound adjectives based on semantic relations of possession conveyed by the suffix -ed. The basic variant is [(a+n)+ -ed], e.g. long-legged. The pattern has two more variants: [(num + n) + -ed), l(n+n)+ -ed],e.g. one-sided, bell-shaped, doll-faced. The type correlates accordingly with phrases with (having) + A+N, with (having) + Num + N, with + N + N or with + N + of + N.
The three other types are classed as compound nouns. All the three types are productive.
II. Verbal-nominal compounds may be described through one derivational structure n+nv, i.e. a combination of a noun-base (in most cases simple) with a deverbal, suffixal noun-base. All the patterns correlate in the final analysis with V+N and V+prp+N type which depends on the lexical nature of the verb:
1) [n+(v+-er)],e.g. bottle-opener, stage-manager, peace-fighter. The pattern is monosemantic and is based on agentive relations that can be interpreted ‘one/that/who does smth’.
2) [n+(v+-ing)],e.g. stage-managing, rocket-flying. The pattern is monosemantic and may be interpreted as ‘the act of doing smth’.
3) [n+(v+-tion/ment)],e.g. office-management, price-reduction.
4) [n+(v + conversion)],e.g. wage-cut, dog-bite, hand-shake, the pattern is based on semantic relations of result, instance, agent, etc.
III. Nominal compounds are all nouns with the most polysemantic and highly-productive derivational pattern n+n; both bases are generally simple stems, e.g. windmill, horse-race, pencil-case. The pattern conveys a variety of semantic relations; the most frequent are the relations of purpose and location. The pattern correlates with nominal word-groups of the N+prp+N type.
IV. Verb-adverb compounds are all derivational nouns, highly productive and built with the help of conversion according to the pattern [(v + adv) + conversion].The pattern correlates with free phrases V + Adv and with all phrasal verbs of different degree of stability. The pattern is polysemantic and reflects the manifold semantic relations of result.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 14
1. Hyponymic structures and lexico-semantic groups
The grouping out of English word stock based on the principle of proximity, may be graphically presented by means of “concentric circles”.
lexico-semantic groups
lexical sets
synonyms
semantic field
The relations between layers are that of inclusion.
The most general term – hyperonym, more special – hyponym (member of the group).
The meaning of the word “plant” includes the idea conveyed by “flower”, which in its turn include the notion of any particular flower. Flower – hyperonim to… and plant – hyponym to…
Hyponymic relations are always hierarchic. If we imply substitution rules we shall see the hyponyms may be replaced be hyperonims but not vice versa (e.g. I bought roses yesterday. “flower” – the sentence won’t change its meaning).
Words describing different sides of one and the same general notion are united in a lexico-semantic group if: a) the underlying notion is not too generalized and all-embracing, like the notions of “time”, “life”, “process”; b) the reference to the underlying is not just an implication in the meaning of lexical unit but forms an essential part in its semantics.
Thus, it is possible to single out the lexico-semantic group of names of “colours” (e.g. pink, red, black, green, white); lexico-semantic group of verbs denoting “physical movement” (e.g. to go, to turn, to run) or “destruction” (e.g. to ruin, to destroy, to explode, to kill).
2. Causes and ways of borrowing
The great influx of borrowings from Latin, English and Scandinavian can be accounted by a number of historical causes. Due to the great influence of the Roman civilisation Latin was for a long time used in England as the language of learning and religion. Old Norse was the language of the conquerors who were on the same level of social and cultural development and who merged rather easily with the local population in the 9th, 10th and the first half of the 11th century. French (Norman dialect) was the language of the other conquerors who brought with them a lot of new notions of a higher social system (developed feudalism), it was the language of upper classes, of official documents and school instruction from the middle of the 11th century to the end of the 14th century.
In the study of the borrowed element in English the main emphasis is as a rule placed on the Middle English period. Borrowings of later periods became the object of investigation only in recent years. These investigations have shown that the flow of borrowings has been steady and uninterrupted. The greatest number has come from French. They refer to various fields of social-political, scientific and cultural life. A large portion of borrowings is scientific and technical terms.
The number and character of borrowed words tell us of the relations between the peoples, the level of their culture, etc.
Some borrowings, however, cannot be explained by the direct influence of certain historical conditions, they do not come along with any new objects or ideas. Such were for instance the words air, place, brave, gay borrowed from French.
Also we can say that the closer the languages, the deeper is the influence. Thus under the influence of the Scandinavian languages, which were closely related to Old English, some classes of words were borrowed that could not have been adopted from non-related or distantly related languages (the pronouns they, their, them); a number of Scandinavian borrowings were felt as derived from native words (they were of the same root and the connection between them was easily seen), e.g. drop(AS.) — drip (Scand.), true (AS.)-tryst (Scand.); the Scandinavian influence even accelerated to a certain degree the development of the grammatical structure of English.
Borrowings enter the language in two ways: through oral speech (early periods of history, usually short and they undergo changes) and through written speech (recent times, preserve spelling and peculiarities of the sound form).
Borrowings may be direct or indirect (e.g., through Latin, French).
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 15
1. Types of English dictionaries
English dictionaries may all be roughly divided into two groups — encyclopaedic and linguistic.
The encyclopaedic dictionaries, (The Encyclopaedia Britannica and The Encyclopedia Americana) are scientific reference books dealing with every branch of knowledge, or with one particular branch, usually in alphabetical order. They give information about the extra-linguistic world; they deal with facts and concepts. Linguistic dictionaries are wоrd-books the subject-matter of which is lexical units and their linguistic properties such as pronunciation, meaning, peculiarities of use, etc.
Linguistic dictionaries may be divided into different categories by different criteria.
1. According to the nature of their word-listwe may speak about general dictionaries (include frequency dictionary, a rhyming dictionary, a Thesaurus) and restricted (belong terminological, phraseological, dialectal word-books, dictionaries of new words, of foreign words, of abbreviations, etc).
2. According to the information they provide all linguistic dictionaries fall into two groups: explanatory and specialized.
Explanatory dictionaries present a wide range of data, especially with regard to the semantic aspect of the vocabulary items entered (e.g. New Oxford Dictionary of English).
Specialized dictionaries deal with lexical units only in relation to some of their characteristics (e.g. etymology, frequency, pronunciation, usage)
3. According to the language of explanations all dictionaries are divided into: monolingual and bilingual.
4. Dictionaries also fall into diachronic and synchronic with regard of time. Diachronic (historical) dictionaries reflect the development of the English vocabulary by recording the history of form and meaning for every word registered (e.g. Oxford English Dictionary). Synchronic (descriptive) dictionaries are concerned with the present-day meaning and usage of words (e.g. Advanced Learner’s Dictionary of Current English).
(Phraseological dictionaries, New Words dictionaries, Dictionaries of slang, Usage dictionaries, Dictionaries of word-frequency, A Reverse dictionary, Pronouncing dictionaries, Etymological dictionaries, Ideographic dictionaries, synonym-books, spelling reference books, hard-words dictionaries, etc.)
2. The role of native and borrowed elements in English
The number of borrowings in Old English was small. In the Middle English period there was an influx of loans. It is often contended that since the Norman Conquest borrowing has been the chief factor in the enrichment of the English vocabulary and as a result there was a sharp decline in the productivity of word-formation. Historical evidence, however, testifies to the fact that throughout its entire history, even in the periods of the mightiest influxes of borrowings, other processes, no less intense, were in operation — word-formation and semantic development, which involved both native and borrowed elements.
If the estimation of the role of borrowings is based on the study of words recorded in the dictionary, it is easy to overestimate the effect of the loan words, as the number of native words is extremely small compared with the number of borrowings recorded. The only true way to estimate the relation of the native to the borrowed element is to consider the two as actually used in speech. If one counts every word used, including repetitions, in some reading matter, the proportion of native to borrowed words will be quite different. On such a count, every writer uses considerably more native words than borrowings. Shakespeare, for example, has 90%, Milton 81%, Tennyson 88%. It shows how important is the comparatively small nucleus of native words.
Different borrowings are marked by different frequency value. Those well established in the vocabulary may be as frequent in speech as native words, whereas others occur very rarely.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 16
1. The main variants of the English language
In Modern linguistics the distinction is made between Standard English and territorial variants and local dialects of the English language.
Standard English may be defined as that form of English which is current and literary, substantially uniform and recognized as acceptable wherever English is spoken or understood. Most widely accepted and understood either within an English-speaking country or throughout the entire English-speaking world.
Variants of English are regional varieties possessing a literary norm. There are distinguished variants existing on the territory of the United Kingdom (British English, Scottish English and Irish English), and variants existing outside the British Isles (American English, Canadian English, Australian English, New Zealand English, South African English and Indian English). British English is often referred to the Written Standard English and the pronunciation known as Received Pronunciation (RP).
Local dialects are varieties of English peculiar to some districts, used as means of oral communication in small localities; they possess no normalized literary form.
Variants of English in the United Kingdom
Scottish English and Irish English have a special linguistic status as compared with dialects because of the literature composed in them.
Variants of English outside the British Isles
Outside the British Isles there are distinguished the following variants of the English language: American English, Canadian English, Australian English, New Zealand English, South African English, Indian English and some others. Each of these has developed a literature of its own, and is characterized by peculiarities in phonetics, spelling, grammar and vocabulary.
2. Basic problems of dictionary-compiling
Lexicography, the science, of dictionary-compiling, is closely connected with lexicology, both dealing with the same problems — the form, meaning, usage and origin of vocabulary units — and making use of each other’s achievements.
Some basic problems of dictionary-compiling:
1) the selection of lexical units for inclusion,
2) their arrangement,
3) the setting of the entries,
4) the selection and arrangement (grouping) of word-meanings,
5) the definition of meanings,
6) illustrative material,
7) supplementary material.
1) The selection of lexical units for inclusion.
It is necessary to decide: a) what types of lexical units will be chosen for inclusion; b) the number of items; c) what to select and what to leave out in the dictionary; d) which form of the language, spoken or written or both, the dictionary is to reflect; e) whether the dictionary should contain obsolete units, technical terms, dialectisms, colloquialisms, and so forth.
The choice depends upon the type to which the dictionary will belong, the aim the compilers pursue, the prospective user of the dictionary, its size, the linguistic conceptions of the dictionary-makers and some other considerations.
2) Arrangement of entries.
There are two modes of presentation of entries: the alphabetical order and the cluster-type (arranged in nests, based on some principle – words of the same root).
3) The setting of the entries.
Since different types of dictionaries differ in their aim, in the information they provide, in their size, etc., they of necessity differ in the structure and content of the entry.
The most complicated type of entry is that found in general explanatory dictionaries of the synchronic type (the entry usually presents the following data: accepted spelling and pronunciation; grammatical characteristics including the indication of the part of speech of each entry word, whether nouns are countable or uncountable, the transitivity and intransitivity of verbs and irregular grammatical forms; definitions of meanings; modern currency; illustrative examples; derivatives; phraseology; etymology; sometimes also synonyms and antonyms.
4) The selection and arrangement (grouping) of word-meanings.
The number of meanings a word is given and their choice in this or that dictionary depend, mainly, on two factors: 1) on what aim the compilers set themselves and 2) what decisions they make concerning the extent to which obsolete, archaic, dialectal or highly specialised meanings should be recorded, how the problem of polysemy and homonymy is solved, how cases of conversion are treated, how the segmentation of different meanings of a polysemantic word is made, etc.
There are at least three different ways in which the word meanings are arranged: a) in the sequence of their historical development (called historical order), b) in conformity with frequency of use that is with the most common meaning first (empirical or actual order), c) in their logical connection (logical order).
5) The definition of meanings.
Meanings of words may be defined in different ways: 1) by means of linguistic definitions that are only concerned with words as speech material, 2) by means of encyclopaedic definitions that are concerned with things for which the words are names (nouns, proper nouns and terms), 3) be means of synonymous words and expressions (verbs, adjectives), 4) by means of cross-references (derivatives, abbreviations, variant forms). The choice depends on the nature of the word (the part of speech, the aim and size of the dictionary).
6) Illustrative material.
It depends on the type of the dictionary and on the aim the compliers set themselves.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 17
1. Sources of compounds
The actual process of building compound words may take different forms: 1) Compound words as a rule are built spontaneously according to productive distributional formulas of the given period. Formulas productive at one time may lose their productivity at another period. Thus at one time the process of building verbs by compounding adverbial and verbal stems was productive, and numerous compound verbs like, e.g. outgrow, offset, inlay (adv + v), were formed. The structure ceased to be productive and today practically no verbs are built in this way.
2) Compounds may be the result of a gradual process of semantic isolation and structural fusion of free word-groups. Such compounds as forget-me-not; bull’s-eye—’the centre of a target; a kind of hard, globular candy’; mainland—‘acontinent’ all go back to free phrases which became semantically and structurally isolated in the course of time. The words that once made up these phrases have lost their integrity, within these particular formations, the whole phrase has become isolated in form, «specialized in meaning and thus turned into an inseparable unit—a word having acquired semantic and morphological unity. Most of the syntactic compound nouns of the (a+n) structure, e.g. bluebell, blackboard, mad-doctor, are the result of such semantic and structural isolation of free word-groups; to give but one more example, highway was once actually a high way for it was raised above the surrounding countryside for better drainage and ease of travel. Now we use highway without any idea of the original sense of the first element.
2. Lexical differences of territorial variants of English
All lexical units may be divided into general English (common to all the variants) and locally-marked (specific to present-day usage in one of the variants and not found in the others). Different variants of English use different words for the same objects (BE vs. AE: flat/apartment, underground/subway, pavement/sidewalk, post/mail).
Speaking about lexical differences between the two variants of the English language, the following cases are of importance:
1. Cases where there are no equivalent words in one of the variant! (British English has no equivalent to the American word drive-in (‘a cinema or restaurant that one can visit without leaving one’s car’)).
2. Cases where different words are used for the same denotatum, e.g. sweets (BrE) — candy (AmE); reception clerk (BrE) — desk clerk (AmE).
3. Cases where some words are used in both variants but are much commoner in one of them. For example, shop and store are used in both variants, but the former is frequent in British English and the latter in American English.
4. Cases where one (or more) lexico-semantic variant(s) is (are) specific to either British English or American English (e.g. faculty, denoting ‘all the teachers and other professional workers of a university or college’ is used only in American English; analogous opposition in British English or Standard English — teaching staff).
5. Cases where one and the same word in one of its lexico-semantic variants is used oftener in British English than in American English (brew — ‘a cup of tea’ (BrE), ‘a beer or coffee drink’ (AmE).
Cases where the same words have different semantic structure in British English and American English (homely — ‘home-loving, domesticated, house-proud’ (BrE), ‘unattractive in appearance’ (AmE); politician ‘a person who is professionally involved in politics’, neutral, (BrE), ‘a person who acts in a manipulative and devious way, typically to gain advancement within an organisation’ (AmE).
Besides, British English and American English have their own derivational peculiarities (some of the affixes more frequently used in American English are: -ее (draftee — ‘a young man about to be enlisted’), -ster (roadster — ‘motor-car for long journeys by road’), super- (super-market — ‘a very large shop that sells food and other products for the home’); AmE favours morphologically more complex words (transportation), BrE uses clipped forms (transport); AmE prefers to form words by means of affixes (burglarize), BrE uses back-formation (burgle from burglar).
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 18
1. Methods and procedures of lexicological analysis
The process of scientific investigation may be subdivided into several stages:
1. Observation (statements of fact must be based on observation)
2. Classification (orderly arrangement of the data)
3. Generalization (formulation of a generalization or hypothesis, rule a law)
4. The verifying process. Here, various procedures of linguistic analysis are commonly applied:
1). Contrastive analysis attempts to find out similarities and differences in both philogenically related and non-related languages. In fact contrastive analysis grew as the result of the errors which are made recurrently by foreign language students. They can be often traced back to the differences in structure between the target language and the language of the learner, detailed comparison of these two languages has been named contrastive analysis.
Contrastive analysis brings to light the essence of what is usually described as idiomatic English, idiomatic Russian etc., i.e. the peculiar way in which every language combines and structures in lexical units various concepts to denote extra-linguistic reality.
2). Statistical analysis is the quantitative study of a language phenomenon. Statistical linguistics is nowadays generally recognised as one of the major branches of linguistics. (frequency – room, collocability)
3). Immediate constituents analysis. The theory of Immediate Constituents (IC) was originally elaborated as an attempt to determine the ways in which lexical units are relevantly related to one another. The fundamental aim of IC analysis is to segment a set of lexical units into two maximally independent sequences or ICs thus revealing the hierarchical structure of this set.
4). Distributional analysis and co-occurrence. By the term distribution we understand the occurrence of a lexical unit relative to other lexical units of the same level (the position which lexical units occupy or may occupy in the text or in the flow of speech). Distributional analysis is mainly applied by the linguist to find out sameness or difference of meaning.
5). Transformational analysis can be definedas repatterning of various distributional structures in order to discover difference or sameness of meaning of practically identical distributional patterns. It may be also described as a kind of translation (transference of a message by different means).
6). Componental analysis (1950’s). In this analysis linguists proceed from the assumption that the smallest units of meaning are sememes (семема — семантическая единица) or semes (сема (минимальная единица содержания)) and that sememes and lexemes (or lexical items) are usually not in one-to-one but in one-to-many correspondence (e.g. in lexical item “woman”, semems are – human, female, adult). This analysis deals with individual meanings.
7). Method of Semantic Differential (set up by American psycholinguists). The analysis is concerned with measurement of differences of the connotational meaning, or the emotive charge, which is very hard to grasp.
2. Ways and means of enriching the vocabulary of English
Development of the vocabulary can be described a process of the never-ending growth. There are two ways of enriching the vocabulary:
A. Vocabulary extension — the appearance of new lexical items. New vocabulary units appear mainly as a result of: 1) productive or patterned ways of word-formation (affixation, conversion, composition); 2) non-patterned ways of word-creation (lexicalization – transformation of a word-form into a word, e.g. arms-arm, customs (таможня)-custom); shortening — transformation of a word-group into a word or a change of the word-structure resulting in a new lexical item, e.g. RD for Road, St for Street; substantivization – the finals to the final exams, acronyms (NATO) and letter abbreviation (D.J. – disk jokey), blendings (brunch – breakfast and lunch), clipping – shortening of a word of two or more syllables (bicycle – bike, pop (clipping plus substativization) – popular music)); 3) borrowing from other languages.
Borrowing as a means of replenishing the vocabulary of present-day English is of much lesser importance and is active mainly in the field of scientific terminology. 1) Words made up of morphemes of Latin and Greek origin (e.g. –tron: mesotron; tele-: telelecture; -in: protein). 2) True borrowings which reflect the way of life, the peculiarities of development of speech communities from which they come. (e.g. kolkhoz, sputnik). 3) Loan-translations also reflect the peculiarities of life and easily become stable units of the vocabulary (e.g. fellow-traveler, self-criticism)
B. Semantic extension — the appearance of new meanings of existing words which may result in homonyms. The semantic development of words already available in the language is the main source of the qualitative growth of the vocabulary but does not essentially change the vocabulary quantatively.
The most active ways of word creation are clippings and acronyms.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 19
1. Means of composition
From the point of view of the means by which the components are joined together compound words may be classified into:
1) Words formed by merely placing one constituent after another (e.g. house-dog, pot-pie) can be: asyntactic (the order of bases runs counter to the order in which the words can be brought together under the rules of syntax of the language, e.g. red-hot, pale-blue, oil-rich) and syntactic (the order of words arranged according to the rules of syntax, e.g. mad-doctor, blacklist).
2) Compound words whose ICs are joined together with a special linking-element — linking vowels (o) and consonants (s), e.g. speedometer, tragicomic, statesman.
The additive compound adjectives linked with the help of the vowel [ou] are limited to the names of nationalities and represent a specific group with a bound root for the first component, e.g. Sino-Japanese, Afro-Asian, Anglo-Saxon.
2. Synchronic and diachronic approaches to conversion
Conversion is the formation of a new word through changes in its paradigm (category of a part of speech). As a paradigm is a morphological category, conversion can be described as a morphological way of forming words (Смирницкий). The term was introduced by Henry Sweet.
The causes that made conversion so widely spread are to be approached diachronically. Nouns and verbs have become identical in form firstly as a result of the loss of endings. The similar phenomenon can be observed in words borrowed from the French language. Thus, from the diachronic point of view distinctions should be made between homonymous word-pairs, which appeared as a result of the loss of inflections (окончание, изменяемая часть слова).
In the course of time the semantic structure of the base nay acquire a new meaning or several meanings under the influence of the meanings of the converted word (reconversion).
Synchronically we deal with pairs of words related through conversion that coexist in contemporary English. A careful examination of the relationship between the lexical meaning of the root-morpheme and the part-of-speech meaning of the stem within a conversion pair reveals that in one of the two words the former does not correspond to the latter.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 20
1. Denotational and connotational aspects of meaning
The lexical meaning comprises two main components: the denotational aspect of meaning and the connotational aspect of meaning. The term «denotational aspect of meaning» is derived from «to denote» and it is through this component of meaning that the main information is conveyed in the process of communication. Besides, it helps to insure references to things common to all the speakers of the given language (e.g. «chemistry»- I’m not an expert in it, but I know what it is about, «dentist», «spaceship»).
The connotational aspect may be called «optional». It conveys additional information in the process of communication. And it may denote the emotive charge and the stylistic value of the word. The emotive charge is the emotive evaluation inherent in the connotational component of the lexical meaning (e.g. «notorious» => [widely known] => for criminal acts, bad behaviour, bad traits of character; «famous» => [widely known] => for special achievement etc.).
Positive/Negative evaluation; emotive charge/stylistic value.
«to love» — neutral
«to adore» — to love greatly => the emotive charge is higher than in «to love»
«to shake» — neutral.
«to shiver» — is stronger => higher emotive charge.
Mind that the emotive charge is not a speech characteristic of the word. It’s a language phenomenon => it remains stable within the basical meaning of the word.
If associations with the lexical meaning concern the situation, the social circumstances (formal/informal), the social relations between the interlocutors (polite/rough), the type or purpose of communication (poetic/official)the connotation is stylistically coloured. It is termed as stylistic reference. The main stylistic layers of the vocabulary are:
Literary «parent» «to pass into the next world» — bookish
Neutral «father» «to die»
Colloquial «dad» «to kick the bucket»
But the denotational meaning is the same.
2. Semantic fields
lexico-semantic groups
lexical sets
synonyms
semantic field
The broadest semantic group is usually referred to as the semantic field. It is a closely neat section of vocabulary characterized by a common concept (e.g. emotions). The common semantic component of the field is called the common dominator. All members of the field are semantically independent, as the meaning of each is determined by the presence of others. Semantic field may be very impressive, covering big conceptual areas (emotions, movements, space). Words comprising the field may belong to different parts of speech.
If the underlying notion is broad enough to include almost all-embracing sections of vocabulary we deal with semantic fields (e.g. cosmonaut, spacious, to orbit – belong to the semantic field of ‘space’).
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 21
1. Assimilation of borrowings
The term ‘assimilation of borrowings’ is used to denote a partial or total conformation to the phonetical, graphical and morphological standards of the English language and its semantic system.
According to the degree of assimilation all borrowed words can be divided into three groups:
1) completely assimilated borrowings;
2) partially assimilated borrowings;
3) unassimilated borrowings or barbarisms.
1. Completely assimilated borrowed words follow all morphological, phonetical and orthographic standards, take an active part in word-formation. The morphological structure and motivation of completely assimilated borrowings remain usually transparent, so that they are morphologically analyzable and therefore supply the English vocabulary not only with free forms but also with bound forms, as affixes are easily perceived and separated in series of borrowed words that contain them (e.g. the French suffixes —age, -ance and -ment).
They are found in all the layers of older borrowings, e. g. cheese (the first layer of Latin borrowings), husband (Scand),face (Fr), animal (Latin, borrowed during the revival of learning).
A loan word never brings into the receiving language the whole of its semantic structure if it is polysemantic in the original language (e.g., ‘sport’in Old French — ‘pleasures, making merry and entertainments in general’, now — outdoor games and exercise).
2. Partially assimilated borrowed words may be subdivided depending on the aspect that remains unaltered into:
a) borrowings not completely assimilated graphically (e.g., Fr. ballet, buffet;some may keep a diacritic mark: café, cliché;retained digraphs (ch, qu, ou, etc.): bouquet, brioche);
b) borrowings not completely assimilated phonetically (e.g., Fr. machine, cartoon, police(accent is on the final syllable), [3] — bourgeois, prestige, regime(stress + contain sounds or combinations of sounds that are not standard for the English language));
c) borrowings not assimilated grammatically (e.g., Latin or Greek borrowings retain original plural forms: crisis — crises, phenomenon — phenomena;
d) borrowings not assimilated semantically because they denote objects and notions peculiar to the country from which they come (e. g. sari, sombrero, shah, rajah, toreador, rickshaw(Chinese), etc.
3. Unassimilated borrowings or barbarisms. This group includes words from other languages used by English people in conversation or in writing but not assimilated in any way, and for which there are corresponding English equivalents, e.g. the Italian addio, ciao— ‘good-bye’.
Etymological doublets are two or more words originating from the same etymological source, but differing in phonetic shape and meaning (e.g. the words ‘whole’(originally meant ‘healthy’, ‘free from disease’) and ‘hale’both come from OE ‘hal’:one by the normal development of OE ‘a’ into ‘o’, the other from a northern dialect in which this modification did not take place. Only the latter has servived in its original meaning).
2. Semi-affixes
There is a specific group of morphemes whose derivational function does not allow one to refer them unhesitatingly either to the derivational affixes or bases. In words like half-done, half-broken, half-eaten and ill-fed, ill-housed, ill-dressed the ICs ‘half-‘ and ‘ill-‘ are given in linguistic literature different interpretations: they are described both as bases and as derivational prefixes. The comparison of these ICs with the phonetically identical stems in independent words ‘ill’ and ‘half’ as used in such phrases as to speak ill of smb, half an hour ago makes it obvious that in words like ill-fed, ill-mannered, half-done the ICs ‘ill-‘ and ‘half-‘ are losing both their semantic and structural identity with the stems of the independent words. They are all marked by a different distributional meaning which is clearly revealed through the difference of their collocability as compared with the collocability of the stems of the independently functioning words. As to their lexical meaning they have become more indicative of a generalizing meaning of incompleteness and poor quality than the individual meaning proper to the stems of independent words and thus they function more as affixational morphemes similar to the prefixes ‘out-, over-, under-, semi-, mis-‘ regularly forming whole classes of words.
Besides, the high frequency of these morphemes in the above-mentioned generalized meaning in combination with the numerous bases built on past participles indicates their closer ties with derivational affixes than bases. Yet these morphemes retain certain lexical ties with the root-morphemes in the stems of independent words and that is why are felt as occupying an intermediate position, as morphemes that are changing their class membership regularly functioning as derivational prefixes but still retaining certain features of root-morphemes. That is why they are sometimes referred to as semi-affixes. To this group we should also refer ‘well-‘ and ‘self-‘ (well-fed, well-done, self-made), ‘-man’ in words like postman, cabman, chairman, ‘-looking’ in words like foreign-looking, alive-looking, strange-looking, etc.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 22
1. Degrees of assimilation of borrowings and factors determining it
Even a superficial examination of the English word-stock shows that there are words among them that are easily recognized as foreign. And there are others that have become so firmly rooted in the language that it is sometimes extremely difficult to distinguish them from words of Anglo-Saxon origin (e.g. pupil, master, city, river, etc.).
Unassimilated words differ from assimilated ones in their pronunciation, spelling, semantic structure, frequency and sphere of application. There are also words that are assimilated in some respects and unassimilated in others – partially assimilated words (graphically, phonetically, grammatically, semantically).
The degree of assimilation depends on the first place upon the time of borrowing: the older the borrowing, the more thoroughly it tends to follow normal English habits of accentuation, pronunciation and etc. (window, chair, dish, box).
Also those of recent date may be completely made over to conform to English patterns if they are widely and popularly employed (French – clinic, diplomat).
Another factor determining the process of assimilation is the way in which the borrowings were taken over into the language. Words borrowed orally are assimilated more readily; they undergo greater changes, whereas with words adopted through writing the process of assimilation is longer and more laborious.
2. Lexical, grammatical valency of words
There are two factors that influence the ability of words to form word-groups. They are lexical and grammatical valency of words. The point is that compatibility of words is determined by restrictions imposed by the inner structure of the English word stock (e.g. a bright idea = a good idea; but it is impossible to say «a bright performance», or «a bright film»; «heavy metal» means difficult to digest, but it is impossible to say «heavy cheese»; to take [catch] a chance, but it is possible to say only «to take precautions»).
The range of syntactic structures or patterns in which words may appear is defined as their grammatical valency. The grammatical valency depends on the grammatical structure of the language (e.g. to convince smb. of smth/that smb do smth; to persuade smb to do smth).
Any departure from the norms of lexical or grammatical valency can either make a phrase unintelligible or be felt as a stylistic device.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 23
1. Classification of homonyms
Homonyms are words that are identical in their sound-form or spelling but different in meaning and distribution.
1) Homonyms proper are words similar in their sound-form and graphic but different in meaning (e.g. «a ball»- a round object for playing; «a ball»- a meeting for dances).
2) Homophones are words similar in their sound-form but different in spelling and meaning (e.g. «peace» — «piece», «sight»- «site»).
3) Homographs are words which have similar spelling but different sound-form and meaning (e.g. «a row» [rau]- «a quarrel»; «a row» [rəu] — «a number of persons or things in a more or less straight line»)
There is another classification by Смирницкий. According to the type of meaning in which homonyms differ, homonyms proper can be classified into:
I. Lexical homonyms — different in lexical meaning (e.g. «ball»);
II. Lexical-grammatical homonyms which differ in lexical-grammatical meanings (e.g. «a seal» — тюлень, «to seal» — запечатывать).
III. Grammatical homonyms which differ in grammatical meaning only (e.g. «used» — Past Indefinite, «used»- Past Participle; «pupils»- the meaning of plurality, «pupil’s»- the meaning of possessive case).
All cases of homonymy may be subdivided into full and partial homonymy. If words are identical in all their forms, they are full homonyms (e.g. «ball»-«ball»). But: «a seal» — «to seal» have only two homonymous forms, hence, they are partial homonyms.
2. Lexical and grammatical meanings of word-groups
1. The lexical meaning of the word-group may be defined as the combined lexical meaning of the component words. Thus, the lexical meaning of the word-group “red flower” may be described denotationally as the combined meaning of the words “red” and “flower”. It should be pointed out, however, that the term combined lexical meaning is not to imply that the meaning of the word-group is a mere additive result of all the lexical meanings of the component members. The lexical meaning of the word-group predominates over the lexical meanings of its constituents.
2. The structural meaning of the word-group is the meaning conveyed mainly by the pattern of arrangement of its constituents (e.g. “school grammar” – школьная грамматика and “grammar school” – грамматическая школа, are semantically different because of the difference in the pattern of arrangement of the component words. The structural meaning is the meaning expressed by the pattern of the word-group but not either by the word school or the word grammar.
The lexical and structural components of meaning in word-groups are interdependent and inseparable, e.g. the structural pattern of the word-groups all day long, all night long, all week long in ordinary usage and the word-group all the sun long is identical. Replacing day, night, week by another noun – sun doesn’t change the structural meaning of the pattern. But the noun sun continues to carry the semantic value, the lexical meaning that it has in word-groups of other structural patterns.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 24
1. Derivational bases
The derivational bases is the part of the word which establishes connections with the lexical unit that motivates the derivative and defines its lexical meaning. The rule of word formation is applied. Structurally, they fall into 3 classes: 1. bases that coincide with morphological stems (e.g. beautiful (d.b.) — beautifully); 2. bases that coincide with word-forms (e.g. unknown — known); 3. bases that coincide with word groups; adjectives and nouns (e.g. blue-eyed – having blue eyes, easy-going).
2. Emotive charge and stylistic reference
The emotive charge is the emotive evaluation inherent in the connotational component of the lexical meaning (e.g. «notorious» => [widely known] => for criminal acts, bad behaviour, bad traits of character; «famous» => [widely known] => for special achievement etc.).
Positive/Negative evaluation; emotive charge/stylistic value.
«to love» — neutral
«to adore» — to love greatly => the emotive charge is higher than in «to love»
«to shake» — neutral.
«to shiver» — is stronger => higher emotive charge.
Mind that the emotive charge is not a speech characteristic of the word. It’s a language phenomenon => it remains stable within the basical meaning of the word.
The emotive charge varies in different word-classes. In some of them, in interjections (междометия), e.g., the emotive element prevails, whereas in conjunctions the emotive charge is as a rule practically non-existent. The emotive implication of the word is to a great extent subjective as it greatly depends of the personal experience of the speaker, the mental imagery the word evokes in him. (hospital – architect, invalid or the man living across the road)
If associations with the lexical meaning concern the situation, the social circumstances (formal/informal), the social relations between the interlocutors (polite/rough), the type or purpose of communication (poetic/official)the connotation is stylistically coloured. It is termed as stylistic reference. The main stylistic layers of the vocabulary are:
Literary «parent» «to pass into the next world» — bookish
Neutral «father» «to die»
Colloquial «dad» «to kick the bucket»
In literary (bookish) words we can single out: 1) terms or scientific words (e.g. renaissance, genocide, teletype); 2) poetic words and archaisms (e.g. aught—’anything’, ere—’before’, nay—’no’); 3) barbarisms and foreign words (e.g. bouquet).
The colloquial words may be, subdivided into:
1) Common colloquial words.
2) Slang (e.g. governor for ‘father’, missus for ‘wife’, a gag for ‘a joke’, dotty for ‘insane’).
3) Professionalisms — words used in narrow groups bound by the same occupation (e.g., lab for ‘laboratory’, a buster for ‘a bomb’).
4) Jargonisms — words marked by their use within a particular social group and bearing a secret and cryptic character (e.g. a sucker — ‘a person who is easily deceived’).
5) Vulgarisms — coarse words that are notgenerally used in public (e.g. bloody, hell, damn, shut up)
5) Dialectical words (e.g. lass – девчушка, kirk — церковь).
6) Colloquial coinages (e.g. newspaperdom, allrightnik)
Stylistic reference and emotive charge of words are closely connected and to a certain degree interdependent. As a rule stylistically coloured words — words belonging to all stylistic layers except the neutral style are observed to possess a considerable emotive charge (e.g. daddy, mammy are more emotional than the neutral father, mother).
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 25
1. Historical changeability of word-structure
The derivational structure of a word is liable to various changes in the course of time. Certain morphemes may become fused together or may be lost altogether (simplification). As a result of this process, radical changes in the word may take place: root morphemes may turn into affixational and semi-affixational morphemes, compound words may be transformed into derived or even simple words, polymorphic words may become monomorphic.
E.g. derived word wisdom goes back to the compound word wīsdom in which – dom was a root-morpheme and a stem of independent word with the meaning ‘decision, judgment’. The whole compound word meant ‘a wise decision’. In the course of time the meaning of the second component dom became more generalized and turned into the suffix forming abstract nouns (e.g. freedom, boredom).
Sometimes the spelling, of some Modern English words as compared with their sound-form reflects the changes these words have undergone (e.g. cupboard — [‘kʌbəd] is a monomorphic non-motivated simple word. But earlier it consisted of two bases — [kʌp] and [bɔːd] and signified ‘a board to put cups on’. Nowadays, it denotes neither cup nor board: a boot cupboard, a clothes cupboard).
2. Criteria of synonymity
1. It is sometimes argued that the meaning of two words is identical if they can denote the same referent (if an object or a certain class of objects can always be denoted by either of the two words.
This approach to synonymy does not seem acceptable because the same referent in different speech situations can always be denoted by different words which cannot be considered synonyms (e.g. the same woman can be referred to as my mother by her son and my wife by her husband – both words denote the same referent but there is no semantic relationship of synonymy between them).
2. Attempts have been made to introduce into the definition of synonymity the criterion of interchangeability in linguistic contexts (they say: synonyms are words which can replace each other in any given context without the slightest alteration in the denotational or connotational meaning). It is argued that for the linguist similarity of meaning implies that the words are synonymous if either of then can occur in the same context. And words interchangeable in any given context are very rare.
3. Modern linguists generally assume that there are no complete synonyms — if two words are phonemically different then their meanings are also different (buy, purchase – Purchasing Department). It follows that practically no words are substitutable for one another in all contexts (e.g. the rain in April was abnormal/exceptional – are synonymous; but My son is exceptional/abnormal – have different meaning).
Also interchangeability alone cannot serve as a criterion of synonymity. We may safely assume that synonyms are words interchangeable in some contexts. But the reverse is certainly not true as semantically different words of the same part of speech are interchangeable in quite a number of contexts (e.g. I saw a little girl playing in the garden the adj. little may be replaced by a number of different adj. pretty, tall, English).
Thus a more acceptable definition of synonyms seems to be the following: synonyms are words different in their sound-form, but similar in their denotational meaning or meanings and interchangeable at least in some contexts.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 26
1. Immediate Constituents analysis
The theory of Immediate Constituents (IC) was originally elaborated as an attempt to determine the ways in which lexical units are relevantly related to one another. The fundamental aim of IC analysis is to segment a set of lexical units into two maximally independent sequences or ICs thus revealing the hierarchical structure of this set (e.g. the word-group a black dress in severe styleis divided intoa black dress / in severe style.Successive segmentation results in Ultimate Constituents (UC) — two-facet units that cannot be segmented into smaller units having both sound-form and meaning (e.g. a | black | dress | in | severe | style).
The meaning of the sentence, word-group, etc. and the IC binary segmentation are interdependent (e.g. fat major’s wifemay mean that either ‘the major is fat’ (fat major’s | wife) or ‘his wife is fat’ (fat | major’s wife).
The Immediate Constituent analysis is mainly applied in lexicological investigation to find out the derivational structure of lexical units (e.g. to denationalise => de | nationalise (it’s a prefixal derivative, because there is no such sound-forms as *denation or *denational). There are also numerous cases when identical morphemic structure of different words is insufficient proof of the identical pattern of their derivative structure which can be revealed only by IC analysis (e.g. words which contain two root-morphemes and one derivational morpheme — snow-coveredwhich is a compound consisting of two stems snow + covered, but blue-eyedis a suffixal derivative (blue+eye)+-ed). It may be inferred from the examples above that ICs represent the word-formation structure while the UCs show the morphemic structure of polymorphic words.
2. Characteristic features of learner’s dictionaries
Traditionally the term learner’s dictionaries is confined to dictionaries specifically complied to meet the demands of the learners for whom English is not their mother tongue. They nay be classified in accordance with different principles, the main are: 1) the scope of the word-list, and 2) the nature of the information afforded. Depending on that, learner’s dictionaries are usually divided into: a) elementary/basic/pre-intermediate; b) intermediate; c) upper-intermediate/advanced learner’s dictionaries.
1. The scope of the word-list. Pre-intermediate as well as intermediate learner’s dictionaries contain only the most essential and important – key words of English, whereas upper-intermediate learner’s dictionaries contain lexical units that the prospective user may need.
Purpose: to dive information on what is currently accepted in modern English. Excluded: archaic and dialectal words, technical and scientific terms, substandard words and phrases. Included: colloquial and slang words, foreign words – if they are of sort to be met in reading or conversation. (frequency)
2. The nature of the information afforded. They may be divided into two groups: 1) learner’s dictionary proper (those giving equal attention to the words semantic characteristics and the way it is used in speech); 2) those presenting different aspects of the vocabulary: dictionaries of collocations, derivational dictionaries (word-structure), dictionaries of synonyms and antonyms and some others.
Pre-intermediate and intermediate learner’s dictionaries differ from advanced sometimes greatly in the number of meanings given and the language used for the description of these meanings.
Pictorial material is widely used. Pictures may define the meanings of different nouns as well as adjectives, verbs, and adverbs. The order of arrangement of meaning is empiric (beginning with the main meaning to minor ones).
The supplementary material in learner’s dictionaries may include lists of irregular verbs, common abbreviations, geographic names, special signs and symbols used in various branches of science, tables of weights and measures and so on.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 27
1. Links between lexicology and other branches of linguistics
Lexicology is a branch of linguistics dealing with a systematic description and study of the vocabulary of the language as regards its origin, development, meaning and current use. The term is composed of 2 words of Greek origin: lexis — word + logos – word’s discourse. So lexicology is a word about words, or the science of a word. However, lexicology is concerned not only with words because the study of the structure of words implies references to morphemes which make up words.
On the other hand, the study of semantic properties of a word implies references to variable (переменный) or stable (set) word groups, of which words are compounding parts. Because it is the semantic properties of words that define the general rules of their joining together.
Comparative linguistics and Contrasted linguistics are of great importance in classroom teaching and translation.
Lexicology is inseparable from: phonetics, grammar, and linguostylistics because phonetics also investigates vocabulary units but from the point of view of their sounds. Grammar in its turn deals with various means of expressing grammar peculiarities and grammar relations between words. Linguostylistics studies the nature, functioning and structure of stylistic devices and the styles of a language.
Language is a means of communication, therefore the social essence of inherent in the language itself. The branch of linguistics dealing with relations between the way the language function and develops on the one hand and develops the social life on the other is called sociolinguistics.
2. Grammatical and lexical meanings of words
The word «meaning» is not homogeneous. Its components are described as «types of meaning». The two main types of meaning are grammatical and lexical meaning.
The grammatical meaning is the component of meaning, recurrent in identical sets of individual forms of words (e.g. reads, draws, writes – 3d person, singular; books, boys – plurality; boy’s, father’s – possessive case).
The lexical meaning is the meaning proper to the linguistic unit in all its forms and distribution (e.g. boy, boys, boy’s, boys’ – grammatical meaning and case are different but in all of them we find the semantic component «male child»).
Both grammatical meaning and lexical meaning make up the word meaning and neither of them can exist without the other.
There’s also the 3d type: lexico-grammatical (part of speech) meaning. Third type of meaning is called lexico-grammatical meaning (or part-of-speech meaning). It is a common denominator of all the meanings of words belonging to a lexical-grammatical class (nouns, verbs, adjectives etc. – all nouns have common meaning oа thingness, while all verbs express process or state).
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 28
1. Types of word segmentability
Within the English word stock maybe distinguished morphologically segment-able and non-segmentable words (soundless, rewrite — segmentable; book, car — non-segmentable).
Morphemic segmentability may be of three types: 1. complete, 2. conditional, 3. defective.
A). Complete segmentability is characteristic of words with transparent morphemic structure. Their morphemes can be easily isolated which are called morphemes proper or full morphemes (e.g. senseless, endless, useless). The transparent morphemic structure is conditioned by the fact that their constituent morphemes recur with the same meaning in a number of other words.
B). Conditional segmentability characterizes words segmentation of which into constituent morphemes is doubtful for semantic reasons (e.g. retain, detain, contain). The sound clusters «re-, de-, con-» seem to be easily isolated since they recur in other words but they have nothing in common with the morphemes «re, de-, con-» which are found in the words «rewrite», «decode», «condensation». The sound-clusters «re-, de-, con-» can possess neither lexical meaning nor part of speech meaning, but they have differential and distributional meaning. The morphemes of the kind are called pseudo-morphemes (quasi morphemes).
C). Defective morphemic segmentability is the property of words whose component morphemes seldom or never recur in other words. Such morphemes are called unique morphemes. A unique morpheme can be isolated and displays a more or less clear meaning which is upheld by the denotational meaning of the other morpheme of the word (cranberry, strawberry, hamlet).
2. Basic criteria of semantic derivation within conversion pairs
There are different criteria if differentiating between the source and the derived word in a conversion pair.
1. The criterion of the non-correspondence between the lexical meaning of the root-morpheme and the part-of-the speech meaning of the stem in one of the two words in a conversion pair. This criterion cannot be implied to abstract nouns.
2. The synonymity criterion is based on the comparison of a conversion pair with analogous synonymous word-pairs (e.g. comparing to chat – chat with synonymous pair of words to converse – conversation, it becomes obvious that the noun chat is the derived member as their semantic relations are similar). This criterion can be applied only to deverbal substantives.
3. The criterion of derivational relations. In the word-cluster hand – to hand – handful – handy the derived words of the first degree of derivation have suffixes added to the nominal base. Thus, the noun hand is the center of the word-cluster. This fact makes it possible to conclude that the verb to hand is the derived member.
4. The criterion of semantic derivation is based on semantic relations within the conversion pairs. If the semantic relations are typical of denominal verbs – verb is the derived member, but if they are typical of deverbal nouns – noun is the derived member (e.g. crowd – to crowd are perceived as those of ‘an object and an action characteristic of an object’ – the verb is the derived member).
5. According to the criterion of the frequency of occurrence a lower frequency value shows the derived character. (e.g. to answer (63%) – answer (35%) – the noun answer is the derived member).
6. The transformational criterion is based on the transformation of the predicative syntagma into a nominal syntagma (e.g. Mike visited his friends. – Mike’s visit to his friends. – then it is the noun that is derived member, but if we can’t transform the sentence, noun cannot be regarded as a derived member – Ann handed him a ball – XXX).
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 29
1. Word-formation: definition, basic peculiarities
According to Смирницкий word-formation is a system of derivative types of words and the process of creating new words from the material available in the language after certain structural and semantic patterns. The main two types are: word-derivation and word-composition (compounding).
The basic ways of forming words in word-derivation are affixation and conversion (the formation of a new word by bringing a stem of this word into a different formal paradigm, e.g. a fall from to fall).
There exist other types: semantic word-building (homonymy, polysemy), sound and stress interchange (e.g. blood – bleed; increase), acronymy (e.g. NATO), blending (e.g. smog = smoke + fog) and shortening of words (e.g. lab, maths). But they are different in principle from derivation and compound because they show the result but not the process.
2. Specialized dictionaries
Phraseological dictionaries have accumulated vast collections of idiomatic or colloquial phrases, proverbs and other, usually image-bearing word-groups with profuse illustrations. (An Anglo-Russian Phraseological Dictionary by A. V. Koonin)
New Words dictionaries have it as their aim adequate reflection of the continuous growth of the English language. (Berg P. A Dictionary of New Words in English)
Dictionaries of slang contain vulgarisms, jargonisms, taboo words, curse-words, colloquialisms, etc. (Dictionary of Slang and Unconventional English by E. Partridge)
Usage dictionaries pass judgement on usage problems of all kinds, on what is right or wrong. Designed for native speakers they supply much various information on such usage problems as, e.g., the difference in meaning between words (like comedy, farce and burlesque; formalityand formalism), the proper pronunciation of words, the plural forms of the nouns (e.g. flamingo), the meaning of foreign and archaic words. (Dictionary of Modern English Usage by N. W. Fowler.)
Dictionaries of word-frequency inform the user as to the frequency of occurrence of lexical units in speech (oral or written). (M. West’s General Service List.)
A Reverse dictionary (back-to-front dictionaries) is a list of words in which the entry words are arranged in alphabetical order starting with their final letters. (Rhyming Dictionary of the English Language).
Pronouncing dictionaries record contemporary pronunciation. They indicate variant pronunciations (which are numerous in some cases), as well as the pronunciation of different grammatical forms. (English Pronouncing Dictionary by Daniel Jones)
Etymological dictionaries trace present-day words to the oldest forms available, establish their primary meanings and point out the immediate source of borrowing, its origin, and parallel forms in cognate languages. (Oxford Dictionary of English Etymology edited by С. Т. Onions.)
Ideographic dictionaries designed for English-speaking writers, orators or translators seeking to express their ideas adequately contain words grouped by the concepts expressed. (Thesaurus of English Words and Phrases.)
Besides the most important and widely used types of English dictionaries discussed above there are some others, such as synonym-books, spelling reference books, hard-words dictionaries, etc.
ЭКЗАМЕНАЦИОННЫЙ БИЛЕТ № 30
1. Meaning in morphemes
A morpheme is the smallest indivisible two-facet (form and meaning) language unit which implies an association of a certain meaning and sound-form. Unlike words, morphemes cannot function independently (they occur in speech only as parts of words).
Morphemes have certain semantic peculiarities that distinguish them from words.- the don’t have grammatical meaning. Concrete lexical meaning is found mainly in root-morphemes (e.g. ‘friend” – friendship). Lexical meaning of affixes is generalized (e.g. -er – doer of an action; re- — repetition of some action).
Lexical meaning in morphemes may be analyzed into connotational and denotational components. The connotational aspect of meaning may be found in root-morphemes and affixational morphemes (e.g. diminutive meaning: booklet).
The part-of-speech meaning is characteristic only of affixal morphemes; moreover, some affixal morphemes are devoid of any part of meaning but part-of-speech meaning (e.g. –ment).
Morphemes possess specific meanings (of their own). There are: 1) deferential meaning and 2) distributional meaning.
Differential meaning is the semantic component that serves to distinguish one word from others containing identical morphemes (e.g. bookshelf, bookcase, bookhaunter).
Distributional meaning is the meaning of order and arrangement of morphemes that make up the word (e.g. heartless X lessheart).
Identical morphemes may have different sound-form (e.g. divide, divisible, division – the root morpheme is represented phonetically in different ways. They are called allomorphs or morpheme variant of one and the same morpheme.
2. Morphemic types of words
According to the number of morphemes words maybe classified into: monomorphic (root) words e.g. live, house) and polymorphic words that consist of more than one morpheme (merciless).
Polymorphic words are subdivided into:
1. Monoradical (one-root) words may be of 3 subtypes: a) radical-suffixal words (e.g. helpless), b) radical-prefixal words (e.g. mistrust), c) prefixo-radical-suffixal words (e.g. misunderstanding).
2. Polyradical (two or more roots) words fall into: a) root morphemes without affixes (e.g. bookcase) and b) root morphemes with suffixes (e.g. straw-colored).
Task1. Note the meanings of the 9 prefixes given below. Make new words with given prefixes. Decide on the part of speech for each of the words. Then work out the approximate meaning or the words that follow before checking their meanings in a good dictionary:
over=too much co=together en=make
under= too little il, in, im, ir, un=not.
dose-___________________________
shadow__________________________
privileged___________________________
habit______________________________
large________________________________
literate________________________________
measurable_____________________________
compromising_______________________________
Can you think of three more beginning with each of the prefixes listed in the exercise?
.Task2.Note the meanings of the 5 prefixes given in the box bellow. Make new words with given prefixes.Dicide on the part of speech for each of the words. Then work out the approximate meaning of the words that follow before checking their meanings in a good dictionary:
action_____________________________
planery_____________________________
historic______________________________
going_________________________________
humorous_______________________________
consider___________________________________
colonise__________________________________
Can you think three more words beginning with each of prefixes listed in the exercise?
Task 3 . Note the meanings of the 5 prefixes given in the box below. Make new words with given prefixes .decide on the part of speech for each of the words. Then work out the approximate meaning of the words that follow before checking their meanings in good dictionary:
trans= across, to the other side dis= causes the action to be reversed anti
counter= against, in opposition to mis= in the wrong manner
continental_____________________________
clockwise________________________________
balance__________________________________
count____________________________________
reputable___________________________________
handle______________________________________
understanding___________________________________
Can you think of three more words beginning with each of the prefixes listed in the exercise?
Task 4
In each sentence one word needs the addition of a prefix to give meaning to the sentence.
Identify the words which need prefixes and them.
-
Known as Saint Nicholas in Germany, Santa Claus was usually accompanied by Black peter, an elf, who punished____________ obedient children.
-
Unemployment and costs have to ______________ acceptable levels.
-
If he has his opinions on a subject, he is ____________ moveable.
-
She headed back home and left her mission______________ accomplished.
-
She is rather _______________ trustful person to strangers.
-
It was an ___________ mistakable step of his: he had own.
-
It was ________________ rational to react in that manner.
-
It is ________________ legal to drive while intoxicated.
-
It was _____________ modest of them to say that.
-
He had made progress that was previously____________ achievable.
-
It would be ____________ accurate to say that she has been dismissed.
-
He left a growing ______________ satisfaction with himself and his position.
-
It is a _____________ alcoholic drink.
-
You were ___________ attentive at the lecture, that’s why you didn’t understand anything.
-
The great Himalayan region is one of the few remaining isolated and ____________ accessible areas in the world today.
Task 5 In each sentence one word needs the addition of a prefix to give meaning to the sentence. Identify the words which need prefixes and them.
-
He never phones his friends or goes out any more: he’s becoming really social.________________
-
With 600 billion people, the country faces population.__________________________
-
Don’t you think it was very responsible to leave a six-year-old alone in the house?________________
-
There are too many mistakes in this essay: I’m afraid you’ll have to write it.________________-
-
He added a script to his letter to say that he received her check.___________________________
-
I think I have done the steaks: they’re very tough._________________________________
-
Drugs are legal in almost every country on earth.___________________________________
Task 6 in each sentence the word in capital letters needs the additions of prefix to give meaning.
I have decided to write my 1 _________________ -Biography! Now, you may think at 25 that I am too2_______________- MATURE to embark upon such an ambitious project but I think age is completely 3__________________ RELEVANT. Anyway, I’m sure that my literary abilities will allow me to 4____________ COME that hurdle only too easy. It will be written in a form of a 5______________ LOGUE in which I tell the world about some of the 6 ______________ BELIEVABLY interesting events in my life so far. I also intend to clear up some very common and totally 7 _______________ LOGICAL 8_____________ CONSEPTIONS about the 9____________NATURAL and finally convince people that all those pseudo-intellectuals at universities have got it all wrong. Being my friend, I hope you will buy a copy or it would be extremely 10_______________ LOYAL not to do so, after all.
Task 7 Complete this chart using the prefixes in the box to make the opposites of the adjectives and verbs given.
In- im- un- mis- dis- |
Adjective/ Verb |
Opposite |
active |
1____________________ |
secure |
2 |
capable |
3 |
experienced |
4 |
possible |
5 |
fortunate |
6 |
conscious |
7 |
healthy |
8 |
understand |
9 |
calculate |
10 |
approve |
11 |
obey |
12 |
Task 8 Form nouns from the given words with the help of the prefixes with the opposite meaning.
Example: employment- unemployment
honesty-____________________
difference-______________________
fortune-________________________
understanding-_______________________
dependence-_________________________
importance-__________________________
security-___________________________
expensive-_________________________
obedience-__________________________
population-__________________________
alcoholic-__________
___________________
Task 9 Supply the right adjectival forms.
Example: I suspect he isn’t honest. In fact he’s quite dishonest
-
This arrangement isn’t strictly legal. Some people would regard it is _________________
-
Sometimes she doesn’t behave in a responsible manner. She’s quite___________________
-
Such a situation is barely imaginable. It is quite_____________________________________
-
Bob’s not very capable. He’s ____________________ of making sound decisions.
-
This fish hasn’t been cooked enough. It’s _________________________________________
-
This scheme isn’t very practical. In fact, it’s quite_____________________________________
-
This dates from before the war. It’s_______________________________________________
Task 10. Form adjectives from the given ones with the help of the prefixes and point out the changes in meaning.
Example: practical- impractical
possible__________________
urban______________________
appointing_____________________
conscious _______________________
informed__________________________
accurate___________________________
believable___________________________
acceptable_____________________________
legal_________________________________
able_________________________________
complete______________________________
married________________________________
bearable_______________________________
successful______________________________
calculation_____________________________
approval________________________________
measurable______________________________
expensive________________________________
modest___________________________________
Keys.
Task 1. Possible answers: overdose, overshadow, underprivileged, cohabit, illiterate, immeasurable, uncompromising
Task2. Possible answers: interaction, interplanetary, prehistoric, foregoing, posthumous, reconsider, recolonise
Task3. Possible answers: transcontinental, anticlockwise, counterbalance, discount, disreputable, mishandle, misunderstanding
Task4. Possible answers: 1. disobedient, 2. unacceptable, 3 immovable, 4 unaccomplished, 5distruthful, 6. Unmistaken, 7. Irrational, 8. Illegal, 9. Immodest, 10.u,nachievable 11.unaccurate, 12.dissatisfaction
13 non-alcoholic,14 inattentive, 15 inaccessible
Task 5. 1 anti-social, 2. over-population, 3. irresponsible, 4. rewrite, 5. postscript, 6. overdone, 7 illegal
Task6 1. AUTOBIOGRAPHY, 2. IMMATURE, 3. IRRELEVANT, 4 OVERCOME, 5. MONOLOGUE/DIALOGUE
6. UNBELIEVABLY, 7 ILLOGICAL, 8. MISCONSEPTIONS, 9. SUPERNATURAL, 10. DISLOYAL
Task7 .1 inactive, 2 insecure, 3 incapable, 4 inexperienced, 5 Impossible, 6 unfortunate, 7 unconscious, 8 unhealthy, 9 misunderstand, 10 miscalculate, 11 disapprove, 12 disobey
Task8 dishonesty, indifference, misfortune, misunderstanding, independence, unimportance, insecurity, inexpensive, disobedience, overpopulation, non-alcoholic
Task9 illegal, irresponsible, unimaginable, incapable, uncooked, impractical, pre-war
Task10 impossible, interurban, disappointing, unconscious, unavailable, misinformed, inaccurate, unbelievable, unacceptable, illegal, unable, incomplete, unmarried, unbearable, unsuccessful, miscalculation, disapproval, immeasurable, inexpensive
NLTK Wordnet can be used to find synonyms and antonyms of words. NLTK Corpus package is used to read the corpus to understand the lexical semantics of the words within the document. A WordNet involves semantic relations of words and their meanings within a lexical database. The semantic relations within the WordNet are hypernyms, synonyms, holonyms, hyponyms, meronyms. NLTK WordNet includes the usage of synsets for finding the words within the WordNet with their usages, definitions, and examples. NLTK WordNet is to find the representations between senses. Relation type detection is connected to the WordNet with lexical semantics. A dog can be a mammal, and this can be expressed with an “IS-A” relation type sentence. Thus, NLTK Wordnet is used to find the relations between words from a document, spam detection, duplication detection, or characteristics of the words within a written text with their POS Tags.
NLTK Lemmatization, stemming, tokenization, and POS Tagging are related to the NLTK WordNet for Natural Language Processing. To use the Natural Language Tool Kit WordNet with better efficiency, the synonyms, and antonyms, holonyms, hypernyms, and hyponyms, and all of the lexical relations should be used for text processing and text cleaning. In this NLTK WordNet Python tutorial, the synonym and antonym finding, along with word similarity calculation will be used with NLTK Corpus Reader for the English Language.
A quick example of the synonym and antonym finding with NLTK Python can be found below.
def synonym_antonym_extractor(phrase):
from nltk.corpus import wordnet
synonyms = []
antonyms = []
for syn in wordnet.synsets(phrase):
for l in syn.lemmas():
synonyms.append(l.name())
if l.antonyms():
antonyms.append(l.antonyms()[0].name())
print(set(synonyms))
print(set(antonyms))
synonym_antonym_extractor(phrase="word")
OUTPUT >>>
{'tidings', 'password', 'Holy_Writ', 'Good_Book', 'Bible', 'discussion', 'news', 'parole', 'give_voice', 'articulate', 'Son', 'word', 'Holy_Scripture', 'Book', 'give-and-take', 'Christian_Bible', 'intelligence', 'Logos', 'phrase', 'word_of_honor', 'formulate', 'Scripture', 'Word', 'watchword', 'countersign', 'Word_of_God'}
set()
The Synonym and Antonym finding example code block with Python NLTK involves a custom function creation, “nltk.corpus”, and “wordnet” with “syn.lemmas”, “syn.antonyms” along with a for a loop. The phrase “word” has been used as an example for the NLTK Synonym and Antonym finding. According to the WordNet within the NLTK.corpus, there is no antonym for “word” phrase, but the synonyms are “password”, “Holy Writ”, “Good Book”, “Bible”, “Discussion”, “News”, “Parole”. NLTK Synonyms and Antonyms involve lexical synonyms and contextual synonyms from WordNet.
In this Python and NLTK Synonym and Antonym finding guide, the usage of the NLTK WordNet for lexical semantics, word similarities, and synonym, antonym, hypernym, hyponym, verb frames, and more will be processed.
How to Find Synonyms of a Word with NLTK WordNet and Python?
To find the synonyms of a word with NLTK WordNet, the instructions below should be followed.
- Import NLTK.corpus
- Import WordNet from NLTK.Corpus
- Create a list for assigning the synonym values of the word.
- Use the “synsets” method.
- use the “syn.lemmas” property to assign the synonyms to the list with a for loop.
- Call the synonyms of the word with NLTK WordNet within a set.
An example of the finding of the synonym of a word via NLTK and Python is below.
from nltk.corpus import WordNet
synonyms = []
for syn in wordnet.synsets("love"):
for i in syn.lemmas():
synonyms.append(l.name())
print(set(synonyms))
OUTPUT >>>
{'dearest', 'love_life', 'get_it_on', 'roll_in_the_hay', 'lie_with', 'screw', 'bonk', 'passion', 'honey', 'sleep_together', 'lovemaking', 'making_love', 'make_love', 'have_sex', 'jazz', 'bed', 'erotic_love', 'dear', 'do_it', 'have_it_away', 'be_intimate', 'fuck', 'have_a_go_at_it', 'sleep_with', 'hump', 'enjoy', 'eff', 'have_it_off', 'know', 'have_intercourse', 'make_out', 'bang', 'beloved', 'love', 'get_laid', 'sexual_love'}
In the example above, the word “love” is used for finding its synonyms for different contexts with the NLTK and Python. The synonyms that are found for the “love” involves “dearest”, “lie with”, “screw”, “bonk”, “passion”, “honey” and some subtypes such as “sexual love”, “erotic love”. A word can be a synonym of another word, and indirectly related and connected words can be included within the synonym list of a word with NLTK WordNet. Thus, to find the different contextual synonyms and sibling phrases for a word, NLTK can be used. The compositional compounds and non-compositional compounds, or synonyms are used by the search engines. For a search engine optimization or search engine creation project, the NLTK WordNet and synonyms are prominent for understanding the context of textual data. Thus, from the Google Patents, the NLTK and WordNet can be found as mentioned methodology for synonym finding.
How to Find Antonyms of a Word with NLTK WordNet and Python?
To find the Antonyms of a Word with NLTK WordNet and Python, the following instructions should be followed.
- Import NLTK.corpus
- Import WordNet from NLTK.Corpus
- Create a list for assigning the synonym values of the word.
- Use the “synsets” method.
- use the “syn.lemmas” property to assign the synonyms to the list with a for loop.
- Use the “antonyms()” method with “name” property for calling the antonym of the phrase.
- Call the antonyms of the word with NLTK WordNet within a set.
from nltk.corpus import wordnet
antonyms = []
for syn in wordnet.synsets("love"):
for i in syn.lemmas():
if i.antonyms():
antonyms.append(i.antonyms()[0].name())
print(set(antonyms))
OPUTPUT >>>
{'hate'}
The antonym of the word “love” has been found as “hate” via the NLTK Antonym finding code example. Finding Synonyms and Antonyms from sentences by tokenizing the words within the sentence is beneficial to see the possible contextual connections to understand the content with NLP. Thus, creating a custom function for synonym finding within the text with Python is useful. The next section of the NLTK Python Synonym and Antonym Finding Tutorial with WordNet will be about a custom function creation.
How to use a custom Python Function for Finding Synonyms and Antonyms with NLTK WordNet?
To use a custom Python Function for finding synonyms and antonyms with NLTK, follow the instructions below.
- Create a custom function with the Python built-in “def” command.
- Use the text for synonym and antonym finding as the argument of the custom synonym and antonym finder Python function.
- Import the “word_tokenize” from the “nltk.tokenize”.
- Import the “wordnet” from the “nltk.corpus”.
- Import “defualtdict” from the “collections”.
- Import “pprint” for the pretty print the antonyms and synonyms.
- Tokenize the words within the sentence for synonym and antonym finding with NLTK.
- Create the antonym and synonym lists with “defaultdict(list)”.
- Use a for loop with the tokens of tokenized sentence with NLTK for synonym and antonym finding.
- Use a for a loop with the “synsets” for synonym and antonym finding.
- Use an “if” statement to check whether the antonym of the word exists or not.
- Use “pprint.pformat” and “dict” for making the synonym and antonym list writable to the a txt file.
- Append all of the synonyms and antonyms for every word within the sentence with the created synonym and antonym defaultdict lists.
- Open a new file as txt.
- Print all of the synonyms and antonyms to a txt file.
- Close the opened and created txt file.
An example of using the WordNet NLTK for finding synonyms and antonyms from an example sentence can be found below.
def text_parser_synonym_antonym_finder(text:str):
from nltk.tokenize import word_tokenize
from nltk.corpus import wordnet
from collections import defaultdict
import pprint
tokens = word_tokenize(text)
synonyms = defaultdict(list)
antonyms = defaultdict(list)
for token in tokens:
for syn in wordnet.synsets(token):
for i in syn.lemmas():
#synonyms.append(i.name())
#print(f'{token} synonyms are: {i.name()}')
synonyms[token].append(i.name())
if i.antonyms():
#antonyms.append(i.antonyms()[0].name())
#print(f'{token} antonyms are: {i.antonyms()[0].name()}')
antonyms[token].append(i.antonyms()[0].name())
pprint.pprint(dict(synonyms))
pprint.pprint(dict(synonyms))
synonym_output = pprint.pformat((dict(synonyms)))
antonyms_output = pprint.pformat((dict(antonyms)))
with open(str(text[:5]) + ".txt", "a") as f:
f.write("Starting of Synonyms of the Words from the Sentences: " + synonym_output + "n")
f.write("Starting of Antonyms of the Words from the Sentences: " + antonyms_output + "n")
f.close()
text_parser_synonym_antonym_finder(text="WordNet is a lexical database that has been used by a major search engine. From the WordNet, information about a given word or phrase can be calculated such as")
OUTPUT >>>
Starting of Synonyms of the Words from the Sentences: {'WordNet': ['wordnet',
'WordNet',
'Princeton_WordNet',
'wordnet',
'WordNet',
'Princeton_WordNet'],
'a': ['angstrom',
'angstrom_unit',
'A',
'vitamin_A',
'antiophthalmic_factor',
'axerophthol',
'A',
'deoxyadenosine_monophosphate',
'A',
'adenine',
'A',
'ampere',
'amp',
'A',
'A',
'a',
'A',
'type_A',
'group_A',
'angstrom',
'angstrom_unit',
'A',
'vitamin_A',
'antiophthalmic_factor',
'axerophthol',
'A',
'deoxyadenosine_monophosphate',
'A',
'adenine',
'A',
'ampere',
'amp',
'A',
'A',
'a',
'A',
'type_A',
'group_A',
'angstrom',
'angstrom_unit',
'A',
'vitamin_A',
'antiophthalmic_factor',
'axerophthol',
'A',
'deoxyadenosine_monophosphate',
'A',
'adenine',
'A',
'ampere',
'amp',
'A',
'A',
'a',
'A',
'type_A',
'group_A'],
'about': ['about',
'astir',
'approximately',
'about',
'close_to',
'just_about',
'some',
'roughly',
'more_or_less',
'around',
'or_so',
'about',
'around',
'about',
'around',
'about',
'around',
'about',
'around',
'about',
'about',
'almost',
'most',
'nearly',
'near',
'nigh',
'virtually',
'well-nigh'],
'as': ['arsenic',
'As',
'atomic_number_33',
'American_Samoa',
'Eastern_Samoa',
'AS',
'angstrom',
'angstrom_unit',
'A',
'vitamin_A',
'antiophthalmic_factor',
'axerophthol',
'A',
'deoxyadenosine_monophosphate',
'A',
'adenine',
'A',
'ampere',
'amp',
'A',
'A',
'a',
'A',
'type_A',
'group_A',
'equally',
'as',
'every_bit'],
'be': ['beryllium',
'Be',
'glucinium',
'atomic_number_4',
'be',
'be',
'be',
'exist',
'be',
'be',
'equal',
'be',
'constitute',
'represent',
'make_up',
'comprise',
'be',
'be',
'follow',
'embody',
'be',
'personify',
'be',
'be',
'live',
'be',
'cost',
'be'],
'been': ['be',
'be',
'be',
'exist',
'be',
'be',
'equal',
'be',
'constitute',
'represent',
'make_up',
'comprise',
'be',
'be',
'follow',
'embody',
'be',
'personify',
'be',
'be',
'live',
'be',
'cost',
'be'],
'by': ['by', 'past', 'aside', 'by', 'away'],
'calculated': ['calculate',
'cipher',
'cypher',
'compute',
'work_out',
'reckon',
'figure',
'calculate',
'estimate',
'reckon',
'count_on',
'figure',
'forecast',
'account',
'calculate',
'forecast',
'calculate',
'calculate',
'aim',
'direct',
'count',
'bet',
'depend',
'look',
'calculate',
'reckon',
'deliberate',
'calculated',
'measured'],
'can': ['can',
'tin',
'tin_can',
'can',
'canful',
'can',
'can_buoy',
'buttocks',
'nates',
'arse',
'butt',
'backside',
'bum',
'buns',
'can',
'fundament',
'hindquarters',
'hind_end',
'keister',
'posterior',
'prat',
'rear',
'rear_end',
'rump',
'stern',
'seat',
'tail',
'tail_end',
'tooshie',
'tush',
'bottom',
'behind',
'derriere',
'fanny',
'ass',
'toilet',
'can',
'commode',
'crapper',
'pot',
'potty',
'stool',
'throne',
'toilet',
'lavatory',
'lav',
'can',
'john',
'privy',
'bathroom',
'can',
'tin',
'put_up',
'displace',
'fire',
'give_notice',
'can',
'dismiss',
'give_the_axe',
'send_away',
'sack',
'force_out',
'give_the_sack',
'terminate'],
'database': ['database'],
'engine': ['engine',
'engine',
'locomotive',
'engine',
'locomotive_engine',
'railway_locomotive',
'engine'],
'given': ['given',
'presumption',
'precondition',
'give',
'yield',
'give',
'afford',
'give',
'give',
'give',
'pay',
'hold',
'throw',
'have',
'make',
'give',
'give',
'throw',
'give',
'gift',
'present',
'give',
'yield',
'give',
'pay',
'devote',
'render',
'yield',
'return',
'give',
'generate',
'impart',
'leave',
'give',
'pass_on',
'establish',
'give',
'give',
'give',
'sacrifice',
'give',
'pass',
'hand',
'reach',
'pass_on',
'turn_over',
'give',
'give',
'dedicate',
'consecrate',
'commit',
'devote',
'give',
'give',
'apply',
'give',
'render',
'grant',
'give',
'move_over',
'give_way',
'give',
'ease_up',
'yield',
'feed',
'give',
'contribute',
'give',
'chip_in',
'kick_in',
'collapse',
'fall_in',
'cave_in',
'give',
'give_way',
'break',
'founder',
'give',
'give',
'give',
'afford',
'open',
'give',
'give',
'give',
'give',
'yield',
'give',
'give',
'give',
'give',
'give',
'give',
'give',
'give',
'give',
'give',
'give',
'given',
'granted',
'apt',
'disposed',
'given',
'minded',
'tending'],
'has': ['hour_angle',
'HA',
'have',
'have_got',
'hold',
'have',
'feature',
'experience',
'receive',
'have',
'get',
'own',
'have',
'possess',
'get',
'let',
'have',
'consume',
'ingest',
'take_in',
'take',
'have',
'have',
'hold',
'throw',
'have',
'make',
'give',
'have',
'have',
'have',
'experience',
'have',
'induce',
'stimulate',
'cause',
'have',
'get',
'make',
'accept',
'take',
'have',
'receive',
'have',
'suffer',
'sustain',
'have',
'get',
'have',
'get',
'make',
'give_birth',
'deliver',
'bear',
'birth',
'have',
'take',
'have'],
'information': ['information',
'info',
'information',
'information',
'data',
'information',
'information',
'selective_information',
'entropy'],
'is': ['be',
'be',
'be',
'exist',
'be',
'be',
'equal',
'be',
'constitute',
'represent',
'make_up',
'comprise',
'be',
'be',
'follow',
'embody',
'be',
'personify',
'be',
'be',
'live',
'be',
'cost',
'be'],
'lexical': ['lexical', 'lexical'],
'major': ['major',
'Major',
'John_Major',
'John_R._Major',
'John_Roy_Major',
'major',
'major',
'major',
'major',
'major',
'major',
'major',
'major',
'major',
'major',
'major'],
'or': ['Oregon',
'Beaver_State',
'OR',
'operating_room',
'OR',
'operating_theater',
'operating_theatre',
'surgery'],
'phrase': ['phrase',
'phrase',
'musical_phrase',
'idiom',
'idiomatic_expression',
'phrasal_idiom',
'set_phrase',
'phrase',
'phrase',
'give_voice',
'formulate',
'word',
'phrase',
'articulate',
'phrase'],
'search': ['search',
'hunt',
'hunting',
'search',
'search',
'lookup',
'search',
'search',
'search',
'seek',
'look_for',
'search',
'look',
'research',
'search',
'explore',
'search'],
'such': ['such', 'such'],
'used': ['use',
'utilize',
'utilise',
'apply',
'employ',
'use',
'habituate',
'use',
'expend',
'use',
'practice',
'apply',
'use',
'use',
'used',
'exploited',
'ill-used',
'put-upon',
'used',
'victimized',
'victimised',
'secondhand',
'used'],
'word': ['word',
'word',
'news',
'intelligence',
'tidings',
'word',
'word',
'discussion',
'give-and-take',
'word',
'parole',
'word',
'word_of_honor',
'word',
'Son',
'Word',
'Logos',
'password',
'watchword',
'word',
'parole',
'countersign',
'Bible',
'Christian_Bible',
'Book',
'Good_Book',
'Holy_Scripture',
'Holy_Writ',
'Scripture',
'Word_of_God',
'Word',
'give_voice',
'formulate',
'word',
'phrase',
'articulate']}
Starting of Antonyms of the Words from the Sentences: {'be': ['differ'],
'been': ['differ'],
'can': ['hire'],
'given': ['take', 'starve'],
'has': ['lack', 'abstain', 'refuse'],
'is': ['differ'],
'major': ['minor', 'minor', 'minor', 'minor', 'minor', 'minor', 'minor'],
'used': ['misused']}
At the example above, a sentence has been used as an example for synonym and antonym finding with a custom Python function which is ” text_parser_synonym_antonym_finder”. Below, you can see the “txt” output of the synonym and antonym extractor from a sentence.
For the synonym and antonym finding and extraction from the text, we have created a new “.txt” file with the name of the first word of the sentence. It is important to notice that with NLTK WordNet and Python, a word can have multiple synonyms with the same word because there are different POS Tags for every word within the antonym and synonym list.
How to use POS Tagging for Synonym and Antonym Finding with NLTK WordNet?
To use POS Tagging for synonym and antonym finding with NLTK WordNet, the “pos” attribute should be used with the WordNet of NLTK. An example of usage for POS Tagging to find antonym and synonym with NLTK WordNet is below.
print("VERB of Love: ", wordnet.synsets("love", pos = wordnet.VERB))
print("ADJECTIVE of Love: ", wordnet.synsets("love", pos = wordnet.ADJ))
print("NOUN of Love: ", wordnet.synsets("love", pos = wordnet.NOUN))
OUTPUT >>>
VERB of Love: [Synset('love.v.01'), Synset('love.v.02'), Synset('love.v.03'), Synset('sleep_together.v.01')]
ADJECTIVE of Love: []
NOUN of Love: [Synset('love.n.01'), Synset('love.n.02'), Synset('beloved.n.01'), Synset('love.n.04'), Synset('love.n.05'), Synset('sexual_love.n.02')]
The POS Tagging for Synonyms and Antonyms with NLTK WordNet shows different synsets (synonym rings) for different synonyms and antonyms of a word based on its context. For instance, the “love.v.01” and the “love.v.02” are not the same with each other in terms of context. To see the difference of a word in terms of its synonym meaning and context, the “definition” method of NLTK can be used with POS Tagging. To learn more about the NLTK POS Tagging, read the related guide and tutorial.
How to Find the Definition of a Synonym Word with NLTK WordNet?
To find the definition of a synonym Word with NLTK WordNet by understanding its context, the “wordnet.synset(“word example”, pos = wordnet.POS TAG).definition()” method should be used. To see the two different meanings of the same word as synonyms, the word “love” will use as an example below.
wordnet.synset("love.v.01").definition()
OUTPUT >>>
'have a great affection or liking for'
The example definition finding of a synonym of a word with NLTK WordNet above demonstrates the first verb example of the “love” as the “have a great affection or liking for”. The example below will show the second verb definition of “love”.
wordnet.synset("love.v.02").definition()
OUTPUT >>>
'get pleasure from'
The second meaning finding example of a word with NLTK WordNet can be found above. The second meaning of the word “love” is “get pleasure from”. Thus, even if the synonym of a word has the same “strings” as a “phrase”, still the meaning can be different. Thus, a word can have multiple synonyms with the same shape but different meanings. NLTK WordNet can be found by finding the different contexts, meanings of synonyms with the help of the POS Tagging with NLTK and the Definition Finding of a word. To improve the contextual understanding of a sentence with NLTK, the word usage examples can be called. Finding a word definition with Python has other methods such as using PyDictionary, but NLTK WordNet provides other benefits such as finding sentence examples for the words or finding different contexts of a word with its antonyms and synonyms.
How to find the sentence examples for words within NLTK WordNet?
To find the sentence examples with NLTK WordNet, the “wordnet.synset.examples()” method is used. An example of sentence example extraction with NLTK WordNet can be found below.
for i in wordnet.synset("love.v.01").examples():
print(i)
OUTPUT >>>
I love French food
She loves her boss and works hard for him
In the example above, the first noun meaning of the “love” word is used with the “wordnet.synset().examples()” method. The “I love French Food” and “She loves her boss and works hard for him” sentences are examples of sentences that the word “love” is used with a specific meaning.
for i in wordnet.synset("love.v.01").examples():
print(i)
OUTPUT >>>
I love cooking
The first meaning of “love” as a “verb” is used to take an example as above. The sentence “I love cooking” is returned by the NLTK WordNet as an example of the first meaning of the verb “love”. NLTK WordNet “examples()” method is useful to see the exact context of the specific word and its POS Tag with its versioned numeric value.
How to Extract the Synonyms and their Definitions at the same time with NLTK WordNet?
To extract the synonyms and their definitions with NLTK WordNet, the “wordnet.synset” and the “lemmas()” method with the “definition()” method should be used. The instructions below should be followed for extracting the synonyms and their definitions at the same time with NLTK WordNet.
- Use the “wordnet.synset()” for a word such as “love”, or “phrase”.
- Take the lemmas of the specific synonym ring with the “lemmas()” method.
- Print the “lemma.name()” and “definition()” method at the same time.
Below, you can find the example output.
for i in wordnet.synsets("love"):
for lemma in i.lemmas():
print("Synonym of Word: " + lemma.name(), "| Definition of Synonym: " + i.definition())
OUTPUT >>>
Synonym of Word: love | Definition of Synonym: a strong positive emotion of regard and affection
Synonym of Word: love | Definition of Synonym: any object of warm affection or devotion
Synonym of Word: passion | Definition of Synonym: any object of warm affection or devotion
Synonym of Word: beloved | Definition of Synonym: a beloved person; used as terms of endearment
Synonym of Word: dear | Definition of Synonym: a beloved person; used as terms of endearment
Synonym of Word: dearest | Definition of Synonym: a beloved person; used as terms of endearment
Synonym of Word: honey | Definition of Synonym: a beloved person; used as terms of endearment
Synonym of Word: love | Definition of Synonym: a beloved person; used as terms of endearment
Synonym of Word: love | Definition of Synonym: a deep feeling of sexual desire and attraction
Synonym of Word: sexual_love | Definition of Synonym: a deep feeling of sexual desire and attraction
Synonym of Word: erotic_love | Definition of Synonym: a deep feeling of sexual desire and attraction
Synonym of Word: love | Definition of Synonym: a score of zero in tennis or squash
Synonym of Word: sexual_love | Definition of Synonym: sexual activities (often including sexual intercourse) between two people
Synonym of Word: lovemaking | Definition of Synonym: sexual activities (often including sexual intercourse) between two people
Synonym of Word: making_love | Definition of Synonym: sexual activities (often including sexual intercourse) between two people
Synonym of Word: love | Definition of Synonym: sexual activities (often including sexual intercourse) between two people
Synonym of Word: love_life | Definition of Synonym: sexual activities (often including sexual intercourse) between two people
Synonym of Word: love | Definition of Synonym: have a great affection or liking for
Synonym of Word: love | Definition of Synonym: get pleasure from
Synonym of Word: enjoy | Definition of Synonym: get pleasure from
Synonym of Word: love | Definition of Synonym: be enamored or in love with
Synonym of Word: sleep_together | Definition of Synonym: have sexual intercourse with
Synonym of Word: roll_in_the_hay | Definition of Synonym: have sexual intercourse with
Synonym of Word: love | Definition of Synonym: have sexual intercourse with
Synonym of Word: make_out | Definition of Synonym: have sexual intercourse with
Synonym of Word: make_love | Definition of Synonym: have sexual intercourse with
Synonym of Word: sleep_with | Definition of Synonym: have sexual intercourse with
Synonym of Word: get_laid | Definition of Synonym: have sexual intercourse with
Synonym of Word: have_sex | Definition of Synonym: have sexual intercourse with
Synonym of Word: know | Definition of Synonym: have sexual intercourse with
Synonym of Word: do_it | Definition of Synonym: have sexual intercourse with
Synonym of Word: be_intimate | Definition of Synonym: have sexual intercourse with
Synonym of Word: have_intercourse | Definition of Synonym: have sexual intercourse with
Synonym of Word: have_it_away | Definition of Synonym: have sexual intercourse with
Synonym of Word: have_it_off | Definition of Synonym: have sexual intercourse with
Synonym of Word: screw | Definition of Synonym: have sexual intercourse with
Synonym of Word: fuck | Definition of Synonym: have sexual intercourse with
Synonym of Word: jazz | Definition of Synonym: have sexual intercourse with
Synonym of Word: eff | Definition of Synonym: have sexual intercourse with
Synonym of Word: hump | Definition of Synonym: have sexual intercourse with
Synonym of Word: lie_with | Definition of Synonym: have sexual intercourse with
Synonym of Word: bed | Definition of Synonym: have sexual intercourse with
Synonym of Word: have_a_go_at_it | Definition of Synonym: have sexual intercourse with
Synonym of Word: bang | Definition of Synonym: have sexual intercourse with
Synonym of Word: get_it_on | Definition of Synonym: have sexual intercourse with
Synonym of Word: bonk | Definition of Synonym: have sexual intercourse with
The example above is for every variation of the word “love” with its possible synonyms, and their contexts. It shows that how content can be made richer with certain types of vocabularies, and how the context can be deepened further for improving the relevance. A possible Information Retrieval system can understand the content’s purpose with these synonyms and antonyms further. Thus, NLTK WordNet and synonym, antonym extraction along with examining the word’s definition and example sentences are important.
How to extract synonyms and antonyms from other languages besides English via NLTK Wordnet?
To extract synonyms and antonyms from other languages besides English via NLTK Wordnet, the “langs()” method should be used. With NLTK WordNet and the “lang” method, the ISO-639 Language Codes should be used. ISO-639 language codes contain the language codes with a shortcut. The language codes that can be used with NLTK WordNet can be seen below.
- eng
- als
- arb
- bul
- cat
- cmn
- dan
- ell
- eus
- fas
- fin
- fra
- glg
- heb
- hrv
- ind
- ita
- jpn
- nld
- nno
- nob
- pol
- por
- qcn
- slv
- spa
- swe
- tha
- zsm
To use the ISO-639 Language codes with NLTK WordNet to find synonyms and antonyms with the “lang” attribute, you can examine the example below.
wordnet.synset("love.v.01").lemma_names("fra")
OUTPUT >>>
['aimer', 'amour', 'bien', "faire_l'amour", 'Mange']
The example use of the “lang” method to find the synonym of “love” with the first verb meaning within the French language can be seen above. The synonyms of “love” as a verb within French can be seen below.
These types of language translations with different synonyms from different contexts can be used to find the contextual relevance between different documents from different languages. Thus, NLTK is a valuable tool for search engines. And, the ISO-639 Language Codes have been used for hreflang attribute in the context of SEO as in NLTK WordNet “lang” method.
What other Lexical Semantics can be extracted with NLTK WordNet besides Antonyms and Synonyms?
The other lexical semantics can be extracted with NLTK WordNet besides antonyms and synonyms are listed below.
- Hypernyms: Hypernym is the opposite (antonym) of the Hyponym. Hypnerym is the superior thing of a class of things. NLTK WordNet can be used for extracting the hypernyms of a word with the “hypnerym” attribute.
- Hyponyms: Hyponym is the opposite (antonym) of the Hypernym. Hyponym is the interior thing of a class of things. NLTK WordNet can be used for extracting the hyponym of a word with the “hyponym” attribute.
- Holonyms: Holonym is the opposite (antonym) of the Meronym. Holonym is the name of the whole thing that has multiple parts. NLTK WordNet can be used for extracting the hypernyms of a word with the “member_holonym” attribute.
- Meronyms: Meronym is the opposite (antonym) of the Holonym. It represents the part name within the thing. NLTK WordNet can be used for extracting the hypernyms of a word with the “hypnerym” attribute. NLTK WordNet has the “member_meronyms” for extracting the meronym of a word.
Lexical Semantics involves hypernyms, hyponyms, holonyms, meronyms, antonyms, synonyms, and more semantic word relations. Semantic Role Labeling and Lexical Semantics are directly connected to Semantic SEO and Natural Language Processing. In this context, NLTK WordNet and Lexical Relations such as hypernyms, hyponyms, meronyms are important for SEO and NLP.
How to Find Hypernym of a Word with NLTK WordNet and Python?
To find the Hypernyms of a word and to see its superior class names, the “hypernym()” method within the NLTK WordNet and Synset should be used. The Hypernym is a part of Lexical Relations in NLTK WordNet that explains a word’s upper and superior concepts. A hypernym can show the context of the word. An example of finding the hypernym of a word can be seen below.
for syn in wordnet.synsets("love"):
print(syn.hypernym_distances())
OUTPUT >>>
{(Synset('feeling.n.01'), 2), (Synset('attribute.n.02'), 4), (Synset('love.n.01'), 0), (Synset('entity.n.01'), 6), (Synset('abstraction.n.06'), 5), (Synset('state.n.02'), 3), (Synset('emotion.n.01'), 1)}
{(Synset('love.n.02'), 0), (Synset('cognition.n.01'), 3), (Synset('content.n.05'), 2), (Synset('psychological_feature.n.01'), 4), (Synset('entity.n.01'), 6), (Synset('abstraction.n.06'), 5), (Synset('object.n.04'), 1)}
{(Synset('whole.n.02'), 5), (Synset('physical_entity.n.01'), 7), (Synset('entity.n.01'), 8), (Synset('entity.n.01'), 5), (Synset('organism.n.01'), 3), (Synset('object.n.01'), 6), (Synset('beloved.n.01'), 0), (Synset('living_thing.n.01'), 4), (Synset('physical_entity.n.01'), 4), (Synset('lover.n.01'), 1), (Synset('person.n.01'), 2), (Synset('causal_agent.n.01'), 3)}
{(Synset('abstraction.n.06'), 6), (Synset('state.n.02'), 4), (Synset('sexual_desire.n.01'), 1), (Synset('attribute.n.02'), 5), (Synset('entity.n.01'), 7), (Synset('love.n.04'), 0), (Synset('feeling.n.01'), 3), (Synset('desire.n.01'), 2)}
{(Synset('score.n.03'), 1), (Synset('measure.n.02'), 4), (Synset('number.n.02'), 2), (Synset('entity.n.01'), 6), (Synset('abstraction.n.06'), 5), (Synset('love.n.05'), 0), (Synset('definite_quantity.n.01'), 3)}
{(Synset('sexual_activity.n.01'), 1), (Synset('organic_process.n.01'), 3), (Synset('process.n.06'), 4), (Synset('sexual_love.n.02'), 0), (Synset('entity.n.01'), 6), (Synset('physical_entity.n.01'), 5), (Synset('bodily_process.n.01'), 2)}
{(Synset('love.v.01'), 0)}
{(Synset('like.v.02'), 1), (Synset('love.v.02'), 0)}
{(Synset('love.v.03'), 0), (Synset('love.v.01'), 1)}
{(Synset('copulate.v.01'), 1), (Synset('sleep_together.v.01'), 0), (Synset('connect.v.01'), 3), (Synset('join.v.04'), 2)}
The explanation of “how to find hypernym of a word with NLTK” code block is below.
- Import NLTK and WordNet
- Use “.synsets” method of wordnet.
- Use a for loop for all of the contexts of the phrases.
The example of discovering the hypernyms of the selected phrase represents different “noun” and “verb” contexts. Thus, there are many different hypernym paths. The hypernym distance represents different conceptual connections with a meaningful lexical hierarchy. For instance, the phrase “love” has “feeling” as hypernym, feeling with the first “noun” context while “attribute” is the second hypernym example for the second “noun” context. The context of the words can be seen with its definition as below.
wordnet.synset("love.n.01").definition()
OUTPUT>>>
'a strong positive emotion of regard and affection'
WordNet says that the “love.n.01” means a strong positive emotion. Thus, the hypernym of the word “love” for the first noun context is “feeling” which is a synonym of “emotion”. For the hypernym of the second context which is the second “noun” version of the “love”, the example is below.
wordnet.synset("love.n.02").definition()
OUTPUT >>>
'any object of warm affection or devotion'
The word “attribute” is the hypernym of the word “love” for the second noun meaning which is “any object of warm affection or devotion”. Thus, according to the context of a word, the meaning and the hypernyms will change. The WordNet hypernym paths and distances can affect the topicality score and semantic relevance of a content piece to a query or a context. Another “hypernym” finding example can be found below.
dog = wordnet.synset('dog.n.01')
print(dog.hypernyms())
OUTPUT >>>
[Synset('basenji.n.01'), Synset('corgi.n.01'), Synset('cur.n.01'), Synset('dalmatian.n.02'), Synset('great_pyrenees.n.01'), Synset('griffon.n.02'), Synset('hunting_dog.n.01'), Synset('lapdog.n.01'), Synset('leonberg.n.01'), Synset('mexican_hairless.n.01'), Synset('newfoundland.n.01'), Synset('pooch.n.01'), Synset('poodle.n.01'), Synset('pug.n.01'), Synset('puppy.n.01'), Synset('spitz.n.01'), Synset('toy_dog.n.01'), Synset('working_dog.n.01')]
The phrase “dog” with the first noun meaning has different hypernyms from “dalmatian” to the “griffon” or “puppy”, and “working dog”. All those hypernyms can be closer to the meaning of the dog within the document according to the general context of the document. Finding hypernyms and the hyponyms are connected to each other. Hyponyms can complete the meaning of a hypernym for the selected phrase within the NLTK WordNet.
How to Find Hyponym of a Word with NLTK WordNet and Python?
To find hyponyms of a word with NLTK WordNet and Python, the “hyponyms()” method can be used. Hyponym finding is beneficial to see the lexical relations of a word as a hypernym. Hyponyms comprise the inferior types of inferior versions of a specific phrase with different contexts. To find hyponyms with NLTK and NLP, follow the instructions below.
for syn in wordnet.synsets("love"):
print(syn.hyponyms())
OUTPUT >>>
[Synset('agape.n.01'), Synset('agape.n.02'), Synset('amorousness.n.01'), Synset('ardor.n.02'), Synset('benevolence.n.01'), Synset('devotion.n.01'), Synset('filial_love.n.01'), Synset('heartstrings.n.01'), Synset('lovingness.n.01'), Synset('loyalty.n.02'), Synset('puppy_love.n.01'), Synset('worship.n.02')]
[]
[]
[]
[]
[]
[Synset('adore.v.01'), Synset('care_for.v.02'), Synset('dote.v.02'), Synset('love.v.03')]
[Synset('get_off.v.06')]
[Synset('romance.v.02')]
[Synset('fornicate.v.01'), Synset('take.v.35')]
The explanation of the hyponym finding with the NLTK code example is below.
- Import the NLTK and WordNet
- Call the “wordnet.sysnset” for the selected phrase.
- Call every “hyponym” for every context of the word.
The example above for the phrase “love” shows that there are different types of hyponyms for different types of meanings of “love”. For the first noun context, the hyponym of love is “agape”. “Agape” is a hyponym for the second meaning of “love” as a noun at the same time. In WordNet, a word can have different hypernyms for different noun versions while having the same hyponym for both of them such as love. There can be multiple hyponyms for a specific word within the NLTK such as “amorousness”. Amarousness is the hyponym of “love” for the first noun meaning. It means that when we check the hypernym of a hyponym, the same concept will appear to complete the hypernym path. An example of bidirectional hypernym-hyponym control for NLTK WordNet is below.
for syn in wordnet.synsets("amorousness"):
print(syn.hypernyms())
OUTPUT >>>
[Synset('love.n.01')]
[Synset('sexual_desire.n.01')]
The hypernym of the “amorousness” is the phrase “love”. And, the second hypernym of the “amorousness” is the “sexual desire” which is a signal of the connection’s context between the “love” and the “amorousness”. The same process can be followed for the meaning of the first hyponym of love which is “agape”.
for syn in wordnet.synsets("agape"):
print(syn.hypernyms())
OUTPUT >>>
[Synset('love.n.01')]
[Synset('love.n.01')]
[Synset('religious_ceremony.n.01')]
[]
“Agape” has the “love” as the hypernym naturally. It has “religious ceremony” as a hypernym as well which shows the context of the connection to the phrase “love”. If we check the synonyms and the definition of “agape”, this connection will be more clear.
wordnet.synset("agape.n.01").definition()
OUTPUT >>>
'(Christian theology) the love of God or Christ for mankind'
The definition of the “agape” shows the “religious ceremony” connection for the word “love” and its hyponym. The synonyms of the “agape” can make this connection’s context more clear.
for syn in wordnet.synsets("agape"):
for l in syn.lemmas():
print(l.name())
OUTPUT >>>
agape
agape
agape_love
agape
love_feast
agape
gaping
The synonyms of the “agape” represent its “Christian Love” context as a hyponym for the word “love”. Because the “love feast” is one of the synonyms of the word “love”. And, the “love feast” is actually a term for Christian Mythology.
The NLTK WordNet Hypernyms and Hyponyms show the context of the word and the possible topicality association of the concept. Hyponym finding via NLTK and NLP can be supported by auditing the hypernyms and synonyms, along with the definitions of the words. Topic Modeling is an important part of the NLTK Hypernym and Hyponym connections. In this context, the Topic Modeling with Bertopic can be given as an example.
How to Find Verb Frames of a Verb with NLTK WordNet and Python?
To find the verb frames of a verb with NLTK WordNet can be found with the “frame_ids” and “frame_strings” methods. A verb-frame involves the meaning of the specific verb with an example sentence. Below, you can see an example usage of the “frame_ids” and “frame_strings” with NLTK WordNet to find the verb frames.
for lemma in wordnet.synset('run.v.02').lemmas():
print(lemma, lemma.frame_ids())
print(" | ".join(lemma.frame_strings()))
OUTPUT >>>
Lemma('scat.v.01.scat') [1, 2, 22]
Something scat | Somebody scat | Somebody scat PP
Lemma('scat.v.01.run') [1, 2, 22]
Something run | Somebody run | Somebody run PP
Lemma('scat.v.01.scarper') [1, 2, 22]
Something scarper | Somebody scarper | Somebody scarper PP
Lemma('scat.v.01.turn_tail') [1, 2, 22]
Something turn_tail | Somebody turn_tail | Somebody turn_tail PP
Lemma('scat.v.01.lam') [1, 2, 22]
Something lam | Somebody lam | Somebody lam PP
Lemma('scat.v.01.run_away') [1, 2, 22]
Something run_away | Somebody run_away | Somebody run_away PP
Lemma('scat.v.01.hightail_it') [1, 2, 22]
Something hightail_it | Somebody hightail_it | Somebody hightail_it PP
Lemma('scat.v.01.bunk') [1, 2, 22]
Something bunk | Somebody bunk | Somebody bunk PP
Lemma('scat.v.01.head_for_the_hills') [1, 2, 22]
Something head_for_the_hills | Somebody head_for_the_hills | Somebody head_for_the_hills PP
Lemma('scat.v.01.take_to_the_woods') [1, 2, 22]
Something take_to_the_woods | Somebody take_to_the_woods | Somebody take_to_the_woods PP
Lemma('scat.v.01.escape') [1, 2, 22]
Something escape | Somebody escape | Somebody escape PP
Lemma('scat.v.01.fly_the_coop') [1, 2, 22]
Something fly_the_coop | Somebody fly_the_coop | Somebody fly_the_coop PP
Lemma('scat.v.01.break_away') [1, 2, 22]
Something break_away | Somebody break_away | Somebody break_away PP
The example above demonstrates how to find the different meanings of a verb with its variations. The second meaning of the verb “run” has other variations and synonyms such as “turn_tail”, “scat”, “breakaway”, “escape” and other contextual synonyms. The verb frames are helpful to find the possible word replacements and contextual connections between the sentences. If the specific verb is replaced by one of the examples within the verb frame without changing the meaning of the sentence or the context of the paragraph, it means that the verb frames are used properly.
How to Find Similar Words for a targeted Word with NLTK WordNet and Python?
To find similar words to each other with NLTK Wordnet and Python, the “lch_similarity” and the “path_similarity” are used. The NLTK WordNet measures the word similarity based on the hypernym and hyponym taxonomy. The distance between the words within the hypernym and hyponym paths represents the similarity level between them. The similarity types and methods that can be used within the NLTK WordNet to measure the word similarity are listed below.
- Resink Similarity with “synset1.res_similarity(synset2, ic)”.
- Wu-Palmer Similarity with “synset1.wup_similarity(synset2)”.
- Leacock-Chodorow Similarity with “synset1.lch_similarity(synset2)”.
- Path Similarity with “synset1.path_similarity(synset2)”.
Example measurement of the word similarity with NLTK WordNet can be found below.
wordnet.synset("dog.n.01").path_similarity(wordnet.synset("cat.n.01"))
OUTPUT >>>
0.2
The word similarity score within the NLTK WordNet represents the similarity between the words. The word similarity score within NLTK WordNet is between 0 and 1. 0 represents there is no similarity, while 1 represents the exact identical similarity. Thus, the example measurement for word similarity with Python above shows that the word “cat” and word “dog” as “noun” are similar to each other 20%.
The “Leacock-Chodorow Similarity” takes the hypernym and hyponym distance for the similarity calculation while taking the shortest path into account. The shortest hypernym and hyponym path between two words and the total depth of the path will represent the similarity for Leacock-Chodorow similarity measurement. Below, you can find example usage of the Leacock-Chodorow Similarity with NLTK WordNet.
wordnet.synset("dog.n.01").lch_similarity(wordnet.synset("cat.n.01"))
OUTPUT >>>
2.0281482472922856
The example above shows the score of the word similarity based on the Leacock-Chodorow Similarity with NLTK WordNet. Finding similar words with Python and NLTK WordNet is a broad topic that can be handled with formulas like “-log(p/2d)” and other similarity measurements, or root node attributes. It is useful to see the word predictions and replacements with success. An NLP algorithm can replace the words based on their similarity to check the context shifts. If the context shifts too much, it means that the content is relevant to the first context candidate. And, word similarity with NLTK can be used for relevance calculation, or Information Retrieval systems.
How to Find Topic Domains of a Word with NLTK WordNet and Python?
NLTK WordNet has a “topic domain” metric for a specific word. The topic domain shows the word’s context and its value for a knowledge domain. The NLTK WordNet can be used to understand the topicality and topical relevance of content to another. All of the document’s from a website, or a book or all of the sentences from content with their words can be taken to calculate the topic domains. The dominant topic domain can signal the main context of the document. For a search engine, thus NLTK WordNet, or Semantic Networks with a proper dataset is useful.
To find the topic domains of a word with NLTK WordNet, and Python follow the steps below.
- Import the NLTK.corpus and wordnet to find the topic domain.
- Choose an example word or phrase to take the topic domain.
- Use the “synset” method of Wordnet for the chosen word.
- Use the “topic_domains()” method of the “synset” object.
- Read the output of the “topic_domains()” example.
Example usage of the NLTK WordNet to find the topic domain of a word can be found below.
wordnet.synset('code.n.03').topic_domains()
OUTPUT >>>
[Synset('computer_science.n.01')]
The example above shows that the topic domain of the word “code” as a noun with the third version is “computer science”. One of the problems for diagnosing the topic domains for words from NLTK WordNet is that the topic modeling and hierarchy might not be detailed enough. To make it up, the Wordnet Domains can be used. To use the WordNet Domains, an application is necessary with the email address and accepting the Creative Common Licence. With the WordNet Domains, more than 400 topic domains can be explored. To print the topic domains within the WordNet Domains, use the code example below.
from collections import defaultdict
from nltk.corpus import wordnet as wn
domain2synsets = defaultdict(list)
synset2domains = defaultdict(list)
for i in open('wn-domains-3.2-20070223', 'r'):
ssid, doms = i.strip().split('t')
doms = doms.split()
synset2domains[ssid] = doms
for d in doms:
domain2synsets[d].append(ssid)
for ss in wn.all_synsets():
ssid = str(ss.offset).zfill(8) + "-" + ss.pos()
if synset2domains[ssid]:
print( ss, ssid, synset2domains[ssid])
for dom in sorted(domain2synsets):
print(dom, domain2synsets[dom][:3])
OUTPUT >>>
acoustics ['02584104-n', '02584268-n', '02584812-n']
administration ['00045146-n', '00556291-n', '00556427-n']
agriculture ['00429923-n', '00866914-n', '00996641-n']
anatomy ['00037703-n', '00133136-n', '00353921-n']
animal_husbandry ['00792299-n', '00860674-n', '00861073-n']
animals ['00012748-n', '00962111-n', '01153586-n']
anthropology ['00210724-n', '00211160-n', '00211365-n']
applied_science ['03985477-n', '04266345-n', '04352832-n']
archaeology ['00040040-n', '01328460-n', '01891224-n']
archery ['00423600-n', '09181370-n', '09608089-n']
architecture ['00577011-n', '00871831-n', '02578017-n']
art ['00258392-n', '00573836-n', '00672395-n']
artisanship ['00869978-n', '00870256-n', '00870389-n']
astrology ['03407158-n', '04436236-n', '05444230-n']
astronautics ['00280016-n', '02827728-n', '02966235-n']
astronomy ['00045801-n', '02655846-n', '02656041-n']
athletics ['00410707-n', '00410925-n', '00414898-n']
atomic_physic ['02657581-n', '02685588-n', '02736848-n']
aviation ['00047580-n', '00047871-n', '00159777-n']
badminton ['00455850-n', '00456227-n', '00458699-n']
Finding topics within the documents with the topic domains of the words via NLTK WordNet can be done in a better way by using the WordNet Domains. Below, you can see the output of the WordNet Domains with Python.
Google Search Engine has a similar topicality and topic domain understanding to the NLTK WordNet and the WordNet Domains. Google NLP API gives more than 100 topics for a specific section. In this context, reading using the Google Knowledge Graph API and Python tutorial and guideline is beneficial to see the topics, entities, and their classification based on the text.
To learn more, read the WordNet Domains Guideline.
How to Find Region Domains of a Word with NLTK WordNet and Python?
Region domains represent the region of the specific word that is used. It is useful to see the cultural affinity of the word. A region domain can signal the topic domain. But, the difference between the region domain and the topic domain is that it represents the geographical and cultural category more than its main topic. To find the region domain with NLTK WordNet, the “region_domains()” method is used. The instructions to find the region domains of a word with NLTK WordNet are below.
- Import the NLTK Corpus and WordNet to find the region domain of a word.
- Choose a word to find the region domains.
- Use the “WordNet.synset()” for the example word.
- Use the “region_domains()” method.
An example of finding region domains with NLTK WordNet and Python can be found below.
wordnet.synset('pukka.a.01').region_domains()
OUTPUT >>>
[Synset('india.n.01')]
The example above shows that the word “Pukka” as an adjective has India as the region domain. The same process can be implemented for all of the words from a document to find the overall region signals of a document with NLTK WordNet.
The topic domain and region domain difference is that the topic domain focuses on the meaning of the word while the region domain focuses on the word’s geography and culture. Similarly, the “usage domain” focuses on which language style uses the specific word. For instance, a word can be from a medicine topic, and Japan as a region while being used in scientific language. Thus, NLTK WordNet is to provides information for exploring the language tonality, region signals, and topicality understanding. The next section will demonstrate an example for the NLTK WordNet usage domains.
How to Find Usage Domains of a Word with NLTK WordNet and Python?
Usage domain involves the word’s used language style. A word can be used by scientists, or it can be used within the slang language. To learn the content’s authenticity, target audience, or the author’s writing character, the usage domain can be used. In this context, the accent of a textual language can be seen. To find the usage domain of a word with the NLTK WordNet, the “usage_domains()” method should be used. The instructions for finding usage domains with NLTK WordNet are below.
- Import the NLTK Corpus and WordNet
- Choose a word to find the usage domains.
- Use the “WordNet.synset()” for the word.
- Use the “usage_domains()” method.
Example usage for the NLTK WordNet usage domain finding is below.
wn.synset('fuck.n.01').usage_domains()
OUTPUT >>>
[Synset('obscenity.n.02'), Synset('slang.n.02')]
The example of finding the usage domain of a word with NLTK WordNet and Python above demonstrates a word’s usage domain from “obscenity” and the “slang” language. NLTK WordNet usage domains can be a good signal to see the overall content character of a website, or a document and book.
How to Use WordNet for other languages with Python NLTK?
To use the WordNet NLTK within another language, the “wordnet.lang”, or “lemma_names” method is used. The ISO-639 language codes are used to identify the language that will be used for the WordNet NLTK. Below, you can find example usage of NLTK WordNet for other languages to find the synonyms or the antonyms along with other lexical relations with Python.
wordnet.synset("love.v.01").lemma_names("jpn")
OUTPUT >>>
['いとおしむ',
'いとおしがる',
'傾慕+する',
'好く',
'寵愛+する',
'愛しむ',
'愛おしむ',
'愛好+する',
'愛寵+する',
'愛慕+する',
'慕う',
'ほれ込む']
The example of finding the synonyms for the word “love” within Japan with NLTK Wordnet and Python can be seen above. NLTK WordNet can be used for finding synonyms and lemmas of English Words via words from other languages. The example below shows how to find the synonyms of the word “macchina” in English which is Italian.
wordnet.lemmas('macchina', lang='ita')
OUTPUT >>>
[Lemma('car.n.01.macchina'),
Lemma('locomotive.n.01.macchina'),
Lemma('machine.n.01.macchina'),
Lemma('machine.n.02.macchina')]
Using other language words for finding synonyms within the English language via NLTK WordNet is useful to see the possible connections within the English from other languages. A word from Italian can have different types of lexical relations within English. The cross-language synonym finding shows the understanding of the semantics in a language-agnostic way. Thus, using NLTK WordNet for multi-language applications such as search engines are useful to see a topic with more layer.
The NLTK WordNet-related other NLTK tasks for NLP can be found below.
- NLTK Tokenize is related to NLTK WordNet, because every word that is tokenized via NLTK can be audited with its hypernyms, hyponyms or synonyms within the WordNet.
- NLTK Lemmatize is related to NLTK WordNet as an NLP Task because it provides the different variations and versions of the same word to understand its context.
- NLTK Stemming is related NLTK WordNet task for NLP because it gives the different stemmed versions of the words.
- NLTK Part of Speech Tag is related to NLTK WordNet as NLP task because it gives the different roles for a word within a sentence by protecting its context.
Related terms to the WordNet from NLTK comprise the lexical relations and semantic relevance along with the similarity. Natural Language Toolkit for a WordNet is connected to the terms below.
- FrameNet: FrameNet is connected to the NLTK WordNet bcause it involves the semantic role labels based on the predicates of the sentences and their meanings.
- Lexical Relations: Lexical relations is connected to WordNet NLTK because it provides lexical similarities and connections between different terms and concepts.
- Semantic Relevance: Semantic Relevance is connected to NLTK WordNet because it shows how a word is relevant to another one based on semantic relations.
- Semantic Similarity: Semantic Similarity is connected to NLTK because it provides similarity between two words based on their contexts.
- Hypernyms: Hypernyms is connected to WordNet because it involes the upper and superior parts of a word.
- Hyponyms: Hyponyms is connected to WordNet bcause it involves the inferior and lower parts of a word.
- Synonyms: Synonyms is connected to WordNet bcause it involves the other words that have the same meaning.
- Antonyms: Antonyms is connected to WordNet bcause it involves the opposite meaning words of a word.
- Holonyms: Holonyms is connected to WordNet bcause it involves the whole of a thing.
- Meronyms: Holonyms is connected to WordNet bcause it involves the sub-part of a thing.
- Partonym: Partonym is connected to WordNet bcause it involves the change of a word to another one with different suffixes or prefixes.
- Polysemy: Polysemy is connected to WordNet bcause it provides same phrases with different meanings.
- Natural Language Processing is connected to WordNet bcause it is the process of understanding human language with machines.
- Semantic Search is connected to WordNet bcause it provides meaningful connections between different words within a semantic map.
- Semantic SEO is connected to WordNet bcause WordNet can be used for better content writing practices.
- Semantic Web is connected to WordNet bcause semantic web behavior patterns have meaningful word relations.
- Named Entity Recognition is connected to WordNet bcause it provides recognition of the named entities.
Last Thoughts on NLTK WordNet and Holistic SEO
NLTK WordNet and Holistic SEO should be used together. The Holistic SEO contains every vertical and angle of the search engine optimization. NLTK WordNet can provide different contexts for a specific word for an SEO to check the possible contextual connections between different phrases. NLTK WordNet is a prominent tool to understand the text along with text cleaning and text processing. Google and other semantic search engines such as Microsoft Bing can use synonyms, antonyms, and hypernyms or hyponyms for query rewriting. A search engine can process a query while tokenizing it and replacing the words with other related words with different contexts. NLTK WordNet can understand the topical relevance of a specific content piece to a query, or query cluster. Based on this, NLTK WordNet and Holistic SEO should be taken and processed together.
The NLTK Guide will continue to be updated regularly based on the new NLP and NLTK updates.
- Author
- Recent Posts
Owner and Founder at Holistic SEO & Digital
Koray Tuğberk GÜBÜR is the CEO and Founder of Holistic SEO & Digital where he provides SEO Consultancy, Web Development, Data Science, Web Design, and Search Engine Optimization services with strategic leadership for the agency’s SEO Client Projects. Koray Tuğberk GÜBÜR performs SEO A/B Tests regularly to understand the Google, Microsoft Bing, and Yandex like search engines’ algorithms, and internal agenda. Koray uses Data Science to understand the custom click curves and baby search engine algorithms’ decision trees. Tuğberk used many websites for writing different SEO Case Studies. He published more than 10 SEO Case Studies with 20+ websites to explain the search engines. Koray Tuğberk started his SEO Career in 2015 in the casino industry and moved into the white-hat SEO industry. Koray worked with more than 700 companies for their SEO Projects since 2015. Koray used SEO to improve the user experience, and conversion rate along with brand awareness of the online businesses from different verticals such as retail, e-commerce, affiliate, and b2b, or b2c websites. He enjoys examining websites, algorithms, and search engines.
Скачать материал
Скачать материал
- Сейчас обучается 268 человек из 64 регионов
- Сейчас обучается 396 человек из 63 регионов
Описание презентации по отдельным слайдам:
-
1 слайд
Word Meaning
Lecture # 6
Grigoryeva M. -
2 слайд
Word Meaning
Approaches to word meaning
Meaning and Notion (понятие)
Types of word meaning
Types of morpheme meaning
Motivation
-
3 слайд
Each word has two aspects:
the outer aspect
( its sound form)
catthe inner aspect
(its meaning)
long-legged, fury animal with sharp teeth
and claws -
4 слайд
Sound and meaning do not always constitute a constant unit even in the same language
EX a temple
a part of a human head
a large church -
5 слайд
Semantics (Semasiology)
Is a branch of lexicology which studies the
meaning of words and word equivalents -
6 слайд
Approaches to Word Meaning
The Referential (analytical) approachThe Functional (contextual) approach
Operational (information-oriented) approach
-
7 слайд
The Referential (analytical) approach
formulates the essence of meaning by establishing the interdependence between words and things or concepts they denotedistinguishes between three components closely connected with meaning:
the sound-form of the linguistic sign,
the concept
the actual referent -
8 слайд
Basic Triangle
concept (thought, reference) – the thought of the object that singles out its essential features
referent – object denoted by the word, part of reality
sound-form (symbol, sign) – linguistic sign
concept – flowersound-form referent
[rәuz] -
9 слайд
In what way does meaning correlate with
each element of the triangle ?In what relation does meaning stand to
each of them? -
10 слайд
Meaning and Sound-form
are not identical
different
EX. dove — [dΛv] English sound-forms
[golub’] Russian BUT
[taube] German
the same meaning -
11 слайд
Meaning and Sound-form
nearly identical sound-forms have different meanings in different languages
EX. [kot] Russian – a male cat
[kot] English – a small bed for a childidentical sound-forms have different meanings (‘homonyms)
EX. knight [nait]
night [nait] -
12 слайд
Meaning and Sound-form
even considerable changes in sound-form do not affect the meaningEX Old English lufian [luvian] – love [l Λ v]
-
13 слайд
Meaning and Concept
concept is a category of human cognitionconcept is abstract and reflects the most common and typical features of different objects and phenomena in the world
meanings of words are different in different languages
-
14 слайд
Meaning and Concept
identical concepts may have different semantic structures in different languagesEX. concept “a building for human habitation” –
English Russian
HOUSE ДОМ+ in Russian ДОМ
“fixed residence of family or household”
In English HOME -
15 слайд
Meaning and Referent
one and the same object (referent) may be denoted by more than one word of a different meaning
cat
pussy
animal
tiger -
16 слайд
Meaning
is not identical with any of the three points of the triangle –
the sound form,
the concept
the referentBUT
is closely connected with them. -
17 слайд
Functional Approach
studies the functions of a word in speech
meaning of a word is studied through relations of it with other linguistic units
EX. to move (we move, move a chair)
movement (movement of smth, slow movement)The distriution ( the position of the word in relation to
others) of the verb to move and a noun movement is
different as they belong to different classes of words and
their meanings are different -
18 слайд
Operational approach
is centered on defining meaning through its role in
the process of communicationEX John came at 6
Beside the direct meaning the sentence may imply that:
He was late
He failed to keep his promise
He was punctual as usual
He came but he didn’t want toThe implication depends on the concrete situation
-
19 слайд
Lexical Meaning and Notion
Notion denotes the reflection in the mind of real objectsNotion is a unit of thinking
Lexical meaning is the realization of a notion by means of a definite language system
Word is a language unit -
20 слайд
Lexical Meaning and Notion
Notions are international especially with the nations of the same cultural levelMeanings are nationally limited
EX GO (E) —- ИДТИ(R)
“To move”
BUT !!!
To GO by bus (E)
ЕХАТЬ (R)EX Man -мужчина, человек
Она – хороший человек (R)
She is a good person (E) -
21 слайд
Types of Meaning
Types of meaninggrammatical
meaninglexico-grammatical
meaning
lexical meaning
denotational
connotational -
22 слайд
Grammatical Meaning
component of meaning recurrent in identical sets of individual forms of different wordsEX. girls, winters, toys, tables –
grammatical meaning of pluralityasked, thought, walked –
meaning of past tense -
23 слайд
Lexico-grammatical meaning
(part –of- speech meaning)
is revealed in the classification of lexical items into:
major word classes (N, V, Adj, Adv)
minor ones (artc, prep, conj)words of one lexico-grammatical class have the same paradigm
-
24 слайд
Lexical Meaning
is the meaning proper to the given linguistic unit in all its forms and distributionsEX . Go – goes — went
lexical meaning – process of movement -
25 слайд
PRACTICE
Group the words into 3 column according to the grammatical, lexical or part-of –speech meaning
Boy’s, nearest, at, beautiful,
think, man, drift, wrote,
tremendous, ship’s, the most beautiful,
table, near, for, went, friend’s,
handsome, thinking, boy,
nearer, thought, boys,
lamp, go, during. -
26 слайд
Grammatical
The case of nouns: boy’s, ship’s, friend’s
The degree of comparison of adj: nearest, the most beautiful
The tense of verbs: wrote, went, thoughtLexical
Think, thinking, thought
Went, go
Boy’s, boy, boys
Nearest, near, nearer
At, for, during (“time”)
Beautiful, the most beautifulPart-of-speech
Nouns—verbs—adj—-prep -
27 слайд
Aspects of Lexical meaning
The denotational aspectThe connotational aspect
The pragmatic aspect
-
28 слайд
Denotational Meaning
“denote” – to be a sign of, stand as a symbol for”establishes the correlation between the name and the object
makes communication possibleEX booklet
“a small thin book that gives info about smth” -
29 слайд
PRACTICE
Explain denotational meaningA lion-hunter
To have a heart like a lion
To feel like a lion
To roar like a lion
To be thrown to the lions
The lion’s share
To put your head in lion’s mouth -
30 слайд
PRACTICE
A lion-hunter
A host that seeks out celebrities to impress guests
To have a heart like a lion
To have great courage
To feel like a lion
To be in the best of health
To roar like a lion
To shout very loudly
To be thrown to the lions
To be criticized strongly or treated badly
The lion’s share
Much more than one’s share
To put your head in lion’s mouth -
31 слайд
Connotational Meaning
reflects the attitude of the speaker towards what he speaks about
it is optional – a word either has it or notConnotation gives additional information and includes:
The emotive charge EX Daddy (for father)
Intensity EX to adore (for to love)
Imagery EX to wade through a book
“ to walk with an effort” -
32 слайд
PRACTICE
Give possible interpretation of the sentencesShe failed to buy it and felt a strange pang.
Don’t be afraid of that woman! It’s just barking!
He got up from his chair moving slowly, like an old man.
The girl went to her father and pulled his sleeve.
He was longing to begin to be generous.
She was a woman with shiny red hands and work-swollen finger knuckles. -
33 слайд
PRACTICE
Give possible interpretation of the sentences
She failed to buy it and felt a strange pang.
(pain—dissatisfaction that makes her suffer)
Don’t be afraid of that woman! It’s just barking!
(make loud sharp sound—-the behavior that implies that the person is frightened)
He got up from his chair moving slowly, like an old man.
(to go at slow speed—was suffering or was ill)
The girl went to her father and pulled his sleeve.
(to move smth towards oneself— to try to attract smb’s attention)
He was longing to begin to be generous.
(to start doing— hadn’t been generous before)
She was a woman with shiny red hands and work-swollen finger knuckles.
(colour— a labourer involved into physical work ,constant contact with water) -
34 слайд
The pragmatic aspect of lexical meaning
the situation in which the word is uttered,
the social circumstances (formal, informal, etc.),
social relationships between the interlocutors (polite, rough, etc.),
the type and purpose of communication (poetic, official, etc.)EX horse (neutral)
steed (poetic)
nag (slang)
gee-gee (baby language) -
35 слайд
PRACTICE
State what image underline the meaningI heard what she said but it didn’t sink into my mind.
You should be ashamed of yourself, crawling to the director like that.
They seized on the idea.
Bill, chasing some skirt again?
I saw him dive into a small pub.
Why are you trying to pin the blame on me?
He only married her for her dough. -
36 слайд
PRACTICE
State what image underline the meaning
I heard what she said but it didn’t sink into my mind.
(to understand completely)
You should be ashamed of yourself, crawling to the director like that.
(to behave humbly in order to win favour)
They seized on the idea.
(to be eager to take and use)
Bill, chasing some skirt again?
(a girl)
I saw him dive into a small pub.
(to enter suddenly)
Why are you trying to pin the blame on me?
(to blame smb unfairly)
He only married her for her dough.
(money) -
37 слайд
Types of Morpheme Meaning
lexical
differential
functional
distributional -
38 слайд
Lexical Meaning in Morphemes
root-morphemes that are homonymous to words possess lexical meaning
EX. boy – boyhood – boyishaffixes have lexical meaning of a more generalized character
EX. –er “agent, doer of an action” -
39 слайд
Lexical Meaning in Morphemes
has denotational and connotational components
EX. –ly, -like, -ish –
denotational meaning of similiarity
womanly , womanishconnotational component –
-ly (positive evaluation), -ish (deragotary) женственный — женоподобный -
40 слайд
Differential Meaning
a semantic component that serves to distinguish one word from all others containing identical morphemesEX. cranberry, blackberry, gooseberry
-
41 слайд
Functional Meaning
found only in derivational affixes
a semantic component which serves to
refer the word to the certain part of speechEX. just, adj. – justice, n.
-
42 слайд
Distributional Meaning
the meaning of the order and the arrangement of morphemes making up the word
found in words containing more than one morpheme
different arrangement of the same morphemes would make the word meaningless
EX. sing- + -er =singer,
-er + sing- = ? -
43 слайд
Motivation
denotes the relationship between the phonetic or morphemic composition and structural pattern of the word on the one hand, and its meaning on the othercan be phonetical
morphological
semantic -
44 слайд
Phonetical Motivation
when there is a certain similarity between the sounds that make up the word and those produced by animals, objects, etc.EX. sizzle, boom, splash, cuckoo
-
45 слайд
Morphological Motivation
when there is a direct connection between the structure of a word and its meaning
EX. finger-ring – ring-finger,A direct connection between the lexical meaning of the component morphemes
EX think –rethink “thinking again” -
46 слайд
Semantic Motivation
based on co-existence of direct and figurative meanings of the same wordEX a watchdog –
”a dog kept for watching property”a watchdog –
“a watchful human guardian” (semantic motivation) -
-
48 слайд
Analyze the meaning of the words.
Define the type of motivation
a) morphologically motivated
b) semantically motivatedDriver
Leg
Horse
Wall
Hand-made
Careless
piggish -
49 слайд
Analyze the meaning of the words.
Define the type of motivation
a) morphologically motivated
b) semantically motivated
Driver
Someone who drives a vehicle
morphologically motivated
Leg
The part of a piece of furniture such as a table
semantically motivated
Horse
A piece of equipment shaped like a box, used in gymnastics
semantically motivated -
50 слайд
Wall
Emotions or behavior preventing people from feeling close
semantically motivated
Hand-made
Made by hand, not machine
morphologically motivated
Careless
Not taking enough care
morphologically motivated
Piggish
Selfish
semantically motivated -
51 слайд
I heard what she said but it didn’t sink in my mind
“do down to the bottom”
‘to be accepted by mind” semantic motivationWhy are you trying to pin the blame on me?
“fasten smth somewhere using a pin” –
”to blame smb” semantic motivationI was following the man when he dived into a pub.
“jump into deep water” –
”to enter into suddenly” semantic motivationYou should be ashamed of yourself, crawling to the director like that
“to move along on hands and knees close to the ground” –
“to behave very humbly in order to win favor” semantic motivation
Найдите материал к любому уроку, указав свой предмет (категорию), класс, учебник и тему:
6 209 121 материал в базе
- Выберите категорию:
- Выберите учебник и тему
- Выберите класс:
-
Тип материала:
-
Все материалы
-
Статьи
-
Научные работы
-
Видеоуроки
-
Презентации
-
Конспекты
-
Тесты
-
Рабочие программы
-
Другие методич. материалы
-
Найти материалы
Другие материалы
- 22.10.2020
- 141
- 0
- 21.09.2020
- 530
- 1
- 18.09.2020
- 256
- 0
- 11.09.2020
- 191
- 1
- 21.08.2020
- 197
- 0
- 18.08.2020
- 123
- 0
- 03.07.2020
- 94
- 0
- 06.06.2020
- 73
- 0
Вам будут интересны эти курсы:
-
Курс повышения квалификации «Формирование компетенций межкультурной коммуникации в условиях реализации ФГОС»
-
Курс профессиональной переподготовки «Клиническая психология: теория и методика преподавания в образовательной организации»
-
Курс повышения квалификации «Введение в сетевые технологии»
-
Курс повышения квалификации «История и философия науки в условиях реализации ФГОС ВО»
-
Курс повышения квалификации «Основы построения коммуникаций в организации»
-
Курс повышения квалификации «Организация практики студентов в соответствии с требованиями ФГОС медицинских направлений подготовки»
-
Курс повышения квалификации «Правовое регулирование рекламной и PR-деятельности»
-
Курс повышения квалификации «Организация маркетинга в туризме»
-
Курс повышения квалификации «Источники финансов»
-
Курс профессиональной переподготовки «Техническая диагностика и контроль технического состояния автотранспортных средств»
-
Курс профессиональной переподготовки «Осуществление и координация продаж»
-
Курс профессиональной переподготовки «Технический контроль и техническая подготовка сварочного процесса»
-
Курс профессиональной переподготовки «Управление качеством»