The discussion about the affix -er addressed a question about word formation: how do we use that particular affix to build up complex words when we attach it to a root. Matters of word formation are the traditional object of inquiry of morphology. At the same time that we inspected the internal structure of words, however, we needed to consider the meaning of the different parts of the word, and how they combine to compose the meaning of the whole. The study of meaning is the subject matter of semantics. Morphology and semantics are different levels of organization of the grammar of a language. Each level has its own specific sets of units (e.g roots and affixes in morphology) and ways to combine those units into larger structures. While the general principles for conducting linguistic analysis are the same across levels, it is important to learn about the particular units and structures at each level. That is the goal of these notes.
Besides morphology and semantics, there are other levels of analysis in linguistics. The study of sounds and sound patterns in human languages is taken on by phonology and phonetics. We will have very little to say about those levels in this course. We will spend quite some time, however, talking about syntax, the level of analysis that examines how words are put together to form sentences and phrases. Morphology, syntax, and semantics, then, will constitute our object of study.
These levels are distinct from each other, but they are not isolated from each other. In recent years, quite a lot of research in linguistics has focused on their interactions, or interfaces. For instance, consider the nature of the following words:
dreamcatcher | songwriter |
hairdryer | kingmaker |
nutcracker |
We recognize the agentive affix -er in all of these words, again. The remainder of the word, however, is not a single root, but two: a compound. There are examples of compounds that do not require an affix, like teapot or bookshelf. What is interesting about the -er compounds is that the second root is an action word (e.g. catch), and the first root is the thing or person that the action is oriented towards. In a way it is as if an expression like dreamcatcher is the packaging of a phrase like ‘the thing that catches dreams’ into a single word. We cannot analyze the structure of -er compounds without making reference to the phrases the compounds seem to originate from.
To develop a sound analysis of the -er compounds, then, we need to introduce some terms from syntax. Words like catch, dry, crack, write, and make, which we informally characterized as expressing an action, are more explicitly treated as the predicate of a sentence. The thing or person performing the action denoted by the -er compound corresponds to the subject of a sentence, and the first root in the compound to the object of a sentence. Subject, object, and predicate are grammatical functions (or relations). They pertain to the level of syntax, not morphology. Notice that it would be inappropriate to call the first root in the -er compounds the ‘object’ of the word, then, since ‘object’ is not a concept we apply to morphological analysis. Hence the key notion of a correspondence between levels. For instance, we can diagram the correspondences between syntax and morphology in the case of -er compounds as follows:
[SheSUBJ(1) writesPRED(2) songsOBJ(3)]SENT
[songROOT(3) – wit(e)ROOT(2) – erAFFIX]WORD(1)
Interface phenomena will always require us to define, identify, and justify correspondences between elements from different levels.
Correspondences between levels may help us arrive at a better understanding of linguistic phenomena, since in many cases a distinction at one level has the function of making a distinction at another level perceptible or understandable. After all, language is a system for making the imperceptible (meaning) into something that our senses can capture and transmit to others (sounds or signs). But from a methodological point of view it is crucial to learn to identify the elements of a level without reference to the elements of other levels. I will refer to this principle as the principle of analytical autonomy.
For instance, when we found out that -er was an affix, we relied on the shared aspects of meaning between the words ending in -er, e.g. ‘doer of X’ (where X is a root that corresponds to a predicate). But we could have simply used another line of reasoning that does not require us mentioning semantics (or syntax) at all. We could have simply justified our analysis by stating that words like dreamer and dancer are somehow formally equivalent (i.e. they can both be subjects or objects), and that dream and dance are both words in our language, therefore the common piece -er must be an affix.
This line of argumentation is more easily grasped when we approach a dataset from a language we may not know, and in which meaning cannot help guide our analysis. Consider the following words:
vhanwa | vhakhounwa | uḑonwa |
ufunza | ukhoufunza | uḑofunza |
ushuma | vhakhoushuma | vhaḑoshuma |
vhavhala | ukhouvhala | vhaḑovhala |
We can easily see that there are four repeated endings: -nwa, -funza, -shuma, and -vhala. Also, words can begin by either vha- or u-, and these two initial components may be optionally followed by -khou- or -ḑo-. We have thus identified the pieces that make up these words by reference to their mutual co-occurrence possibilities, without any consideration of meaning whatsoever.
We will refer to the pieces that make up a word, whether roots or affixes, as morphs. The table below has the words above segmented into their constituent morphs, with a translation of their meaning (these examples are from Venda, a Bantu language spoken in the Transvaal region of the Republic of South Africa).
vha-nwa | ‘they drink’ |
u-funza | ‘he teaches’ |
u-shuma | ‘he works’ |
vha-vhala | ‘they read’ |
vha-khou-nwa | ‘they are drinking’ |
u-khou-funza | ‘she is teaching’ |
vha-khou-shuma | ‘they are working’ |
u-khou-vhala | ‘she is reading’ |
u-ḑo-nwa | ‘she will drink’ |
u-ḑo-funza | ‘she will teach’ |
vha-ḑo-shuma | ‘they will work’ |
vha-ḑo-vhala | ‘they will read’ |
When
analysing the semantic structure of a polysemantic word, it is
necessary to distinguish between two
levels of analysis.
a)
On
the first
level, the
semantic structure of a word is treated as a system of meanings. For
example, the semantic structure of the noun fire
could
be presented as following (see
also the scheme p.133):
II
An
instance of destructive burning: a
forest fire.
Fire,
n → I
Flame →
III
Burning
material in a fireplace: A
camp fire.
IV
The
shooting of guns, etc:
to open (cease) fire.
V
Strong
feeling, passion: a
speech lacking fire.
The
above suggests that meaning I
(flame)
holds a kind of dominance over the other meanings conveying the
concept in the most general way whereas meanings II – V are
associated with special circumstances.
Meaning
I
(generally
referred to as the
main meaning)
presents the centre of the semantic structure of the word holding it
together. It is mainly through meaning I that meanings II
– V
(they
are called secondary
meanings) can
be associated with one another.
b)
Yet,
it is not in every polysemantic word that such a centre can be found.
Some semantic structures are arranged on a different principle. In
the following list of meanings of the adjectve dull
one
can hardly find a generalized meaning covering and holding together
the rest of the semantic structure (see
also p. 134):
Dull,
adj
-
Uninteresting,
monotonous, boring; e.g. a
dull book, a dull film. -
Slow
in understanding, stupid; e.g. a
dull student. -
Not
clear or bright; e.g. dull
(пасмурная)
weather, dull day. -
Not
loud or distinct; e.g. a
dull (глухой)
sound. -
Not
sharp; e.g. a
dull knife.
Yet,
one distinctly feels that there is something that all these meanings
have in common, and that is the implication of deficiency, be it of
colour (m. III), wits (m. II), interest (m. I), sharpness (m. V). The
implication of insufficient quality, of something lacking, can be
clearly distinguished in each separate meaning.
The
scheme of the semantic structure of dull
shows
that the centre holding together the complex semantic structure of
this word is not one of the meanings but a certain component
that
can be singled out within each separate meaning. This brings us to
the
second level of analysis of the semantic structure of a word. The
semantic structure of the word is “divisible”
not
only at the level of different meanings but, also, at a deeper level.
Each
separate meaning seems to be subject to structural analysis in which
it may be represented as sets of semantic components. In terms of
componential
analysis,
one
of the modern methods of semantic research, the meaning of a word is
defined as a set of elements of meaning which are not part of the
vocabulary of the language itself, but rather theoretical elements.
Therefore,
the semantic structure of a word should be investigated at both these
levels: a)
of different meanings, b) of semantic components within each separate
meaning. For
a monosemantic word (i.e. a word with one meaning) the first level is
excluded.
Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]
- #
- #
- #
- #
- #
- #
- #
- #
- #
- #
- #
What Is Linguistic Analysis?
Linguistic analysis refers back to the scientific analysis of a language pattern. It includes not less than one of the 5 most important branches of linguistics, that are phonology, morphology, syntax, semantics, and pragmatics. Linguistic analysis can be utilized to explain the unconscious guidelines and processes that audio system of a language use to create spoken or written language, and this may be helpful to those that need to be taught a language or translate from one language to a different. Some argue that it may additionally present perception into the minds of the audio system of a given language, though this concept is controversial. Levels of linguistic analysisThe self-discipline of linguistics is outlined because the scientific research of language. Individuals who have an schooling in linguistics and follow linguistic analysis are known as linguists. The drive behind linguistic analysis is to know and describe the data that underlies the flexibility to talk a given language, and to know how the human thoughts processes and creates languageThe 5 most important branches of linguistics are phonology, morphology, syntax, semantics, and pragmatics. An prolonged language analysis might cowl all 5 of the branches, or it could deal with just one side of the language being analyzed. Every of the 5 branches focuses on a single space of language. Levels of linguistic analysis
Phonology refers back to the research of the sounds of a language. Each language has its personal stock of sounds and logical guidelines for combining these sounds to create phrases. The phonology of a language basically refers to its sound system and the processes used to mix sounds in spoken language.
Morphology refers back to the research of the interior construction of the phrases of a language. In any given language, there are lots of phrases to which a speaker can add a suffix, prefix, or infix to create a brand new phrase. In some languages, these processes are extra productive than others. The morphology of a language refers back to the word-building guidelines audio system use to create new phrases or alter the which means of present phrases of their language.
Syntax is the research of sentence construction. Each language has its personal guidelines for combining phrases to create sentences. Syntactic analysis makes an attempt to outline and describe the foundations that audio system use to place phrases collectively to create significant phrases and sentences.
Semantics is the research of which means in language. Linguists try to establish not solely how audio system of a language discern the meanings of phrases of their language, but additionally how the logical guidelines audio system apply to find out the which means of phrases, sentences, and whole paragraphs. The which means of a given phrase can rely upon the context during which it’s used, and the definition of a phrase might fluctuate barely from speaker to speaker. Levels of linguistic analysis
Pragmatics is the research of the social use of language. All audio system of a language use completely different registers, or completely different conversational types, relying on the corporate during which they discover themselves. A linguistic analysis that focuses on pragmatics might describe the social elements of the language pattern being analyzed, comparable to how the standing of the people concerned within the speech act may have an effect on the which means of a given utterance.
Linguistic analysis has been used to find out historic relationships between languages and folks from completely different areas of the world. Some governmental companies have used linguistic analysis to verify or deny people’ claims of citizenship. This use of linguistic analysis stays controversial, as a result of language use can fluctuate significantly throughout geographical areas and social class, which makes it troublesome to precisely outline and describe the language spoken by the residents of a selected nation.
Text or speech in natural language can be analyzed at different levels, language levels. Each language level is determined by the main language element or the class of elements that are typical for a particular level. Each plane has an input and output view.
1-phonetic Level
Phonetics is a science on the border of linguistics, anatomy, physiology and physics. This level is concerned with signal processing, ie their sorting and classification. The basic unit is the so-called “ telephone ”.
Phones can be further divided into:
- articulatory ie according to the place where they are formed (position of the tongue, teeth, opening of the oral cavity, etc.),
- acoustic i.e. transmission of sounds by frequency,
- perceptual ie the way the listener receives sounds.
Phonetics determine the formation of vowels and consonants (long / short, tone high / low / descending, voiced / voiceless, nasal / non-nasal). The output of the phonetic level is the processing of the array of phones in the phonetic alphabet. Levels of linguistic analysis
2-Phonological level
Phonology deals with the function of sounds. Like phonetics, this level deals with the study of the sound side of natural language, specifically the sound differences that have the ability to discern meaning in a particular language. Phonology is concerned with the function of sounds. The basic unit is the so-called “ phoneme ”, that is, a sound instrument used to distinguish morphemes, words and word forms of the same language, with different meanings (lexical, grammatical). The phoneme itself can only be recognized by the realization of a “voice“.
The method of articulating a particular phoneme is called “ allophone ” and denotes one of the possible sounds, both in phonetics and phonology. An example of sounds that are given a phonological function (eg “j”) in Czech, namely – chin – gin. The content of the phonological level also includes distinguishing features. This means that there are differences between individual phonemes and higher-level sound phenomena, which have the ability to discern the meaning of words. For example, in the Czech language, this characteristic is solidity (three – three, fifth – fifth) and the differentiation of several sounds (t / d). Another important and indivisible unit in linguistics is the so-called ” grapheme “† The chart shows the letter, characters, icons, numbers and punctuation marks. Usually one phoneme corresponds to one phoneme. [Where? ] It is the recording of a sound with a graphic symbol. The output from the phonological level is a series of symbols of the abstract alphabet, usable at the phonological level. Levels of linguistic analysis
3-Morphology
Morphology is a linguistics that studies inflection, that is, inflection and timing. It also examines the regular derivation of words using prefixes, suffixes, and suffixes. Morphology studies the relationships between different parts of words. The basic unit is the so-called “morpheme“. It is the smallest unit that carries meaning, it is the unit of the language system. A morph is a superficial realization of a morpheme, for example it is a unit of speech – there are specific morphs “ber-” and “br-“, which are the realization of one morpheme. Different morphs that are realizations of the same morpheme are called allomorphs .
There are two types of morphemes:
- lexical morpheme – is a stem of a word that has meaning
- grammatical morpheme – determines the grammatical role of a word form
From a morphological point of view, words are divided into flexible (inflection and timing) and inflexible.
- Morphological Level – Entering this plane is a sequence of phonemes written in the abstract alphabet. The basic element is morphonemes, the composition of the elements are the so-called Morphs. The output is a series of morphones divided into morphs.
- morphematic Level – The input is a series of morphs. The basic element is the so-called “seed” and the compound elements are “morphemes” and “Form”. The output is a series of word forms, including semantic (lexical) and grammatical information. The form corresponds to the word form. Morphemes are lexical (for example, the stem “healthy”) and grammatical (for example, the ending “more”). The topics are lexical, such as part of speech and grammar. The output of morphology is the processing of sentence structure. [source? †
4-syntactic Level
Syntax is a linguistic discipline that deals with the relationships between words in a sentence, as well as the correct formation of sentence structures and word order. The syntax does not describe the meaning of individual words and phrases. The basic unit is a sentence. Natural language syntax then describes the language that arose from natural evolution. Natural language is typically (syntactically) ambiguous. Levels of linguistic analysis
The input to the syntactic plane is a series of morphemes. The basic element is the so-called “Day”, that is, a member of the sentence. It can be not just a word, but for example more words such as “in the house”, “I did”, etc. The compound element is the so-called “syntagmém”, or a sentence. Syntactic categories are then understood, for example subject, predicate, subject, proverbial clause, complement. The output of the syntactic level is a sentence structure (a tree denoting sentence relationships).
5-Semantic level
Semantics is part of semiotics. It deals with the meaning of expressions from different structural levels of language, morphemes, words, idioms and sentences, or even higher units of text. The relationships of these expressions with reality then give meaning. The access to the semantic plane is a sentence tree denoting sentence relations. The basic element is the so-called “Semantic”, which corresponds to the tagmen.
The semantic level is discussed further :
-
- coordination – ie. merging (a, i, ani, nebo), where the sentences are equivalent in content, – resistance (but, but, but), where the second sentence expresses a fact contrary to the fact of the first sentence, separation (or -or), when when the two sentences are combined, their contents are mutually exclusive.
- coreference – it is a coincidence of a subject with a predicate at so-called long distances, – it is a relation of two or more expressions in the text to one object, even if this object is replaced by a pronoun in the previous sentence,
- deep x surface features
Sentence division: The sentence is divided into a theme, which is the basis and premise (what we already know) and a rhyme, which has the function of a core and a focus (what we say new about what we already know). Within the starting point or focus, the members of the sentence are included in the system word order. It’s an in-depth word order. Levels of linguistic analysis
The output of the semantic level is a sentence structure with determining sentence relationships.
6-pragmatic Level
Pragmatics as a scientific discipline, it falls into the field of linguistics and philosophy, which deals with oral expression, that is, speeches and utterances. At this level, the assignment of real world objects (it does not fall into the linguistic content) to specific so-called nodes of the sentence structure is realized.
The pragmatic level deals with practical communication problems, in particular the individual interpretation of the text. If a character is interpreted, then only in relation to other characters, objects and users. Through language it is possible to understand and describe specific objects of our thinking.
This level touches on the choice, use and effect of all spoken or written characters in a given communication situation and assesses whether the speaker has chosen the right strategy so that the receiver comes to understanding. Interpretation can also be influenced by the performer’s own set of knowledge and his attitude to the acquired knowledge.
At the pragmatic level, but also beyond, there is also the so-called conversation , which in ordinary communication can be understood as a discussion, dissertation or as an explanation of a certain topic, in the form of a dialogue of several speakers or just a monologue.
The pragmatic level output is a logical form of text that can be judged true or false.
Levels of Linguistic Analysis
Following are Levels of Linguistic Analysis or Branches of Linguistic Analysis :-
Phonetics
Phonology
Morphology
Lexicology
Syntax
Semantics
Pragmatics
Discourse
1: Phonetics:-
Phonetics is the investigation of production,transmission and impression of discourse sound.It is worried about the hints of languages,how these sounds are verbalized and how the listener sees them.Phonetics is identified with study of accoustics in that it utilizes much similar methods in the examination of sounds that accoustics does.There are three parts of Phonetics :-
1-Articulatory phonetics:-
It is the investigation of creation of discourse sounds.
2-Acoustic Phonetic:-
It is the investigation of physical creation and transmission of discourse sounds.
3-Auditory Phonetics:-
It is the investigation of view of discourse sounds.
2:»Phonology»:-
It is the investigation of the examples of language.It is worried about how sounds are composed in a language.It looks at what jumps out at discourse sounds when they are consolidated to frame a word and how these discourse sounds connect with one another it attempts to clarify what these phonological procedure are as far as formal standards.
The Phonemes of specific language are those insignificant particular units of sound that can recognize significance in that English .e.g in English the/p/sound is phoneme b/c it is the littlest unit of hints of bill,till or drill making the word pill.The vowel sound of pill is additionally a phoneme b/c its peculiarity in sound makes pill,which implies one thing,sound not the same as pal,which implies another.
3: Morphology:-
It is investigation of word arrangement and structure.It ponders how words are assembled from their littler parts and the principles overseeing this process.The components that are joining to frame words are called Morpheme.A morpheme is the littlest unit of syntax you can have in language the felines e.g contains the morphemes feline and the plurals.
4:Lexicology:-
It is investigation of words.We think about word-arrangement and world classes.Lexeme is the littlest unit of Lexis.
5:Syntax:-
It is the investigation of sentence structure.It endeavors to portrayed what are syntactic principles specifically language.These rules detail a fundamental structure and a transformational process.The basic structure of English e.g would have a subject — action word — object sentence order.For precedent: John hit the ball
The transformational procedure would permit a difference in word order,which could give us something like,the ball was hit by John.
6:Semantics:-
It is the investigation of significance in language.It is worried about depicting how we speak to the importance of word in our mind how we utilize this portrayal in building sentence.It depends to a great extent on the examination rationale in reasoning.
7:Pragmatics:-
It examines the components that administer our decision of language in social association and the effects of our decisions on others.In theory,we can say anything we like.In practice we pursue an expansive no. of social principles (some of then unwittingly) that compel the manner in which we like we talk
e.g there is presently law that says we should not tell jokes amid a funeral,but it is commonly not done.
8:Discourse:-
It is the investigation of stretches of spoken and composed language over the sentence
or then again
The manner in which sentences work in succession to deliver cognizant stretches of language.
There’s something exceptional about humans.
We’re capable of doing unbelievably complex tasks. Even more amazing is that most of the things easiest for us are incredibly difficult for machines to learn.
The day to day activities that we do like talking or writing are in form of natural language.
We created coding programs to help us communicate on the same level as a computer. But can you imagine a world where humans talked to each other in the equivalent of code? It would be very dry…
A programmer is going to the grocery store and his wife tells him, “Buy a gallon of milk, and if there are eggs, buy a dozen.”
So the programmer goes, buys everything, and drives back to his house.
Upon arrival, his wife angrily asks him, “Why did you get 13 gallons of milk?” The programmer says, “There were eggs!”
Fortunately for us, there’s little chance that we’ll adopt Python as a spoken language. We can keep the beauty and complexity of the languages we speak and write, with their vast vocabulary, double-meanings, sarcasm, slang, abbreviations, and idiosyncrasies!
Natural language simply refers to the way we communicate with each other: speech and text.
Processing refers to making natural language usable for computational tasks.
Natural language processing, also referred to as text analytics, plays a very vital role in today’s era because of the sheer volume of text data that users generate around the world on digital channels such as social media apps, e-commerce websites, blog posts, etc. Natural Language Processing works on multiple levels and most often, these different areas synergize well with each other. This article will offer a brief overview of each and provide some example of how they are used in information retrieval.
Morphological
Morphology has been a part of mainstream linguistics for sixty years or more.The morphological level of linguistic processing deals with the study of word structures and word formation, focusing on the analysis of the individual components of words. According to the classical approach in linguistics , words are formed of morphemes, which are the minimal (that is, non-decomposable) linguistics units that carry meaning
Many language processing applications need to extract the information encoded in the words – Parsers which analyze sentence structure need to know/check agreement between
- subjects and verbs
- Adjectives and nouns
Information retrieval systems benefit from know what the stem of a word is and machine translation systems analyze words to their components and generate words with specific features in the target language
Taking, for example, the word: “undesirableness”. It can be broken down into three morphemes (prefix, stem, and suffix), with each conveying some form of meaning: the prefix un- refers to “not being”, while the suffix -ness refers to “a state of being”. The stem desirable is considered as a free morpheme since it is a “word” in its own right. Bound morphemes (prefixes and suffixes) require a free morpheme to which it can be attached to, and can therefore not appear as a “word” on their own.
In Information Retrieval, document and query terms can be stemmed to match the morphological variants of terms between the documents and query; such that the singular form of a noun in a query will match even with its plural form in the document, and vice versa, thereby increasing recall.
Lexical
It involves identifying and analyzing the structure of words and parts of speech. Lexicon of a language means the collection of words and phrases in a language. Lexical analysis is dividing the whole chunk of text into paragraphs, sentences, and words.This level of linguistic processing utilizes a language’s lexicon, which is a collection of individual lexemes. A lexeme is a basic unit of lexical meaning; which is an abstract unit of morphological analysis that represents the set of forms or “senses” taken by a single morpheme.
“Better”, for example, can take the form of a noun or a verb or a adjective but its part-of-speech and lexical meaning can only be derived in context with other words used in the phrase/sentence. This, in fact, is an early step towards a more sophisticated Information Retrieval system where precision is improved through part-of-speech tagging.
For a simple application like spam detection, lexical processing works just fine, but it is usually not enough in more complex applications, like, say, machine translation. For example, the sentences “My cat ate its third meal” and “My third cat ate its meal”, have very different meanings. However, lexical processing will treat the two sentences as equal, as the “group of words” in both sentences is the same. Hence, we clearly need a more advanced system of analysis.
Syntactic
The next step after lexical analysis is where we try to extract more meaning from the sentence, by using its syntax this time. Instead of only looking at the words, we look at the syntactic structures, i.e., the grammar of the language to understand what the meaning is.
In Information Retrieval, parsing can be leveraged to improve indexing since phrases can be used as representations of documents which provide better information than just single-word indices. In the same way, phrases that are syntactically derived from the query offers better search keys to match with documents that are similarly parsed.
One example is differentiating between the subject and the object of the sentence, i.e., identifying who is performing the action and who is the person affected by it. For example, “Chris thanked Brett” and “Brett thanked Chris” are sentences with different meanings from each other because in the first instance, the action of ‘thanking’ is done by Chris and affects Brett, whereas, in the other one, it is done by Brett and affects Chris. Hence, a syntactic analysis that is based on a sentence’s subjects and objects, will be able to make this distinction.
There are various other ways in which these syntactic analyses can help us enhance our understanding. For example, a question answering system that is asked the question “Who is the Prime Minister of USA?”, will perform much better, if it can understand that the words “Prime Minister” are related to “USA”. It can then look up in its database, and provide the answer.
Semantic
Lexical and syntactic processing don’t suffice when it comes to building advanced NLP applications such as language translation, chatbots etc.. The machine, after the two steps given above, will still be incapable of actually understanding the meaning of the text.
The semantic level of linguistic processing deals with the determination of what a sentence really means by relating syntactic features and disambiguating words with multiple definitions to the given context. This level entails the appropriate interpretation of the meaning of sentences, rather than the analysis at the level of individual words or phrases.
In Information Retrieval, the query and document matching process can be performed on a conceptual level, as opposed to simple terms, thereby further increasing system precision. Moreover, by applying semantic analysis to the query, term expansion would be possible with the use of lexical sources, offering improved retrieval of the relevant documents even if exact terms are not used in the query. Precision may increase with query expansion, as with recall probably increasing as well.
Such an incapability can be a problem for, say, a question answering system, as it may be unable to understand that PM and Prime Minister mean the same thing. Hence, when somebody asks it the question, “Who is the PM of USA?”, it may not even be able to give an answer unless it has a separate database for PMs, as it won’t understand that the words PM and Prime Minister are the same. You could store the answer separately for both the variants of the meaning (PM and Prime Minister), but how many of these meanings are you going to store manually? At some point, your machine should be able to identify synonyms, antonyms, etc. on its own.This is typically done by inferring the word’s meaning to the collection of words that usually occur around it. So, if the words, PM and Prime Minister occur very frequently around similar words, then you can assume that the meanings of the two words are similar as well.
Once you have the meaning of the words, obtained via semantic analysis, you can use it for a variety of applications. Machine translation, chatbots and many other applications require a complete understanding of the text, right from the lexical level to the understanding of syntax to that of meaning. Hence, in most of these applications, lexical and semantic processing simply form the “pre-processing” layer of the overall process. In some simpler applications, only lexical processing is also enough as the pre-processing part.
Some other Natural Language Processing (NLP) methods concerned with semantics include:
- Using numbers to represent the meanings of words/sentences in text and how they relate to one another
- Translating text from one language to another
- Creating human readable text from structured data (rows and columns)
- Determining the text in a given image (go from screenshot to Word document)
- Forming answers to questions about a given set of data
- Deciding on the positive or negative sentiment of a given text
- Separating text into self-contained topics
- Deciding on the meaning of an ambiguous word given the context (Roll over? Eat a sushi roll?)
- Given an example like above, where we find proper nouns, figuring out the relationships between these (Melissa is Victor’s wife)
- Semantics is a flourishing field — needless to say there is a lot of progress being made in helping our computers find meaning in text, and in turn helping us perform much more powerful analytics.
Discourse
Discourse processing is a suite of Natural Language Processing (NLP) tasks to uncover linguistic structures from texts at several levels, which can support many NLP applications.It deals with the analysis of structure and meaning of text beyond a single sentence, making connections between words and sentences. At this level, Anaphora Resolution is also achieved by identifying the entity referenced by an anaphor (most commonly in the form of, but not limited to, a pronoun). An example is shown below.
“I love dominoes pizza because they put extra cheese” , she said.
Here there are two entities she and dominoes, where she is in context of “I” and they is in context of “dominoes” so discourse will interpret this sentence has 2 entities ( I and dominoes ) and 2 anaphor ( she and they)
Much of discourse is used when trying to train chatbots to interact well with humans and be easily understandable. If you tell a chatbot on a cosmetics website that you’re looking for a good moisturizer, it’s unhelpful for the bot to ask what your favorite book is.
With the capability to recognize and resolve anaphora relationships, document and query representations are improved, since, at the lexical level, the implicit presence of concepts is accounted for throughout the document as well as in the query, while at the semantic and discourse levels, an integrated content representation of the documents and queries are generated.
Further methods of Natural Language Processing (NLP) that concern themselves with discourse include:
- Which mentions are referring to which entities? (“they” refers to “Dominoes”)
- Categorizing the type of text (Is it a question, statement, assertion?)
- Grade the quality and coherence of text (automatic essay scoring)
Pragmatic
Pragmatic means practical or logical. If someone calls you pragmatic, they mean that you tend to think in terms of the practical or logical rather than the ideal situation.
The pragmatic level of linguistic processing deals with the use of real-world knowledge and understanding of how this impacts the meaning of what is being communicated. By analyzing the contextual dimension of the documents and queries, a more detailed representation is derived.
Examples of Pragmatics:
- Will you crack open the door? I am getting hot.
Semantically, the word “crack” would mean to break, but pragmatically we know that the speaker means to open the door just a little to let in some air.
- I heart you!
Semantically, “heart” refers to an organ in our body that pumps blood and keeps us alive. However, pragmatically, “heart” in this sentence means “love”-hearts are commonly used as a symbol for love, and to “heart” someone has come to mean that you love someone.
- If you eat all of that food, it will make you bigger!
Semantically, “bigger” in this sentence would mean larger than you are currently. Think about how this sentence, pragmatically, would mean something different depending on the context. If it is said to a young child, pragmatically, it would mean to grow bigger. If it is said to a grown person who is already obese, it would mean something entirely different.
In Information Retrieval, this level of Natural Language Processing primarily engages query processing and understanding by integrating the user’s history and goals as well as the context upon which the query is being made. Contexts may include time and location.
This level of analysis enables major breakthroughs in Information Retrieval as it facilitates the conversation between the IR system and the users, allowing the elicitation of the purpose upon which the information being sought is planned to be used, thereby ensuring that the information retrieval system is fit for purpose.
Happy learning
Check my other blogs by clicking here
Follow us on LinkedIN
14
Likes
Presentation on theme: «Levels of Linguistic Analysis»— Presentation transcript:
1
Levels of Linguistic Analysis
2
Introduction For Language study, areas marked and subdivided – it helps in analytic & systematic study Language has hierarchical structure Language made up of smaller units which are made up of still smaller units & finally smallest indivisible unit – single distinguishable sound called phoneme
3
Introduction (contd..) Other way round also possible – Phonemes combine to make up morphemes which combine to make up words which combine to make up phrases and sentences ….. Text or discourse At each stage (or level) certain rules operate which permit occurrence and combination of smaller units
4
Rules Rules of phonology determine occurrence and combination of particular phonemes Rules for word formation – behaviour of morphemes Rules of sentence formation – combination & positioning of words in a sentence So each level is a system in itself.
5
Rules Because of rules at each level, we can analyse each level independently of the other Although each level is linked to the other in hierarchy, still it is independent because of distinct rules that can be described, analyzed and understood. Each level of analysis corresponds to each level of the structure of language
6
Levels Levels of Analysis Levels of structure Phonetics & Phonology Sounds Morphology Word formation Syntax sentence formation Semantics Meanings Discourse Connected sentences
7
Levels of language linked
Levels of language not completely separate, important & vital linkages present Earlier thought – Phonology had no link with semantics Now we know – Link much more complex than thought earlier e.g discourse made up of all levels working together, Semantics has analysis both at words & sentence meaning
8
Levels (Brief description)
Phonetics: Phonetics explores how the linguistically relevant sounds in the languages of the world are produced, and how these sounds are perceived using experimental and computational tools. Studies language at the level of sounds: articulated by the human speech mechanism & received by auditory mechanism
9
Phonetics (contd..) It studies how sounds can be distinguished and characterized by the manner in which they are produced It also talks about different symbols (phonetic symbols) used for different sounds / alphabets
10
Phonology Phonology: Phonology examines how sounds pattern in languages, how sounds are combined to make words, how sounds near each other affect each other and how sounds are affected by where in the word/phrase they occur. Studies the formation of syllables and larger units
11
Phonology (contd..) It studies combination of sounds into organized units of speech, the formation of syllabus and larger units. It describes the sound system of a particular language & the combination and distribution of sounds which occur in that language.
12
Phonology (contd..) Classification is made on the basis of the concept of phoneme i.e /m/, /g/,/p/. These distinct sounds enter into combination with others – rules of combination are different in different languages
13
Morphology Morphology: Morphology examines the structure of words and the principles that govern the formation of words. Words also made up of a number of units, the word ‘unhappiness’ involves three elements (or morphemes) un-, -happy- and –ness. Morphology deals with how languages add morphemes together.
14
Morphology (contd..) It studies the patterns of formation of words by combination of sounds into minimal distinctive units of meaning called morphemes Morpheme cannot be broken – it will no longer make sense e.g bat (single morpheme) Single morpheme – bat or two morphemes bat + s
15
Morphology (contd..) Level of morphology is related to phonology on the one hand and to semantics on the other take – took (change in one of the sounds) take the action take + time present change took the action take + time past in meaning
16
Syntax Syntax: Syntax investigates the structure of sentences and the common principles that determine how phrases and sentences are built up from words. It also explores the way that languages vary in their application of these common principles by looking at the variation across languages.
17
Syntax (contd..) Syntax describes the rules of positioning of elements in a sentence – Noun /nouns syntax phrases, verb/verb phrases, adverbial phrases Syntax also describes the function of elements in a sentence e.g Noun ‘boy’ has different functions / roles in (a) & (b) (a) The boy likes cricket. (b) The old man loved the boy.
18
Syntax (contd..) Rules of syntax should explain how grammatical & meaningful sentences are formed. e.g. Colourless green ideas sleep furiously (meaningless)
19
Semantics Semantics studies the meanings of words and sentences independently of any context. Semantics seeks to explain how it is that we come to have such a clear understanding of the language we use. It analyses the structure of meaning in language.
20
Semantics (contd..) Example: Semantics analyzes how words similar and different are related; it attempts to show these inter-relationships through forming categories. It attempts to analyze and define ‘abstract’ words. Example: easy to define ‘tree’, ‘table’ difficult to define ‘love’, ‘feel’
21
Discourse Discourse: a unit of text used by linguists for the analysis of linguistic phenomena that range over more than one sentence. formal and orderly and usually extended expression of thought on a subject connected speech or writing a linguistic unit (as a conversation or a story) larger than a sentence
22
Discourse (contd..) At discourse level we analyze inter-sentential links that form a connected or cohesive text. Cohesion – relation formed in a sentence between it and the sentences before it and after it, by using connectives By this study we can know how a piece of connected language can have greater meaning that is more than the sum of the individual sentences
23
Some other studies Graphology: study of the writing system of the language and conventions used in representing speech in writing – formation of letters Lexicology: studies the manner in which lexical items are grouped together as in compilation of dictionaries
24
Two views about scope Micro-linguistic: Study confined to phonology, morphology and syntax Macro-linguistic: Other aspects of language and its relationship with many areas of human activity