From Wikipedia, the free encyclopedia
In formal language theory, a context-free language (CFL) is a language generated by a context-free grammar (CFG).
Context-free languages have many applications in programming languages, in particular, most arithmetic expressions are generated by context-free grammars.
Background[edit]
Context-free grammar[edit]
Different context-free grammars can generate the same context-free language. Intrinsic properties of the language can be distinguished from extrinsic properties of a particular grammar by comparing multiple grammars that describe the language.
Automata[edit]
The set of all context-free languages is identical to the set of languages accepted by pushdown automata, which makes these languages amenable to parsing. Further, for a given CFG, there is a direct way to produce a pushdown automaton for the grammar (and thereby the corresponding language), though going the other way (producing a grammar given an automaton) is not as direct.
Examples[edit]
An example context-free language is , the language of all non-empty even-length strings, the entire first halves of which are a‘s, and the entire second halves of which are b‘s. L is generated by the grammar .
This language is not regular.
It is accepted by the pushdown automaton where is defined as follows:[note 1]
Unambiguous CFLs are a proper subset of all CFLs: there are inherently ambiguous CFLs. An example of an inherently ambiguous CFL is the union of with . This set is context-free, since the union of two context-free languages is always context-free. But there is no way to unambiguously parse strings in the (non-context-free) subset which is the intersection of these two languages.[1]
Dyck language[edit]
The language of all properly matched parentheses is generated by the grammar .
Properties[edit]
Context-free parsing[edit]
The context-free nature of the language makes it simple to parse with a pushdown automaton.
Determining an instance of the membership problem; i.e. given a string , determine whether where is the language generated by a given grammar ; is also known as recognition. Context-free recognition for Chomsky normal form grammars was shown by Leslie G. Valiant to be reducible to boolean matrix multiplication, thus inheriting its complexity upper bound of O(n2.3728596).[2][note 2]
Conversely, Lillian Lee has shown O(n3−ε) boolean matrix multiplication to be reducible to O(n3−3ε) CFG parsing, thus establishing some kind of lower bound for the latter.[3]
Practical uses of context-free languages require also to produce a derivation tree that exhibits the structure that the grammar associates with the given string. The process of producing this tree is called parsing. Known parsers have a time complexity that is cubic in the size of the string that is parsed.
Formally, the set of all context-free languages is identical to the set of languages accepted by pushdown automata (PDA). Parser algorithms for context-free languages include the CYK algorithm and Earley’s Algorithm.
A special subclass of context-free languages are the deterministic context-free languages which are defined as the set of languages accepted by a deterministic pushdown automaton and can be parsed by a LR(k) parser.[4]
See also parsing expression grammar as an alternative approach to grammar and parser.
Closure properties[edit]
The class of context-free languages is closed under the following operations. That is, if L and P are context-free languages, the following languages are context-free as well:
Nonclosure under intersection, complement, and difference[edit]
The context-free languages are not closed under intersection. This can be seen by taking the languages and , which are both context-free.[note 3] Their intersection is , which can be shown to be non-context-free by the pumping lemma for context-free languages. As a consequence, context-free languages cannot be closed under complementation, as for any languages A and B, their intersection can be expressed by union and complement: . In particular, context-free language cannot be closed under difference, since complement can be expressed by difference: .[12]
However, if L is a context-free language and D is a regular language then both their intersection and their difference are context-free languages.[13]
Decidability[edit]
In formal language theory, questions about regular languages are usually decidable, but ones about context-free languages are often not. It is decidable whether such a language is finite, but not whether it contains every possible string, is regular, is unambiguous, or is equivalent to a language with a different grammar.
The following problems are undecidable for arbitrarily given context-free grammars A and B:
The following problems are decidable for arbitrary context-free languages:
According to Hopcroft, Motwani, Ullman (2003),[25]
many of the fundamental closure and (un)decidability properties of context-free languages were shown in the 1961 paper of Bar-Hillel, Perles, and Shamir[26]
Languages that are not context-free[edit]
The set is a context-sensitive language, but there does not exist a context-free grammar generating this language.[27] So there exist context-sensitive languages which are not context-free. To prove that a given language is not context-free, one may employ the pumping lemma for context-free languages[26] or a number of other methods, such as Ogden’s lemma or Parikh’s theorem.[28]
Notes[edit]
References[edit]
- ^ Hopcroft & Ullman 1979, p. 100, Theorem 4.7.
- ^ Valiant, Leslie G. (April 1975). «General context-free recognition in less than cubic time». Journal of Computer and System Sciences. 10 (2): 308–315. doi:10.1016/s0022-0000(75)80046-8.
- ^ Lee, Lillian (January 2002). «Fast Context-Free Grammar Parsing Requires Fast Boolean Matrix Multiplication» (PDF). J ACM. 49 (1): 1–15. arXiv:cs/0112018. doi:10.1145/505241.505242. S2CID 1243491. Archived (PDF) from the original on 2003-04-27.
- ^ Knuth, D. E. (July 1965). «On the translation of languages from left to right». Information and Control. 8 (6): 607–639. doi:10.1016/S0019-9958(65)90426-2.
- ^ a b c Hopcroft & Ullman 1979, p. 131, Corollary of Theorem 6.1.
- ^ Hopcroft & Ullman 1979, p. 142, Exercise 6.4d.
- ^ Hopcroft & Ullman 1979, p. 131-132, Corollary of Theorem 6.2.
- ^ Hopcroft & Ullman 1979, p. 132, Theorem 6.3.
- ^ Hopcroft & Ullman 1979, p. 142-144, Exercise 6.4c.
- ^ Hopcroft & Ullman 1979, p. 142, Exercise 6.4b.
- ^ Hopcroft & Ullman 1979, p. 142, Exercise 6.4a.
- ^ Stephen Scheinberg (1960). «Note on the Boolean Properties of Context Free Languages» (PDF). Information and Control. 3 (4): 372–375. doi:10.1016/s0019-9958(60)90965-7. Archived (PDF) from the original on 2018-11-26.
- ^ Beigel, Richard; Gasarch, William. «A Proof that if L = L1 ∩ L2 where L1 is CFL and L2 is Regular then L is Context Free Which Does Not use PDA’s» (PDF). University of Maryland Department of Computer Science. Archived (PDF) from the original on 2014-12-12. Retrieved June 6, 2020.
- ^ Hopcroft & Ullman 1979, p. 203, Theorem 8.12(1).
- ^ Hopcroft & Ullman 1979, p. 202, Theorem 8.10.
- ^ Salomaa (1973), p. 59, Theorem 6.7
- ^ Hopcroft & Ullman 1979, p. 135, Theorem 6.5.
- ^ Hopcroft & Ullman 1979, p. 203, Theorem 8.12(2).
- ^ Hopcroft & Ullman 1979, p. 203, Theorem 8.12(4).
- ^ Hopcroft & Ullman 1979, p. 203, Theorem 8.11.
- ^ Hopcroft & Ullman 1979, p. 205, Theorem 8.15.
- ^ Hopcroft & Ullman 1979, p. 206, Theorem 8.16.
- ^ Hopcroft & Ullman 1979, p. 137, Theorem 6.6(a).
- ^ Hopcroft & Ullman 1979, p. 137, Theorem 6.6(b).
- ^ John E. Hopcroft; Rajeev Motwani; Jeffrey D. Ullman (2003). Introduction to Automata Theory, Languages, and Computation. Addison Wesley. Here: Sect.7.6, p.304, and Sect.9.7, p.411
- ^ a b Yehoshua Bar-Hillel; Micha Asher Perles; Eli Shamir (1961). «On Formal Properties of Simple Phrase-Structure Grammars». Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung. 14 (2): 143–172.
- ^ Hopcroft & Ullman 1979.
- ^ «How to prove that a language is not context-free?».
Works cited[edit]
- Hopcroft, John E.; Ullman, Jeffrey D. (1979). Introduction to Automata Theory, Languages, and Computation (1st ed.). Addison-Wesley. ISBN 9780201029888.
- Salomaa, Arto (1973). Formal Languages. ACM Monograph Series.
Further reading[edit]
- Autebert, Jean-Michel; Berstel, Jean; Boasson, Luc (1997). «Context-Free Languages and Push-Down Automata». In G. Rozenberg; A. Salomaa (eds.). Handbook of Formal Languages (PDF). Vol. 1. Springer-Verlag. pp. 111–174. Archived (PDF) from the original on 2011-05-16.
- Ginsburg, Seymour (1966). The Mathematical Theory of Context-Free Languages. New York, NY, USA: McGraw-Hill.
- Sipser, Michael (1997). «2: Context-Free Languages». Introduction to the Theory of Computation. PWS Publishing. pp. 91–122. ISBN 0-534-94728-X.
Presentation on theme: «HANDLING CONTEXT- FREE WORDS. Context-free words have an important role to play in the translating process. They usually have permanent equivalents.»— Presentation transcript:
1
HANDLING CONTEXT- FREE WORDS
2
Context-free words have an important role to play in the translating process. They usually have permanent equivalents in TL which, in most cases, can be used in TT. The translator is thus provided with reference points helping him to choose the appropriate translation variants. The permanent equivalents of context-free words are often formed by transcription (with possible elements of transliteration) or loan translations.
3
Now it is possible to find three methods of proper names transfer: Translation. According to this method each name of frequent occurrence corresponds to some equivalent in the target language, which had become historically accepted for the current moment; Transcription. This is the method, according to which the foreign proper name corresponds to such word in the target language, that reproduces its sounding with maximal accuracy, that only could be achieved in the target language. Transliteration. This method provides “letter-by-letter” transfer of proper names, recorded by means of graphical system of original language, to another form of record by means of graphical system of target language.
4
Proper and geographical names are transcribed with TL letters, e.g.: Smith — Смит, Brown — Браун, John Fitzgerald Kennedy — Джон Фитцжеральд Кеннеди; Cleveland — Кливленд, Rhode Island — Род-Айленд, Ontario — Онтарио; Downing Street — Даунинг-стрит, Foley Square — Фоли-сквер. The same is true about the titles of periodicals and the names of firms and corporations, e.g.: Life-Лайф», US News and World Report — «ЮС ньюс энд уорлд рипорт», General Motors Corporation — «Дженерал моторс корпорейшн», Harriman and Brothers — «Гар-риман энд бразерс», Anaconda Mining Company — «Анаконда май- нинг компани».
5
Practical transcription Newton – Ньютон Campbell – Кэмпбелл Folkstone – Фолькстон Malcolm – Малькольм Palm – Палм Robert – Роберт Whistle – Уистл
6
Transcription is also used to reproduce in TL the names of ships, aircraft,missiles and pieces of military equipment: Queen Elisabeth — «Kyин Элизабет», Spitfire — «Спитфайр», Hawk — «Хок», Trident — «Трайдент», Honest John — «Онест Джон».
7
The rules of transcription have two minor exceptions. First, it is sometimes supplemented by elements of transliteration when SL letters are reproduced in TT instead of sounds. This technique is used with mute and double consonants between vowels or at the end of the word and with neutral vowels (Dorset —Дорсет, Bonnerе Ferry —Боннере Ферри) as well as to preserve some elements of SL spelling so as to make the TL equivalent resemble some familiar pattern (the Hercules missile — ракета «Геркулес», Columbia — Колумбия). Second, there are some traditional exceptions in rendering the names of historical personalities and geographical names, e.g.: Charles I —Карл I, James II —Яков II, Edinborough — Эдинбург. Some geographical names are made up of common nouns and are translated word-for-word: the United States of America — Соединенные Штаты Америки, the United Kingdom — Соединенное Королевство, the Rocky Mountains — Скалистые горы.
8
If the name includes both a proper name and a common name, the former is transcribed while the latter is either translated or transcribed or both: the Atlantic Ocean — Атлантический океан, Kansas City— Канзас-сити, New Hampshire — Нью-Хемпшир, Firth of Clyde — залив Ферт-оф- Кпайд, Names of political parties, trade unions and similar bodies are usually translated word-for-word (with or without a change in the word-order): the Republican Party — республиканская партия, the United Automobile Workers Union — Объединенный npoфсоюз рабочих автомобильной промышленности, the Federal Bureau of Investigation — Федеральное бюро расследований.
9
Terminological words are also relatively context-free though the context often helps to identify the specific field to which the term belongs. In the sentence ‘These rifles are provided with a new type of foresight», the context clearly shows that the meaning of «foresight» is that of a military term and therefore all other meanings of the word can be disregarded. The context may also help to understand the meaning of the term in the text when it can denote more than one specific concept. For instance, in the US political terminology the term «state» can refer either to a national state or to one of the states within a federal entity. The following context will enable the translator to make the correct choice: «Both the state and Federal authorities were accused of establishing a police state.» In the first case the term «state» is contrasted with «Federal» and will be translated as «штат», while in the second case it obviously means «государство».
10
As a rule, English technical terms (as well as political terms and terms in any other specific field) have their permanent equivalents in the respective Russian terminological systems: magnitude — величина, oxygen — кислород, surplus value — прибавочная стоимость, Embassy — посольство, legislation — законодательство. Many Russian equivalents have been formed from the English terms by transcription or loan translations: computer — компьютер, electron -электрон, Congressman — конгрессмен, impeachment — импичмент, shadow cabinet — «теневой кабинет», nuclear deterrent — ядерное устрашение. Quite a few among them are international terms: theorem — теорема, television — телевидение, president — президент, declaration — декларация, diplomacy — дипломатия.
11
In some cases there are parallel forms in Russian: one formed by transcription and the other, so to speak, native, e.g.: резистор and сопротивление, бустер and ускоритель, индустрия and промышленность, тред-юнион and профсоюз, лидер и руководитель.
12
The translator makes his choice considering whether ST is highly technical or not, for a borrowed term is usually more familiar to specialists than to laymen. He has also to take into account the possible differences between the two forms in the way they are used in TL. For example, the Russian «индустрия» is restricted in usage and somewhat old- fashioned, «тред-юнион» always refers to British trade-unions and «лидер» gives the text a slightly foreign flavour.
13
Dealing with context-free words the translator must be aware of two common causes of translation errors. First, English and Russian terms can be similar in form but different in meaning. A «decade» is not «декада», an «instrument» is not «инструмент», and a «department» in the United States is not «департамент». Such words belong to the so-called false friends of the translator (see below). Second, the translator should not rely on the «inner form» of the English term to understand its meaning or to find a proper Russian equivalent for it is often misleading. A «packing industry is not «упаковочная» but «консервная промышленность», «conventional armaments» are not «условные» but «обычные вооружения» and a «public school» in Britain is not «публичная» or «общедоступная» but «частная школа».
14
Translation of technical terms puts a premium on the translator’s knowledge of the subject-matter of ST. He must take great pains to get familiar with the system of terms in the appropriate field and make good use of technical dictionaries and other books of reference.
15
Thank you for your attention!
General
defin – nouns used to name a certain person, place or thing. Until
recently they didn’t enter dics as t-rs thought they are
untranslatable. 1)proper ns – heterogeneous system: pers ns, pet
ns, diminutive, endearing; geogr; institutions, etc
2)can
be t-ed following well-est
tradition(monarchy,
biblical) and acc to modern
demand.
Shall not be mixed not to create dublets.(may depend on the
nationality of the bearer: hugo – Хьюго – Гюго.
Personal
names –
2 groups depending on their translation: those translated according
to tradition and those translated in keeping with the modern
tendency.
Pet
ns
used to — !)characterize a person in ? 2)express interpersonal
distances 3)show the attitude. Can be expressed implicitly or
explicitly. When translating – var means (Shorty
– Коротышка;
Scout – Глазастик; Матрёшенька – tiny
Matreshka).
Firms,
ships, newpapers
– transcribed. ./// Organizations,
societies
– loan translated: Фонд
обязательного мед страхования –
Obligatory Medical Insurance Fund.
/// Geogr
and microtoponyms
– depending on several factors: 1)type of nomination – 1 word or
multi-word. 2)structure of multi-word complexes. 1) –
transliterated(Giblartar
– Гиблартар)
2) a)transcribed
and/or
transliterated (Downing
steet – Даунинг—Стрит)
b)t-ed
– Cape
of the Good Hope – Мыс Доброй
Надежды.c)both
– Bull lake – озеро
Булл
Лэйк.
26. Handling e noun phrases with multiple pre-mods in t.
Often
used in publicistic,
bc of great packaging of info (a
take-it-or-leave-it resolution),
academic,
bc it have terminological character (electric
equipment supply)
and in fiction
for emotional & expressive properties (a-please-leave-me-alone
expression).
A lot of dif. bw R & E. in m-g, form & usage. Semantic
structure:
1)polysemy
of
w-g due to polysemy of pattern (Washington
support);
2)broad range of semantic
relations
bw elements (war
expenditure – relations of purpose; war heroes – qualifying
rel.).
In
form:
1)acc to # of pre-mods: 2-member & multiple-member.(1-redundancy
dismissals, 2-retail price index increase)
2)can be various parts of speech in dif. cases.(adj;-ed
participial;-ing particip.;nouns)
T-n:
1)identification
of sem and struct
center
of a phrase. It is necessary to take into account noun-headed phrases
of 4 structural types premodified by:1)general
adjectives(official
negotiations);2)«ed-participial
modifier(
restricted area); 3)-ing-participial
modifier (growing
problem); 4) nouns(stabiIization fund).
2)sem
analysis
usually
moves from left to right, but it often produces ambiguous results due
to a wide range of meaning relationships when the exact relationship
is not explicit;
3)t-n
The
world’s first push-button controlled solid fuel central heating
system — первая
в
мире
центральная
отопительная
система
на
твёрдом
топливе,
имеющая
кнопочное
управление.
T-n left
to right.
T-ed by:1)equiv(random
sampling)2)borrow:transcr,loan3)variants(pink
eleph-бред
алког.;чертики
в
глазах)4)descrip.(wage
deadlock)
Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]
- #
- #
- #
- #
- #
- #
- #
- #
- #
- #
- #
The context can be explained with regards to the production rules allowed for different grammars in Chomsky hierarchy.
If you consider context-free grammars, their production rules have the following form:
$$ A rightarrow alpha$$
So, you can observe that the left part of this kind of rules is made up of only one non-terminal symbol; thus, the substitution of the non-terminal symbol takes place without considering its «context», that is the other symbols it is surrounded by.
On the other hand, if you consider production rules of context-sensitive grammars, they have the following form:
$$ beta A gamma rightarrow beta alpha gamma$$
where $A$ is a non-terminal and $alpha$, $beta$, $gamma$ are sequences of non-terminals and terminals.
In this case the «context» (i.e., $beta$ and $gamma$) of the non-terminal symbol to be substituted influences the effect of the substitution and it is part of the rule itself.
You can find more details in this answer on mathematics and in this answer on software engineering.