Properties of Context Free Languages and Turing Machines
Context-Free Grammar (CFG): S -> aS | b
A context-free grammar (CFG) is a formal way to describe the syntax of languages. It consists of:
- A set of production rules, like
S -> aS | b
, which dictate how strings in the language are formed. - Terminals: The actual characters in the language (e.g.,
a
,b
). - Non-terminals: Variables (e.g.,
S
) that are placeholders for patterns in the language. - Start symbol: The initial non-terminal that generates strings.
For example, in the rule S -> aS | b
, we have:
S
: A non-terminal that can generate more characters.a
andb
: Terminals (characters from the language).|
: Means “or”, soS
can either becomeaS
orb
.
Types of CFG
A CFG can be:
- linear:
S->aSb|L
one non terminal on RHS- non-linear:
S->aSb|bSa|SS|L
more than one non terminal on RHS
Example Rule: S -> aS | b
This rule generates strings that consist of one or more a
s followed by a single b
. Here’s how it works:
- Start with
S
:
- Apply the rule
S -> aS
:
- Apply the rule again
S -> aS
:
- Keep applying until you want to stop with
S -> b
:
Thus, the string aaab
is generated by applying the production rules step-by-step.
Derivations
A sequence of applications of production rules. For the above rule, here’s an example:
- Start with
S
. - Apply
S -> aS
:
- Apply
S -> aS
again:
- Finally, apply
S -> b
:
So, the derivation for aaab
is:
Example Strings
Using the rule S -> aS | b
, the following strings can be generated:
b
(by directly applyingS -> b
)ab
(by applyingS -> aS
, thenS -> b
)aab
(by applyingS -> aS -> aS -> b
)aaab
,aaaab
, and so on (morea
s followed by oneb
).
- Recursion: The rule
S -> aS
is recursive becauseS
can call itself. This allows for generating strings with any number ofa
s. - Termination: The recursion stops when we use the rule
S -> b
, ensuring that the string eventually ends with ab
.
String belongs to Grammar or not?
- Start with the S start symbol and choose the closest production that matches to the given string.
- Replace the Variables with its most appropriate production. Repeat the process until the string is generated or until no other productions are left.
Left and Rightmost
Leftmost derivation
Rightmost Derivation
Ambiguous Grammar
Warning
The left one is the right one! A compiler can not understand ambigious grammar.
a+bc
Regex
aab
Grammar vs Language Ambiguity
Normal Forms
To sanitize grammar for the compiler.
- CNF
- GNF
- Membership problem: a string is a part of the language or not?
- Enabled parsing: you must stop after K steps, quit if over limit.
CNF
- Eliminate
L
productions - Eliminate unit productions: each step must increase the length of the sentencial form or the number of terminals.
- Eliminate useless productions and symbols.
- Deriviablility: each variable must derive a string / end with set of terminals or
L
- Reachability: each variable must be reachable from
S
- Deriviablility: each variable must derive a string / end with set of terminals or