Natural Language#
The natural language is a context sensitive language and therefore difficult to parse (in contrast to context free programming languages).
Natural Language Processing#
Is done in several steps.
- Tokenize: Separate individual words
- Tagging: Detect word type (Noun, Verb, etc.)
- Chunking: Group words into phrases
- Extraction: Analyze meaning
Part of Speech (POS) Tagging#
Tag | Description | Example |
---|---|---|
DT | Article | the, a |
NN | Noun | dog, car |
VB | Verb | fly |
JJ | Adjective | little |
IN | Preposition | at, on, if |
MD | Modal | shall, will |
EX | Existential | there |
Chunking#
For each type of phrase (e.g. noun phrase) the words are tagged with 3
IOB Tags: I
-inside, O
-outside, B
-begin. B
if a phrase begins,
following words get I
if the word belongs to the phrase, or O
for
all other words.
Chunk | Description | Example |
---|---|---|
NP | Noun Phrase | the little dog |
VP | Verb Phrase | will fly |
P | Preposition Phrase | to |