 |
An example of language parsing July 14, 2008

Before we get too generic about how to parse language, let's start with a simple problem: How to parse the statement "My name is Daniel" and update our data structure to reflect this statement.
Defining inputs and outputs
Our input is a list of words:
| "My", "name", "is", "Daniel" |
|
Our desired output is a value assignment that will modify our data structure:
| speaker.first_name = "Daniel" |
|
... where speaker is a new entity in the data structure to represent the person we're conversing with.
Transformations
We can achieve our goal by applying a series of transformations to the list of words. For instance:
What this says is that if the word "my" is followed by a noun, that could be referring to speaker.noun. In our example:
However, we have a bit of an issue since we want speaker.first_name. What this highlights is that "name" is still a word; it hasn't been mapped to an entity in the data structure yet. What we need is a mapping from words to entities. In our example, we want:
Thus, when a transformation such as "my {noun} -> speaker.$1" is applied, a second step will be resolving the noun to its possible entities.
What we have now is an intermediate representation, such as:
| [speaker.first_name] is Daniel |
|
The first part, speaker.first_name, is fully transformed, but the rest of the statement is still a list of words. The next transformation we need is:
| {noun} is {word} -> $1 = $2 |
|
What this says is that if the word "is" gets placed between a noun and a word, that could mean that the word defines the value of the noun. In our example, the word "Daniel" defines the value of "My name":
| [speaker.first_name] is Daniel -> speaker.first_name = "Daniel" |
|
|
|
 |