An example of language parsing
July 14, 2008

Before we get too generic about how to parse language, let's start with a simple problem: How to parse the statement "My name is Daniel" and update our data structure to reflect this statement.

Defining inputs and outputs

Our input is a list of words:

"My", "name", "is", "Daniel"

Our desired output is a value assignment that will modify our data structure:

speaker.first_name = "Daniel"

... where speaker is a new entity in the data structure to represent the person we're conversing with.

Transformations

We can achieve our goal by applying a series of transformations to the list of words. For instance:

my {noun} -> speaker.$1

What this says is that if the word "my" is followed by a noun, that could be referring to speaker.noun. In our example:

my name -> speaker.name

However, we have a bit of an issue since we want speaker.first_name. What this highlights is that "name" is still a word; it hasn't been mapped to an entity in the data structure yet. What we need is a mapping from words to entities. In our example, we want:

"name" -> first_name

Thus, when a transformation such as "my {noun} -> speaker.$1" is applied, a second step will be resolving the noun to its possible entities.

What we have now is an intermediate representation, such as:

[speaker.first_name] is Daniel

The first part, speaker.first_name, is fully transformed, but the rest of the statement is still a list of words. The next transformation we need is:

{noun} is {word} -> $1 = $2

What this says is that if the word "is" gets placed between a noun and a word, that could mean that the word defines the value of the noun. In our example, the word "Daniel" defines the value of "My name":

[speaker.first_name] is Daniel -> speaker.first_name = "Daniel"