 |
Transformation: A specification
Summary
Transformations are used to iteratively transform a statement or question written in a human language into a format easily understood by the computer.
Example transformation:
What this transformation says is that if a noun follows the word "my", it is possible that the noun refers to a property of the person who spoke/wrote the statement.
For example:
Notice how the $1 token above refers to the entity that "filled in the blank", which in this case, is the age entity.
Now consider this transformation:
This transformation says that if a noun with a 's suffix is followed by a second noun, then it is possible that the second noun refers to a property of the first noun.
For example:
| Daniel's age -> Daniel.age |
|
Specification
The following is a specification for transformations. For more details on terminology, see the glossary section below.
  | A transformation consists of an input specification and an output specification. |
  | The input specification is represented by a list of tokens. |
 |   | A token can be one of two types: |
 |   | An entity type (Entity type tokens are surrounded by curly braces) |
  | The output specification can be either an assignment or a value. |
 |   | An assignment is of the form a.b = c, where a, b and c are entities. We'll call a the object, b the property, and c the value. |
 |   | In some cases, a will be of the form a1.a2, or a1.a2.a3, etc. |
 |   | An output specification that is a value is of the form a, a.b, a.b.c, etc. In other words, a property. |
 |   | Any entity in the output specification can be represented by a variable. Variables are numbered and begin with a dollar sign. They correspond to the actual entities that the entity type tokens of the input specification resolve to. |
 |   | Any entity in an output specification can be represented by a fragment. See the glossary for more details. |
 |   | Dot notation implies that there exists a has_a relationship between each adjacent entity. |
Consider the following:
Its input specification consists of three tokens:
  | {noun}: An entity type token |
  | 's: A literal token |
  | {noun}: An entity type token |
Its output specification is as follows:
  | It is a value output specification |
  | $1 is the object |
  | $2 is the property |
  | The relationship $1 has_a $2 must exist |
  | It references the value: object.Get(property) |
When a transformation is applied to a list of words, words that get transformed are replaced with either an entity or property. The thing which starts out as a list of words and gets transformed through various intermediate forms is called a fragment.
Glossary
  | Token: A general term used to mean "a discrete part". Some things, such as fragments and properties, consist of a list of tokens. |
  | Literal token: A literal token is usually just a word, but can also represent such things as 's. |
  | Entity token: A token that represents an entity. For example, person, age, and car are entities. |
  | Property token: The purpose of a property token is to refer to a property of an entity. For example speaker.age refers to the age of the person who is speaking/writing. Property tokens themselves consist of a list of tokens. In the previous example, speaker and age are both entity tokens. Properties in general can also contain fragments, which if present, are surrounded by two sets of round brackets. Finally, properties used within transformation output specifications can contain variable tokens. |
  | Fragment: In the most basic sense, a fragment is a list of words. But as transformations are applied to fragments, words (literal tokens) are replaced with other types of tokens: property tokens or entity tokens. The input specification of a transformation is like a fragment but has different restrictions on which types of tokens is can contain. ie. They may contain entity type tokens, but cannot contain property tokens. |
Formatting specifics
  | Entity type tokens, which are found in transformation input specifications, are surrounded by curly braces. |
  | When a literal token is replaced with an entity token, the entity is surrounded by square brackets. |
  | When one or more tokens are replaced with a property token, the property is surrounded by square brackets. |
  | If a property token contains a fragment, the fragment is surrounded by two sets of round brackets. |
|
|
 |