Parsing numbers
September 6, 2008

After some thought, here is a strategy for parsing numbers.

Step 1: Word mappings and entity types

The following word mappings and entity types are required:

'one' -> 1: digit
'two' -> 2: digit
'three' -> 3: digit
'four' -> 4: digit
'five' -> 5: digit
'six' -> 6: digit
'seven' -> 7: digit
'eight' -> 8: digit
'nine' -> 9: digit

'eleven' -> 11: teen_number
'twelve' -> 12: teen_number
'thirteen' -> 13: teen_number
'fourteen' -> 14: teen_number
'fifteen' -> 15: teen_number
'sixteen' -> 16: teen_number
'seventeen' -> 17: teen_number
'eighteen' -> 18: teen_number
'nineteen' -> 19: teen_number

'twenty' -> 20: group_of_ten
'thirty' -> 30: group_of_ten
'forty' -> 40: group_of_ten
'fifty' -> 50: group_of_ten
'sixty' -> 60: group_of_ten
'seventy' -> 70: group_of_ten
'eighty' -> 80: group_of_ten
'ninety' -> 90: group_of_ten

'hundred' -> 100: multiplier
'thousand' -> 1000: multiplier
'million' -> 1000000: multiplier
'billion' -> 1000000000: multiplier
'trillion' -> 1000000000000: multiplier
'quadrillion' -> 1000000000000000: multiplier

number_part
digit is_a number_part
teen_number is_a number_part
group_of_ten is_a number_part
100+_number_part is_a number_part
100+_number_part is_a number

Step 2: New transformation type

The first step is to introduce a new transformation type which evaluates a numerical formula. For example:

{group_of_ten} {digit} -> # $1 + $2

The # prefix indicates that the transformation's output specification is a numeric formula.

In addition, it would be helpful to be able to specify as a part of any transformation, what the result's entity type should be considered. For example:

{group_of_ten} {digit} -> # $1 + $2 (number_part)

Step 3: Transformations

{group_of_ten} {digit} -> # $1 + $2 (number_part)
{number_part} {multiplier} -> # $1 * $2 (100+_number_part)
{multiplier} {number_part} -> # $1 + $2 (100+_number_part)
{100+_number_part} {100+_number_part} -> # $1 + $2 (100+_number_part)
{100+_number_part} {number_part} -> # $1 + $2
{100+_number_part} and {number_part} -> # $1 + $2
{number} + {number} -> # $1 + $2
{number} * {number} -> # $1 * $2