 |


Principle of maximum entropyAugust 21, 2008
Something to investigate is the principle of maximum entropy which the CodePlex project relies on. Penn Treebank ProjectAugust 21, 2008
Here's an interesting project, called the Penn Treebank Project. The Penn Treebank Project annotates naturally-occuring text for linguistic structure. Most notably, we produce skeletal parses showing rough syntactic and semantic information -- a bank of linguistic trees. We also annotate text with part-of-speech tags, and for the Switchboard corpus of telephone conversations, dysfluency annotation. We are located in the LINC Laboratory of the Computer and Information Science Department at the University of Pennsylvania.Also: CC Coordinating conjunction RP Particle CD Cardinal number SYM Symbol DT Determiner TO to EX Existential there UH Interjection FW Foreign word VB Verb, base form IN Preposition/subordinate VBD Verb, past tense conjunction JJ Adjective VBG Verb, gerund/present participle JJR Adjective, comparative VBN Verb, past participle JJS Adjective, superlative VBP Verb, non-3rd ps. sing. present LS List item marker VBZ Verb, 3rd ps. sing. present MD Modal WDT wh-determiner NN Noun, singular or mass WP wh-pronoun NNP Proper noun, singular WP$ Possessive wh-pronoun NNPS Proper noun, plural WRB wh-adverb NNS Noun, plural `` Left open double quote PDT Predeterminer , Comma POS Possessive ending '' Right close double quote PRP Personal pronoun . Sentence-final punctuation PRP$ Possessive pronoun : Colon, semi-colon RB Adverb $ Dollar sign RBR Adverb, comparative # Pound sign RBS Adverb, superlative -LRB- Left parenthesis * -RRB- Right parenthesis *
* The Penn Treebank uses the ( and ) symbols, but these are used elsewhere by the OpenNLP parser. |
|
This is all stuff I need to get my head in to. Interesting language parsing article on CodePlexAugust 21, 2008
http://www.codeproject.com/KB/recipes/englishparsing.aspxThis is quite an interesting looking article about natural language parsing. older >>
|
|
|
|
|
 |