A text meaning representation (TMR) is a language-neutral description (an interlingua) of the meaning conveyed in a natural language text. It is derived by syntactic, semantic, and pragmatic analysis of the text. Because the TMR is intended to be language neutral, it is also deliberately syntax neutral and avoids using terminology like clause, proposition, tense, etc., which are associated more closely with the syntactic structure of a particular language.
In addition to providing information about the lexical-semantic dependencies in the text, the TMR represents stylistic factors, discourse relations, speaker attitudes, and other pragmatic factors present in the discourse structure. In doing so, the TMR captures not only the meaning of individual elements in the text, but also the relations between those elements, and captures both propositional and nonpropositional components of textual meaning.
We have recently begun creating libraries of gold- and bronze-standard TMRs, which are text meaning representations that are automatically generated by the OntoSem analyzer then manually checked and, if needed, corrected by people. Whereas gold standard TMRs include all aspects of semantic and pragmatic analysis (including reference resolution, the interpretation of indirect speech acts, etc.) bronze standard TMRs are limited to lexical disambiguation and the establishment of the semantic dependency structure. Gold- and bronze-standard TMRs can be used as input to knowledge-based reasoning engines and statistical processing, and they can be used for the study of different language phenomena.
