RealPro is a text generation "engine" that performs syntactic realization — i.e., the transformation of abstract syntactic specifications of natural language sentences (or phrases) into their corresponding surface forms. It supports multiple languages and levels of linguistic representation, with performance suitable for real-world applications.

In applications such as machine translation, text generators need to be able to handle a wide variety of different inputs, and produce fluent text for each one. Whether the input is at the semantic, conceptual, or phrasal level, there are often simply too many syntactic rules that need to be taken into account for a simple phrase concatenation-based approach to be feasible. This is in contrast to applications such as data summarization, where the system designer has more control over the syntactic variety that will be required in generated text, and a template-based approach is often practical.

RealPro provides a grammar rule engine that can generate text from sophisticated, multi-level linguistic representations. The abstraction it provides makes it easy to generate many syntactic variants of the same semantic content on demand — unlike with template-based approaches, where the combinatorics of generating multiple syntactic variants quickly becomes unmanageable.


  • Java- and XML-based implementation provides high performance and easy integration with other application components.
  • Includes a general-purpose English lexicon and grammar, suitable for many applications.
  • Syntactic specifications are based on the deep-syntactic structures of Meaning-Text Theory (predicate-argument and predicate-modifier structures).
  • Can be customized to generate different languages.
  • Integration with Exemplars allows authors to mix and match "deep" generation with template-based approaches.


CoGenTex has used RealPro in these applications and research projects:

  • ModelExplainer — generating textual descriptions of object-oriented data models
  • MeteoCogent — generating weather forecasts from meteorological data
  • LIDA — developing domain models by automatically processing "legacy documents"
  • DesignExpert — advising software engineers on non-functional requirements


  • RealPro General English Grammar User Manual. [Acrobat, 88 Kb]
  • Lavoie, Benoit; and Rambow, Owen (1997). A Fast and Portable Realizer for Text Generation Systems. In Proceedings of the Fifth Conference on Applied Natural Language Processing, Washington, DC. [Acrobat, 183 Kb] [PostScript, 144 Kb]