LIDA (Linguistic Assistant for Domain Analysis)

LIDA (LInguistic Assistant for Domain Analysis) helps analysts to develop object-oriented models of a domain, using a subset of UML. In order to develop such models, the requirements analyst or knowledge engineer often needs to analyze large volumes of text from "legacy documents" — these might include user manuals of legacy systems, company policies, use cases, or transcripts of interviews with domain experts. LIDA facilitates this analysis by compiling a list of the words and multi-word terms in a document, and providing a graphical interface for the user to mark them as corresponding to elements of a model. It also lets the user validate models as they are created, through integration with our ModelExplainer tool, which generates textual descriptions of a model.


  • State-of-the-art linguistic processing is used to group different forms of the same base word together, to determine part of speech (noun, verb, adjective), and to detect multi-word terms.
  • The full text, word lists, and evolving UML model are displayed in parallel, letting the user compare different views.
  • Words and multi-word terms can be assigned a type in the model (Class, Attribute, Role, etc.) with the click of a button. The corresponding strings are color-coded in the text display and graphically displayed in the Modeler window.
  • Words and multi-word terms can be sorted alphabetically, by frequency, by part of speech, or by assigned type.
  • KWIC (Keywords In Context) view displays only those sentences containing a chosen word or group of words.
  • Completed models can be exported to visual modelling tools such as Visio.
  • Future plans include import capabilities from modeling tools, drag-and-drop model construction, and automatic extraction of model candidates from structured use cases.


  • Overmyer, Scott; Lavoie, Benoit; and Rambow, Owen (2001). Conceptual Modeling through Linguistic Analysis Using LIDA. In Proceedings of 23rd International Conference on Software Engineering (ICSE 2001), Toronto, Canada. [Acrobat, 250 Kb]