MedLEE - A Medical Language Extraction and Encoding System


Welcome to MedLEE's Home Page!


The goal of MedLEE is to extract, structure, and encode clinical information in textual patient reports so that the data can be used by subsequent automated processes. This page describes how to use the most recent demonstration version of MedLEE on the Web. MedLEE was created by Carol Friedman in collaboration with the Department of Biomedical Informatics at Columbia University, the Radiology Department at Columbia University, and the Department of Computer Science at Queens College of CUNY.


Acknowledgments: This demonstration has been made possible by grants LM008635, LM07659, LM06274, and LM05397 from the National Library of Medicine and The Columbia Center for Advanced Technology, supported by the New York State Science and Technology Foundation.


START THE SYSTEM


Instructions

The Web version of MedLEE is for demonstration purposes only, and does not represent the complete version of MedLEE or the most recent version. The live MedLEE demonstration has been disabled. Visitors can still view a static demonstration. Requests for licenses to use MedLEE for either academic or commercial purposes should be addressed to:

Donna See
Science & Technology Ventures
Columbia University
dks26@columbia.edu

This Demo contains sample reports for the domains of discharge summaries ("D/C"), radiographic reports of the chest ("CXR") and mammography reports ("Mammo"). If you select a sample report, the report will appear in the text window. The user may select the following types of parameters or use the default values:

choose one of indented (default), hl7, markup, nested, xml or line.

Markup highlights certain information in the original report; clinical conditions are shown in red, medications in green, and procedures in blue.

Hl7 is used to upload data to the CIS patient database. This form contains MED Codes and also shows the corresponding MED terms.

Line consists of nested lists; it is a general form showing the structured extracted information before encoding. This form of output is useful as input to other automated applications.

Indented is useful for presenting data to the user in the most readable form; it shows a single finding on a new line; subsequent indented lines show modifiers of the finding.

XML consists of XML output.

choose one of best (default), mode1, mode2, mode3, mode4, or mode5. Mode 1 is the most accurate and mode 5 is the least accurate, but results in the highest recall. Mode "best" tries to get a parse using the most accurate mode first; if that mode is not successful, the next most accurate mode is tried, etc.


Carol Friedman, Ph.D. <friedman@dbmi.columbia.edu >

Created: October 19, 1995.
Revised: November 17, 2006.