SemRep Gold Standard Annotation

In early 2011, we conducted a gold standard annotation study in which we annotated 500 sentences randomly selected from MEDLINE abstracts with semantic predications. The results are mainly intended to serve as an evaluation testbed for SemRep. They can also be used by other information extraction systems based on UMLS domain knowledge. The study consisted of three phases: a) practice phase b) main annotation phase and c) adjudication phase.

Here, we present two sets of annotations from the main phase as well as the adjudicated gold standard. For further details, refer to our BMC Bioinformatics paper "Constructing A Semantic Predication Gold Standard from the Biomedical Literature" or contact Halil Kilicoglu.

To access the SemRep Gold Standard Annotation files, you must have accepted the terms of the UMLS Metathesaurus License Agreement, which requires you to respect the copyrights of the constituent vocabularies and to file a brief annual report on your use of the UMLS. You also must have activated a UMLS Terminology Services (UTS) account. For information on how we use UTS authentication please select here.

For details of the licenses see the UMLS Metathesaurus License Agreement and How to License and Access the Unified Medical Language System (UMLS) Data.

Available Files:

Annotator A: Main Phase XML fileAnnotator A: Main Phase (main_A.xml) (1.3 mb)

Annotator B: Main Phase XML fileAnnotator B: Main Phase (main_B.xml) (1.4 mb)

Annotator C: Adjudication XML fileAnnotator C: Adjudication (adjudicated.xml) (1.4 mb)

DTD fileDTD file (annotations.dtd) (1.8 kb)