Objective. To develop and test a method for automatically detecting inconsistencies between the parent-child is-a relationships in the Metathesaurus and the ancestor-descendant relationships in the Semantic Network of the Unified Medical Language System (UMLS). Methods. We exploited the fact that each Metathesaurus concept is assigned one or more semantic types from the UMLS Semantic Network and that the semantic types are arranged in a hierarchy. We compared the semantic types of each pair of parent and child concepts to determine if the types "explained" the Metathesaurus is-a relationships. We considered cases where the semantic type of the parent was neither the same as, nor an ancestor of, the semantic type of the child to be "unexplained." We applied this method to the January 2002 release of the UMLS and examined the unexplained cases we discovered to determine their causes. Results. We found that 17,022 (24.3%) of the parent-child is-a relationships in the UMLS Metathesaurus could not be explained based on the semantic types of the concepts. Causes for these discrepancies included cases where the parent or child was missing a semantic type, cases where the semantic type of the child was too general or the semantic type of the parent was too specific, cases where the parent-child relationship was incorrect, and cases where an ancestor-descendant relationship should be added to the UMLS Semantic network. In many cases, the specific cause of the discrepancy cannot be resolved without authoritative judgment by the UMLS developers. Conclusions. Our method successfully detects inconsistencies between the hierarchies of the UMLS Metathesaurus and Semantic Network. We believe that our method should be added to the set of tools that the UMLS developers use to maintain and audit the UMLS knowledge sources.
All Science Journal Classification (ASJC) codes
- Health Informatics
- Computer Science Applications