Rephetio: Repurposing drugs on a hetnet [rephetio]

Development and evaluation of a crowdsourcing methodology for knowledge base construction: identifying relationships between clinical problems and medications

Extracting indications from the ehrlink resource

ehrlink is our name for a study where an EHR system prompted clinicians to report the problem that a medication was prescribed for [1]. The resulting high-confidence set contained 11,166 problem-medication pairs with precision exceeding 95%. Thus far, the comments pertaining to ehrlink have been scattered, so this discussion is meant to consolidate and provide a home for further analysis.

Here is the history of this collaborative integration effort:

  1. @b_good initially suggested the resource and located the data supplement.
  2. @dhimmel converted the pdf data supplement to a tsv file (comment, notebook, download).
  3. @dhimmel determined the identifiers were not from a standard terminology
  4. @allisonmccoy joined the discussion, confirming the proprietary identifiers and providing additional related studies.
  5. @allisonmccoy and @TIOprea discussed the reliability of the resource.
  6. @alizee mapped the medication terms from ehrlink to RxNorm (comment, repository).
  7. @dhimmel mapped the RxNorm concepts matched by @alizee to RxNorm ingredients (comment, notebook, download).

Mapping ehrlink diseases to the DO

The ehrlink high-confidence set contains indications for 1,596 problems (download). We used a simplistic string matching scheme to map these terms to the disease ontology. Lowercase ehrlink problem names were matched to lowercase DO names and synonyms (notebook, results).

22.9% = 365 / 1596 of the ehrlink problems mapped to the disease ontology. Of the 137 DO slim terms, 50 had a matching ehrlink problem. When we include propagated matching to DO slim terms, 5 additional diseases get matched. While these recall numbers appear low, we do recover a decent extent of the major complex diseases with few to no false positives.

Mapping ehrlink to DO and RxNorm ingredient terms

We created a version of ehrlink with the subset problem-medication pairs that mapped to standardized terminologies (notebook, download). We converted problems to DO terms (see above). Then we converted medications to RxNorm concepts, using the mapping produced by @alizee. We excluded any RxNorm matches with score < 55 as errors were observed below this threshold. Overall, the RxNorm approximateTerm function of the API performed impressively. Next we converted RxNorm concepts into their active ingredients and restricted to single-ingredient medications.

33.3% = 3719 / 11166 of the original problem-medication pairs successfully mapped to an ingredient and DO term. Users should take note that our mapping procedure was motivated by precision and automation, rather than recall.

Join to Reply
Status: Completed
Referenced by
Cite this as
Daniel Himmelstein (2015) Extracting indications from the ehrlink resource. Thinklab. doi:10.15363/thinklab.d62

Creative Commons License