## Predictions of whether a compound treats a disease

Project Rephetio has finally reached its prediction stage. A brief recap, we created Hetionet v1.0 — an integrative network with 2,250,197 relationships of 24 types. Then we extracted features from the network to quantify the prevalence of specific path types between each compound and disease. Finally, we fit a model to translate from network-based features to a probability of treatment for a given compound–disease pair.

In total, we make predictions for 209,168 compound–disease pairs between 1,538 approved small molecule compounds and 136 complex diseases. Our model was trained on 755 disease-modifying indications (treatments) from PharmacotherapyDB v1.0. Hence, our predicted probabilities assume a true prevalence of treatment of 0.36%. Note that in reality the prevalence of treatment is higher and thus are predicted probabilities are on the low side.

Our top predictions (probability > 1%) are available in this online spreadsheet. The full set of predictions is available as TSV file.

Our model relied heavily on the following features which positively influence the prediction. Each feature corresponds to a type of path:

• CbGbCtD: whether the compound binds to the same genes as compounds which treat the disease
• CbGaD: whether the compound binds to genes that are associated with the disease
• CiPCiCtD: whether the compound belongs to the same pharmacologic classes as compounds that treat the disease
• CrCtD: whether the compound chemically resembles compounds that treat the disease
• CtDrD: whether the compound treats diseases which resemble the disease
• CrCrCtD: whether the compound resembles compounds that resemble compounds that treat the disease.
• CcSEcCtD: whether the compound causes the same side effects as compounds that treat the disease
• CpDpCtD: whether the compound palliates the same diseases as compounds that treat the disease
• CbGpPWpGaD: whether the compound binds to genes that participate in the same pathways as genes associated with the disease
• CbGeAlD: whether the compound binds to genes that are expressed in the anatomies affected by the disease

We are working on a method for investigating the specific network paths supporting each prediction. Stay tuned and let us know what you think of the predictions.

Daniel Himmelstein Researcher

# Assessing prediction performance

We've assembled four sets of indications to assess our predictions on:

• Disease Modifying — the 755 disease modifying treatments in PharmacotherapyDB v1.0. These indications are included in the hetnet as treats edges and used to train the logistic regression model. Due to edge dropout contamination and self-testing, overfitting could potentially inflate performance on this set. Therefore, for the three remaining indication sets, we remove any observations that were positives in this set.
• DrugCentral — We discovered the DrugCentral database after completing our physician curation for PharmacotherapyDB. This database contained 210 additional indications. While we haven't curated these indications yet, we observed a high proportion of disease modifying therapy.
• Clinical Trial — We compiled indications that have been investigated by clinical trial from ClinicalTrials.gov. This set contains 5,594 indications.
• Symptomatic — 390 symptomatic indications from PharacotherapyDB. These edges are included in the hetnet as palliates edges.

## Visualizing performance

The above figure assesses how well our predictions prioritize four sets of indications. A) The y-axis labels denote the number of indications (+) and non-indications (−) composing each set. Violin plots with quartile lines show the distribution of indications when compound–disease pairs are ordered by their prediction. In all four cases, the actual indications were ranked highly by our predictions. B) ROC Curves with AUROCs in the legend. C) Precision–Recall Curves with AUPRCs in the legend.

A variant of Panel A above designed to be more familiar to the biologist shows where the positives lie:

Daniel Himmelstein Researcher

# Announcing Prediction Browsing and Visualization

The Project Rephetio Browser is live at het.io/repurpose letting you browse 209,168 drug repurposing predictions. Users can navigate by compound or by disease. Each prediction can be investigated in the Hetionet Browser — a read only Neo4j database available at neo4j.het.io that we just released.

For each prediction, we created a guide for the Neo4j Browser to provide additional information. Details include a query to visualize the top ten paths contributing to a prediction as well as links to clinical trials investigating a compound–disease pair.

## Seeking feedback

We'd love to hear what people think about our predictions. Are they reasonable? Are they interesting? Where do we do well? Where do we do poorly?

We're especially interested in a few examples that showcase our approach. So please share any compound–disease pairs where our approach did a good job capturing the relevant pharmacology and disease pathophysiology.

Paging our contributors with clinical or pharmacological expertise: @pouyakhankhanian @chrissyhessler @TIOprea @cknoxrun @mkgilson @alexanderpico @allisonmccoy @ritukhare. All feedback is welcome!

Chrissy Hessler Researcher

Great work! I have just started familiarizing myself with this database so I apologize if I misunderstand the terminology.

I notice that in migraine, amitriptyline does not have a high predicted probability of treating migraine (0.013). Yet this is a medication often used for migraine, similar to nortriptyline, which scores higher (0.086).

Am I understanding this correctly? If so, why does amitriptyline not score highly?

Chrissy Hessler Researcher

A small point- i think we should use the term "epilepsy" rather than "epilepsy syndrome"- the word syndrome implies that the epilepsy is part of a constellation of associated features. Perhaps Ari and Pouya can weigh in.

Daniel Himmelstein Researcher

I think we should use the term "epilepsy" rather than "epilepsy syndrome"

We rely on the Disease Ontology for coding diseases [1, 2]. I'm hesitant to deviate from the Disease Ontology names because it adds an extra layer of complexity and upkeep. Instead, I relayed your request to the DO Issue Tracker, as we've done in the past [3].

Presently, we also have issues with some compound names that are ALL CAPS due to upstream issues in DrugBank. If many people feel strongly about these names, I'll reconsider fixing them on our end.

Daniel Himmelstein Researcher

# Amitriptyline versus nortriptyline for treating migraine

I notice that in migraine, amitriptyline does not have a high predicted probability of treating migraine (0.013). Yet this is a medication often used for migraine, similar to nortriptyline, which scores higher (0.086).

@chrissyhessler, great question regarding our migraine predictions.

## Amitriptyline

We can use the Neo4j Browser to investigate the amitriptyline–migraine prediction (follow this link and press the play button). I've copied the first paragraph of the guide:

Project Rephetio predicted a probability of 1.390% that Amitriptyline (DB00321) treats migraine (DOID:6364). This probability represents a 2.85-fold enrichment over the background prevalence of treatment. This prediction is in the 97.8th percentile for Amitriptyline and the 93.9th percentile for migraine.

So one thing to note is that while the probability is low, amitriptyline comparatively scores highly. Our algorithm bases its probabilities around a positive prevalence of 0.36%, since there are 755 known treatments out of 209,168 possible compound–disease pairs. Therefore, while 1.39% sounds low, it's actually a 2.85-fold enrichment. Now let's look into the top 10 paths supporting that amitriptyline treats migraine (by executing the Cypher query in the guide):

Amitriptyline–migraine is actually a cool example because it's not overwhelmingly supported by a single path. The 10th most influential path above (amitriptyline–binds–HTR1A–associates–migraine) still provides 3.53% of the total support (as shown in the second slide of the guide).
29.7% of the support is due to CpDpCtD paths (as shown in the third slide), which means amitriptyline palliates the same diseases as compounds that treat migraine. 29.0% of the support is due to CbGaD paths: amitriptyline binds to migraine-associated proteins. 16.7% of support is due to CbGbCtD paths: amitriptyline binds to similar proteins as compounds that treat migraine. 9.1% of support is due to CrCrCtD paths: amitriptyline is chemically similar to a compound that is chemical similar to a migraine treatment. The list goes on with support due to pathways (CbGpPWpGaD), side effects (CcSEcCtD), and tissue specificity (CbGeAlD).

## Nortriptyline

If you already have the Neo4j Browser open, you can play the nortriptyline–migraine guide by running :play https://neo4j.het.io/guides/rep/DB00540/DOID_6364.html. The overview and 10 most supportive paths are copied below:

Project Rephetio predicted a probability of 8.615% that Nortriptyline (DB00540) treats migraine (DOID:6364). This probability represents a 22.86-fold enrichment over the background prevalence of treatment. This prediction is in the 100.0th percentile for Nortriptyline and the 99.4th percentile for migraine.

Notice that many paths are the same for nortriptyline and amitriptyline. Both compounds target serotonin receptors that are associated with migraine. Both compounds share symptomatic uses that support their efficacy against migraine. Now the one major difference is the nortriptyline–resembles–amitriptyline–treats–migraine path, which provides 24.2% of the total support for the nortriptyline prediction.

Why doesn't amitriptyline draw support from a amitriptyline–resembles–nortriptyline–treats–migraine path? PharmacotherapyDB doesn't contain an indication between nortriptyline and migraine but does contain an indication between amitriptyline and migraine. While we ignore the amitriptyline–migraine treatment when extracting paths for the amitriptyline–migraine prediction, we do not ignore this treatment when extracting paths for the nortriptyline–migraine prediction.

Hence, @chrissyhessler stumbled upon an interesting example of how missing knowledge in Hetionet combines with our machine learning approach to mediate predictions.

Daniel Himmelstein Researcher

# Bupropion for nicotine dependence

I was looking into several historical instances of repurposing (compiled from a 2013 article [1]). One example that caught my eye was bupropion for nicotine dependence. Here's some historical context from a 2008 review [2]:

Bupropion was developed as an antidepressant for the treatment of major depressive disorder in 1989 as a thrice-daily immediate release formulation [3]. In 1996 a twice daily sustained release (SR) formulation was produced and, in 1997 the smoking cessation properties were first noticed in the United States of America (US) [4]. Following evaluation as an anti-smoking agent [5], it became licensed as an aid to smoking cessation and is now a recognised first line antismoking agent in both the UK and US.

The following quote from a 2012 article speaks to the serendipity of this repurposing [6]:

The ability of bupropion to facilitate smoking cessation was discovered serendipitously when it was shown to decrease cigarette consumption in depressed patients. The precise mechanism through which bupropion facilitates abstinence is unclear but nevertheless raises the intriguing possibility that other FDA-approved medications could likewise facilitate smoking cessation through novel mechanisms of action.

I looked into our prediction for bupropion and nicotine dependence (see in browser):

Project Rephetio predicted a probability of 1.265% that Bupropion (DB01156) treats nicotine dependence (DOID:0050742). This probability represents a 2.50-fold enrichment over the background prevalance of treatment. This prediction is in the 96.3th percentile for Bupropion and the 99.5th percentile for nicotine dependence.

While the predicted probability is not exceptional, bupropion was in the top 99.5th percentile (ranked 9th) for nicotine dependence. The following diagram shows the 10 paths providing the most support for this prediction:

Our approach picks up that bupropion causes terminal insomnia as does varenicline — the only other FDA-approved smoking cessation compound. Indeed, insomnia is one of the primary side effects of bupropion [7, 2] and is also a documented side effect of varenicline [8]. This shared side effect could underlie a common mechanism of action.

## Top predictions for nicotine dependence

Our top 3 predictions for treating nicotine dependence (or smoking cessation) are nicotine, cytisine, and galantamine. Nicotine is included in PharmacotherapyDB as a symptomatic indication and matched 86 clinical trials. Cytisine is an encouraging prediction because it's shown efficacy in trial [9, 10] and has been used by smokers for decades as a well-tolerated and low-cost cessation therapy [11]. There is currently an ongoing clinical trial at the University of Pennsylvania for galantamine and quitting smoking. The trial has recently shown positive early results [12].

Daniel Himmelstein Researcher

# Predictions for decitabine

Decitabine (DB01262) is a treatment for myelodysplastic syndromes — which are not included in Hetionet v1.0 as a disease. In Europe, decitabine was approved for acute myeloid leukemia (AML) in 2012, but Hetionet v1.0 does not contain any indications for this drug. Therefore, all of decitabine's predictions are novel from the perspective of the network.

The top prediction at 16.5% is hematologic cancer, a supertype of AML. This prediction draws support from a large number of paths. Decitabine binds DNMT3A and DCK, two hematologic cancer associated genes. Furthermore, several treatments for hematologic cancer — including Pentostatin, Clofarabine, Nelarabine, Cytarabine — belong to the "Nucleic Acid Synthesis Inhibitors" pharmacologic class along with decitabine. And finally, decitabine chemically resembles the four other compounds mentioned above.

Other top cancer predictions for decitabine include the lymphatic system (4.87%), stomach (2.94%), breast (2.74%), pancreas (1.76%), bladder (1.71%), lung (1.48%), and kidney (1.47%). All of these cancers besides the stomach have clinical trials for decitabine. Below we visualizing the most supportive paths for lung cancer. Most of the support comes from decitabine's similarity to lung-cancer-therapy gemcitabine. Our method also picked up that decitabine's target genes DNMT1 and DNMT3A are expressed in relevant anatomies for lung cancer. However, we really only care about the tissue-specificity of the genes contained within potentially mechanistic paths, such as DCK below, which is required to enzymatically activate decitabine [1]. Therefore, a future direction could be to engineer meta-patterns (rather than paths) that capture the benefits of tissue-specificity [2].

The fifth ranking prediction, however, is not a cancer but instead multiple sclerosis (MS). The MS prediction is based on decitabine's resemblance to cladribine — an efficacious MS treatment that was never approved due to safety concerns [3, 4, 5], although risks may have been overstated [6] and regulatory review has recommenced. Additional support for treating MS with decitabine came from azathioprine — a longstanding and economical MS therapy [7, 8] — also being a nucleic acid synthesis inhibitor. Supporting this prediction, a 2014 abstract reported that decitabine completely blocked symptoms in EAE (a preclinical model of MS) [9]. The decitabine–MS prediction is visualized below:

Daniel Himmelstein Researcher

# Predictions for trifluridine

Trifluridine (DB00432) is an antiviral drug most commonly applied via solution to the eye for treatment of herpes simplex virus (HSV). Trifluridine does not contain any indications in Hetionet v1.0. Here are the top five predictions for trifluridine:

NamePredictionCompound PctlDisease PctlVisualize
acquired immunodeficiency syndrome14.4%100%99.4%browser
stomach cancer13.1%99.3%99.9%browser
hepatitis B5.37%98.5%99.5%browser
hematologic cancer3.19%97.8%96.0%browser
multiple sclerosis2.18%97.1%98.2%browser

HSV isn't included as a disease in Hetionet v1.0, yet Project Rephetio picked up on the antiviral powers of trifluridine and suggested treatment of acquired immunodeficiency syndrome (AIDS) and hepatitis B (HBV). The AIDS prediction (visualized below) is supported by the five existing AIDS therapies that are also nucleoside analogs. The similarity between trifluridine and stavudine/zidovudine is further supported by chemical resemblance and in the case of zidovudine a common binding to thymidine kinase-1 (TK1 gene). Thymidine kinase-1 phosphorylates trifluridine into its active form, trifluridine monophosphate [1].

Several of the top predictions for trifluridine were cancers. It turns out that in 2015, the FDA approved a combination of trifluridine and tipiracil — code named TAS-102 and trade named Lonsurf — to treat advanced colorectal cancer after it increased median survival from 5.3 to 7.1 months in Phase 3 trial [2].

The therapeutic effect of TAS-102 on colorectal cancer is due to the cytotoxicity of trifluridine monophosphate [3]. Cytotoxity occurs to thymidylate synthase inhibition and DNA dysfunction [4, 5]. The addition of tipiracil improves the bioavailability of trifluridine by inhibiting thymidine phosphorylase which degrades trifluridine. Colon cancer was eighth highest prediction for trifluridine at 1.5% (99.0th percentile for colon cancer) and is visualized below:

The prediction is supported by other colon cancer treatments — capecitabine and fluorouracil — also being nucleic acid synthesis inhibitors. Furthermore, trifluridine binds to thymidylate synthetase (TYMS gene) and thymidine phosphorylase (TYMP gene), which are respectively responsible for its efficacy and short half-life as explained above. Our method picks up that TYMP is expressed in the endothelium. A pharmacologist could interpret this knowledge to infer that thymidine is necessary to prolong bioavailability when delivering trifluridine to endothelial cancers. Also important for identifying which cancers trifluridine will treat best is the tissue-specific expression of TYMS and TK1. In fact, a recent study found stomach cancers (our top cancer prediction for trifluridine) highly express both the TYMS and TK1 proteins [6]. We can query Hetionet for to see which stomach-cancer-affected tissues upregulate thymidylate genes:

MATCH path = (compound:Compound)-[:BINDS_CbG]-(gene:Gene)-[:UPREGULATES_AuG]-(:Anatomy)-[:LOCALIZES_DlA]-(disease:Disease)
WHERE
compound.name = 'Trifluridine' AND
gene.description CONTAINS 'thymid' AND
disease.name = 'stomach cancer'
RETURN path

Finally, trifluridine was also predicted to treat multiple sclerosis. The mechanisms underlying this prediction are similar to decitabine. Trifluridine resembles cladribine. Furthermore, like the MS-treatment azathioprine, trifluridine is a nucleic acid synthesis inhibitor and nucleoside analog. Additionally, the immunosuppressant methotrexate is an effective therapy for MS [7]. While the exact mechanism of methotrexate's efficacy against MS remains elusive [8], our prediction highlights the potential involvement of thymidylate synthetase (TYMS).

In conclusion, Project Rephetio identified three categories of disease — neoplastic, autoimmune, and viral — where trifluridine may exhibit efficacy. While little literature existed which probed the non-cancer applications, this appears to be an area of ongoing and relentless discovery.

Pouya Khankhanian Researcher

In seeking other examples, I perused somewhat randomly the top predictions of three diseases that I'm familiar with. I think it would be a very difficult process to evaluate diseases that I'm not familiar with.

Epilepsy:
The top ~45 indications, all have score greater than 10. The distribution of scores seems much higher than in other diseases.
scores 50-60 — 4 indications
scores 40-50 — 8 indications
scores 30-40 — 8 indications
scores 20-30 — 9 indications
scores 10-20 — 14 indications
Of the top ~45, all but 2 are clinically classified as "anti-epileptics" so that's excellent! There are three of interest:
– Sevoflurane is not known as an "anti-epileptic". But sevoflurane is an Anesthetic with rather incompletely known mechanism of action. At least a handful of other anesthetics have been repurposed for epilepsy. The mechanism of action could involve GABA receptors.
– Amitriptyline (score 25, 98.4 percentile) is actually relatively contraindicated in epilepsy, it can worsen epilepsy. potentially interesting to explore the mechanism why this drug was picked.
– Tiagabine (score 28, 98.6 percentile) I believe was mis-categorized as a "NOT" by all three reviewers, sadly. I think should have been a DM based on the rules.

Multiple Sclerosis:
The distribution of scores is much lower.
Scores 10-15 — 3 indications
Scores 5-10 — 6 indications
Of these top 9, I think 7 are steroids so that makes sense. Here are the others:
– Clofarabine. It's a chemotherapy for cancers of leukocytes. Chemotherapies for leukocyte cancers have been previously repurposed for MS (like Rituximab and Ocrelizumab for example)
– Pemetrexed is a chemotherapy for small cell lung cancer (among others). Chemotherapies as a large class have been previously repurposed for MS. .

Migraine:
Scores 30-40 — 1 indication
Scores 20-30 — 1 indication
Scores 10-20 — 6 indications
Scores 5-10 — 14 indications
– Oxcarbazepine, top score 33, was an outlier score, but as we have seen above the score is often less relevant than the rank. Still, I looked further into it, apparently people did believe in it (it is of a class that had already been successfully repurposed for migraine), but it failed after multiple trials.
All the drugs with score >5 are of known drug classes that have been successfully repurposed for migraine (anti-epileptics, neuroleptics, TCAs, B-blockers, SSRI).

In summary, the top scores come from drugs that are good candidates because they belong to a drug class that has already been repurposed for a given indication. I have not dug deep enough to find a "novel idea" (a repurposing suggestion for a drug that is not already in a class of drugs that has been previously repurposed successfully). In searching for a novel idea, I can think of two approaches
1. delve deeper into lower scores and lower percentiles in each disease, specifically scores less than 5, and percentiles less than 95%.
2. look for high-score / high-percentile "novel ideas" in all the other diseases. This would require massive user input (Ari, Chrissy, and I working together would probably require like 30 solid days of work). But it could be automated.

Daniel Himmelstein Researcher

# Epilepsy

@pouyakhankhanian, that's exciting to hear our top epilepsy predictions (above 10%) had a precision of 95.5% (43 / 45). Epilepsy seems to be an interesting disease because despite the many established therapies, there's still a large portion of patients with seizures that are uncontrolled by current medications [1].

## Tiagabine–Epilepsy

Tiagabine was classified as a non-indication in PharmacotherapyDB v1.0, but is used to treat epilepsy [2]. However, in some settings tiagabine may also trigger seizures [3]. This example brings up an interesting consideration: many anti-epileptic drugs may also be epileptogenic (epilepsy inducing) depending on the context [4].

## Amitriptyline–Epilepsy

@pouyakhankhanian brought up that amitriptyline is contraindicated for epilepsy. Accordingly, several studies suggest that amitriptyline causes seizures [5, 6, 7]. Our approach found the following support for amitriptyline treating epilepsy:

• 21.5% of the prediction resulted from: Amitriptyline–treats–Migraine–resembles–Epilepsy
• 15.2% of the Amitriptyline–resembles–Oxcarbazepine–treats–Epilepsy
• Many other paths of the following metapaths also contributed: CbGaD, CpDpCtD, CrCrCtD, CbGbCtD

It's likely that our method cannot fully differentiate between indications and contraindications. In other words, certain paths may increase the probability of indication and contraindication similarly and our predictions may also enrich for detrimental therapies.

## Sevoflurane–Epilepsy

Regarding the prediction that sevoflurane treats epilepsy, this is the top prediction for sevoflurane and is in the 97.3rd percentile for epilepsy. Here are the top ten paths supporting this prediction:

However contrary to our prediction, I did find some evidence that sevoflurane could be epileptogenic [8, 9, 10]. However, potentially a bigger concern is whether sevoflurane has appropriate pharmacokinetics for epilepsy. According to DrugBank (DB01236), sevoflurane does cross the blood–brain barrier which is important. However, sevoflurane is volatile and administered via inhalation. Now, this may not always be a disqualifying factor. For example, some pharmaceutical companies are more interested in "drug repositioning aided by reformulation" than solely repositioning [11]. However, in this case, I just don't know.

Daniel Himmelstein Researcher

# Clofarabine

Clofarabine (DB00631) is a purine nucleoside analogue, which was approved in 2004 for special cases of paediatric leukaemia [1, 2, 3] Our top prediction for clofarabine at 18.5% is hematologic cancer — a superterm of its approved indication, which is included in the network as a disease-modifying indication. At 10.2% the second prediction is lymphatic system cancer, which has been investigated by 14 trials.

The third prediction is multiple sclerosis (MS) at 8.80%, representing a 24-fold enrichment over the null probability. MS is the only non-cancer prediction for clofarabine in the top 10. Furthermore, clofarabine is the fourth highest MS prediction with the three higher predictions corresponding to known disease-modifying therapies. Hetnet support for the prediction is shown below:

Clofarabine is a hybrid of cladribine and fludarabine [2, 4] as detected by Hetionet's chemical resemblance relationships. Cladribine showed promising phase 3 results and was approved in Australia and Russia before being withdrawn due to safety concerns [5, 6] (see more discussion on cladribine for MS above. Our method picked up on the similarities between cladribine and clofarabine, both in terms of structure and targets. A recent phase II add-on study, found that fludarabine may be have greater efficacy against MS than methylprednisolone [7]. In addition, clofarabine is a nucleic acid synthesis inhibitor like azathioprine — an effective multiple sclerosis treatment [8]. Clofarabine relies on deoxycytidine kinase (DCK gene) for intracellular phosphorylation and activation [9]

While I didn't see any trials for clofarabine in MS, purine nucleoside analogues have had success in autoimmune disease [10]. Additionally, US Patent US7772206 claims the following application for clofarabine:

In a preferred embodiment, the invention encompasses a method for treating, preventing, or managing multiple sclerosis utilizing doses higher than 1 mg/kg per day, preferably higher than 1.25 mg/kg per day.

Pouya Khankhanian Researcher

The epilepsy discussion has been moved to:
https://thinklab.com/discussion/prediction-in-epilepsy/224#2

Daniel Himmelstein Researcher

# Clofarabine Synopsis

Above, I discussed our clofarabine predictions. I created a summary of those findings for our project report. However, in private communications @pouyakhankhanian noted:

Regarding clofarabine, my only issue is that it's very obvious. I suspect that if you ask ten clinicians about using clofarabine in MS, I suspect most would guess that clofarabine would work in MS patients. Perhaps you are looking for something obvious in order to validate your approach. But it's not something that will ever result in someone repurposing a drug based on your predictions.

Therefore, we've decided to remove this example from the draft report. To preserve the content, I will post it here.

Clofarabine is an FDA-approved treatment for certain cases of acute lymphoblastic leukemia (ALL) [1]. Our top prediction for clofarabine was hematologic cancer (50.27-fold over null), which is a supertype of ALL and listed as a disease-modifying indication in PharacotherapyDB. Second was lymphatic system cancer (27.14-fold), which matched 14 clinical trials. Third was multiple sclerosis (MS, 23.37-fold). Notably, MS was the only non-cancer in the top 10 predictions for clofarabine, and clofarabine was in the 99.8th percentile of MS predictions. As shown in Figure 5, our approach based this prediction on the successful repurposing of other chemotherapeutics for MS [2] — particularly cladribine [3] and azathioprine [4]. Besides a patent that encompasses treating MS with clofarabine, there is little literature on this potential repurposing, despite the favorable pharmacological properties of clofarabine compared to cladribine [5].

### Figure

Legend for figure above:

Evidence supporting the repurposing of clofarabine for multiple sclerosis. In total, 769 paths of 10 types provided positive support for repurposing clofarabine for MS. The ten most supportive paths are visualized. Several important aspects of clofarabine's pharmacology are illustrated. Clofarabine is a hybrid of cladribine and fludarabine and relies on deoxycytidine kinase (DCK) for phosphorylation into its active form [5]. Ultimately, the purine-analog-metabolites of clofarabine, cladribine, and azathioprine inhibit nucleic acid synthesis. Since the resulting cytoxicity is pronounced in lymphocytes, these compounds are attractive MS therapeutics [2].

Pouya Khankhanian Researcher

# Plot of prediction scores by disease

The y-axis is the prediction score (the predicted probability of treatment) for compounds in each disease. The 97th, 98th, and 99th percentile prediction scores in each disease are also shown. Allergic rhinitis, asthma, coronary artery disease, epilepsy, hematologic cancer, hypertension, osteoporosis, psoriasis, and type 2 diabetes are highlighted in red; these diseases had the highest prediction score tail distributions.

Status: Open
Views
298
Topics
Referenced by
Cite this as
Daniel Himmelstein, Chrissy Hessler, Pouya Khankhanian (2016) Predictions of whether a compound treats a disease. Thinklab. doi:10.15363/thinklab.d203