Inspiration

In drug discovery and drug repurposing, interactions between objects are significant. Graph (network) based inference can use and expose these interactions pretty efficiently. The graph-based inference is based on the topology of the graph. In addition to the topology, the attributes of objects can also play an important role in the analysis. As a result, benefiting from both the topological properties of the graph and data attributes can give the best results.

What it does

  • Search ANY string attribute inside the TigerGraph database easily.
  • Filter attributes and types on string search.
  • Search also in client-side data to highlight in the user interface.
  • Run 'Interpreted' and 'Installed' GSQL queries.
  • Inspect attributes of a vertex/edge.
  • Inspect attributes of multiple vertices and edges as a table.
  • Dynamically resize graph canvas.
  • Save and load data as JSON for visualization.
  • Summarize crowded neighborhoods with containers to make visualization readable
  • Go back and forward on the history of the graph.
  • Get Adamic-Adar values of a vertex to all other visible nodes to predict connections.
  • Get Jaccard similarity of a vertex to all other vertices in the whole database. Here Jaccard similarity is defined as (count of the intersection of 1-neighborhood) / (count of the union of 1-neighborhood).
  • Get inchi similarity of a Compound to all other visible Compound. Here similary of inchi values are calculated with edit distance algorithm.
  • Bring neighbors of a vertex from a specific type or all types.

How we built it

Firstly, we created dataset. To create dataset, we used Python and Jupyter notebook and pyTigerGraph library. Implementation of creating dataset, creating data schema and inserting to TigerGraph is available inside another repository called "derman". Derman also contains all the data as text inside a compressed file. So you can recreate your own database.

To generate the database We used the DRKG dataset. Then we enriched this with DGIDB and hetionet datasets. Basically, we merged these 3 datasets to create an extensive dataset.

We used Angular and Angular Material in the frontend.

Challenges we ran into

  • Understanding the domain was difficult. We read lots of papers and get help from Molecular Biolog friends.
  • Creating a useful and understandable dataset was hard. The DRKG was a knowledge graph dataset. So it doesn't contain any data properties . We wanted to use both data properties such as International Chemical Identifier (inchi) and also the topology of the graph. That's why we enriched the DRKG with 'hetionet' and 'DGIDB' datasets.
  • Writing efficient GSQL query that scans whole database is hard. It must be efficient. we wrote some queries like that. But later we see that they are not that performant. So we switched to run on subsets given as parameters.

Accomplishments that we're proud of

  • Original and generic (schema-agnostic) GSQL algorithms such as adamicAdar, editDistance, and jaccardSimilarity.
  • Domain-specific GSQL algorithm inchiSimilarity
  • Generic (schema-agnostic) way to search for ANY string inside the database.
  • A resizeable graph canvas for easy interaction using User-Interface.
  • With the help of compound nodes, make complex graphs a lot more readable.

What we learned

  • Powerful GSQL features such as dynamic arrays
  • Using Tiger Graph with Docker
  • A little knowledge about drug discovery

What's next for Dervish

  • Many different algorithms can be used to find similarities or connections between diseases and/or compounds.
  • More collaboration with domain experts can make the tool more user-friendly

Built With

Share this project:

Updates

posted an update

References

  • https://medium.com/@orbifold/drug-repurposing-using-tigergraph-graph-machine-learning-5e7fa4e12b0
  • https://het.io/
  • https://dgidb.org/
  • Masoudi-Sobhanzadeh, Yosef, et al. "Drug databases and their contributions to drug repurposing." Genomics 112.2 (2020): 1087-1095.
  • Mathur, Sachin, and Deendayal Dinakarpandian. "Drug repositioning using disease associated biological processes and network analysis of drug targets." AMIA annual symposium proceedings. Vol. 2011. American Medical Informatics Association, 2011.
  • Emig, Dorothea, et al. "Drug target prediction and repositioning using an integrated network-based approach." PloS one 8.4 (2013): e60618.
  • Martinez, Victor, et al. "DrugNet: network-based drug–disease prioritization by integrating heterogeneous data." Artificial intelligence in medicine 63.1 (2015): 41-49.
  • Chen, Xing, Ming-Xi Liu, and Gui-Ying Yan. "Drug–target interaction prediction by random walk on the heterogeneous network." Molecular BioSystems 8.7 (2012): 1970-1978.
  • Hu, Guanghui, and Pankaj Agarwal. "Human disease-drug network based on genomic expression profiles." PloS one 4.8 (2009): e6536.
  • Cheng, Feixiong, et al. "Prediction of drug-target interactions and drug repositioning via network-based inference." PLoS computational biology 8.5 (2012): e1002503.

Log in or sign up for Devpost to join the conversation.