Creating the Onco-Tensor: An AI-ready repository for multi-omic cancer data

Bainom
Chicago, Illinois
BiologyData Science
$50
Pledged
1%
Funded
$5,250
Goal
15
Days Left
  • $50
    pledged
  • 1%
    funded
  • 15
    days left

About This Project

Multi-omic data (e.g., genomics, transcriptomics) holds the key to identifying novel solutions for most pressing problems in biomedical sciences, such as cancer precision therapy [1]. Large amount of this data exists in databases, but remains unanalyzed [2].

We hypothesize that extraction and integration of these multi-omic features across major cancer types will reveal novel molecular signatures that can improve treatment response or disease progression.

Ask the Scientists

Join The Discussion

What is the context of this research?

Advances in next-generation sequencing (NGS) methods (e.g., genomics, transcriptomics) enabled a range of cancer omic analyses at the molecular level [1]. These methods identify features underlying cancer pathophysiology and serve pivotal to devise therapeutics [2]. Integrated analysis of multiple omic features can suggest optimal treatment flow at a patient level - termed precision medicine [3].

To get insights, patient omic data is matched with existing omic patterns accumulated through years of research [4]. A major hurdle here is that a large amount of omic data with targetable, but yet unanalyzed features exists in several databases. Also, this data is heterogenoeous, requires complex analysis steps, not validated for clinical actionability, and lacks a multi-omic view [5].

We hypothesize that integrating 15 key omic features across major cancer types will reveal molecular patterns that can guide more effective, data-driven approaches to cancer precision therapy.

What is the significance of this project?

Multi-omic view of disease progression holds the key to identifying novel solutions for some of the most pressing problems in healthcare and biomedical sciences, such as cancer precision therapy, and finding new drug targets or biomarkers [1].

Rapid advances in DNA/RNA/protein sequencing methods made it possible to obtain complete molecular profile of a patient [2]. However, the existing multi-omic data that is required to match with patient data remain largely unanalyzed and scattered [3]. This omic data is large, heterogeneous, and importantly extraction of meaningful insights require specific technical skills, scientific know-how, and computational infrastructure, which is largely lacking among primary stakeholders [4].

This project will provide a proof-of-concept for how a user-facing, AI/ML driven tool encompassing pre-analyzed, pre-validated clinically actionable omic features can drive deeper and faster biological insights.

What are the goals of the project?

Completion of this project would provide a proof-of-concept and identify novel multi-omic, oncologic patterns that correlate with clinical outcomes of cancer patients.

The outcome from this work is also critical for our larger goal to develop a disease-wide Multi-omic Tensor for 50+ omic features corresponding to 85+ human disease conditions (>2Bi data points). Our eventual goal is to combine this Multi-omic Tensor with AI/ML driven omic data analysis kits in cloud to build world's first GPS for biomedical sciences.


Budget

Please wait...

The funds will be used to cover computational costs associated with mining, processing, and analyzing the omic data. Funds will also pay for one bioinformatic analyst, and training/testing of an unsupervised machine learning model. Finally, funds will allow us to deploy the Onco-Tensor in the cloud where stakeholders can easily explore these patterns.


Endorsed by

This project solves longstanding problems for many researchers who wish to use transcriptomic and genomic data in their daily work. Deepak is the exceptionally well qualified to solve these problems.
I am truly enthusiastic about Dr Sharma’s project, which promises to deliver novel insights into cancer biology by integrating multi-omics data analysis with clinical information. Dr. Sharma brings the expertise, deep knowledge, and unwavering enthusiasm necessary to drive this ambitious endeavor to success.

Project Timeline

Sequencing data (FASTQ format) for 1000 cancer datasets from 3 major cancer databases (TCGA, GEO, and UK biobank) will be mined. Data will be harmonized, QC'd, and mapped to the genome. Using bioinformatic workflows, genome-wide counts for 15 omic features will be obtained. Machine learning will stratify features into cancer-specific moelcular subtypes. A repository and user facing analytic dashboard will be developed for exploration.

Nov 20, 2025

Project Launched

Dec 31, 2025

Mining of raw sequencing data for 1000 cancer datasets from TCGA, GEO, and UK biobank (15 cancer types).

Jan 21, 2026

Harmonizing the data and setting for feature extraction

Feb 28, 2026

Data quality control, mapping to genome, analysis and extraction of 15 omic features.

Mar 16, 2026

Machine learning based unsupervised clustering.

Meet the Team

Deepak Sharma
Deepak Sharma
Principal Bioinformatics Scientist

Affiliates

Bainom
View Profile

Deepak Sharma

I am a PhD scientist with a background of analyzing large scale multi-omic data (e.g., genomics, transcriptomics, proteomics) and finding clinically actionable patterns in it [1]. Ever since I began my journey in science, I have been fascinated by what hidden patterns the omic data holds and what patterns are already present in large amount of existing data, but are currently difficult to interpret.

Throughout my 15+ years of research career, I worked at several universites and published >30 research articles in prestigious journals (eg., Nature [2], JCI [3]). Currently I work as the Principal Bioinformatics Scientist at a startup that I co-founded - Bainom [4]. At Bainom, our goal is to create a GPS for biomedical sciences by merging multi-omic data analysis tools (as standalone, query-specific, biology centric "bioinformatic kits") and all existing data from 80+ databases as single flow that accelerates drug discovery, precision health, and biomedical research.

Lab Notes

Nothing posted yet.


Project Backers

  • 1Backers
  • 1%Funded
  • $50Total Donations
  • $50.00Average Donation
Please wait...

See Your Scientific Impact

You can help a unique discovery by joining 1 other backers.
Fund This Project