Siddharth Jain

That's me 

Siddharth Jain
Advisor: Professor Jehoshua (Shuki) Bruck
Electrical Engineering Department
Email: sidjain at caltech dot edu
Phone: +1-6266521958
Photo Credits: Prachi Parihar

Research Interests

I am interested in understanding the interplay between computation, information and evolution. This quest has led me to focus on theoretical problems related to finding the fundamental limits of evolution using information theory and theory of computation tools to applied problems in healthcare primarily focusing on cancer classification and viral evolution. From time to time, I have used data science/ML methods on genomic datasets. Realizing the noisy and biased nature of datasets, I am also interested in designing algorithms and approaches with theoretical guarantees which ensure robustness in data science/ML. My other research interests include connections between data compression & machine learning and DNA Storage.


Sep 25, 2020: Our paper titled ``Robust Correction of Sampling Bias Using Cumulative Distribution Functions,'' got accepted in NeurIPS 2020 [arXiv Link]

July 27, 2020: New paper on bioRxiv, "Predicting the Emergence of SARS-CoV-2 Clades [Link]

May 7, 2020: New paper on arXiv, "Coding for Optimized Writing Rate in DNA Storage" [Link] [Video Link]

Apr 22, 2020: New paper on arXiv, "CodNN - Robust Neural Netowrks from Coded Classification" [Link]

Mar 31, 2020: Three new papers accepted in IEEE ISIT 2020.

  • "Coding for Optimized Writing in DNA Storage.''
  • "CodNN - Robust Neural Networks from Coded Classification.''
  • "What is the Value of Data? on Mathematical Methods for Data Quality Estimation.''

Jan 17, 2020: Paper accepted in NVMW 2020, "Coded Deep Neural Networks for Robust Neural Computation"

Jan 9, 2020: New paper on arXiv, "What is the value of data? On Mathematical Methods for Data Quality Estimation" [Link]

Sept 24, 2019: Our work related to Cancer Classification was accepted to LMRL workshop at NeurIPS 2019

June 20, 2019: We filed a patent on "Mutation Profile and related labeled Genomic Components, Methods and Systems".

May 17, 2019: Our work "Data Equals Money, But How Much? On Mathematical Methods for Data Pricing" got accepted in CodML workshop in ICML

Apr 1, 2019: I successfully defended my PhD thesis "Decoding the Past" [Link]

Mar 21, 2019: I gave an invited talk at Conference on Information Sciences and Systems (CISS) at John Hopkins

Feb 13, 2019: I gave an invited graduation day talk at ITA Workshop in San Diego

Jan 11, 2019: Our article on Cancer Classification using healthy DNA is available on bioRxiv [Link]

Jan 11, 2019: Our article on the statistical bias present in short tandem repeats in amplified samples on TCGA is now available on bioRxiv [Link]


  • Reviewer for IEEE Transactions on Information Theory
  • Reviewer for IEEE Transactions on Communications
  • Reviewer for ACM Transactions on Algorithms
  • Reviewer for AAAS Science Advances
  • Reviewer for Frontiers in Physiology
  • Reviewer for IEEE Communication Letters
  • Reviewer for IEEE International Symposium on Information Theory (ISIT)