I'm working on compression of tabular datasets at Compressable Corp., a startup I co-founded, and on indexing and searching genomic datasets at the Ji Lab at Stanford. My postdoctoral work is in information theory and its applications to statistics. My PhD work was in information theory, analysis of single molecule experiments, and error correction using quantum optical systems.
I received my PhD from Stanford Physics in 2014 (advised by Hideo Mabuchi) and my A.B. in Physics from Harvard in 2007. From 2014 until 2019 I was a postdoc jointly in Tsachy Weissman's group at Stanford Electrical Engineering and Hanlee P. Ji's group at Stanford Oncology.
dmitrip [at] dmitrip [dot] com
Google Scholar profile
I live in Redwood City, California with my spouse.
Information theory and statistics. Estimation when the sample size is not large enough for classical techniques to perform well; Fluctuations of Markov chains and neutral genetic drift; Analogies between data compression and physical processes; Optimally symmetrizing probability distributions.
Approximate profile maximum likelihood, Dmitri Pavlichin, Jiantao Jiao, and Tsachy Weissman. Journal of Machine Learning Research (JMLR) 2019 [paper].
Quantum channel capacities per unit cost, David Ding, Dmitri Pavlichin, and Mark Wilde. IEEE Transactions on Information Theory 2018 [paper].
Minimum power to maintain a nonequilibrium distribution of a Markov chain, Dmitri Pavlichin, Yihui Quek, and Tsachy Weissman 2018 [paper].
Chained Kullback-Leibler divergences, Dmitri Pavlichin and Tsachy Weissman. International Symposium on Information Theory (ISIT) 2016 [paper].
Nearest symmetric distributions, Dmitri Pavlichin. International Symposium on Information Theory (ISIT) 2016 [paper].
Data compression of genomic and tabular datasets.
Compressing tabular data via pairwise dependencies, Dmitri Pavlichin, Amir Ingber, and Tsachy Weissman. Poster at Data Compression Conference (DCC) 2017 [paper].
The human genome contracts again, Dmitri Pavlichin, Golan Yona, and Tsachy Weissman. Bioinformatics 2013 [paper] (a new compression ratio record at the time).
Indexing and searching genomic data. Data structures and algorithms for indexing large, heterogeneous collections of genomic data, and making public genomic resources searchable in a common way.
Profiling SARS-CoV-2 mutation fingerprints that range from the viral pangenome to individual infection quasispecies, Billy T. Lau, Dmitri Pavlichin, Anna Hooker, Alison Almeda, Giwon Shin, Jiamin Chen, ChunHong Huang, Benjamin A. Pinsky, HoJoon Lee, Hanlee P. Ji, 2020. In submission [paper].
Unique K-mer sequences for validating cancer-related substitution, insertion and deletion mutations, HoJoon Lee, Ahmed Shuaibi, John M. Bell, Dmitri Pavlichin, Hanlee P. Ji, 2020. In submission [paper].
Provisional patent Improved K-mer Storage and Retrieval for Next-Generation Sequencing, Hanlee P. Ji, HoJoon Lee, Tsachy Weissman, and Dmitri Pavlichin. Submitted in July 2020 [Stanford Office of Technology Licensing page]
Quantum optics and single molecule biophysics. Modeling, simulation, experimental data analysis, and engineering of small, noisy systems (ribozymes and quantum optical systems). Stochastic computing, an information-theoretic analysis of the proposed schemes, and how to systematically turn anything, say a pizza, into a communication channel.
PhD Disssertation: Photonic circuits and probabilistic computing, Dmitri Pavlichin, 2014 [pdf].
Photonic circuits for iterative decoding of a class of low-density parity-check codes, Dmitri Pavlichin and Hideo Mabuchi. New Journal of Physics, 2014 [paper].
Specification of photonic circuits using quantum hardware description language, Nikolas Tezak, Armand Niederberger, Dmitri Pavlichin, Gopal Sarma, and Hideo Mabuchi. Phil. Trans. Royal Soc. A, 2012 [paper].
Single Molecule Analysis Research Tool (SMART): an integrated approach for analyzing single molecule data, Max Greenfeld, Hideo Mabuchi, and Daniel Herschlag. PLoS One, 2012 [paper].
Design of nanophotonic circuits for autonomous subsystem quantum error correction, Joseph Kerckhoff, Dmitri Pavlichin, Hamidreza Chalabi, and Hideo Mabuchi. New Journal of Physics 2011 [paper].
Press, teaching, and other works
A popular summary of data compression in general and as applied to the particularities of genomic data, with bonus "Silicon Valley" material at the end.
The Desperate Quest for Genomic Compression Algorithms, Dmitri Pavlichin and Tsachy Weissman, illustration by Greg Mably, IEEE Spectrum, August 2018 [article].
A lecture I gave on tabular and genomic data compression in Stanford's Information Theory course EE376A, Winter 2019.
Genomic and tabular data compression + sundry IT adventures, Dmitri Pavlichin, [video]
I consulted for the show (along with dozens of others).
The New Brain Behind the Whiteboards—and More—for HBO’s "Silicon Valley", by Tekla Perry at IEEE Spectrum, April 2017 [article].
Mount Rose summit near Lake Tahoe