exRNA Atlas: Publication in Cell

04 Apr 2019 » James Diao » New Haven, CT


As part of the Gerstein Lab at Yale, I contributed to a large multi-institutional effort to document and catalog the variety of exRNA molecules found in human biofluids. My main contribution was a dimensionality reduction and visualization tool that the consortium used for quality control and hypothesis generation. Our work was published just today, more than 5 years after the consortium was founded, and more than 3 years after I started working on it.

Murillo OD, Thistlethwaite W, Rozowsky, J, Subramanian SL, Lucero R, Shah N, Jackson AR, Srinivasan S, Chung A, Laurent CD, Kitchen RR, Galeev T, Warrell J, Diao JA, Welsch JA, Hanspers K, Riutta A, Burgstaller-Muehlbacher S, Shah R, Yeri A, Jenkins L, Ahsen ME, Cordon-Cardo C, Dogra N, Gifford SM, Smith JT, Stolovitzky G, Tewari AK, Wunsch BH, Yadav KK, Danielson KM, Filant J, Moeller C, Nejad P, Paul A, Simonson B, Wong DK, Zhang X, Balaj L, Gandhi R, Sood AK, Alexander RP, Wang L, Wu C, Wong D, Galas DJ, Van Keuren-Jensen K, Patel T, Jones JC, Das S, Cheung K, Pico AR, Su AI, Raffai RL, Laurent LC, Roth ME, Gerstein MB, Milosavljevic A. “ExRNA Atlas analysis reveals distinct extracellular RNA types and their carriers present across human biofluids.” Cell. 4 Apr 2019. https://doi.org/10.1016/j.cell.2019.02.018.

exRNA Atlas Header

Non-technical summary

Our bodily fluids (e.g., blood, urine, etc.) contain many different molecules that can be used to diagnose and track disease. Most non-invasive diagnostic tests–from pregnancy tests to cancer biomarkers–look for abnormal levels of specific proteins. More recently, scientists have discovered that another type of molecule, exRNA, can also be measured from fluid samples for diagnostic purposes. To help standardize the study of exRNA, the NIH put together a consortium–the Extracellular RNA Communication Consortium (ERCC).

A primary goal for the ERCC was to create a catalog of exRNA molecules found in human biofluids, like blood, saliva, and urine. The resulting data, consisting of >50,000 samples from >2,000 donors across 13 different biofluids, was uniformly processed through a pipeline that our lab developed (our paper on this came out today too!) and compiled into the exRNA Atlas.

This paper aims to describe the suite of data resources and analysis tools available in the exRNA Atlas, and to demonstrate how they can be used to better understand exRNA sequencing data. To accomplish the second aim, the Atlas resource was used to develop a model for explaining variability between exRNA profiles in terms of the cargo types found in each sample (i.e., which carriers are helping to shuttle the exRNAs around). The hope is that modeling these sources of variation can help us better understand our data, and that correcting for them can improve statistical power and decrease false positives for studies that aim to demonstrate clinical correlations.

My work

I was involved with the NMF deconvolution portion of the exRNA Atlas project from very early on, starting with mouse exRNA-Seq data, and then trying it on human data. My efforts were different from the approach that was eventually published in 2 ways: I was trying to use tissue-specific data to supervise the deconvolution, and I focused on biofluids instead of cargo types. Although it didn’t end up generating interesting results, it meant that I understood the final approach quite well and was able to contribute in other ways.

My main contribution was the dimensionality reduction and plotting tool. I had done some data visualization work at Harvard, and we had a great opportunity to produce something similar for the exRNA Atlas. Unlike the exceRpt project, which was entirely internal to Gerstein Lab, this project involved working alongside other ERCC labs. After building the beta version on my own, I worked closely with William Thistlethwaite (co-first author) on developing it further. This was a great experience; it taught me how to work closely with long-distance colleagues I’ve never met, as well as how to develop a tool for a broad user base with diverse technical backgrounds.


The exRNA Atlas resource was such a large project; I’m still amazed that coordination was possible just via conference calls and Google docs. For my next steps, I think I have a lot to learn from completing a smaller project with more ownership. Still, working on the Atlas gave me the chance to be part of something broader than myself, my lab, and my institution, and I’m grateful for that.

<< Back to Posts