Single cell RNA-seq in the past and in future

Posted by beauty33 on December 15th, 2019

A brief introduction list of single-cell RNA sequencing:

  • A major breakthrough was made in the late 2000s (replacement of microarrays) and has since been widely used.
  • The average expression level of each gene in a large number of input cells was measured.
  • Used to compare transcriptomics, such as samples from the same tissue from different species.
  • Used to quantify expression markers from tissues, such as disease research.
  • Be insufficient to study heterogeneous systems, such as early developmental studies, complex tissues (brains).
  • No insight into the randomness of gene expression is provided.
  • This is a new technology, first published in Tang,
  • Until 2014, when new protocols and lower sequencing costs made it easier to obtain, single cell RNA sequencinggained widespread popularity.
  • The distribution of expression levels of each gene in the cell population was measured.
  • Allowing the study of new biological problems, where it is important to study cell-specific changes in the transcriptome, for example, cell type identification, heterogeneity of cellular responses, randomness of gene expression, inference of gene regulatory networks across cells .
  • Data sets range from 102 to 10^6, and the number is increasing every year.
  • Currently, several different protocols are in use, for example, SMART-seq2 (Picelli 2013), CELL-seq (Hashimshony 2012) and Drop-seq (Macosko 2015).
  • There are also some commercial platforms, including Fluidigm, Wafergen and 10X.
  • Several computational analytical methods from bulk RNA-seq can be used.
  • In most cases, computational analysis requires adjustments to existing methods or development of new methods.
  • There are several comments on scRNA-seq analysis, including Stegle, Teichmann, and Marioni 2015.
  • Falco is a single-cell RNA-seq processing framework on the cloud;
  • SCONE (Single Cell Overview of Standardized Expression) is a package for single cell RNA-seq data quality control and standardization;
  • Seurat is an R package for quality control, analysis and exploration of single-cell RNA-seq data;
  • ASAP (Automatic Single Cell Analysis Pipeline) is a web-based interactive single cell analysis platform.

The main difference between bulk and single cell RNA-seq is that each sequencing library represents a single cell, not a cell population. Therefore, great care must be taken to compare results from different cells (sequencing libraries). The main sources of difference between libraries are:

Amplification (up to 1 million fold);

The gene 'dropouts' refers to genes observed at a moderate expression level in one cell but not in another (Kharchenko, Silberstein and Scadden 2014). In both cases, since the RNA is derived from only one cell, a difference is introduced due to the low initial amount of the transcript. Increasing transcript capture efficiency and reducing amplification bias are currently active research areas. However, some of these issues can be mitigated through proper normalization and correction.

The development of new methods and protocols for scRNA-seq is currently a very active area of research and several protocols have been published over the past few years. Here is a brief list:

CEL-seq (Hashimshony et al. 2012);

CEL-seq2 (Hashimshony et al. 2016);

Drop-seq (Macosko et al. 2015);

InDrop-seq (Klein et al. 2015);

MARS-seq (Jaitin et al. 2014);

SCRB-seq (Soumillon et al. 2014);

Seq-well (Gierahn et al. 2017);

Smart-seq (Picelli et al. 2014);

Smart-seq2 (Picelli et al. 2014);

SMARTer

STRT-seq (Islam et al. 2013)

These methods can be categorized in different ways, but the two most important aspects are quantification and capture.

For quantification, there are two types, full length and tag based. The former attempted to obtain uniform read coverage for each transcript. In contrast, the tag-based approach captures only the 5' or 3' end of each RNA. The choice of quantization methods is important for what type of analysis the data can be used for. In theory, the full-length scheme should provide uniform coverage of the transcript, but as we will see, the coverage is usually biased. The main advantage of tag-based solutions is that they can be combined with a unique molecular identifier (UMI), which helps to improve the quantified effect. On the other hand, limiting to one end of the transcript may reduce the matchability and also make it more difficult to distinguish between different isoforms (Archer et al., 2016).

The strategy used for capture determines the throughput, how to select cells, and what additional information is available in addition to the sequencing available. Among them, the three most widely used options are based on microporous, microfluidic and droplet options.

Like it? Share it!


beauty33

About the Author

beauty33
Joined: July 10th, 2017
Articles Posted: 286

More by this author