Single-cell transcriptomics analysis for pedestrians

This document compares two approaches to analyse single-cell data: First, we use Seurat, and then we perform all steps using only basic R, in order to be able to study and modify each step. I have presented it at the Single Cell Center Heidelberg meeting in Mar 2019.

Example data

As example data, I use the study on the development of the mouse brain by Carter et al.: “A single-cell transcriptional atlas of the developing murine cerebellum”, Current Biology 28:2910 (2018).

Specifically, I will use their sample E13a, i.e., which comprises cells from the cerebellum of a mouse embryo at day 13.5. The “a” means that it is the first of their two replicates.

Seurat analysis

I first perform a standard analysis with Seurat.

library( Seurat )

Kevin has downloaded the bam files from the European Nucleotide Archive (accession PRJEB23051) and run them again through CellRanger. In the next line, I read in the output from CellRanger. If you want to rerun this script yourself, you can find the three CellRanger output files for sample E13a here.

seu <- Read10X("E13_A/")

Now I create a Seurat object, keeping only the genes that are expressed in at least 3 cells, and only those cells expressing at least 1000 genes.

seu <- CreateSeuratObject( seu, min.cells = 3, min.genes=1000 )

I normalize the data. This performs, by default, the transformation \(y_{ij} = \log\left( 1 + 10^4 \frac{k_{ij}}{\sum_{i'}k_{i'k}}\right)\) (where \(k_{ij}\) is the UMI count for gene \(i\) in cell \(j\)).

seu <- NormalizeData( seu )

A short detour here on the topic of normalization:

The median UMI count in our cells is

median_umi_count_per_cell <- median( colSums(seu@raw.data) )
median_umi_count_per_cell

## [1] 6199

Here is a plot showing how the transformation changes UMI counts for such a “median” cell

plot( 0:100, log( 1 + 1e4 * (0:100) / median_umi_count_per_cell ), 
      type = "l", xlab="UMI count", ylab="expression measure", ylim = c(0, 6.5) )

meet_at_50 <- function(y)
  y / y[51] * log( 1 + 1e4 * 50 / median_umi_count_per_cell )

lines( 0:100, meet_at_50( log( 1 + 1e3 * (0:100) / median_umi_count_per_cell ) ), type = "l", col="darkgreen" )
lines( 0:100, meet_at_50( log( 1 + 1e5 * (0:100) / median_umi_count_per_cell ) ), type = "l", col="darkblue" )
lines( 0:100, meet_at_50( log10( 1 + (0:100) ) ), type = "l", col="green" )
lines( 0:100, meet_at_50( sqrt( (0:100) / median_umi_count_per_cell ) ), type = "l", col="orange" )

The black line above is Seurat’s default normalization. For future reference, I have also included the transformation if one changes Seurat’s scaling factor from the default \(10^4\) (black) to \(10^3\) (dark green) or \(10^5\) (dark blue). Light green is the often used simple \(y_{ij} = \log_{10}(1+k_{ij})\) and orange is the square-root transform \(y_{ij} = \sqrt{ {k_{ij}}/{\sum_{i'}k_{i'k}}}\). I have rescaled these extra lines so that they all meet at 50.

Continuing with the standard Seuirat work-flow: Find the variable genes

seu <- FindVariableGenes( seu )

Scale and center the data and regress out UMI totals

seu <- ScaleData( seu, vars.to.regress = "nUMI" )

## Regressing out: nUMI

## 
## Time Elapsed:  29.526376247406 secs

## Scaling data matrix

Run a PCA to reduce the data

seu <- RunPCA( seu, do.print = FALSE )

Perform Louvain clustering

seu <- FindClusters( seu, print.output = FALSE, save.SNN = TRUE )

Get an embedding. I use UMAP. By default, RunUMAP only uses the first 5 PCS, but this gives quite unsatisfactory results. Hence, I use a few more, otherwise it get’s too blurry.

seu <- RunUMAP( seu, dims.use=1:20 )

Plot the embedding, and highlight the Louvain clusters:

DimPlot( seu, "umap" )

Inspect it with Sleepwalk

library( sleepwalk )

sleepwalk( seu@dr$umap@cell.embeddings, seu@dr$pca@cell.embeddings )

Pedestrian analysis

Now I perform an analysis only using standard R and tidyverse packages.

library( tidyverse )

Read CellRanger output

First, read the CellRanger output. We read the three files one by one.

First the genes.tsv file. It contains two columns: the gene IDs and the gene names

genes <- read_tsv( "E13_A/genes.tsv", col_names = c( "gene_id", "gene_name" ) )

## Parsed with column specification:
## cols(
##   gene_id = col_character(),
##   gene_name = col_character()
## )

head( genes )

## # A tibble: 6 x 2
##   gene_id            gene_name    
##   <chr>              <chr>        
## 1 ENSMUSG00000102693 4933401J01Rik
## 2 ENSMUSG00000064842 Gm26206      
## 3 ENSMUSG00000051951 Xkr4         
## 4 ENSMUSG00000102851 Gm18956      
## 5 ENSMUSG00000103377 Gm37180      
## 6 ENSMUSG00000104017 Gm37363

There are a few duplicated gene names. I disambiguate these few cases by adding the Ensemble ID to the name.

duplicate_gene_names <- genes$gene_name[ duplicated(genes$gene_name) ]

genes$gene_name <- ifelse(
  genes$gene_name %in% duplicate_gene_names,
     str_c( genes$gene_name, "_", genes$gene_id ),
     genes$gene_name )

Next, the cell barcodes. This is simply a file with one barcode per line.

barcodes <- readLines( "E13_A/barcodes.tsv" )
str( barcodes )

##  chr [1:737280] "AAACATACAAAACG-1" "AAACATACAAAAGC-1" ...

Note that these are the unfiltered files. We have over 700,000 barcodes – presumably most of them empty droplets.

I remove the superfluous -1 suffix.

barcodes <- str_remove( barcodes, "-1" )

The third file and biggest file in the CellRanger output is the matrix of counts in coordinate-sparse form, stored according to the MatrixMarket format.

matrixTable <- read_delim( "E13_A/matrix.mtx", 
   delim=" ", comment = "%", col_names = c( "gene", "barcode", "count" ) )

## Parsed with column specification:
## cols(
##   gene = col_double(),
##   barcode = col_double(),
##   count = col_double()
## )

head(matrixTable)

## # A tibble: 6 x 3
##    gene barcode    count
##   <dbl>   <dbl>    <dbl>
## 1 53801  737280 10042484
## 2 32438       1        1
## 3 39070       1        1
## 4 52789       1        1
## 5  5635       4        1
## 6 10830       4        1

Note how the first non-comment row of a MatrixMarket file contains the total number of genes, barcodes and UMIs. The first two numbers agree with the number of rows in the genes and barcodes files. From the second row on, we have actual data: the first and second columns are indices into the gene and barcode list, the third one contains the UMI counts.

The sparseMatrix function from the base R package Matrix is designed to handle such data. Actually, it turns it from triplet-sparse to column-sparse (unless we set giveCsparse=FALSE) but that might actually be better for performance. (Seurat, by the way, keeps it in triplet-sparse format.)

The sparseMatrix functions takes all the data we have just loaded:

library( Matrix )

counts <-
   sparseMatrix( 
      i = matrixTable$gene[-1], j = matrixTable$barcode[-1], x = matrixTable$count[-1],
      dims = c( matrixTable$gene[1], matrixTable$barcode[1] ),
      dimnames = list( gene = genes$gene_name, barcode = barcodes ) )

counts[1:5,1:5]

## 5 x 5 sparse Matrix of class "dgCMatrix"
##                barcode
## gene            AAACATACAAAACG AAACATACAAAAGC AAACATACAAACAG
##   4933401J01Rik              .              .              .
##   Gm26206                    .              .              .
##   Xkr4                       .              .              .
##   Gm18956                    .              .              .
##   Gm37180                    .              .              .
##                barcode
## gene            AAACATACAAACGA AAACATACAAAGCA
##   4933401J01Rik              .              .
##   Gm26206                    .              .
##   Xkr4                       .              .
##   Gm18956                    .              .
##   Gm37180                    .              .

I remove the raw data tables to save RAM

rm( barcodes, genes, matrixTable )

Do the knee plot

As I have read the complete data, I can do the popular knee plot that is also shown in CellRanger’s HTML summary page.

colSums( counts>0 ) tells us for each barcode how many genes have been detected for this. I calculate this and sort:

n_detected_genes_sorted <-  sort( colSums( counts>0 ), decreasing = TRUE )

head(n_detected_genes_sorted)

## ATCCCGTGACTAGC TAATGAACACTCAG CTTTGATGGCTCCT TCGTAGGAGGGAGT AATACTGAATTCGG 
##           6451           5862           5638           5346           5224 
## TGGAACACCGAACT 
##           5212

Now I plot these numbers in a log-log plot

plot( seq.int(ncol(counts)), n_detected_genes_sorted, log = "xy", type="s",
      xlab = "barcodes", ylab = "detected genes" )

## Warning in xy.coords(x, y, xlabel, ylabel, log): 457576 y values <= 0
## omitted from logarithmic plot

abline( h=1000, col="gray" )

I use the same criterion as above and call everything with at least 1000 detected genes (above the gray line) a cell.

cell_barcodes <- names(which( colSums( counts>0 ) >= 1000 ))
str(cell_barcodes)

##  chr [1:3255] "AAACATTGGCTCCT" "AAACATTGTCGACA" "AAACGCACCCACCT" ...

Are these the same barcodes as the ones in the Seurat object?

all( colnames(seu@raw.data) == cell_barcodes )

## [1] TRUE

They are. Even in the same order.

I remove the rest to make my matrix smaller

counts <- counts[ , cell_barcodes ]

dim(counts)

## [1] 53801  3255

I also remove the genes that appear in none of the remaining cells

counts <- counts[ rowSums(counts) > 0,  ]

dim(counts)

## [1] 25174  3255

Normalisation

For each cell, I simply divide the UMI count per gene by the total UMI count over all genes: \(y_{ik} = k_{ij} / \sum_{i'} k_{i'j}\) The double transposition is to get R to divide the columns and not the rows by the appropriate total.

nrm_counts <- t( t(counts) / colSums(counts) )

Highly variable genes.

A preliminary: It seems there is no function to efficiently calculate row variances for a column-sparse matrix. Here is my own implementation. I might look a bit ugly but it works.

colVars_spm <- function( spm ) {
  stopifnot( is( spm, "dgCMatrix" ) )
  ans <- sapply( seq.int(spm@Dim[2]), function(j) {
    mean <- sum( spm@x[ (spm@p[j]+1):spm@p[j+1] ] ) / spm@Dim[1]
    sum( ( spm@x[ (spm@p[j]+1):spm@p[j+1] ] - mean )^2 ) +
      mean^2 * ( spm@Dim[1] - ( spm@p[j+1] - spm@p[j] ) ) } ) / ( spm@Dim[1] - 1 )
  names(ans) <- spm@Dimnames[[2]]
  ans
}

rowVars_spm <- function( spm ) {
  colVars_spm( t(spm) )
}

With this, I can get the variance-to-mean ratio for each gene. I plot them against the means:

gene_means <- rowMeans( nrm_counts )
gene_vars <- rowVars_spm( nrm_counts )

plot( gene_means, gene_vars / gene_means,
      log = "xy", cex = .3, col = adjustcolor("black", alpha=.3), 
      xlab = "mean", ylab = "variance / mean" )

poisson_vmr <- mean( 1 / colSums( counts ) )

abline( h = 1:3 * poisson_vmr, col="lightblue" )

The lowest of the light blue line marks the Poisson noise level (poisson_vmr). It is placed at the mean of the reciprocals of the UMI totals per cell. This is immediately not obvious; I’ll write up the proof.

I chose the genes three times above the Poisson level as informative genes

informative_genes <- names(which( 
   gene_vars / gene_means  >  3 * poisson_vmr ))

str( informative_genes )

##  chr [1:591] "Eya1" "Tfap2b" "Khdrbs2" "1500015O10Rik" "Gm26649" ...

How many of these are also among the variable genes chosen by Seurat?

grid::grid.newpage()
print( grid::grid.draw( 
  VennDiagram::venn.diagram( list( Seurat = seu@var.genes, mine =  informative_genes ), NULL ) ) )

## NULL

Embedding

I calculate a UMAP embedding, using only the informative genes

library( umap )

my_umap <- umap( as.matrix( t( sqrt( nrm_counts[ informative_genes,  ] ) ) ) )

The square root here is a varianmce-stabilizing transformation. I use it instead of the log used by Seurat, because in UMI data, the dominant noise is Poissonean. (This, too, needs to be explained in more detail, along with the calculation of the Poisson noise level.)

plot( my_umap$layout, asp=1 )

Those outliers are annoying. I pull them in

my_umap$layout[ my_umap$layout[,1] < -8, 1 ] <- -8

Plot again, now colour with the Louvain cluster found by Seurat

plot( my_umap$layout, asp=1, col=seu@ident, cex=.3 )

I save the environment, so I can easily restart here later.

save.image("comparison.rda")

Now I can compare the two embeddings. Left the one by Seurat, right the one with my workflow.

sleepwalk( 
  list( seu@dr$umap@cell.embeddings, my_umap$layout ), 
  list( seu@dr$pca@cell.embeddings, as.matrix( t(sqrt(nrm_counts[informative_genes,])) ) ) )

(Sorry, the UMAP is inverted in the screenshots compared to what you see in the R plots, because I ran it twice with different random seeds.)

I have selected a few points with the mouse (using Sleepwalk’s lasso feature). These then appear in the environment as a variable called selPoints. To be able to knit this document non-interactively, I list the selected point here:

selPoints <-
c(1L, 4L, 5L, 7L, 14L, 16L, 26L, 28L, 29L, 32L, 34L, 35L, 37L, 
45L, 60L, 61L, 69L, 71L, 76L, 85L, 95L, 98L, 104L, 107L, 111L, 
...
3228L, 3231L, 3234L, 3237L, 3238L, 3240L, 3244L, 3250L, 3252L, 
3253L)

This is the structure that I have selected with the mouse:

selPoints_bool <-seq.int(ncol(nrm_counts)) %in% selPoints

plot( my_umap$layout, asp=1, col=1+selPoints_bool )

What genes are special among these cells? I like to use AUROC for this, as it is non-parametric

library( pROC )

## Type 'citation("pROC")' for a citation.

## 
## Attaching package: 'pROC'

## The following objects are masked from 'package:stats':
## 
##     cov, smooth, var

aucs <- sapply( informative_genes, function(g)
   auc( roc( selPoints_bool, nrm_counts[g,] ) ) )

head( sort( aucs, decreasing = TRUE ), 40 )

##     Meis2      Lhx9      Nrn1     Ralyl    Hs3st5     Nrxn1    Tfap2b 
## 0.9465052 0.8764802 0.8576308 0.8356687 0.7974714 0.7930246 0.7878751 
##      Nnat     Ncam1     Erbb4   Slc25a5      Rtn1    Phlda1   Igfbpl1 
## 0.7775149 0.7730306 0.7719841 0.7566909 0.7558819 0.7548668 0.7515794 
##   Tmem163  Cacna2d1    Lhx1os      Ppia     Cd24a    Nkain3     Nrxn3 
## 0.7476554 0.7468441 0.7439741 0.7372272 0.7317169 0.7307682 0.7281671 
##    Tubb2b     Fabp7    Gm2694     Auts2     C1ql4      Lhx2     Stmn1 
## 0.7274364 0.7254094 0.7218321 0.7188515 0.7186425 0.7147797 0.7076640 
##      Pax3     Sox11 Hist3h2ba    Evx1os     Tubb5      Sox4  Trp53i11 
## 0.7035411 0.7022047 0.7011367 0.7001113 0.6965152 0.6914617 0.6899750 
##     Hmgn2    Malat1     Basp1       Ina     Hmgb2 
## 0.6894896 0.6852528 0.6828976 0.6801940 0.6774803

According to the cheat sheet I got from Kevin, Meis2, Lhx2 and Lhx9 are markers for the precursor cells for the cerebellar nuclei.

Let’s check out the top gene, Meis2

plot( colSums(counts), jitter(counts["Meis2",]), log="x", cex=.3, col = 1+selPoints_bool  )

In this plot, the UMI counts for the gene of interest are on the y axis and the UMI totals for the cell on the x axis. This allows me to distinguish “informative” zeroes (bottom right) from zeroes due to low coverage (bottom left). I have also added vertical jitter to reduce overplotting. The red points are the cells in the selected structure.

I prefer this plot over the usual violin plot:

ggplot( data.frame( selected = selPoints_bool, expr = sqrt(nrm_counts["Meis2",]) ) ) +
  geom_violin(aes(x=selected, y=expr))

Nevertheless, the violin plot helps understanding what the AUROC criterion does and how the ROC curver for the gene is derived.

plot( roc( selPoints_bool, nrm_counts["Meis2",] ) )

I next plot the gene in the UMAP embedding. I highlight all cells with a least two UMI counts for the gene.

plot( my_umap$layout, col = 1 + ( counts["Meis2",] >= 2 ), asp=1, cex=.1 )

Note how the tip of the structure does not seem to be part of whatever this gene marks. Also note that the boundary between the black tip and the red main part do not correspond to the Louvain cluster boundary.

Was 2 UMI counts really a good cut-off? Our LinkedCharts library allows to add a sigmoid colour slider to the scatter plot. This can help here:

library( rlc )
openPage( useViewer = FALSE )
lc_scatter( dat(
   x = my_umap$layout[,1],
   y = my_umap$layout[,2],
   colourValue = nrm_counts["Meis2",],
   size = 1 ),
   id = "scatter")

lc_colourSlider( chart="scatter" )

Next, I have selected the long arm with the Sleepwalk lasso:

selPoints <-
c(2L, 3L, 8L, 9L, 10L, 11L, 12L, 13L, 15L, 17L, 18L, 21L, 22L, 
...
3236L, 3239L, 3241L, 3247L, 3249L, 3251L, 3254L)

These cells:

selPoints_bool <-seq.int(ncol(nrm_counts)) %in% selPoints

plot( my_umap$layout, asp=1, col=1+selPoints_bool )

Check again by AUROC

aucs <- sapply( informative_genes, function(g)
   auc( roc( selPoints_bool, nrm_counts[g,] ) ) )

head( sort( aucs, decreasing = TRUE ), 40 )

##   Gm26688     Nrxn3    Lhx1os    Tfap2b      Ebf1      Lhx5     Nhlh2 
## 0.9181459 0.9021246 0.8972830 0.8957367 0.8721018 0.8612993 0.8087934 
##      Pid1     Foxp2    Cdkn1c    Cited2      Uncx    Tagln3   Ppp2r2b 
## 0.8064388 0.8009522 0.7934326 0.7880298 0.7878655 0.7877650 0.7800723 
##      Gad2      Ftl1     Sparc  Serpinh1   Igfbpl1     Erbb4     Tenm2 
## 0.7765886 0.7543529 0.7420285 0.7415008 0.7386379 0.7357047 0.7328726 
##     Pcdh9     Actg1     Stmn2     Stmn1      Ldha     Basp1     H3f3a 
## 0.7324070 0.7319108 0.7306687 0.7305721 0.7298708 0.7285679 0.7283207 
##     Rpl41    Malat1     Celf4    Klhl35     Fabp7      Actb       Id2 
## 0.7252911 0.7239458 0.7213464 0.7182265 0.7173180 0.7169015 0.7168335 
##    Tmsb10     Fabp5    Mpped2       Mif       Ran 
## 0.7121611 0.7100093 0.7064039 0.7052148 0.7011633

(This sapply is not well done. I should presort the counts to speed it up.)

I spot Foxp2, a marker for Purkinje cells.

Where else do we see that one?

openPage( useViewer = FALSE )
lc_scatter( dat(
   x = my_umap$layout[,1],
   y = my_umap$layout[,2],
   colourValue = nrm_counts["Foxp2",],
   size = 1 ),
   id = "scatter")

lc_colourSlider( chart="scatter" )

Are there other genes that increase in expression along the length of my selected structure? As the structure is elongated in vertical direction on the UMAP embedding, I can find them by checking, for each gene, the correlation of this gene in a cell with the cell’s y cordinate in the umap embedding. I calculate the correlation coefficient only using the selected counts

Correlate all genes in these cells with y coordinate of UMAP:

cors <- sapply( informative_genes, function(g)
   suppressWarnings(cor( 
     nrm_counts[ g, selPoints_bool ],    # expression of gene 'g' in the selected cells
     my_umap$layout[ selPoints, 2 ],     # umap y corrdinate of the selected cells
     method="spearman" ) ))

cat("\n")

head( sort( cors, decreasing = TRUE ), 40 )

##    Tmsb10     Gap43    Malat1     Cnih2     Stmn3      Meg3     Ncam1 
## 0.8942025 0.8453662 0.7664583 0.7662315 0.7611152 0.7009983 0.6985235 
##       Ndn       Ina    Tubb2a Hist3h2ba      Gng3      Nnat    Mllt11 
## 0.6863740 0.6852637 0.6336845 0.6315892 0.6272005 0.6252312 0.6155675 
##      Ly6h      Rtn1      Gad2       Id4      Ftl1    Tubb2b     Stmn2 
## 0.6130553 0.6104698 0.6066861 0.5882843 0.5843818 0.5692370 0.5618663 
##   Khdrbs2      Dner     Ptprd     Nrxn3     Epha5    Adarb2     Pcdh9 
## 0.5379458 0.5333362 0.5268840 0.5254053 0.5217668 0.5170168 0.5132195 
##      Ebf1    Lhx1os      Rora      Ppia      Nrg3     Tubb5     Stmn4 
## 0.5035401 0.4928012 0.4886042 0.4676053 0.4674764 0.4543159 0.4486940 
##     Foxp2    Nkain3     Tenm2     H3f3a     Celf4 
## 0.4403896 0.4392023 0.4332404 0.4310572 0.4281805

(The supressWarnings is only to remove the warnings about genes with zero SD.)

Let’s plot one of these genes. This time we use the cubeHelix colour palette.

as_tibble( my_umap$layout ) %>% 
add_column( expr = nrm_counts[ "Cdkn1c", ] ) %>%
ggplot +
  geom_point(aes( x=V1, y=V2, col=expr ), size=.3 ) +
  coord_fixed() +
  scale_color_gradientn( colours=rev(rje::cubeHelix(50))[15:40] )

## Warning: `as_tibble.matrix()` requires a matrix with column names or a `.name_repair` argument. Using compatibility `.name_repair`.
## This warning is displayed once per session.

Further work

This is just a demosntration. For a real analysis, we should, of course, check more than just one marker for each cell type, see whether they mark the same cell types, look more closely at the gradualy changing genes, and work towards some biological hypotheses. And, not to forget: There are many more samples in the Zeisel data.

Session Info

The versions of R and of all packages:

sessionInfo()

## R version 3.5.3 (2019-03-11)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.2 LTS
## 
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=de_DE.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] pROC_1.13.0     umap_0.2.0.0    forcats_0.4.0   stringr_1.4.0  
##  [5] dplyr_0.8.0.1   purrr_0.3.1     readr_1.3.1     tidyr_0.8.3    
##  [9] tibble_2.0.1    tidyverse_1.2.1 Seurat_2.3.4    Matrix_1.2-16  
## [13] cowplot_0.9.4   ggplot2_3.1.0  
## 
## loaded via a namespace (and not attached):
##   [1] Rtsne_0.15           colorspace_1.4-0     class_7.3-15        
##   [4] modeltools_0.2-22    ggridges_0.5.1       mclust_5.4.2        
##   [7] htmlTable_1.13.1     futile.logger_1.4.3  base64enc_0.1-3     
##  [10] rje_1.9              rstudioapi_0.9.0     proxy_0.4-23        
##  [13] npsurv_0.4-0         flexmix_2.3-15       bit64_0.9-7         
##  [16] RSpectra_0.13-1      fansi_0.4.0          lubridate_1.7.4     
##  [19] mvtnorm_1.0-10       xml2_1.2.0           codetools_0.2-16    
##  [22] splines_3.5.3        R.methodsS3_1.7.1    lsei_1.2-0          
##  [25] robustbase_0.93-3    knitr_1.22           Formula_1.2-3       
##  [28] jsonlite_1.6         broom_0.5.1          ica_1.0-2           
##  [31] cluster_2.0.7-1      kernlab_0.9-27       png_0.1-7           
##  [34] R.oo_1.22.0          compiler_3.5.3       httr_1.4.0          
##  [37] backports_1.1.3      assertthat_0.2.0     lazyeval_0.2.1      
##  [40] cli_1.0.1            formatR_1.6          lars_1.2            
##  [43] acepack_1.4.1        htmltools_0.3.6      tools_3.5.3         
##  [46] igraph_1.2.4         gtable_0.2.0         glue_1.3.0          
##  [49] RANN_2.6.1           reshape2_1.4.3       Rcpp_1.0.0          
##  [52] cellranger_1.1.0     trimcluster_0.1-2.1  gdata_2.18.0        
##  [55] ape_5.2              nlme_3.1-137         iterators_1.0.10    
##  [58] fpc_2.1-11.1         gbRd_0.4-11          lmtest_0.9-36       
##  [61] xfun_0.5             rvest_0.3.2          irlba_2.3.3         
##  [64] gtools_3.8.1         DEoptimR_1.0-8       MASS_7.3-51.1       
##  [67] zoo_1.8-4            scales_1.0.0         hms_0.4.2           
##  [70] doSNOW_1.0.16        parallel_3.5.3       lambda.r_1.2.3      
##  [73] RColorBrewer_1.1-2   yaml_2.2.0           reticulate_1.11.1   
##  [76] pbapply_1.4-0        gridExtra_2.3        rpart_4.1-13        
##  [79] segmented_0.5-3.0    latticeExtra_0.6-28  stringi_1.3.1       
##  [82] foreach_1.4.4        checkmate_1.9.1      caTools_1.17.1.2    
##  [85] bibtex_0.4.2         Rdpack_0.10-1        SDMTools_1.1-221    
##  [88] rlang_0.3.1          pkgconfig_2.0.2      dtw_1.20-1          
##  [91] prabclus_2.2-7       bitops_1.0-6         evaluate_0.13       
##  [94] lattice_0.20-38      ROCR_1.0-7           labeling_0.3        
##  [97] htmlwidgets_1.3      bit_1.1-14           tidyselect_0.2.5    
## [100] plyr_1.8.4           magrittr_1.5         R6_2.4.0            
## [103] generics_0.0.2       snow_0.4-3           gplots_3.0.1.1      
## [106] Hmisc_4.2-0          haven_2.1.0          pillar_1.3.1        
## [109] foreign_0.8-70       withr_2.1.2          fitdistrplus_1.0-14 
## [112] mixtools_1.1.0       survival_2.43-3      nnet_7.3-12         
## [115] tsne_0.1-3           modelr_0.1.4         crayon_1.3.4        
## [118] hdf5r_1.0.1          futile.options_1.0.1 utf8_1.1.4          
## [121] KernSmooth_2.23-15   rmarkdown_1.11       readxl_1.3.0        
## [124] grid_3.5.3           data.table_1.12.0    metap_1.1           
## [127] digest_0.6.18        diptest_0.75-7       VennDiagram_1.6.20  
## [130] R.utils_2.8.0        stats4_3.5.3         munsell_0.5.0