This document compares two approaches to analyse single-cell data: First, we use Seurat, and then we perform all steps using only basic R, in order to be able to study and modify each step. I have presented it at the Single Cell Center Heidelberg meeting in Mar 2019.
As example data, I use the study on the development of the mouse brain by Carter et al.: “A single-cell transcriptional atlas of the developing murine cerebellum”, Current Biology 28:2910 (2018).
Specifically, I will use their sample E13a, i.e., which comprises cells from the cerebellum of a mouse embryo at day 13.5. The “a” means that it is the first of their two replicates.
I first perform a standard analysis with Seurat.
library( Seurat )
Kevin has downloaded the bam files from the European Nucleotide Archive (accession PRJEB23051) and run them again through CellRanger. In the next line, I read in the output from CellRanger. If you want to rerun this script yourself, you can find the three CellRanger output files for sample E13a here.
seu <- Read10X("E13_A/")
Now I create a Seurat object, keeping only the genes that are expressed in at least 3 cells, and only those cells expressing at least 1000 genes.
seu <- CreateSeuratObject( seu, min.cells = 3, min.genes=1000 )
I normalize the data. This performs, by default, the transformation \(y_{ij} = \log\left( 1 + 10^4 \frac{k_{ij}}{\sum_{i'}k_{i'k}}\right)\) (where \(k_{ij}\) is the UMI count for gene \(i\) in cell \(j\)).
seu <- NormalizeData( seu )
A short detour here on the topic of normalization:
The median UMI count in our cells is
median_umi_count_per_cell <- median( colSums(seu@raw.data) )
median_umi_count_per_cell
## [1] 6199
Here is a plot showing how the transformation changes UMI counts for such a “median” cell
plot( 0:100, log( 1 + 1e4 * (0:100) / median_umi_count_per_cell ),
type = "l", xlab="UMI count", ylab="expression measure", ylim = c(0, 6.5) )
meet_at_50 <- function(y)
y / y[51] * log( 1 + 1e4 * 50 / median_umi_count_per_cell )
lines( 0:100, meet_at_50( log( 1 + 1e3 * (0:100) / median_umi_count_per_cell ) ), type = "l", col="darkgreen" )
lines( 0:100, meet_at_50( log( 1 + 1e5 * (0:100) / median_umi_count_per_cell ) ), type = "l", col="darkblue" )
lines( 0:100, meet_at_50( log10( 1 + (0:100) ) ), type = "l", col="green" )
lines( 0:100, meet_at_50( sqrt( (0:100) / median_umi_count_per_cell ) ), type = "l", col="orange" )
The black line above is Seurat’s default normalization. For future reference, I have also included the transformation if one changes Seurat’s scaling factor from the default \(10^4\) (black) to \(10^3\) (dark green) or \(10^5\) (dark blue). Light green is the often used simple \(y_{ij} = \log_{10}(1+k_{ij})\) and orange is the square-root transform \(y_{ij} = \sqrt{ {k_{ij}}/{\sum_{i'}k_{i'k}}}\). I have rescaled these extra lines so that they all meet at 50.
Continuing with the standard Seuirat work-flow: Find the variable genes
seu <- FindVariableGenes( seu )
Scale and center the data and regress out UMI totals
seu <- ScaleData( seu, vars.to.regress = "nUMI" )
## Regressing out: nUMI
##
## Time Elapsed: 29.526376247406 secs
## Scaling data matrix
Run a PCA to reduce the data
seu <- RunPCA( seu, do.print = FALSE )
Perform Louvain clustering
seu <- FindClusters( seu, print.output = FALSE, save.SNN = TRUE )
Get an embedding. I use UMAP. By default, RunUMAP
only uses the first 5 PCS, but this gives quite unsatisfactory results. Hence, I use a few more, otherwise it get’s too blurry.
seu <- RunUMAP( seu, dims.use=1:20 )
Plot the embedding, and highlight the Louvain clusters:
DimPlot( seu, "umap" )
Inspect it with Sleepwalk
library( sleepwalk )
sleepwalk( seu@dr$umap@cell.embeddings, seu@dr$pca@cell.embeddings )
Now I perform an analysis only using standard R and tidyverse packages.
library( tidyverse )
First, read the CellRanger output. We read the three files one by one.
First the genes.tsv
file. It contains two columns: the gene IDs and the gene names
genes <- read_tsv( "E13_A/genes.tsv", col_names = c( "gene_id", "gene_name" ) )
## Parsed with column specification:
## cols(
## gene_id = col_character(),
## gene_name = col_character()
## )
head( genes )
## # A tibble: 6 x 2
## gene_id gene_name
## <chr> <chr>
## 1 ENSMUSG00000102693 4933401J01Rik
## 2 ENSMUSG00000064842 Gm26206
## 3 ENSMUSG00000051951 Xkr4
## 4 ENSMUSG00000102851 Gm18956
## 5 ENSMUSG00000103377 Gm37180
## 6 ENSMUSG00000104017 Gm37363
There are a few duplicated gene names. I disambiguate these few cases by adding the Ensemble ID to the name.
duplicate_gene_names <- genes$gene_name[ duplicated(genes$gene_name) ]
genes$gene_name <- ifelse(
genes$gene_name %in% duplicate_gene_names,
str_c( genes$gene_name, "_", genes$gene_id ),
genes$gene_name )
Next, the cell barcodes. This is simply a file with one barcode per line.
barcodes <- readLines( "E13_A/barcodes.tsv" )
str( barcodes )
## chr [1:737280] "AAACATACAAAACG-1" "AAACATACAAAAGC-1" ...
Note that these are the unfiltered files. We have over 700,000 barcodes – presumably most of them empty droplets.
I remove the superfluous -1
suffix.
barcodes <- str_remove( barcodes, "-1" )
The third file and biggest file in the CellRanger output is the matrix of counts in coordinate-sparse form, stored according to the MatrixMarket format.
matrixTable <- read_delim( "E13_A/matrix.mtx",
delim=" ", comment = "%", col_names = c( "gene", "barcode", "count" ) )
## Parsed with column specification:
## cols(
## gene = col_double(),
## barcode = col_double(),
## count = col_double()
## )
head(matrixTable)
## # A tibble: 6 x 3
## gene barcode count
## <dbl> <dbl> <dbl>
## 1 53801 737280 10042484
## 2 32438 1 1
## 3 39070 1 1
## 4 52789 1 1
## 5 5635 4 1
## 6 10830 4 1
Note how the first non-comment row of a MatrixMarket file contains the total number of genes, barcodes and UMIs. The first two numbers agree with the number of rows in the genes
and barcodes
files. From the second row on, we have actual data: the first and second columns are indices into the gene and barcode list, the third one contains the UMI counts.
The sparseMatrix
function from the base R package Matrix
is designed to handle such data. Actually, it turns it from triplet-sparse to column-sparse (unless we set giveCsparse=FALSE
) but that might actually be better for performance. (Seurat, by the way, keeps it in triplet-sparse format.)
The sparseMatrix functions takes all the data we have just loaded:
library( Matrix )
counts <-
sparseMatrix(
i = matrixTable$gene[-1], j = matrixTable$barcode[-1], x = matrixTable$count[-1],
dims = c( matrixTable$gene[1], matrixTable$barcode[1] ),
dimnames = list( gene = genes$gene_name, barcode = barcodes ) )
counts[1:5,1:5]
## 5 x 5 sparse Matrix of class "dgCMatrix"
## barcode
## gene AAACATACAAAACG AAACATACAAAAGC AAACATACAAACAG
## 4933401J01Rik . . .
## Gm26206 . . .
## Xkr4 . . .
## Gm18956 . . .
## Gm37180 . . .
## barcode
## gene AAACATACAAACGA AAACATACAAAGCA
## 4933401J01Rik . .
## Gm26206 . .
## Xkr4 . .
## Gm18956 . .
## Gm37180 . .
I remove the raw data tables to save RAM
rm( barcodes, genes, matrixTable )
As I have read the complete data, I can do the popular knee plot that is also shown in CellRanger’s HTML summary page.
colSums( counts>0 )
tells us for each barcode how many genes have been detected for this. I calculate this and sort:
n_detected_genes_sorted <- sort( colSums( counts>0 ), decreasing = TRUE )
head(n_detected_genes_sorted)
## ATCCCGTGACTAGC TAATGAACACTCAG CTTTGATGGCTCCT TCGTAGGAGGGAGT AATACTGAATTCGG
## 6451 5862 5638 5346 5224
## TGGAACACCGAACT
## 5212
Now I plot these numbers in a log-log plot
plot( seq.int(ncol(counts)), n_detected_genes_sorted, log = "xy", type="s",
xlab = "barcodes", ylab = "detected genes" )
## Warning in xy.coords(x, y, xlabel, ylabel, log): 457576 y values <= 0
## omitted from logarithmic plot
abline( h=1000, col="gray" )
I use the same criterion as above and call everything with at least 1000 detected genes (above the gray line) a cell.
cell_barcodes <- names(which( colSums( counts>0 ) >= 1000 ))
str(cell_barcodes)
## chr [1:3255] "AAACATTGGCTCCT" "AAACATTGTCGACA" "AAACGCACCCACCT" ...
Are these the same barcodes as the ones in the Seurat object?
all( colnames(seu@raw.data) == cell_barcodes )
## [1] TRUE
They are. Even in the same order.
I remove the rest to make my matrix smaller
counts <- counts[ , cell_barcodes ]
dim(counts)
## [1] 53801 3255
I also remove the genes that appear in none of the remaining cells
counts <- counts[ rowSums(counts) > 0, ]
dim(counts)
## [1] 25174 3255
For each cell, I simply divide the UMI count per gene by the total UMI count over all genes: \(y_{ik} = k_{ij} / \sum_{i'} k_{i'j}\) The double transposition is to get R to divide the columns and not the rows by the appropriate total.
nrm_counts <- t( t(counts) / colSums(counts) )
A preliminary: It seems there is no function to efficiently calculate row variances for a column-sparse matrix. Here is my own implementation. I might look a bit ugly but it works.
colVars_spm <- function( spm ) {
stopifnot( is( spm, "dgCMatrix" ) )
ans <- sapply( seq.int(spm@Dim[2]), function(j) {
mean <- sum( spm@x[ (spm@p[j]+1):spm@p[j+1] ] ) / spm@Dim[1]
sum( ( spm@x[ (spm@p[j]+1):spm@p[j+1] ] - mean )^2 ) +
mean^2 * ( spm@Dim[1] - ( spm@p[j+1] - spm@p[j] ) ) } ) / ( spm@Dim[1] - 1 )
names(ans) <- spm@Dimnames[[2]]
ans
}
rowVars_spm <- function( spm ) {
colVars_spm( t(spm) )
}
With this, I can get the variance-to-mean ratio for each gene. I plot them against the means:
gene_means <- rowMeans( nrm_counts )
gene_vars <- rowVars_spm( nrm_counts )
plot( gene_means, gene_vars / gene_means,
log = "xy", cex = .3, col = adjustcolor("black", alpha=.3),
xlab = "mean", ylab = "variance / mean" )
poisson_vmr <- mean( 1 / colSums( counts ) )
abline( h = 1:3 * poisson_vmr, col="lightblue" )
The lowest of the light blue line marks the Poisson noise level (poisson_vmr
). It is placed at the mean of the reciprocals of the UMI totals per cell. This is immediately not obvious; I’ll write up the proof.
I chose the genes three times above the Poisson level as informative genes
informative_genes <- names(which(
gene_vars / gene_means > 3 * poisson_vmr ))
str( informative_genes )
## chr [1:591] "Eya1" "Tfap2b" "Khdrbs2" "1500015O10Rik" "Gm26649" ...
How many of these are also among the variable genes chosen by Seurat?
grid::grid.newpage()
print( grid::grid.draw(
VennDiagram::venn.diagram( list( Seurat = seu@var.genes, mine = informative_genes ), NULL ) ) )
## NULL
I calculate a UMAP embedding, using only the informative genes
library( umap )
my_umap <- umap( as.matrix( t( sqrt( nrm_counts[ informative_genes, ] ) ) ) )
The square root here is a varianmce-stabilizing transformation. I use it instead of the log used by Seurat, because in UMI data, the dominant noise is Poissonean. (This, too, needs to be explained in more detail, along with the calculation of the Poisson noise level.)
plot( my_umap$layout, asp=1 )
Those outliers are annoying. I pull them in
my_umap$layout[ my_umap$layout[,1] < -8, 1 ] <- -8
Plot again, now colour with the Louvain cluster found by Seurat
plot( my_umap$layout, asp=1, col=seu@ident, cex=.3 )
I save the environment, so I can easily restart here later.
save.image("comparison.rda")
Now I can compare the two embeddings. Left the one by Seurat, right the one with my workflow.
sleepwalk(
list( seu@dr$umap@cell.embeddings, my_umap$layout ),
list( seu@dr$pca@cell.embeddings, as.matrix( t(sqrt(nrm_counts[informative_genes,])) ) ) )
(Sorry, the UMAP is inverted in the screenshots compared to what you see in the R plots, because I ran it twice with different random seeds.)
I have selected a few points with the mouse (using Sleepwalk’s lasso feature). These then appear in the environment as a variable called selPoints
. To be able to knit this document non-interactively, I list the selected point here:
selPoints <-
c(1L, 4L, 5L, 7L, 14L, 16L, 26L, 28L, 29L, 32L, 34L, 35L, 37L,
45L, 60L, 61L, 69L, 71L, 76L, 85L, 95L, 98L, 104L, 107L, 111L,
...
3228L, 3231L, 3234L, 3237L, 3238L, 3240L, 3244L, 3250L, 3252L,
3253L)
This is the structure that I have selected with the mouse:
selPoints_bool <-seq.int(ncol(nrm_counts)) %in% selPoints
plot( my_umap$layout, asp=1, col=1+selPoints_bool )
What genes are special among these cells? I like to use AUROC for this, as it is non-parametric
library( pROC )
## Type 'citation("pROC")' for a citation.
##
## Attaching package: 'pROC'
## The following objects are masked from 'package:stats':
##
## cov, smooth, var
aucs <- sapply( informative_genes, function(g)
auc( roc( selPoints_bool, nrm_counts[g,] ) ) )
head( sort( aucs, decreasing = TRUE ), 40 )
## Meis2 Lhx9 Nrn1 Ralyl Hs3st5 Nrxn1 Tfap2b
## 0.9465052 0.8764802 0.8576308 0.8356687 0.7974714 0.7930246 0.7878751
## Nnat Ncam1 Erbb4 Slc25a5 Rtn1 Phlda1 Igfbpl1
## 0.7775149 0.7730306 0.7719841 0.7566909 0.7558819 0.7548668 0.7515794
## Tmem163 Cacna2d1 Lhx1os Ppia Cd24a Nkain3 Nrxn3
## 0.7476554 0.7468441 0.7439741 0.7372272 0.7317169 0.7307682 0.7281671
## Tubb2b Fabp7 Gm2694 Auts2 C1ql4 Lhx2 Stmn1
## 0.7274364 0.7254094 0.7218321 0.7188515 0.7186425 0.7147797 0.7076640
## Pax3 Sox11 Hist3h2ba Evx1os Tubb5 Sox4 Trp53i11
## 0.7035411 0.7022047 0.7011367 0.7001113 0.6965152 0.6914617 0.6899750
## Hmgn2 Malat1 Basp1 Ina Hmgb2
## 0.6894896 0.6852528 0.6828976 0.6801940 0.6774803
According to the cheat sheet I got from Kevin, Meis2, Lhx2 and Lhx9 are markers for the precursor cells for the cerebellar nuclei.
Let’s check out the top gene, Meis2
plot( colSums(counts), jitter(counts["Meis2",]), log="x", cex=.3, col = 1+selPoints_bool )
In this plot, the UMI counts for the gene of interest are on the y axis and the UMI totals for the cell on the x axis. This allows me to distinguish “informative” zeroes (bottom right) from zeroes due to low coverage (bottom left). I have also added vertical jitter to reduce overplotting. The red points are the cells in the selected structure.
I prefer this plot over the usual violin plot:
ggplot( data.frame( selected = selPoints_bool, expr = sqrt(nrm_counts["Meis2",]) ) ) +
geom_violin(aes(x=selected, y=expr))
Nevertheless, the violin plot helps understanding what the AUROC criterion does and how the ROC curver for the gene is derived.
plot( roc( selPoints_bool, nrm_counts["Meis2",] ) )
I next plot the gene in the UMAP embedding. I highlight all cells with a least two UMI counts for the gene.
plot( my_umap$layout, col = 1 + ( counts["Meis2",] >= 2 ), asp=1, cex=.1 )
Note how the tip of the structure does not seem to be part of whatever this gene marks. Also note that the boundary between the black tip and the red main part do not correspond to the Louvain cluster boundary.
Was 2 UMI counts really a good cut-off? Our LinkedCharts library allows to add a sigmoid colour slider to the scatter plot. This can help here:
library( rlc )
openPage( useViewer = FALSE )
lc_scatter( dat(
x = my_umap$layout[,1],
y = my_umap$layout[,2],
colourValue = nrm_counts["Meis2",],
size = 1 ),
id = "scatter")
lc_colourSlider( chart="scatter" )
Next, I have selected the long arm with the Sleepwalk lasso:
selPoints <-
c(2L, 3L, 8L, 9L, 10L, 11L, 12L, 13L, 15L, 17L, 18L, 21L, 22L,
...
3236L, 3239L, 3241L, 3247L, 3249L, 3251L, 3254L)
These cells:
selPoints_bool <-seq.int(ncol(nrm_counts)) %in% selPoints
plot( my_umap$layout, asp=1, col=1+selPoints_bool )
Check again by AUROC
aucs <- sapply( informative_genes, function(g)
auc( roc( selPoints_bool, nrm_counts[g,] ) ) )
head( sort( aucs, decreasing = TRUE ), 40 )
## Gm26688 Nrxn3 Lhx1os Tfap2b Ebf1 Lhx5 Nhlh2
## 0.9181459 0.9021246 0.8972830 0.8957367 0.8721018 0.8612993 0.8087934
## Pid1 Foxp2 Cdkn1c Cited2 Uncx Tagln3 Ppp2r2b
## 0.8064388 0.8009522 0.7934326 0.7880298 0.7878655 0.7877650 0.7800723
## Gad2 Ftl1 Sparc Serpinh1 Igfbpl1 Erbb4 Tenm2
## 0.7765886 0.7543529 0.7420285 0.7415008 0.7386379 0.7357047 0.7328726
## Pcdh9 Actg1 Stmn2 Stmn1 Ldha Basp1 H3f3a
## 0.7324070 0.7319108 0.7306687 0.7305721 0.7298708 0.7285679 0.7283207
## Rpl41 Malat1 Celf4 Klhl35 Fabp7 Actb Id2
## 0.7252911 0.7239458 0.7213464 0.7182265 0.7173180 0.7169015 0.7168335
## Tmsb10 Fabp5 Mpped2 Mif Ran
## 0.7121611 0.7100093 0.7064039 0.7052148 0.7011633
(This sapply is not well done. I should presort the counts to speed it up.)
I spot Foxp2, a marker for Purkinje cells.
Where else do we see that one?
openPage( useViewer = FALSE )
lc_scatter( dat(
x = my_umap$layout[,1],
y = my_umap$layout[,2],
colourValue = nrm_counts["Foxp2",],
size = 1 ),
id = "scatter")
lc_colourSlider( chart="scatter" )
Are there other genes that increase in expression along the length of my selected structure? As the structure is elongated in vertical direction on the UMAP embedding, I can find them by checking, for each gene, the correlation of this gene in a cell with the cell’s y cordinate in the umap embedding. I calculate the correlation coefficient only using the selected counts
Correlate all genes in these cells with y coordinate of UMAP:
cors <- sapply( informative_genes, function(g)
suppressWarnings(cor(
nrm_counts[ g, selPoints_bool ], # expression of gene 'g' in the selected cells
my_umap$layout[ selPoints, 2 ], # umap y corrdinate of the selected cells
method="spearman" ) ))
cat("\n")
head( sort( cors, decreasing = TRUE ), 40 )
## Tmsb10 Gap43 Malat1 Cnih2 Stmn3 Meg3 Ncam1
## 0.8942025 0.8453662 0.7664583 0.7662315 0.7611152 0.7009983 0.6985235
## Ndn Ina Tubb2a Hist3h2ba Gng3 Nnat Mllt11
## 0.6863740 0.6852637 0.6336845 0.6315892 0.6272005 0.6252312 0.6155675
## Ly6h Rtn1 Gad2 Id4 Ftl1 Tubb2b Stmn2
## 0.6130553 0.6104698 0.6066861 0.5882843 0.5843818 0.5692370 0.5618663
## Khdrbs2 Dner Ptprd Nrxn3 Epha5 Adarb2 Pcdh9
## 0.5379458 0.5333362 0.5268840 0.5254053 0.5217668 0.5170168 0.5132195
## Ebf1 Lhx1os Rora Ppia Nrg3 Tubb5 Stmn4
## 0.5035401 0.4928012 0.4886042 0.4676053 0.4674764 0.4543159 0.4486940
## Foxp2 Nkain3 Tenm2 H3f3a Celf4
## 0.4403896 0.4392023 0.4332404 0.4310572 0.4281805
(The supressWarnings
is only to remove the warnings about genes with zero SD.)
Let’s plot one of these genes. This time we use the cubeHelix colour palette.
as_tibble( my_umap$layout ) %>%
add_column( expr = nrm_counts[ "Cdkn1c", ] ) %>%
ggplot +
geom_point(aes( x=V1, y=V2, col=expr ), size=.3 ) +
coord_fixed() +
scale_color_gradientn( colours=rev(rje::cubeHelix(50))[15:40] )
## Warning: `as_tibble.matrix()` requires a matrix with column names or a `.name_repair` argument. Using compatibility `.name_repair`.
## This warning is displayed once per session.
This is just a demosntration. For a real analysis, we should, of course, check more than just one marker for each cell type, see whether they mark the same cell types, look more closely at the gradualy changing genes, and work towards some biological hypotheses. And, not to forget: There are many more samples in the Zeisel data.
The versions of R and of all packages:
sessionInfo()
## R version 3.5.3 (2019-03-11)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.2 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] pROC_1.13.0 umap_0.2.0.0 forcats_0.4.0 stringr_1.4.0
## [5] dplyr_0.8.0.1 purrr_0.3.1 readr_1.3.1 tidyr_0.8.3
## [9] tibble_2.0.1 tidyverse_1.2.1 Seurat_2.3.4 Matrix_1.2-16
## [13] cowplot_0.9.4 ggplot2_3.1.0
##
## loaded via a namespace (and not attached):
## [1] Rtsne_0.15 colorspace_1.4-0 class_7.3-15
## [4] modeltools_0.2-22 ggridges_0.5.1 mclust_5.4.2
## [7] htmlTable_1.13.1 futile.logger_1.4.3 base64enc_0.1-3
## [10] rje_1.9 rstudioapi_0.9.0 proxy_0.4-23
## [13] npsurv_0.4-0 flexmix_2.3-15 bit64_0.9-7
## [16] RSpectra_0.13-1 fansi_0.4.0 lubridate_1.7.4
## [19] mvtnorm_1.0-10 xml2_1.2.0 codetools_0.2-16
## [22] splines_3.5.3 R.methodsS3_1.7.1 lsei_1.2-0
## [25] robustbase_0.93-3 knitr_1.22 Formula_1.2-3
## [28] jsonlite_1.6 broom_0.5.1 ica_1.0-2
## [31] cluster_2.0.7-1 kernlab_0.9-27 png_0.1-7
## [34] R.oo_1.22.0 compiler_3.5.3 httr_1.4.0
## [37] backports_1.1.3 assertthat_0.2.0 lazyeval_0.2.1
## [40] cli_1.0.1 formatR_1.6 lars_1.2
## [43] acepack_1.4.1 htmltools_0.3.6 tools_3.5.3
## [46] igraph_1.2.4 gtable_0.2.0 glue_1.3.0
## [49] RANN_2.6.1 reshape2_1.4.3 Rcpp_1.0.0
## [52] cellranger_1.1.0 trimcluster_0.1-2.1 gdata_2.18.0
## [55] ape_5.2 nlme_3.1-137 iterators_1.0.10
## [58] fpc_2.1-11.1 gbRd_0.4-11 lmtest_0.9-36
## [61] xfun_0.5 rvest_0.3.2 irlba_2.3.3
## [64] gtools_3.8.1 DEoptimR_1.0-8 MASS_7.3-51.1
## [67] zoo_1.8-4 scales_1.0.0 hms_0.4.2
## [70] doSNOW_1.0.16 parallel_3.5.3 lambda.r_1.2.3
## [73] RColorBrewer_1.1-2 yaml_2.2.0 reticulate_1.11.1
## [76] pbapply_1.4-0 gridExtra_2.3 rpart_4.1-13
## [79] segmented_0.5-3.0 latticeExtra_0.6-28 stringi_1.3.1
## [82] foreach_1.4.4 checkmate_1.9.1 caTools_1.17.1.2
## [85] bibtex_0.4.2 Rdpack_0.10-1 SDMTools_1.1-221
## [88] rlang_0.3.1 pkgconfig_2.0.2 dtw_1.20-1
## [91] prabclus_2.2-7 bitops_1.0-6 evaluate_0.13
## [94] lattice_0.20-38 ROCR_1.0-7 labeling_0.3
## [97] htmlwidgets_1.3 bit_1.1-14 tidyselect_0.2.5
## [100] plyr_1.8.4 magrittr_1.5 R6_2.4.0
## [103] generics_0.0.2 snow_0.4-3 gplots_3.0.1.1
## [106] Hmisc_4.2-0 haven_2.1.0 pillar_1.3.1
## [109] foreign_0.8-70 withr_2.1.2 fitdistrplus_1.0-14
## [112] mixtools_1.1.0 survival_2.43-3 nnet_7.3-12
## [115] tsne_0.1-3 modelr_0.1.4 crayon_1.3.4
## [118] hdf5r_1.0.1 futile.options_1.0.1 utf8_1.1.4
## [121] KernSmooth_2.23-15 rmarkdown_1.11 readxl_1.3.0
## [124] grid_3.5.3 data.table_1.12.0 metap_1.1
## [127] digest_0.6.18 diptest_0.75-7 VennDiagram_1.6.20
## [130] R.utils_2.8.0 stats4_3.5.3 munsell_0.5.0