Artykuł w czasopiśmie
Brak miniatury
Licencja

ClosedAccessDostęp zamknięty

Truncated Robust Principal Component Analysis and Noise Reduction for Single Cell RNA Sequencing Data

Autor
Gogolewski, Krzysztof
Gambin, Anna
Sykulski, Maciej
Chung, Neo Christopher
Data publikacji
2019
Abstrakt (EN)

The development of single cell RNA sequencing (scRNA-seq) has enabled innovative approaches to investigating mRNA abundances. In our study, we are interested in extracting the systematic patterns of scRNA-seq data in an unsupervised manner; thus, we have developed two extensions of robust principal component analysis (RPCA). First, we present a truncated version of RPCA (tRPCA), which is much faster and memory efficient. Second, we introduce a noise reduction in tRPCA with L2 regularization. Unlike RPCA that only considers a low-rank L and sparse S matrices, the proposed method can also extract a noise E matrix inherent in modern genomic data. We demonstrate its usefulness by applying our methods on the peripheral blood mononuclear cell scRNA-seq data. Particularly, the clustering of a low-rank L matrix showcases better classification of unlabeled single cells. Overall, the proposed variants are well suited for high-dimensional and noisy data that are routinely generated in genomics.

Słowa kluczowe EN
matrix decomposition
principal component analysis
robust PCA
single cell RNA-seq
unsupervised learning
Dyscyplina PBN
informatyka
Czasopismo
Journal of Computational Biology
Tom
26
Zeszyt
8
Strony od-do
782-793
ISSN
1066-5277
Licencja otwartego dostępu
Dostęp zamknięty