Artykuł w czasopiśmie
Brak miniatury
Licencja

ClosedAccessDostęp zamknięty

Enhanced performance of gene expression predictive models with protein-mediated spatial chromatin interactions

Autor
Plewczyński, Dariusz
Agarwal, Abhishek
Ruan, Yijun
Lipiński, Jakub
Chiliński, Mateusz
Data publikacji
2023
Abstrakt (EN)

There have been multiple attempts to predict the expression of the genes based on the sequence, epigenetics, and various other factors. To improve those predictions, we have decided to investigate adding protein-specific 3D interactions that play a significant role in the condensation of the chromatin structure in the cell nucleus. To achieve this, we have used the architecture of one of the state-of-the-art algorithms, ExPecto, and investigated the changes in the model metrics upon adding the spatially relevant data. We have used ChIA-PET interactions that are mediated by cohesin (24 cell lines), CTCF (4 cell lines), and RNAPOL2 (4 cell lines). As the output of the study, we have developed the Spatial Gene Expression (SpEx) algorithm that shows statistically significant improvements in most cell lines. We have compared ourselves to the baseline ExPecto model, which obtained a 0.82 Spearman's rank correlation coefficient (SCC) score, and 0.85, which is reported by newer Enformer were able to obtain the average correlation score of 0.83. However, in some cases (e.g. RNAPOL2 on GM12878), our improvement reached 0.04, and in some cases (e.g. RNAPOL2 on H1), we reached an SCC of 0.86.

Dyscyplina PBN
nauki biologiczne
Czasopismo
Scientific Reports
Tom
13
Zeszyt
1
ISSN
2045-2322
Licencja otwartego dostępu
Dostęp zamknięty