Autoregressive Upscaling of Sparse Single-Cell Data Improves Interpretability

Date issued

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Západočeská univerzita v Plzni

Abstract

Introduction of a novel generative training procedure for modeling single-cell RNA sequencing (scRNA-seq) data based on an autoregressive neural network architecture. Our model sequentially samples UMI-tagged transcripts and effectively captures the complex and sparse distributions inherent in scRNA-seq datasets. This generative framework supports realistic synthetic cell generation, gene expression inpainting, and measurement upscaling. Moreover, the pretrained model serves as a robust foundation for downstre am predictive tasks, such as disease classification. Finally, we propose a novel unsupervised cell-typing approach leveraging the model’s intrinsic generative structure. Cell-type hierarchies naturally emerge by tracing generative sampling paths, offering both interpretability and valuable biological insights.

Description

Subject(s)

autoregressive neural network, generative training procedure, synthetic cell generation, scRNA-seq analysis

Citation