R API in beta.

CZ CELLxGENE Discover Census

image

The CZ CELLxGENE Discover Census provides efficient computational tooling to access, query, and analyze all single-cell RNA data from CZ CELLxGENE Discover.

Using a new access paradigm of cell-based slicing and querying, you can interact with the data across datasets through TileDB-SOMA, or get slices in AnnData or Seurat objects.

Get started on using the Census:

Citing the Census

Please follow the citation guidelines offered by CZ CELLxGENE Discover.

Census Capabilities

The Census is a data object publicly hosted online and a convenience API to open it. The object is built using the SOMA API and data model via its implementation TileDB-SOMA. As such, the Census has all the data capabilities offered by TileDB-SOMA including:

Data access at scale

  • Cloud-based data access.

  • Efficient access for larger-than-memory slices of data.

  • Query and access data based on cell or gene metadata at low latency.

Interoperability with existing single-cell toolkits

  • Load and create AnnData objects.

  • Load and create Seurat objects. Coming soon.

Interoperability with existing Python or R data structures

  • From Python create PyArrow objects, SciPy sparse matrices, NumPy arrays, and pandas data frames.

  • From R create R Arrow objects, sparse matrices (via the Matrix package), and standard data frames and (dense) matrices.

Census Data Releases

The Census data release plans are detailed here.

Shortly, starting in May 15, 2023, Census long-term supported data releases will be published every 6 months and will be publicly accessible for at least 5 years. In addition, weekly releases are published without any guarantee of permanence.

Questions, feedback and issues

Coming soon

  • We are currently working on creating the tooling necessary to perform data modeling at scale with seamless integration of the Census and PyTorch.

  • To increase the usability of the Census for research, in 2023 and 2024 we are planning to explore the following areas:

    • Include organism-wide normalized layers.

    • Include Organism-wide embeddings.

    • On-demand information-rich subsampling.

Projects and tools using Census

If you are interested in listing a project here, please reach out to us at soma@chanzuckerberg.com