marker_genes#

API pages include interactive (HTML) plots that would possibly not render correctly on a mobile device.

marker_genes(data: AnnData, groups: Sequence[str] | None = None, *, key: str = 'rank_genes_groups', n_genes: int = 5) list[str]#

Select the top-ranked marker gene names from a precomputed ranking.

The returned list is meant to be passed straight into the keys argument of the dimensional and distribution plots (e.g. umaps, violins), so each marker becomes one panel.

Parameters:
  • data (AnnData) – The single-cell data object holding the precomputed differential expression ranking.

  • groups (Sequence[str] | None, default None) – Subset of groups to pull markers from, in order. None keeps all groups in their stored order.

  • key (str, default 'rank_genes_groups') – The key under which the precomputed ranking is stored on data.

  • n_genes (int, default 5) – Number of top genes to pull per group.

Returns:

list[str] – Flattened, order-preserving list of marker gene names with duplicates removed (a gene ranked highly in several groups is kept once, at its first occurrence).

Raises:
  • UnsupportedDataTypeError – If data is not a supported single-cell data object.

  • KeyNotFoundError – If the ranking result or a requested group is missing.

  • ValueError – If n_genes is out of range.

  • TypeError – If groups is neither a Sequence of strings nor None.

Notes

Reads only the gene names, not their scores; use markers to plot the ranking itself. Use marker_genes_dict to keep the per-group grouping.

Examples

Color a UMAP grid by the top markers of a single group.

import scanpy as sc

import cellestial as cl

data = sc.datasets.pbmc68k_reduced()

cl.umaps(
    data,
    keys=cl.marker_genes(data, groups=["CD14+ Monocyte"], n_genes=4),
    size=2,
    ncol=2,
)

A heatmap of the top markers of all groups (3 genes per group) , with duplicates removed.

cl.heatmap(data, keys=cl.marker_genes(data, n_genes=3), group_by="louvain")