volcano#

volcano(data: AnnData, group: str, *, group_by: str | None = None, key: str = 'rank_genes_groups', use_adjusted_pvalue: bool = True, logfoldchange_threshold: float = 1.0, pvalue_threshold: float = 0.05, mapping: FeatureSpec | None = None, color_up: str = '#b22222', color_down: str = '#6495ed', color_nonsignificant: str = '#bebebe', size: float = 2.0, alpha: float = 0.8, show_threshold_lines: bool = True, threshold_color: str = '#3f3f3f', threshold_size: float = 0.4, threshold_linetype: str = 'dashed', threshold_kwargs: dict | None = None, top_n: int | None = 10, label_color: str = '#1f1f1f', label_size: float = 4.0, segment_size: float = 0.4, label_kwargs: dict | None = None, nonsignificant_subsample: int | None = 2000, variable_column: str = 'variable', logfoldchange_column: str = 'logfoldchange', pvalue_column: str = 'pvalue', neg_log_pvalue_column: str = 'neg_log_pvalue', significance_column: str = 'significance', up_label: str = 'up', down_label: str = 'down', nonsignificant_label: str = 'ns', rank_genes_kwargs: dict | None = None, tooltips: Literal['none'] | Sequence[str] | FeatureSpec | None = None, interactive: bool = False, **point_kwargs) PlotSpec#

Volcano plot.

Plots -log10(pvalue) against log fold change for the differential expression results of group (vs the rest), colored by significance and optionally annotated with the top up- and down-regulated genes.

Parameters:
  • data (AnnData) – The AnnData object holding the differential expression results in data.uns[key] (scanpy’s rank_genes_groups convention).

  • group (str) – The group name to plot results for.

  • group_by (str | None, default None) – Observation column to group by. When provided, data.uns[key] is (re)computed via scanpy.tl.rank_genes_groups if it is missing or was computed with a different groupby.

  • key (str, default 'rank_genes_groups') – The key under data.uns storing the differential expression results.

  • use_adjusted_pvalue (bool, default True) – Whether to use adjusted p-values (pvals_adj) instead of raw p-values (pvals) for the significance call and transform.

  • logfoldchange_threshold (float, default 1.0) – Absolute log fold change threshold used to label up/down regulated.

  • pvalue_threshold (float, default 0.05) – P-value threshold used to label significance.

  • mapping (FeatureSpec | None, default None) – Additional aesthetic mappings, the result of aes(). Merged on top of the default aes(x=logfoldchange, y=neg_log_pvalue, color=significance).

  • color_up (str, default '#A83737') – Color for significantly up-regulated points (brick red).

  • color_down (str, default '#6FA8DC') – Color for significantly down-regulated points (soft cornflower blue).

  • color_nonsignificant (str, default '#A6A6A6') – Color for non-significant points (medium gray).

  • size (float, default 2.0) – Default point size; can be overridden via point_kwargs.

  • alpha (float, default 0.8) – Default point alpha; can be overridden via point_kwargs.

  • show_threshold_lines (bool, default True) – Whether to draw the dashed threshold lines.

  • threshold_color (str, default '#3f3f3f') – Color of the threshold lines.

  • threshold_size (float, default 0.4) – Size (thickness) of the threshold lines.

  • threshold_linetype (str, default 'dashed') – Line type of the threshold lines.

  • threshold_kwargs (dict | None, default None) – Additional parameters passed to the threshold geom_hline and geom_vline layers.

  • top_n (int | None, default 10) – Number of top up- and top down-regulated genes (by -log10(pvalue)) to label. Set to None or 0 to disable labels.

  • label_color (str, default '#1f1f1f') – Color of the gene labels.

  • label_size (float, default 4.0) – Size of the gene labels.

  • segment_size (float, default 0.4) – Width of the line segment connecting the label to the point.

  • label_kwargs (dict | None, default None) – Additional parameters passed to the geom_text label layer.

  • nonsignificant_subsample (int | None, default 2000) – Maximum number of non-significant points to keep. Significant points (up/down) are always kept in full. Reducing this shrinks the embedded data in the rendered plot (smaller HTML/notebook output) at no visible cost since the non-significant cloud is heavily overplotted. Set to None to keep every non-significant point. Sampling is deterministic (seed=42).

  • variable_column (str, default 'variable') – Output column name for the gene/feature names.

  • logfoldchange_column (str, default 'logfoldchange') – Output column name for the log fold changes.

  • pvalue_column (str, default 'pvalue') – Output column name for the p-values.

  • neg_log_pvalue_column (str, default 'neg_log_pvalue') – Output column name for the -log10(pvalue) transform.

  • significance_column (str, default 'significance') – Output column name for the categorical significance label.

  • up_label (str, default 'up') – Label for significantly up regulated features.

  • down_label (str, default 'down') – Label for significantly down regulated features.

  • nonsignificant_label (str, default 'ns') – Label for non-significant features.

  • rank_genes_kwargs (dict | None, default None) – Additional keyword arguments forwarded to scanpy.tl.rank_genes_groups when group_by triggers a (re)compute (e.g. method='t-test', use_raw=False, layer=..., corr_method='bonferroni').

  • tooltips ({'none'} | Sequence[str] | FeatureSpec | None, default None) – Tooltips to show when hovering over the geom. Accepts Sequence[str] or result of layer_tooltips() for more complex tooltips. Use ‘none’ to disable tooltips.

  • interactive (bool, default False) – Whether to make the plot interactive.

  • **point_kwargs – Additional parameters for the geom_point layer.

Returns:

PlotSpec – Volcano plot.

Examples

A simple volcano plot.

import scanpy as sc

import cellestial as cl

data = sc.read_h5ad("data/pbmc3k_pped.h5ad")

cl.volcano(
    data,
    group="B Cells",
    group_by="cell_type_lvl1",
)
WARNING: It seems you use rank_genes_groups on the raw count data. Please logarithmize your data before calling rank_genes_groups.