cellestial.build_frame#
- build_frame(data: AnnData, *, variable_keys: str | Sequence[str] | None = None, axis: Literal[0, 1] | None = None, observations_name: str = 'barcode', variables_name: str = 'variable', include_dimensions: bool | int = False) DataFrame#
Build a DataFrame from an AnnData object.
- Parameters:
data (
AnnData) – The AnnData object containing the variables.variable_keys (
str | Sequence[str] | None) – Variable keys to add to the DataFrame. If None, no additional keys are added.axis (
{0,1}| None) – The axis to build the frame for. 0 for observations, 1 for variables.observations_name (
str) – The name of the observations column, default is ‘barcode’variables_name (
str) – Name for the variables index column, default is ‘variable’include_dimensions (
bool | int) – Whether to include dimensions in the DataFrame, default is False. Providing an integer will limit the number of dimensions to given number.
- Returns:
DataFrame– A polars DataFrame containing the variables.
Examples
Providing axis, 0 for observations axis and 1 for variables axis.
import cellestial as cl import scanpy as sc data = sc.read_h5ad("data/pbmc3k_pped.h5ad") frame = cl.build_frame(data, axis=0, include_dimensions=2) frame.head()
shape: (5, 33)barcode sample n_genes_by_counts log1p_n_genes_by_counts total_counts log1p_total_counts pct_counts_in_top_50_genes pct_counts_in_top_100_genes pct_counts_in_top_200_genes pct_counts_in_top_500_genes total_counts_mt log1p_total_counts_mt pct_counts_mt total_counts_ribo log1p_total_counts_ribo pct_counts_ribo total_counts_hb log1p_total_counts_hb pct_counts_hb n_genes doublet_score predicted_doublet leiden leiden_res_0.02 leiden_res_0.50 leiden_res_2.00 cell_type_lvl1 X_PCA1 X_PCA2 X_TSNE1 X_TSNE2 X_UMAP1 X_UMAP2 str cat i32 f64 f32 f32 f64 f64 f64 f64 f32 f32 f32 f32 f32 f32 f32 f32 f32 i64 f64 bool cat cat cat cat cat f32 f32 f32 f32 f32 f32 "AAACCCAAGGATGGCT-1" "s1d1" 2103 7.651596 8663.0 9.066932 42.721921 59.667552 69.744892 79.348955 460.0 6.133398 5.309938 3650.0 8.202756 42.133209 17.0 2.890372 0.196237 2103 0.036113 false "0" "0" "0" "0" "Lymphocytes" -2.698756 -1.970923 34.229698 -30.396513 -0.758709 9.41935 "AAACCCAAGGCCTAGA-1" "s1d1" 3916 8.273081 12853.0 9.461411 35.843772 44.26204 52.376877 62.763557 1790.0 7.49053 13.92671 1719.0 7.450079 13.37431 58.0 4.077538 0.451257 3912 0.183381 false "2" "1" "1" "1" "Monocytes" -5.051104 9.427238 -9.459503 65.066498 1.472651 -2.361804 "AAACCCAAGTGAGTGC-1" "s1d1" 683 6.527958 1631.0 7.397562 56.284488 62.599632 70.386266 88.77989 581.0 6.36647 35.622318 63.0 4.158883 3.862661 13.0 2.639057 0.797057 683 0.04532 false "3" "2" "2" "2" "Erythroid" -2.535384 -1.728063 -63.691681 -6.706005 3.407673 16.263371 "AAACCCACAAGAGGCT-1" "s1d1" 4330 8.373554 17345.0 9.761117 27.66215 38.420294 48.901701 62.023638 780.0 6.660575 4.496973 3936.0 8.278174 22.692417 44.0 3.806663 0.253675 4328 0.04532 false "5" "0" "3" "3" "Lymphocytes" -3.846453 1.094153 -40.505711 57.683064 -0.056542 3.495222 "AAACCCACATCGTGGC-1" "s1d1" 325 5.786897 555.0 6.320768 49.90991 59.459459 77.477477 100.0 159.0 5.075174 28.648647 26.0 3.295837 4.684685 26.0 3.295837 4.684685 323 0.016181 false "6" "0" "4" "4" "Lymphocytes" 0.211629 -1.380853 -26.950605 -43.669479 -5.413215 8.537583 Providing variable_keys allows function to infer the axis as 0.
import cellestial as cl import scanpy as sc data = sc.read_h5ad("data/pbmc3k_pped.h5ad") frame = cl.build_frame(data, variable_keys=["CD14", "HBA1"], include_dimensions=2) frame.head()
shape: (5, 35)barcode sample n_genes_by_counts log1p_n_genes_by_counts total_counts log1p_total_counts pct_counts_in_top_50_genes pct_counts_in_top_100_genes pct_counts_in_top_200_genes pct_counts_in_top_500_genes total_counts_mt log1p_total_counts_mt pct_counts_mt total_counts_ribo log1p_total_counts_ribo pct_counts_ribo total_counts_hb log1p_total_counts_hb pct_counts_hb n_genes doublet_score predicted_doublet leiden leiden_res_0.02 leiden_res_0.50 leiden_res_2.00 cell_type_lvl1 X_PCA1 X_PCA2 X_TSNE1 X_TSNE2 X_UMAP1 X_UMAP2 CD14 HBA1 str cat i32 f64 f32 f32 f64 f64 f64 f64 f32 f32 f32 f32 f32 f32 f32 f32 f32 i64 f64 bool cat cat cat cat cat f32 f32 f32 f32 f32 f32 f32 f32 "AAACCCAAGGATGGCT-1" "s1d1" 2103 7.651596 8663.0 9.066932 42.721921 59.667552 69.744892 79.348955 460.0 6.133398 5.309938 3650.0 8.202756 42.133209 17.0 2.890372 0.196237 2103 0.036113 false "0" "0" "0" "0" "Lymphocytes" -2.698756 -1.970923 34.229698 -30.396513 -0.758709 9.41935 0.0 0.854953 "AAACCCAAGGCCTAGA-1" "s1d1" 3916 8.273081 12853.0 9.461411 35.843772 44.26204 52.376877 62.763557 1790.0 7.49053 13.92671 1719.0 7.450079 13.37431 58.0 4.077538 0.451257 3912 0.183381 false "2" "1" "1" "1" "Monocytes" -5.051104 9.427238 -9.459503 65.066498 1.472651 -2.361804 0.375316 1.629056 "AAACCCAAGTGAGTGC-1" "s1d1" 683 6.527958 1631.0 7.397562 56.284488 62.599632 70.386266 88.77989 581.0 6.36647 35.622318 63.0 4.158883 3.862661 13.0 2.639057 0.797057 683 0.04532 false "3" "2" "2" "2" "Erythroid" -2.535384 -1.728063 -63.691681 -6.706005 3.407673 16.263371 0.0 1.523574 "AAACCCACAAGAGGCT-1" "s1d1" 4330 8.373554 17345.0 9.761117 27.66215 38.420294 48.901701 62.023638 780.0 6.660575 4.496973 3936.0 8.278174 22.692417 44.0 3.806663 0.253675 4328 0.04532 false "5" "0" "3" "3" "Lymphocytes" -3.846453 1.094153 -40.505711 57.683064 -0.056542 3.495222 0.0 0.515772 "AAACCCACATCGTGGC-1" "s1d1" 325 5.786897 555.0 6.320768 49.90991 59.459459 77.477477 100.0 159.0 5.075174 28.648647 26.0 3.295837 4.684685 26.0 3.295837 4.684685 323 0.016181 false "6" "0" "4" "4" "Lymphocytes" 0.211629 -1.380853 -26.950605 -43.669479 -5.413215 8.537583 0.0 0.0