scvelo.pp.filter_and_normalize¶
-
scvelo.pp.
filter_and_normalize
(data, min_counts=None, min_counts_u=None, min_cells=None, min_cells_u=None, min_shared_counts=None, min_shared_cells=None, n_top_genes=None, retain_genes=None, subset_highly_variable=True, flavor='seurat', log=True, layers_normalize=None, copy=False, **kwargs)¶ Filtering, normalization and log transform.
Expects non-logarithmized data. If using logarithmized data, pass log=False.
Runs the following steps
scv.pp.filter_genes(adata) scv.pp.normalize_per_cell(adata) if n_top_genes is not None: scv.pp.filter_genes_dispersion(adata) if log: scv.pp.log1p(adata)
- data:
AnnData
Annotated data matrix.
- min_counts: int (default: None)
Minimum number of counts required for a gene to pass filtering (spliced).
- min_counts_u: int (default: None)
Minimum number of counts required for a gene to pass filtering (unspliced).
- min_cells: int (default: None)
Minimum number of cells expressed required to pass filtering (spliced).
- min_cells_u: int (default: None)
Minimum number of cells expressed required to pass filtering (unspliced).
- min_shared_counts: int, optional (default: None)
Minimum number of counts (both unspliced and spliced) required for a gene.
- min_shared_cells: int, optional (default: None)
Minimum number of cells required to be expressed (both unspliced and spliced).
- n_top_genes: int (default: None)
Number of genes to keep.
- retain_genes: list, optional (default: None)
List of gene names to be retained independent of thresholds.
- subset_highly_variable: bool (default: True)
Whether to subset highly variable genes or to store in .var[‘highly_variable’].
- flavor: {‘seurat’, ‘cell_ranger’, ‘svr’}, optional (default: ‘seurat’)
Choose the flavor for computing normalized dispersion. If choosing ‘seurat’, this expects non-logarithmized data.
- log: bool (default: True)
Take logarithm.
- layers_normalize: list of str (default: None)
List of layers to be normalized. If set to None, the layers {‘X’, ‘spliced’, ‘unspliced’} are considered for normalization upon testing whether they have already been normalized (by checking type of entries: int -> unprocessed, float -> processed).
- copy: bool (default: False)
Return a copy of adata instead of updating it.
- **kwargs:
Keyword arguments passed to pp.normalize_per_cell (e.g. counts_per_cell).
- Returns
Returns or updates adata depending on copy.
- data: