scvelo.pp.filter_and_normalize¶
-
scvelo.pp.
filter_and_normalize
(data, min_counts=None, min_counts_u=None, min_cells=None, min_cells_u=None, min_shared_counts=None, min_shared_cells=None, n_top_genes=None, retain_genes=None, subset_highly_variable=True, flavor='seurat', log=True, layers_normalize=None, copy=False, **kwargs)¶ Filtering, normalization and log transform
Expects non-logarithmized data. If using logarithmized data, pass log=False.
Runs the following steps
scv.pp.filter_genes(adata) scv.pp.normalize_per_cell(adata) if n_top_genes is not None: scv.pp.filter_genes_dispersion(adata) if log: scv.pp.log1p(adata)
Parameters: - data :
AnnData
Annotated data matrix.
- min_counts : int (default: None)
Minimum number of counts required for a gene to pass filtering (spliced).
- min_counts_u : int (default: None)
Minimum number of counts required for a gene to pass filtering (unspliced).
- min_cells : int (default: None)
Minimum number of cells expressed required to pass filtering (spliced).
- min_cells_u : int (default: None)
Minimum number of cells expressed required to pass filtering (unspliced).
- min_shared_counts : int, optional (default: None)
Minimum number of counts (both unspliced and spliced) required for a gene.
- min_shared_cells : int, optional (default: None)
Minimum number of cells required to be expressed (both unspliced and spliced).
- n_top_genes : int (default: None)
Number of genes to keep.
- retain_genes : list, optional (default: None)
List of gene names to be retained independent of thresholds.
- subset_highly_variable : bool (default: True)
Whether to subset highly variable genes or to store in .var[‘highly_variable’].
- flavor : {'seurat', 'cell_ranger', 'svr'}, optional (default: 'seurat')
Choose the flavor for computing normalized dispersion. If choosing ‘seurat’, this expects non-logarithmized data.
- log : bool (default: True)
Take logarithm.
- layers_normalize : list of str (default: None)
List of layers to be normalized. If set to None, the layers {‘X’, ‘spliced’, ‘unspliced’} are considered for normalization upon testing whether they have already been normalized (by checking type of entries: int -> unprocessed, float -> processed).
- copy : bool (default: False)
Return a copy of adata instead of updating it.
- **kwargs
Keyword arguments passed to pp.normalize_per_cell (e.g. counts_per_cell).
Returns: Returns or updates adata depending on copy.
- data :