sainsc.LazyKDE.assign_celltype

sainsc.LazyKDE.assign_celltype(signatures, *, log=False, min_transcripts=None, chunk=(500, 500))

Calculate the cosine similarity with known cell-type signatures.

For each bead calculate the cosine similarity with a set of cell-type signatures. The cell-type with highest score will be assigned to the corresponding bead.

Parameters:
  • signatures (DataFrame) – DataFrame of cell-type signatures. Columns are cell-types and index are genes.

  • log (bool) – Whether to log transform the KDE when calculating the cosine similarity. This is useful if the gene signatures are derived from log-transformed data.

  • min_transcripts (int | None) – Minimum number of transcripts to consider a chunk for processing. Can be used to filter chunks with few “noisy” transcripts.

  • chunk (tuple[int, int]) – Size of the chunks for processing. Larger chunks require more memory but have less duplicated computation.

Raises:
  • ValueError – If not all genes of the signatures are available.

  • ValueError – If self.kernel is not set.

  • ValueError – If chunk is smaller than the shape of self.kernel.