stands
¶
Spatial Transcriptomics ANomaly Detection and Subtyping (STANDS) is an innovative computational method to detect anomalous tissue domains from multi-sample spatial transcriptomics (ST) data and reveal their biologically heterogeneous subdomains, which can be individual-specific or shared by all individuals.
Detecting and characterizing anomalous anatomic regions from tissue samples from affected individuals are crucial for clinical and biomedical research. This procedure, which we refer to as Detection and Dissection of Anomalous Tissue Domains (DDATD), serves as the first and foremost step in the analysis of clinical tissues because it reveals factors, such as pathogenic or differentiated cell types, associated with the development of diseases or biological traits. Traditionally, DDATD has relied on either laborious expert visual inspection or computer vision algorithms applied to histology images. ST provides an unprecedent opportunity to enhance DDATD by incorporating spatial gene expression information. However, to the best of our knowledge, no existing methods can perform de novo DDATD from ST datasets.
STANDS is built on state-of-the-art generative models for de novo DDATD from multi-sample ST by integrating multimodal information including spatial gene expression, histology image, and single cell gene expression. STANDS concurrently fulfills DDATD's three sequential core tasks: detecting, aligning, and subtyping anomalous tissue domains across multiple samples. STANDS first integrates and harnesses multimodal information from spatial transcriptomics and associated histology images to pinpoint anomalous tissue regions across multiple target datasets. Next, STANDS aligns anomalies identified from target datasets in a common data space via style-transfer learning to mitigate their non-biological variations. Finally, STANDS dissects aligned anomalies into biologically heterogenous subtypes that are either common or unique to the target datasets. STANDS combines these processes into a unified framework that maintains the methodological coherence, which leads to its unparallel performances in DDATD from multi-sample ST.
Modules:
Name | Description |
---|---|
read |
Read single spatial data and preprocess if required. |
read_cross |
Read spatial data from two sources and preprocess if required. |
read_multi |
Read multiple spatial datasets and preprocess if required. |
pretrain |
Pretrain STANDS using spatial data. |
evaluate |
Calculate various metrics (including SGD). |
AnomalyDetect
¶
AnomalyDetect(
n_epochs: int = 10,
batch_size: int = 128,
learning_rate: float = 0.0003,
n_dis: int = 2,
GPU: Union[bool, str] = True,
random_state: Optional[int] = None,
weight: Optional[Dict[str, float]] = None,
)
Source code in src\stands\anomaly.py
UpdateD
¶
Updating discriminator
Source code in src\stands\anomaly.py
UpdateG
¶
Updating generator
Source code in src\stands\anomaly.py
fit
¶
Train STANDS on reference graph
Source code in src\stands\anomaly.py
init_weight
¶
Initial stage for pretrained weights and memory block
Source code in src\stands\anomaly.py
predict
¶
Detect anomalous spots on target graph
Source code in src\stands\anomaly.py
BatchAlign
¶
BatchAlign(
n_epochs: int = 10,
batch_size: int = 128,
learning_rate: float = 0.0003,
n_dis: int = 3,
GPU: Union[bool, str] = True,
random_state: Optional[int] = None,
weight: Optional[Dict[str, float]] = None,
)
Source code in src\stands\align.py
UpdateD
¶
Updating discriminator
Source code in src\stands\align.py
UpdateG
¶
Updating generator
Source code in src\stands\align.py
fit
¶
Remove batch effects
Source code in src\stands\align.py
read_cross
¶
read_cross(
ref: AnnData,
tgt: AnnData,
spa_key: str = "spatial",
preprocess: bool = True,
n_genes: int = 3000,
patch_size: Optional[int] = None,
n_neighbors: int = 4,
augment: bool = True,
return_type: Literal["anndata", "graph"] = "graph",
)
Read spatial data from two sources and preprocess if required. The read data are transformed to reference and target graph.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
ref |
AnnData
|
Reference AnnData object. |
required |
tgt |
AnnData
|
Target AnnData object. |
required |
spa_key |
str
|
Key for spatial information in AnnData objects. |
'spatial'
|
preprocess |
bool
|
Perform data preprocessing. |
True
|
n_genes |
int
|
Number of genes for feature selection. |
3000
|
patch_size |
Optional[int]
|
Patch size for H&E images. |
None
|
n_neighbors |
int
|
Number of neighbors for spatial data reading. |
4
|
augment |
bool
|
Whether to use the data augmentation. |
True
|
return_type |
Literal['anndata', 'graph']
|
Type of data to return. |
'graph'
|
Returns:
Type | Description |
---|---|
Union[Tuple, Dict]
|
Depending on the 'return_type', returns either a tuple of AnnData objects or a dictionary of graph-related data. |
Source code in src\stands\_read.py
read_multi
¶
read_multi(
adata_list: List[AnnData],
patch_size: Optional[int] = None,
gene_list: Optional[List[str]] = None,
preprocess: bool = True,
n_genes: int = 3000,
n_neighbors: int = 4,
augment: bool = True,
spa_key: str = "spatial",
return_type: Literal["anndata", "graph"] = "graph",
)
Read multiple spatial datasets and preprocess if required. All the datasets are transformed to only one graph.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
adata_list |
List[AnnData]
|
List of AnnData objects. |
required |
patch_size |
Optional[int]
|
Patch size for H&E images. |
None
|
gene_list |
Optional[List[str]]
|
Selected gene list. |
None
|
preprocess |
bool
|
Perform data preprocessing. |
True
|
n_genes |
int
|
Number of genes for feature selection. |
3000
|
n_neighbors |
int
|
Number of neighbors for spatial data reading. |
4
|
augment |
bool
|
Whether to use the data augmentation. |
True
|
spa_key |
str
|
Key for spatial information in AnnData objects. |
'spatial'
|
return_type |
Literal['anndata', 'graph']
|
Type of data to return. |
'graph'
|
Returns:
Type | Description |
---|---|
Union[List, Dict]
|
Depending on the 'return_type', returns either a list of AnnData objects or a dictionary of graph-related data. |