Skip to content

Note

See this notebook for an explanation with examples of the different types of representations in torchmil.

Spatial and sequential representation

In torchmil, bags can be represented in two ways: sequential and spatial.

In the sequential representation bag['X'] is a tensor of shape (bag_size, dim). This representation is the most common in MIL.

When the bag has some spatial structure, the sequential representation can be coupled with a graph using an adjacency matrix or with the coordinates of the instances. These are stored as bag['adj'] (of shape (bag_size, bag_size)) and bag['coords'] (of shape (bag_size, coords_dim)), respectively.

Alternatively, the spatial representation can be used. In this case, bag['X'] is a tensor of shape (coord1, ..., coordN, dim), where N=coords_dim is the number of dimensions of the space.

In torchmil, you can convert from one representation to the other using the functions torchmil.utils.seq_to_spatial and torchmil.utils.spatial_to_seq from the torchmil.data module. These functions need the coordinates of the instances in the bag, stored as bag['coords'].

Example: Whole Slide Images

Due to their large resolution, Whole Slide Images (WSIs) are usually represented as bags of patches. Each patch is an image, from which a feature vector of is typically extracted. The spatial representation of a WSI has shape (height, width, feat_dim), while the sequential representation has shape (bag_size, feat_dim). The coordinates corresponds to the coordinates of the patches in the WSI.

SETMIL is an example of a model that uses the spatial representation of a WSI.


torchmil.data.seq_to_spatial(X, coords)

Computes the spatial representation of a bag given the sequential representation and the coordinates.

Given the input tensor X of shape (batch_size, bag_size, dim) and the coordinates coords of shape (batch_size, bag_size, n), this function returns the spatial representation X_enc of shape (batch_size, coord1, coord2, ..., coordn, dim).

This representation is characterized by the fact that the coordinates are used to index the elements of spatial representation: X_enc[batch, i1, i2, ..., in, :] = X[batch, idx, :] where (i1, i2, ..., in) = coords[batch, idx].

Parameters:

  • X (Tensor) –

    Sequential representation of shape (batch_size, bag_size, dim).

  • coords (Tensor) –

    Coordinates of shape (batch_size, bag_size, n).

Returns:

  • X_esp ( Tensor ) –

    Spatial representation of shape (batch_size, coord1, coord2, ..., coordn, dim).


torchmil.data.spatial_to_seq(X_esp, coords)

Computes the sequential representation of a bag given the spatial representation and the coordinates.

Given the spatial tensor X_esp of shape (batch_size, coord1, coord2, ..., coordn, dim) and the coordinates coords of shape (batch_size, bag_size, n), this function returns the sequential representation X of shape (batch_size, bag_size, dim).

This representation is characterized by the fact that the coordinates are used to index the elements of spatial representation: X_seq[batch, idx, :] = X_esp[batch, i1, i2, ..., in, :] where (i1, i2, ..., in) = coords[batch, idx].

Parameters:

  • X_esp (Tensor) –

    Spatial representation of shape (batch_size, coord1, coord2, ..., coordn, dim).

  • coords (Tensor) –

    Coordinates of shape (batch_size, bag_size, n).

Returns:

  • X_seq ( Tensor ) –

    Sequential representation of shape (batch_size, bag_size, dim).