Skip to content

Multi-Concept Standard dataset

torchmil.datasets.MCStandardMILDataset

Bases: Dataset

Multi-Concept Standard MIL Dataset. Implementation from Algorithm 2 in Reproducibility in Multiple Instance Learning: A Case For Algorithmic Unit Tests.

__init__(D, num_bags, pos_class_prob=0.5, train=True, seed=0)

Parameters:

  • D (int) –

    Dimensionality of the data.

  • num_bags (int) –

    Number of bags in the dataset.

  • pos_class_prob (float, default: 0.5 ) –

    Probability of a bag being positive.

  • train (bool, default: True ) –

    Whether to create the training or test dataset.

  • seed (int, default: 0 ) –

    Seed for the random number generator.

__getitem__(index)

Parameters:

  • index (int) –

    Index of the bag to retrieve.

Returns:

  • bag_dict ( TensorDict ) –

    Dictionary containing the following keys:

    • X: Bag features of shape (bag_size, feat_dim).
    • Y: Label of the bag.
    • y_inst: Instance labels of the bag.