openfl.federated.data.federated_data.FederatedDataSet

openfl.federated.data.federated_data.FederatedDataSet#

class openfl.federated.data.federated_data.FederatedDataSet(X_train, y_train, X_valid, y_valid, batch_size=1, num_classes=None, train_splitter=None, valid_splitter=None)[source]#

Bases: PyTorchDataLoader

A Data Loader class used to represent a federated dataset for in-memory Numpy data.

Parameters:

train_splitter (NumPyDataSplitter)
valid_splitter (NumPyDataSplitter)

train_splitter#

An object that splits the training data.

Type:: NumPyDataSplitter

valid_splitter#

An object that splits the validation data.

Type:: NumPyDataSplitter

__init__(X_train, y_train, X_valid, y_valid, batch_size=1, num_classes=None, train_splitter=None, valid_splitter=None)[source]#

Initializes the FederatedDataSet object.

Parameters:

X_train (np.array) – The training features.
y_train (np.array) – The training labels.
X_valid (np.array) – The validation features.
y_valid (np.array) – The validation labels.
batch_size (int, optional) – The batch size for the data loader. Defaults to 1.
num_classes (int, optional) – The number of classes the model will be trained on. Defaults to None.
train_splitter (NumPyDataSplitter, optional) – The object that splits the training data. Defaults to None.
valid_splitter (NumPyDataSplitter, optional) – The object that splits the validation data. Defaults to None.

Methods

`__init__`(X_train, y_train, X_valid, y_valid)	Initializes the FederatedDataSet object.
`get_feature_shape`()	Returns the shape of an example feature array.
`get_infer_loader`()	Returns the data loader for inferencing data.
`get_train_data_size`()	Returns the total number of training samples.
`get_train_loader`([batch_size, num_batches])	Returns the data loader for the training data.
`get_valid_data_size`()	Returns the total number of validation samples.
`get_valid_loader`([batch_size])	Returns the data loader for the validation data.
`split`(num_collaborators)	Splits the dataset into equal parts for each collaborator and returns a list of FederatedDataSet objects.

Attributes

`train_splitter`
`valid_splitter`

get_feature_shape()[source]#

Returns the shape of an example feature array.

Returns:: The shape of an example feature array.
Return type:: tuple

get_infer_loader()[source]#

Returns the data loader for inferencing data.

Raises:: NotImplementedError – This method must be implemented by a child class.

get_train_data_size()[source]#

Returns the total number of training samples.

Returns:: The total number of training samples.
Return type:: int

get_train_loader(batch_size=None, num_batches=None)[source]#

Returns the data loader for the training data.

Parameters:

batch_size (int, optional) – The batch size for the data loader (default is None).
num_batches (int, optional) – The number of batches for the data loader (default is None).

Returns:

The DataLoader object for the training data.

Return type:

DataLoader

get_valid_data_size()[source]#

Returns the total number of validation samples.

Returns:: The total number of validation samples.
Return type:: int

get_valid_loader(batch_size=None)[source]#

Returns the data loader for the validation data.

Parameters:: batch_size (int, optional) – The batch size for the data loader (default is None).
Returns:: The DataLoader object for the validation data.
Return type:: DataLoader

split(num_collaborators)[source]#

Splits the dataset into equal parts for each collaborator and returns a list of FederatedDataSet objects.

Parameters:

num_collaborators (int) – The number of collaborators to split the dataset between.

Returns:

A list of FederatedDataSet objects, each: representing a slice of the dataset for a collaborator.

Return type:

FederatedDataSets (list)

openfl.federated.data.federated_data.FederatedDataSet

Contents

openfl.federated.data.federated_data.FederatedDataSet#