openfl.utilities.data_splitters.numpy.DirichletNumPyDataSplitter
- class openfl.utilities.data_splitters.numpy.DirichletNumPyDataSplitter(alpha=0.5, min_samples_per_col=10, seed=0)
Bases:
NumPyDataSplitterClass for splitting numpy arrays of data according to a Dirichlet distribution.
Generates the random sample of integer numbers from dirichlet distribution until minimum subset length exceeds the specified threshold. This behavior is a parametrized version of non-i.i.d. split in FedMA algorithm. Origin source: https://github.com/IBM/FedMA/blob/master/utils.py#L96
- Parameters:
alpha (float, optional) – Dirichlet distribution parameter. Defaults to 0.5.
min_samples_per_col (int, optional) – Minimal amount of samples per collaborator. Defaults to 10.
seed (int, optional) – Random numbers generator seed. Defaults to 0.
Methods
Split the data.
- split(data, num_collaborators)
Split the data.