Benchmarking Results for MPI-based federated learning
Please visit the following link to check the latest benchmark experimental results: https://app.wandb.ai/automl/fedml/reports/FedML-Benchmark-Experimental-Results--VmlldzoxODE2NTU FedML white paper (https://arxiv.org/pdf/2007.13518.pdf) also summarizes the dataset list and related benchmarks. We refer the hyper-parameters and reproduce results from many top-tier ML conferences. Please check details of our reference hyperparameters as follows.
Linear Models
Data | Model | Alg | Partition | #C | #C_p | bs | c_opt | lr | e | #R | acc |
---|---|---|---|---|---|---|---|---|---|---|---|
MNIST | LR | FedAvg | Power Law | 1000 | 10 | 10 | SGD | 0.03 | 1 | >100 | >75 |
Federated EMNIST | LR | FedAvg | Power Law | 200 | 10 | 10 | SGD | 0.003 | 1 | >200 | 10~40 |
Synthetic(α,β) | LR | FedAvg | Power Law | 30 | 10 | 10 | SGD | 0.01 | 1 | >200 | >60 |
Note: #C stands for client_num_in_total; #C_p stands for client_num_per_round; bs = batch_size; c_opt = client optimizer; e = epoch; #R = number of rounds; acc = accuracy. For Synthetic(α,β), (α,β) is chosen from (0,0), (0.5,0.5), (1,1)
- MNIST – Logistic Regression – FedAvg
- Patition Method: ‘Federated optimization in heterogeneous networks’, page 7, Section 5.1, ‘Real data’
- client_num_in_total: ‘Federated optimization in heterogeneous networks’, page 7, Section 5.1, ‘Real data’
- client_num_per_round: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
- batch_size: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
- client_optimizer: ‘Federated optimization in heterogeneous networks’, page 8, Section 5.1, ‘Implementation
- lr: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
- epochs: ‘Federated optimization in heterogeneous networks’, page 21, Appendix C.3.2 Figure 9 description
- comm_round: ‘Federated optimization in heterogeneous networks’, page 21, Appendix C.3.2 Figure 10
- accuracy: ‘Federated optimization in heterogeneous networks’, page 21, Appendix C.3.2 Figure 10
- Federated EMNIST – Logistic Regression-FedAvg
- Patition Method: ‘Federated optimization in heterogeneous networks’, page 7, Section 5.1, ‘Real data’
- client_num_in_total: ‘Federated optimization in heterogeneous networks’, page 7, Section 5.1, ‘Real data’
- client_num_per_round: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
- batch_size: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
- client_optimizer: ‘Federated optimization in heterogeneous networks’, page 8, Section 5.1, ‘Implementation
- lr: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
- epochs: ‘Federated optimization in heterogeneous networks’, page 21, Appendix C.3.2 Figure 9 description
- comm_round: ‘Federated optimization in heterogeneous networks’, page 21, Appendix C.3.2 Figure 10
- accuracy: ‘Federated optimization in heterogeneous networks’, page 21, Appendix C.3.2 Figure 10
- Synthetic(α,β) – Logistic Regression -FedAvg
- Patition Method: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.1, ‘Synthetic’
- client_num_in_total: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.1, ‘Synthetic’
- client_num_per_round: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
- batch_size: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
- client_optimizer: ‘Federated optimization in heterogeneous networks’, page 8, Section 5.1, ‘Implementation
- lr: ‘Federated optimization in heterogeneous networks’, page 18, Appendix C.2, ‘Hyperparameters’
- epochs: ‘Federated optimization in heterogeneous networks’, page 8, Section 5.1, ‘Hyperparameters & evaluation metrics’
- comm_round: ‘Federated optimization in heterogeneous networks’, page 19, Appendix C.3.2 Figure 6
- accuracy: ‘Federated optimization in heterogeneous networks’, page 19, Appendix C.3.2 Figure 6
Lightweight and shallow neural network models
Task | Data Set | Model | Algorithm | Partition Method | Partition Alpha | client_num_in_total | client_num_per_round | batch_size | client_optimizer | lr | wd | epochs | comm_round | accuracy |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
CV | Federated EMNIST | CNN (2 Conv + 2 FC) | FedAvg | Power Law | 3400 | 10 | 20 | SGD | 0.1 | - | 1 | >1500 | 84.9 | |
CV | CIFAR-100 | ResNet-18+group normalization | FedAvg | Pachinko Allocation | 100/500(ex/cli) | 500 | 10 | 20 | SGD | 0.1 | - | 1 | >4000 | 44.7 |
NLP | Shakespeare | RNN (2 LSTM + 1 FC) | FedAvg | realistic patition | 715 | 10 | 4 | SGD | 1 | - | 1 | >1200 | 56.9 | |
NLP | StackOverflow | RNN (1 LSTM + 2 FC) | FedAvg | Pachinko Allocation | 342477 | 50 | 16 | SGD | pow(10,-0.5) | - | 1 | >1500 | 19.5 |
- Federated EMNIST-CNN-FedAvg (https://openreview.net/pdf?id=LkFG3lB13U5)
- Patition Method: ‘Adaptive federated optimization’ (https://openreview.net/pdf?id=LkFG3lB13U5), page 23, Appendix C.2
- client_num_in_total: ‘Adaptive federated optimization’, page 23, Appendix C Dataset & Models, Table2
- client_num_per_round: ‘Adaptive federated optimization’, page 6, Section 4, ‘Optimizer and hyperparameters’
- batch_size: ‘Adaptive federated optimization’, page 27, Appendix D Experiment Hyperparameters, Table7
- client_optimizer: ‘Adaptive federated optimization’, page 25, Appendix D.1, Paragraph 1
- lr: ‘Adaptive federated optimization’, page 27, Appendix D.4, Table8
- wd (learning rate decay): ‘Adaptive federated optimization’, page34, Appendix E.6, Paragraph 2
- epochs: ‘Adaptive federated optimization’, page34, Appendix E.6, Paragraph 1
- comm_round:‘Adaptive federated optimization’, page28, Appendix E.1, figure 3
- accuracy: ‘Adaptive federated optimization’, page 7, Section 5, Table1
- CIFAR-100 – ResNet18 -FedAvg
- Patition Method: ‘Adaptive federated optimization’, page 23, Appendix C.1, Paragraph 3
- Patition_alpha: ‘Adaptive federated optimization’, page 23, Appendix C.1, Paragraph 2
- client_num_in_total: ‘Adaptive federated optimization’, page 23, Appendix C Dataset & Models, Table2
- client_num_per_round: ‘Adaptive federated optimization’, page 6, Section 4, ‘Optimizer and hyperparameters’
- batch_size: ‘Adaptive federated optimization’, page 27, Appendix D Experiment Hyperparameters, Table7
- client_optimizer: ‘Adaptive federated optimization’, page 25, Appendix D.1, Paragraph 1
- lr: ‘Adaptive federated optimization’, page 27, Appendix D.4, Table8
- epochs: ‘Adaptive federated optimization’, page 6, Section 4, ‘Optimizer and hyperparameters’
- comm_round: ‘Adaptive federated optimization’, page 7, Section 4, figure 1
- accuracy: ‘Adaptive federated optimization’, page 7, Section 5, Table1
- Shakespeare – RNN – FedAvg
- Patition Method: ‘Adaptive federated optimization’, page 23, Appendix C.3
- client_num_in_total: ‘Adaptive federated optimization’, page 23, Appendix C Dataset & Models, Table2
- client_num_per_round: ‘Adaptive federated optimization’, page 6, Section 4, ‘Optimizer and hyperparameters’
- batch_size: ‘Adaptive federated optimization’, page 27, Appendix D Experiment Hyperparameters, Table7
- client_optimizer: ‘Adaptive federated optimization’, page 25, Appendix D.1, Paragraph 1
- lr: ‘Adaptive federated optimization’, page 27, Appendix D.4, Table8
- epochs: ‘Adaptive federated optimization’, page 6, Section 4, ‘Optimizer and hyperparameters’
- comm_round: ‘Adaptive federated optimization’, page 7, Section 4, figure 1
- accuracy: ‘Adaptive federated optimization’, page 7, Section 5, Table1
- StackOverflow – RNN – FedAvg
- Patition Method: ‘Adaptive federated optimization’, page 23, Appendix C.4, Paragraph 2
- client_num_in_total: ‘Adaptive federated optimization’, page 25, Appendix C.4, Paragraph 1
- client_num_per_round: ‘Adaptive federated optimization’, page 6, Section 4, ‘Optimizer and hyperparameters’
- batch_size: ‘Adaptive federated optimization’, page 27, Appendix D Experiment Hyperparameters, Table7
- client_optimizer: ‘Adaptive federated optimization’, page 25, Appendix D.1, Paragraph 1
- lr: ‘Adaptive federated optimization’, page 27, Appendix D.4, Table8
- epochs: ‘Adaptive federated optimization’, page 6, Section 4, ‘Optimizer and hyperparameters’
- comm_round: ‘Adaptive federated optimization’, page 7, Section 4, figure 1
- accuracy: ‘Adaptive federated optimization’, page 7, Section 5, Table1
Benchmarking using modern DNNs
Data | Model | Alg | # C | # C_p | bs | c_opt | lr | wd | e | round | IID acc | non-IID acc |
---|---|---|---|---|---|---|---|---|---|---|---|---|
CIFAR10 | ResNet-56 | FedAvg | 10 | 10 | 64 | SGD | 0.001 | 0.001 | 20 | 100 | 93.19 | 87.12 |
CIFAR100 | ResNet-56 | FedAvg | 10 | 10 | 64 | SGD | 0.001 | 0.001 | 20 | 100 | 68.91 | 64.70 |
CINIC10 | ResNet-56 | FedAvg | 10 | 10 | 64 | SGD | 0.001 | 0.001 | 20 | 100 | 82.57 | 73.49 |
CIFAR10 | MobileNet | FedAvg | 10 | 10 | 64 | SGD | 0.001 | 0.001 | 20 | 100 | 91.12 | 86.32 |
CIFAR100 | MobileNet | FedAvg | 10 | 10 | 64 | SGD | 0.001 | 0.001 | 20 | 100 | 55.12 | 53.54 |
CINIC10 | MobileNet | FedAvg | 10 | 10 | 64 | SGD | 0.001 | 0.001 | 20 | 100 | 79.95 | 71.23 |
Note: Non-IID distribution is set using LDA ( LDA = Latent Dirichlet Allocation) with alpha = 0.5; #C stands for client_num_in_total; #C_p stands for client_num_per_round; bs = batch size; c_opt = client optimizer.