[Federated Learning] Detailed Explanation and Code Implementation of FedAvg Aggregation Algorithm

Last updated: 2023/4/5 Fixed the abnormal display of formulas in Markdown
This article was first published on Roy C's blog: Detailed Explanation and Code Implementation of FedAvg Aggregation Algorithm. Please indicate the source if reproduced.

Original paper: Communication-Efficient Learning of Deep Networks from Decentralized Data

Overview#

In modern computer science, machine learning is widely used in various fields. However, machine learning requires a large amount of data to achieve optimal performance. In some cases, centralized training models may not be feasible due to data privacy and security reasons. This is where the concept of federated learning comes in. Federated learning is a machine learning paradigm where models are trained on local devices instead of centralized servers. This blog post will introduce a commonly used federated learning algorithm called FedAvg.

FedAvg is a commonly used federated learning algorithm that aggregates model parameters through weighted averaging. The basic idea of FedAvg is to upload the parameters of local models to the server, calculate the average of all model parameters on the server, and then broadcast this average back to all local devices. This process can be iterated multiple times until convergence.

To ensure the accuracy of model aggregation, the FedAvg algorithm uses weighted averaging. Specifically, the model parameters uploaded by each device are assigned weights and then averaged. The weights of the model parameters uploaded by the devices are assigned based on the size of the local data on the devices, with devices having more data assigned higher weights.

Advantages of FedAvg#

Compared to other federated learning algorithms, FedAvg has the following advantages:

Low communication overhead: Since only local model parameters need to be uploaded, the communication overhead is low.
Support for heterogeneous data: Since local devices can use different datasets, FedAvg can handle heterogeneous data.
Strong generalization: The FedAvg algorithm improves the accuracy and generalization performance of the model by training a global model using local data on all devices through global model aggregation.

Disadvantages of FedAvg#

Despite having many advantages, FedAvg still has some disadvantages:

Coordination required: Due to the need to coordinate the computations of multiple local devices, FedAvg requires a centralized coordinator to perform this task. This may result in performance bottlenecks or single points of failure.
Data imbalance: In the FedAvg algorithm, the weights of the model parameters uploaded by each device are assigned based on the size of the local data on the device. This approach may lead to data imbalance, where devices with smaller data contribute less to the global model, thereby affecting the generalization performance of the model.

Algorithm Flow of FedAvg#

Pseudocode#

Explanation#

Server initializes global model parameters $w_0$;
All local devices randomly select a portion of the dataset and compute local model parameters $w_i$ locally;
All local devices upload local model parameters $w_i$ to the server;
Server calculates the weighted average of all local model parameters $\bar{w}$ and broadcasts it to all local devices;
All local devices use $\bar{w}$ as the initial value of their local model parameters, repeat steps 2-4 until the global model converges.

Code Implementation#

def fedavg(self):
        # FedAvg with weight
        total_samples = sum(self.num_samples)
        base = [0] * len(self.weights[0])
        for i, client_weight in enumerate(self.weights):
            total_samples += self.num_samples[i]
            for j, v in enumerate(client_weight):
                base[j] += (self.num_samples[i] / total_samples * v.astype(np.float64))

        # Update the model
        return base

Conclusion#

In conclusion, the FedAvg algorithm is an effective federated learning algorithm that can train a global model using local data while protecting privacy, reducing communication overhead, supporting distributed devices, and improving the accuracy and generalization performance of the model. With the development of federated learning and the expansion of application scenarios, the research and application of the FedAvg algorithm will continue to deepen. In the future, the FedAvg algorithm is expected to be further improved in algorithm optimization, privacy protection, model compression, and applied to more fields and scenarios.