Abstract
In the cross-silo federated learning setting, one kind of data partition
according to features, which is so-called vertical federated learning
(i.e. feature-wise federated learning) (Yang et al. 2019), is to apply
to multiple datasets that share the same sample ID space but different
feature spaces. Simultaneously, the image dataset can also be
partitioned according to labels. To improve the model performance of the
isolated parties based on feature-wise (i.e. label-wise) results, the
most effective method is to federate the model results of the isolated
parties together. However, it is a non-trivial task to allow the
participating parties to share the model results without violating the
data privacy of the parties. In this paper, within the framework of
principal component analysis (PCA), we propose a Federated-PCA machine
learning approach, in which the PCA method is used to reduce the
dimensionality of sample data for all parties and extract the principal
component feature information to improve the efficiency of subsequent
training work. This process will not reveal the original data
information of each party. The federal system can help each side build a
common profit strategy. Under this federal mechanism, the identity and
status of each party are the same. By comparing the federated results of
the isolated parties and the result of the unseparated party through
multiple sets of comparative experiments, we find that the experimental
results of these two settings are close, and the proposed method can
effectively improve the training model performance of most participating
parties.