htfllib: A unified benchmark library for evaluating cross-modal heterofederal learning methods

by admin · June 19, 2025

AI organizations develop heterogeneous models for specific tasks, but face data scarcity challenges during training. Traditional federated learning (FL) only supports homogeneous model collaboration, which requires the same architecture across all clients. However, customers develop model architectures for their unique requirements. In addition, the local training model of shared efforts intensive includes intellectual property rights and reduces participants’ interest in participating in collaboration. Heterogeneous federated learning (HTFL) addresses these limitations, but the literature lacks a unified benchmark for evaluating HTFL across various fields and aspects.

Background and categories of HTFL methods

Existing FL benchmarks focus on data heterogeneity using uniform customer models, but ignore practical scenarios involving model heterogeneity. Representative HTFL approaches are divided into three main categories that address these limitations. Partial parameter sharing methods such as LG-FEDAVG, FedGen, and FedGH maintain a heterogeneous extractor while assuming that a homogeneous classifier header is used for knowledge transfer. Mutual distillation (e.g., FML, FEDKD, and FEDMRL) trains and shares small auxiliary models through distillation techniques. Prototype Sharing Methods transfer lightweight class prototypes to global knowledge, collect local prototypes from clients and collect them on servers to guide local training. However, it is not clear whether the existing HTFL methods are continuously executed in various situations.

Introduction to htfllib: Unified Benchmark Testing

Researchers from Shanghai Qiao Tangtang University, Peking University, Chongqing University, Tangji University, Hong Kong Polytechnic University and Queen’s University of Belfast have proposed that this is a simple and scalable approach, a simple and scalable approach for integrating multiple datasets and model heterogeneity scenarios. This method integrates:

12 datasets across domains, patterns and data heterogeneity schemes
40 model architectures range from small to large three ways.
A modular and easy to scale HTFL code base with implementations of 10 representative HTFL methods.
System evaluation covers accuracy, convergence, computational costs, and communication costs.

Datasets and Schemas in htfllib

htfllib contains detailed data heterogeneity schemes, divided into three settings: tag skew, pathology and Dirichlet as subsets, feature shifting and real world. It integrates 12 datasets including CIFAR10, CIFAR100, Flowers102, Tiny-Imagenet, Kvasir, Covidx, Domainnet, Camelyon17, AG News, AG News, Shakespeare, Har and Pamap2. These datasets vary widely in domain, data volume and class numbers, demonstrating the comprehensiveness and versatility of HTFLLIB. Furthermore, the main focus of the researchers is on image data, especially label skew setting, because the image task is the most commonly used task in each field. Evaluate HTFL methods across image, text, and sensor signal tasks to evaluate their respective strengths and weaknesses.

Performance analysis: Image method

For image data, most HTFL methods show reduced accuracy as model heterogeneity increases. FEDMRL shows superior strength by assisting the combination of global and local models. When introducing heterogeneous classifiers that make partial parameter sharing methods unsuitable, FedTGP maintains advantages in various environments due to its refinement capabilities of adaptive prototypes. Experiments in medical data sets of black box pre-trained heterogeneous models show that HTFL enhances model quality and is more improved than auxiliary models such as FML. For text data, the advantage of FEDMRL in the tag skew setting is reduced in the actual setting, while FedProto and FedTGP are relatively poor compared to image tasks.

in conclusion

In summary, the researchers introduced HTFLLIB, a framework that addresses key gaps in HTFL benchmarking by providing unified evaluation criteria across different fields and scenarios. The modular design and scalable architecture of HTFLLIB provide a detailed benchmark for the research and practical application of HTFL. Furthermore, its ability to support heterogeneous models in collaborative learning opens the way for future research to leverage complex pre-trained large models, black box systems, and various architectures for a variety of tasks and methods.

Check Paper and github page. All credits for this study are to the researchers on the project. Also, please feel free to follow us twitter And don’t forget to join us 100K+ ml reddit And subscribe Our newsletter.

Sajjad Ansari is a final year undergraduate student from IIT Kharagpur. As a technology enthusiast, he delves into the practical application of AI, focusing on understanding AI technology and its real-world impact. He aims to express complex AI concepts in a clear and easy way.

htfllib: A unified benchmark library for evaluating cross-modal heterofederal learning methods

Background and categories of HTFL methods

Introduction to htfllib: Unified Benchmark Testing

Datasets and Schemas in htfllib

Performance analysis: Image method

in conclusion

You may also like...

live chat

Recent Posts

htfllib: A unified benchmark library for evaluating cross-modal heterofederal learning methods

Background and categories of HTFL methods

Introduction to htfllib: Unified Benchmark Testing

Datasets and Schemas in htfllib

Performance analysis: Image method

in conclusion

You may also like...

Laser drives tested on Earth can discover microbial fossils on Mars

When NASA hits the asteroid, it flies car-sized rocks at 120 mph

NVIDIA AI releases new jet hybrid: faster 53x hybrid architecture language model series, which reduces the size of 98% cost reduction

live chat

Recent Posts