New approach greatly reduces the energy consumption of artificial intelligence

AI applications such as large language models (LLM) have become an indispensable part of our daily lives. The required computing, storage and transmission capacity is provided by a data center that consumes a lot of energy. In Germany alone, it was about 16 billion kilowatt-hours in 2020, accounting for about 1% of the country’s total energy consumption. For 2025, this figure is expected to increase to 22 billion kWh.
The new method is 100 times faster
The arrival of more complex AI applications will greatly increase the demand for data center capacity in the coming years. These applications will consume a lot of energy to train neural networks. To offset this trend, the researchers developed a training method that is 100 times faster while achieving comparable accuracy to existing procedures. This will greatly reduce the energy consumption of training.
The work of the human brain inspired the functions of neural networks in AI for tasks such as image recognition or language processing. These networks are composed of interconnected nodes called artificial neurons. The input signal is weighted using certain parameters and then summed. If the defined threshold is exceeded, the signal is passed to the next node. To train a network, the initial selection of parameter values using a normal distribution is usually grouped randomly. The values are then gradually adjusted to gradually improve the network prediction. Since many iterations are required, this kind of training is very demanding and consumes a lot of power.
Parameters selected according to probability
Physics augmented machine learning professor Felix Dietrich and his team have developed a new approach. Their methods do not iterate to determine the parameters between nodes, but use probability. Their probabilistic approach is based on the use of targets at key locations in the training data, where values undergo large and rapid changes. The purpose of the current study is to use this method to obtain dynamic systems of energy-sustaining potential from the data. For example, such systems change over time according to certain rules, for example, in climate models and financial markets.
“Our approach makes it possible to determine the required parameters with minimal computational power. This can make training of neural networks faster and therefore more energy-efficient,” said Felix Dietrich. “In addition, we have seen that the accuracy of the new method is comparable to that of the iteratively trained network.”
Rahma, Atamert, Chinmay Datar and Felix Dietrich, “Training Hamiltonian Neural Networks Without Backpropagation”, 2024. The 38th Neurips Conference on Neurips at the Machine Learning and Physical Sciences
Bolager, Erik L, Iryna Burak, Chinmay Datar, Qing Sun and Felix Dietrich. 2023. “Sample weights of deep neural networks”. exist Advances in neural information processing systems36:63075–116. Curran Associates, Inc.
If you find this piece useful, consider supporting our work with a one-time or monthly donation. Your contribution allows us to continue to bring you accurate, thought-provoking scientific and medical news that you can trust. Independent reporting requires time, effort, and resources, and your support makes it possible for us to continue exploring stories that are important to you. Together, we can ensure that important discoveries and developments attract those who need them the most.