Article Highlight | 23-Sep-2025

Compression using automatically differentiable tensor networks: Fewer parameters with equal or better performance

Deep tensor networks compactly encode neural network parameters

Intelligent Computing

The workflow of the automatically differentiable tensor network method for compressing neural networks. — **image:**
(A) Illustration of a convolutional neural network (NN) whose variational parameters (T) are encoded in the automatically differentiable tensor network (ADTN) shown in (B). The ADTN contains many fewer parameters than T.
view more

Credit: Yong Qing et al.

In machine learning tasks for applications as diverse as facial recognition and movie recommendations, neural networks have been used extensively. To address the complexities of neural networks and issues such as overfitting, loss of generalization power, and excessive hardware cost, Qing et al. developed a general compression scheme based on automatically differentiable tensor networks that considerably reduces the number variational parameters needed while maintaining or improving the network’s performance. Their work was published on May 15, 2025, in an article titled “Compressing Neural Networks Using Tensor Networks with Exponentially Fewer Variational Parameters” in Intelligent Computing, a Science Partner Journal.

Across different neural network models—including linear and convolutional layers—the proposed automatically differentiable tensor network scheme achieves exponential parameter compression while maintaining or even improving accuracy. For example, approximately 10 million parameters were reduced to 424 while test accuracy improved from 90.17% to 91.74% on a standard 10-class dataset. This efficiency follows a principled scaling: a weight tensor with 2^{Q} entries can be encoded with about O(MQ) trainable parameters at moderate depth M, highlighting why deep tensor networks outperform shallow decompositions or factorization. Owing to its flexibility, the automatically differentiable tensor network allows further optimization by tuning hyperparameters such as depth and bond dimension or making modest architectural adjustments. While current inference requires an additional contraction step, integrating contraction with forward passes is an active engineering direction. Looking ahead, combining tensor-network compression with large-scale model training could provide a more sustainable path for building trillion-parameter artificial intelligence systems, extending benefits to demanding domains such as natural language processing and robotics, and opening the possibility of developing models entirely based on tensor-network representations.

The method developed in this research involves encoding the variational parameters of neural networks into tensor networks using a “brick-wall” structure. Each block represents a tensor, and the bonds connected to a block represent the indices of the corresponding tensors. In the new method, there are 2 main stages to obtain tensors. First, the Euclidean distance was used as the loss function to obtain tensors in the tensor network and enhance its stability in pretraining. Then the authors minimized the loss function for machine learning tasks. The tensor network directly encodes the parameter tensors into deep tensor networks, which distinguishes it from network distillation and model quantization. Also, the tensor network’s lower-rank structure with a deep architecture distinguishes it from conventional low-rank decompositions, which produce shallow structures.

Overall, high compression ratios were achieved for both linear and convolutional layers, with compressed models retaining between 98% and 103% of the baseline accuracy on three different image datasets. The new method was tested on 5 different neural network models. The compression scheme is flexible, as the choice of compressed neural network layers, the number of automatically differentiable tensor networks, and the hyperparameters of each automatically differentiable tensor network can be varied and adjusted. Even when compared with shallow tensor networks, the new moderate-depth tensor network method achieved high compression ratios with comparable or even better accuracy. For issues of underfitting or overfitting in neural networks, adjusting the number of automatically differentiable tensor networks may provide a solution. Performing compression layer by layer also helps to overcome local minima problems.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.