As artificial intelligence models grow more complex, they demand significant energy and memory, making them costly to run on everyday devices like smartphones. A new study from Qualcomm Research introduces a method that slashes these requirements without sacrificing performance, potentially leading to faster, more efficient AI applications for consumers. This breakthrough matters because it could extend battery life in mobile gadgets and reduce operational costs in data centers, making advanced AI more accessible.
Researchers discovered that by transforming and selectively compressing data within AI models, they can maintain accuracy while using less memory. Specifically, they applied sequence-aware transformations to group similar data patterns together, allowing parts of the data to be stored with lower precision. This approach, called STaMP, effectively 'concentrates' important information into a smaller set of high-precision elements, minimizing errors from compression.
The team used mathematical transforms, such as the Discrete Wavelet Transform, to reorganize data sequences in models like those for image and text generation. These transforms exploit natural correlations in data—like adjacent pixels in an image or words in a sentence—to redistribute information efficiently. They then allocated higher precision to key data segments and lower precision to others, all without retraining the models, using calibration from standard datasets.
Results from the paper show that STaMP improved signal quality by over 10% in image generation tests and reduced errors in language models, as detailed in Tables 1 and 2. For instance, when combined with existing compression techniques, it enabled models to operate with 4-bit precision—a level that typically causes significant performance drops—while maintaining near-original output quality, as illustrated in Figure 6.
This innovation could lead to more responsive AI assistants on phones, quicker image editing apps, and greener computing in cloud services, benefiting regular users through lower costs and enhanced device capabilities. By drawing on principles from media compression like JPEG, it bridges classic engineering with modern AI to solve practical energy challenges.
However, the method's effectiveness depends on the data structure and may not suit all AI tasks equally, as noted in the limitations. Further research is needed to adapt it to diverse model types and ensure compatibility with various hardware setups.
About the Author
Guilherme A.
Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.
Connect on LinkedIn