In cloud computing environments where programs wait in queues before execution, accurately predicting how long they'll take to run can significantly reduce waiting times and optimize resource use. Researchers from Lomonosov Moscow State University have developed AI approaches that can estimate execution times for programs across different computer installations—even when those programs have never run on specific systems before. This breakthrough addresses a key bottleneck in federated cloud computing, where programs move between various computing resources managed by different organizations.
The core discovery is that programs and computer installations can be represented mathematically in ways that reveal performance relationships. The researchers created two main methods: one groups similar computer installations based on how programs perform on them, while another uses matrix decomposition techniques similar to those in recommendation systems like Amazon or Netflix to generate "embeddings"—numerical representations that capture performance characteristics.
To develop these methods, the team analyzed execution data from MPI and OpenMP benchmarks run on hundreds of computer installations. They constructed a Programs-Computers matrix where rows represented programs and columns represented computer installations, with entries showing execution times. The challenge was predicting missing entries in this matrix—estimating how long programs would take on installations where they hadn't previously run.
The first approach uses Pearson correlation coefficient to identify computer installations with similar performance patterns. When installations show high correlation (close to 1 in absolute value), they form "cliques" where execution times scale predictably. This method works well when the matrix is densely populated—at least 95% of entries filled—achieving prediction errors as low as 6.8%.
The second approach applies matrix decomposition techniques commonly used in recommendation systems. By breaking down the Programs-Computers matrix into component matrices, the method generates vector representations (embeddings) for both programs and installations. The remarkable property of these 1-dimensional embeddings is that they establish a performance ordering—installations with smaller embedding values generally run programs faster. This method excels with sparse matrices, maintaining reasonable accuracy even when 80% of execution time data is missing.
Experimental results using MPI2007 benchmarks (396 installations, 13 programs) and OpenMP benchmarks (25 installations, 15 programs) showed that ensemble methods combining both approaches delivered the most robust predictions. For matrices with up to 4% missing data, grouping-based methods performed best. For sparser matrices (15-94% missing data), the embedding approach dominated, with prediction errors staying below 60% even with 80% of data removed.
For everyday cloud computing users, this means faster job completion as cloud orchestrators can make smarter decisions about where to run programs. Instead of relying on extensive execution histories or test runs—which often aren't available—cloud systems can now estimate performance across federated installations using minimal data. The embedding approach particularly valuable because it requires only the program's identifier and requested resources, information typically available in modern computing environments.
The research acknowledges limitations, including sensitivity to outliers in the grouping method and the challenge of maintaining accurate predictions as computing environments dynamically change. The methods assume program behavior remains consistent across different arguments and that installation performance characteristics stay relatively stable over time—assumptions that may not always hold in practice.
Original Source
Read the complete research paper
About the Author
Guilherme A.
Former dentist (MD) from Brazil, 41 years old, husband, and AI enthusiast. In 2020, he transitioned from a decade-long career in dentistry to pursue his passion for technology, entrepreneurship, and helping others grow.
Connect on LinkedIn