USE OF LOOP TRANSFORMATION TECHNIQUES TO OPTIMIZE PARALLEL APPLICATIONS

O.A. Chemerys, Z. Kh. Borukayev, I.V. Blinov

Èlektron. model. 2021, 44(1):53-68

https://doi.org/10.15407/emodel.44.01.053

ABSTRACT

The possibility of programs optimizing, in particular, existing parallel ones, is considered. Minimization of program execution time on a parallel computer system was chosen as the optimization function. To optimize, we use algorithms for affine transformation of the iterative space of loop operators, each of which is represented as a graph based on the dependencies between the operators that create relationships in the iteration graph of the loop operator. As an example, we consider the process of optimizing the software package MFDn, which is used in nuclear physics to find a multibody nuclear Hamiltonian. Optimized program run time decreasing is shown.

KEYWORDS

parallelization, program optimization, program transformation, affine transformations.

REFERENCES

  1. Sternberg, P. (2008), “Accelerating Configuration Interaction Calculations for Nuclear Structure”, Proceedings of the ACM/IEEE Conference on Supercomputing, Austin, TX, USA, рр. 1-12.
    https://doi.org/10.1109/SC.2008.5220090
  2. Collard, J.F. (1995), “Construction of do loops from systems of affine constraints”, Parallel Processing Letters, Vol. 5, pp. 421-436.
    https://doi.org/10.1142/S0129626495000394
  3. Chemeris, A., Gorunova, J. and Lazorenko, D. (2012), “Loop Nests Parallelization for Digital System Synthesis”, Proceedings of IEEE East-West Design & Test Symposium (EWDTS’2012), Kharkov, Ukraine, September 14–17, 2012, рр. 118-121.
  4. Bielecki, W. and Hyduke, S. (1999), “Kompilator jezyka VHDL do syntezy ukladow logicznych”, Reprogramowalne uklady cyfrowe (RUC’99), Szczecin, рр. 183-190.
  5. Lim, A.W. and Lam, M.S. (1995), “Communication-free parallelization via affine transformations”, Languages and Compilers for Parallel Computing: 7th International Workshop Ithaca, Berlin, Springer Berlin Heidelberg, рр. 92—106.
    https://doi.org/10.1007/BFb0025873
  6. Vasilache, N., Cohen, A. and Pouchet, L.-N. (2007), “Automatic Correction of Loop Transformations”, Parallel Architectures and Compilation Techniques, Conference Proceedings, PACT, Brasov, Romania, рр. 292-304.
    https://doi.org/10.1109/PACT.2007.4336220
  7. Fraboulet, A., Huard, G. and Mignotte, A. (1999), “Loop Alignment for Memory Accesses Optimization”, Twelfth International Symposium on System Synthesis, Piscataway, NJ, USA, рр. 71–77.
  8. Aktulga, H.M., Buluç, A., Williams, S. and Yang, C. (2014), “Optimizing Sparse Matrix-Multiple Vectors Multiplication for Nuclear Configuration Interaction Calculations”, 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp. 1213-1222, 
    https://doi.org/10.1109/IPDPS.2014.125
  9. Bielecki, W. (2003), “Finding Synchronization - Free Parallelism for Non-uniform Loops”, Proceedings of the Computational Science (ICCS 2003), Lecture Notes in Computer Science, Vol. 2658, pp. 925-934.
    https://doi.org/10.1007/3-540-44862-4_100

Full text: PDF