We are pleased to announce the long awaited version 1.2 of HPCC. It contains many bug fixes, major features, and minor enhancements many of which were contributed by users. The major focus of this release was to improve accuracy of the reported results and ensure scalability of the code on the largest supercomputer installations with hundreds of thousands of computational cores.
Summary of Changes
-
Changes in the FFT component:
-
Added flexibility in choosing vector sizes and processor counts: now the code can do powers of 2, 3, and 5 both sequentially and in parallel tests.
-
FFTW can now run with ESTIMATE (not just MEASURE) flag: it might produce worse performance results but often reduces time to run the test and cuases less memory fragmentation.
-
Changes in the DGEMM component:
-
Changes in the RandomAccess component:
-
Removed time-bound functionality: only runs that perform complete computation are now possible.
-
Made the timing more accurate: main array initialization is not counted towards performance timing.
-
Cleaned up the code: some non-portable C language constructs have been removed.
-
Added new algorithms: new algorithms from Sandia based on hypercube network topology can now be chosen at compile time which results on much better performance results on many types of parallel systems.
-
Fixed potential resource leaks by adding function calls rquired by the MPI standard.
-
Changes in the HPL component:
-
Changes in the PTRANS component:
-
Miscellaneous changes:
-
Added better support for Windows-based clusters by taking advantage of Win32 API.
-
Added custom memory allocator to deal with memory fragmentation on some systems.
-
Added better reporting of configuration options in the output file.