|
The first release of the benchmark will measure the following tests.
- HPL benchmark (MPI on whole system)
- DGEMM (single CPU)
- *DGEMM (embarrassingly parallel)
- STREAM (single CPU)
- *STREAM (embarrassingly parallel)
- PTRANS A = A + B^T (MPI on whole system)
- RandomAccess (single CPU)
- RandomAccess (MPI on whole system)
- MPI-FFTE (MPI on whole system)
- FFTE (single CPU)
- *FFTE (embarrassingly parallel)
- *RandomAccess (embarrassingly parallel)
- Latency/Bandwidth (under varying conditions and between
multiple pairs of nodes)
|
Rules for Running the Benchmark |
There must be one baseline run submitted for each computer system entered in
the archive. There may also exist an optimized run for each computer system.
- Baseline Runs
Optimizations as described below are allowed.
- Compile and load options
Compiler or loader flags which are supported and documented by the supplier
are allowed. These include porting, optimization, and preprocessor
invocation.
- Libraries
Linking to optimized versions of the following libraries is allowed:
Acceptable use of such libraries is subject to the following rules:
- All libraries used shall be disclosed with the results submission.
Each library shall be identified by library name, revision,
and institution supplying the source code.
- Libraries which are not generally available are not permitted unless
they are made available by the reporting organization within 6 months.
Upon request, these libraries should be usable by others (possibly under
NDA).
- Calls to library subroutines should have the same syntax and semantics
as in the released benchmark code. Code modifications to accommodate
various library call formats are not allowed.
- Software Tools
Any tools used to build and run the benchmark (including pre-processors,
compilers, static and dynamic linkers, operating systems) must be generally
available on the tested system (or they must be made available by the
reporting organization within 6 months.)
- Only complete benchmark output may be submitted - partial results will
not be accepted.
- Optimized Runs
- Libraries
Linking to optimized versions of the following libraries is
allowed:
Upon request, these libraries should be usable by others (possibly under NDA).
- Code modification
Provided that the input and output specification is
preserved, the following routines may be substituted:
- In HPL:
HPL_pdgesv() , HPL_pdtrsv()
(factorization and substitution functions)
- no changes are allowed in the DGEMM testing harness and
the substituted DGEMM routine (if any) should conform to BLAS definition
- In PTRANS:
pdtrans()
- In STREAM:
tuned_STREAM_Copy() ,
tuned_STREAM_Scale() ,
tuned_STREAM_Add() ,
tuned_STREAM_Triad()
- In RandomAccess:
Power2NodesMPIRandomAccessUpdate() ,
AnyNodesMPIRandomAccessUpdate() ,
and RandomAccessUpdate()
- In FFTE:
fftw_malloc() ,
fftw_free() ,
fftw_create_plan() ,
fftw_one() ,
fftw_destroy_plan() ,
fftw_mpi_create_plan() ,
fftw_mpi_local_sizes() ,
fftw_mpi() ,
fftw_mpi_destroy_plan() (all these functions
are compatible with FFTW 2.1.5 so the benchmark code
can be directly linked against FFTW 2.1.5 by only adding
proper compiler and linker flags including
-DUSING_FFTW )
- changes are allowed in parts of the b_eff
component but portability and conformance to the MPI
standard (MPI 1.1 or later) need to be preserved.
Detailed list of removed and added MPI function calls has
to be provided upon submission. Modified source code is
subject to review by the HPC Challenge Committee.
- Limitations of Optimization
- Code with limited calculation accuracy
The calculation should be carried out in full precision (64-bit or
the equivalent). However the substitution of algorithms is allowed (see
Exchange of the used mathematical algorithm).
- Exchange of the used mathematical algorithm
Any change of algorithms must be fully disclosed and is subject to review
by the HPC Challenge Committee. Passing the verification test is a necessary
condition for such an approval. The substituted algorithm must be as robust
as the baseline algorithm. For the matrix multiply in the HPL benchmark,
Strassen Algorithm may not be used as it changes the operation count
of the algorithm.
- Using the knowledge of the solution
Any modification of the code or input data sets, which uses the knowledge
of the solution or of the verification test, is not permitted.
- Code to circumvent the actual computation
Any modification of the code to circumvent the actual computation is not
permitted.
- Only complete benchmark output may be submitted - partial results will
not be accepted.
|
Project Handouts
|