Base Results
Optimized Results
Base and Optimized
Base Results
Optimized Results
Base and Optimized
Base Results
Optimized Results
Base and Optimized
Base Results
Optimized Results
Base and Optimized
Base Results
Optimized Results
Base and Optimized
Base Results
Optimized Results
Base and Optimized
Base Results
Optimized Results
Base and Optimized
Base Results
Optimized Results
Base and Optimized
Manufacturer/Processor Type, Speed, Count, Threads, Processes
Includes the manufacturer/processor type, processor speed, number of processors, threads, and number of processes.
Move mouse over this column for each row to display additional information, including; manufacturer, system name, interconnect, MPI, affiliation, and submission date.

Run Type

Run Type, indicates whether the benchmark was a base run or was optimized.

Processors

Processors, this is the number of processors used in the benchmark, entered in the form by the benchmark submitter.

G-HPL ( system performance )
HPL, Solves a randomly generated dense linear system of equations in double floating-point precision (IEEE 64-bit) arithmetic using MPI. The linear system matrix is stored in a two-dimensional block-cyclic fashion and multiple variants of code are provided for computational kernels and communication patterns. The solution method is LU factorization through Gaussian elimination with partial row pivoting followed by a backward substitution. Unit: Tera Flops per Second
PP-HPL ( per processor )
HPL, Solves a randomly generated dense linear system of equations in double floating-point precision (IEEE 64-bit) arithmetic using MPI. The linear system matrix is stored in a two-dimensional block-cyclic fashion and multiple variants of code are provided for computational kernels and communication patterns. The solution method is LU factorization through Gaussian elimination with partial row pivoting followed by a backward substitution. Unit: Tera Flops per Second
G-PTRANS (A=A+B^T, MPI) ( system performance )
PTRANS (A=A+B^T, MPI), Implements a parallel matrix transpose for two-dimensional block-cyclic storage. It is an important benchmark because it exercises the communications of the computer heavily on a realistic problem where pairs of processors communicate with each other simultaneously. It is a useful test of the total communications capacity of the network. Unit: Giga Bytes per Second
PP-PTRANS (A=A+B^T, MPI) ( per processor )
PTRANS (A=A+B^T, MPI), Implements a parallel matrix transpose for two-dimensional block-cyclic storage. It is an important benchmark because it exercises the communications of the computer heavily on a realistic problem where pairs of processors communicate with each other simultaneously. It is a useful test of the total communications capacity of the network. Unit: Giga Bytes per Second
G-RandomAccess ( system performance )
Global RandomAccess, also called GUPs, measures the rate at which the computer can update pseudo-random locations of its memory - this rate is expressed in billions (giga) of updates per second (GUP/s). Unit: Giga Updates per Second
PP-RandomAccess ( per processor )
PP-RandomAccess, also called GUPs, measures the rate at which the computer can update pseudo-random locations of its memory - this rate is expressed in billions (giga) of updates per second (GUP/s). Unit: Giga Updates per Second
EP-STREAM ( per process )
The Embarrassingly Parallel STREAM benchmark is a simple synthetic benchmark program that measures sustainable memory bandwidth and the corresponding computation rate for simple numerical vector kernels. It is run in embarrassingly parallel manner - all computational processes perform the benchmark at the same time, the arithmetic average rate is reported. Unit: Giga Bytes per Second
G-FFT ( system performance )
Global FFT, performs the same test as FFT but across the entire system by distributing the input vector in block fashion across all the processes. Unit: Giga Flops per Second
PP-FFT ( per processor )
PP-FFT, performs the same test as FFT but across the entire system by distributing the input vector in block fashion across all the processes. Unit: Giga Flops per Second
EP-DGEMM ( per process )
Embarrassingly Parallel DGEMM, benchmark measures the floating-point execution rate of double precision real matrix-matrix multiply performed by the DGEMM subroutine from the BLAS (Basic Linear Algebra Subprograms). It is run in embarrassingly parallel manner - all computational processes perform the benchmark at the same time, the arithmetic average rate is reported. Unit: Giga Flops per Second
Randomly Ordered Ring Bandwidth ( per process )
Randomly Ordered Ring Bandwidth, reports bandwidth achieved in the ring communication pattern. The communicating processes are ordered randomly in the ring (with respect to the natural ordering of the MPI default communicator). The result is averaged over various random assignments of processes in the ring. Unit: Giga Bytes per second per process
Randomly-Ordered Ring Latency ( per process )
Randomly-Ordered Ring Latency ( per process ), reports latency in the ring communication pattern. The communicating processes are ordered randomly in the ring (with respect to the natural ordering of the MPI default communicator) in the ring. The result is averaged over various random assignments of processes in the ring. Unit: micro-seconds







Global and per Processor Results - Optimized Runs Only - 41 Systems - Generated on Thu Jun 23 16:00:34 2022
System Information
System - Processor - Speed - Count - Threads - Processes
G-HPL PP-HPL G-PTRANS PP-PTRANS G-Random
Access
PP-Random
Access
G-FFT PP-FFT
MA/PT/PS/PC/TH/PR/CM/CS/IC/IA/SDTFlop/s TFlop/s GB/s GB/s Gup/s Gup/s GFlop/s GFlop/s
Manufacturer: Cray Inc.
Processor Type: Cray X1 MSP
Processor Speed: 0.8GHz
Processor Count: 252
Threads: 1
Processses: 252
System Name: X1
Interconnect: X1
MPI: MPT 2.4
Affiliation: Oak Ridge National Laboratory
Submission Date: 04-26-04
Cray Inc. X1 Cray MSP   0.8GHz   252   1   252
2.37
0.0094
96.14
0.3815




Manufacturer: Cray Inc.
Processor Type: Cray X1 MSP
Processor Speed: 0.8GHz
Processor Count: 60
Threads: 1
Processses: 60
System Name: X1
Interconnect: Cray modified 2D torus
MPI: MPT 2.4
Affiliation: U.S. Army Engineer Research and Development Center Major Shared Resource Center
Submission Date: 04-26-04
Cray Inc. X1 Cray MSP   0.8GHz   60   1   60
0.58
0.0096
31.07
0.5179




Manufacturer: Cray Inc.
Processor Type: Cray X1 MSP
Processor Speed: 0.8GHz
Processor Count: 124
Threads: 1
Processses: 124
System Name: X1
Interconnect: Cray modified 2D torus
MPI: MPT.2.3.0.3
Affiliation: Army High Performance Computing Research Center (AHPCRC)
Submission Date: 05-03-04
Cray Inc. X1 Cray MSP   0.8GHz   124   1   124
1.18
0.0095
39.38
0.3176




Manufacturer: Cray Inc.
Processor Type: Cray X1 MSP
Processor Speed: 0.8GHz
Processor Count: 124
Threads: 1
Processses: 124
System Name: X1
Interconnect: Cray modified 2D torus
MPI: MPT 2.3.0.3
Affiliation: Army High Performance Computing Research Center (AHPCRC)
Submission Date: 05-05-04
Cray Inc. X1 Cray MSP   0.8GHz   124   1   124
1.18
0.0095
39.38
0.3176




Manufacturer: IBM
Processor Type: IBM PowerPC 440
Processor Speed: 0.7GHz
Processor Count: 1024
Threads: 1
Processses: 1024
System Name: Blue Gene/L
Interconnect: Custom
MPI: MPICH 1.0 customized for Blue Gene/L
Affiliation: Blue Gene Computational Center at IBM T.J. Watson Research Center
Submission Date: 04-11-05
IBM Blue Gene/L PowerPC 440   0.7GHz   1024   1   1024
1.42
0.0014
27.99
0.0273
0.13
0.0001
49.93
0.0488
Manufacturer: Cray Inc.
Processor Type: Cray X1E
Processor Speed: 1.13GHz
Processor Count: 248
Threads: 1
Processses: 248
System Name: mfeg8
Interconnect: Modified 2D Torus
MPI: mpt 2.4
Affiliation: Cray
Submission Date: 06-15-05
Cray Inc. mfeg8 Cray X1E   1.13GHz   248   1   248
3.39
0.0137
66.01
0.2662
1.85
0.0075
-1.00
-0.0040
Manufacturer: Cray Inc.
Processor Type: Cray X1E
Processor Speed: 1.13GHz
Processor Count: 1008
Threads: 1
Processses: 1008
System Name: X1
Interconnect: Cray Modified 2D torus
MPI: MPT
Affiliation: DOE/Office of Science/ORNL
Submission Date: 11-02-05
Cray Inc. X1 Cray E   1.13GHz   1008   1   1008
12.27
0.0122
144.97
0.1438
7.69
0.0076
245.09
0.2431
Manufacturer: IBM
Processor Type: IBM PowerPC 440
Processor Speed: 0.7GHz
Processor Count: 131072
Threads: 1
Processses: 65536
System Name: Blue Gene/L
Interconnect: Custom Torus / Tree
MPI: MPICH2 1.0.1
Affiliation: National Nuclear Security Administration
Submission Date: 11-02-05
IBM Blue Gene/L PowerPC 440   0.7GHz   131072   1   65536
252.30
0.0019
369.63
0.0028
35.47
0.0003
2311.09
0.0176
Manufacturer: IBM
Processor Type: IBM PowerPC 440
Processor Speed: 0.7GHz
Processor Count: 131072
Threads: 1
Processses: 65536
System Name: Blue Gene/L
Interconnect: Custom Torus / Tree
MPI: MPICH2 1.0.1
Affiliation: National Nuclear Security Administration
Submission Date: 11-02-05
IBM Blue Gene/L PowerPC 440   0.7GHz   131072   1   65536
259.21
0.0020
374.42
0.0029
32.98
0.0003
2228.39
0.0170
Manufacturer: IBM
Processor Type: IBM PowerPC 440
Processor Speed: 0.7GHz
Processor Count: 32768
Threads: 1
Processses: 16384
System Name: Blue Gene/L
Interconnect: Blue Gene Custom Interconnect
MPI: MPICH 1.1
Affiliation: IBM T.J. Watson Research Center
Submission Date: 11-04-05
IBM Blue Gene/L PowerPC 440   0.7GHz   32768   1   16384
67.12
0.0020
137.24
0.0042
17.29
0.0005
988.18
0.0302
Manufacturer: Cray Inc.
Processor Type: AMD Opteron
Processor Speed: 2.4GHz
Processor Count: 5208
Threads: 1
Processses: 5208
System Name: XT3
Interconnect: Cray Seastar
MPI: xt-mpt/1.3.07
Affiliation: Oak Ridge National Laboratory, DOE Office of Science
Submission Date: 11-10-05
Cray Inc. XT3 AMD Opteron   2.4GHz   5208   1   5208
20.42
0.0039
942.25
0.1809
0.66
0.0001
779.43
0.1497
Manufacturer: Cray Inc.
Processor Type: AMD Opteron
Processor Speed: 2.4GHz
Processor Count: 5208
Threads: 1
Processses: 5208
System Name: XT3
Interconnect: Cray Seastar
MPI: xt-mpt/1.3.07
Affiliation: Oak Ridge National Laboratories - DOE Office of Science
Submission Date: 11-12-05
Cray Inc. XT3 AMD Opteron   2.4GHz   5208   1   5208
20.42
0.0039
942.25
0.1809
0.66
0.0001
779.43
0.1497
Manufacturer: Cray Inc.
Processor Type: AMD Opteron
Processor Speed: 2.4GHz
Processor Count: 5208
Threads: 1
Processses: 5208
System Name: XT3
Interconnect: Cray Seastar
MPI: xt-mpt/1.3.07
Affiliation: Oak Ridge National Lab - DOD Office of Science
Submission Date: 11-12-05
Cray Inc. XT3 AMD Opteron   2.4GHz   5208   1   5208
20.34
0.0039
944.21
0.1813
0.69
0.0001
855.24
0.1642
Manufacturer: NEC
Processor Type: NEC SX-8
Processor Speed: 2GHz
Processor Count: 40
Threads: 1
Processses: 40
System Name: NEC SX-7C
Interconnect: IXS
MPI: MPI/SX 7.1.3
Affiliation: Tohoku University, Information Synergy Center
Submission Date: 03-24-06
NEC SX-7C SX-8   2GHz   40   1   40
0.61
0.0153
70.09
1.7523
0.01
0.0002
92.83
2.3207
Manufacturer: NEC
Processor Type: NEC SX-8
Processor Speed: 2GHz
Processor Count: 40
Threads: 8
Processses: 5
System Name: NEC SX-7C
Interconnect: IXS
MPI: MPI/SX 7.1.3
Affiliation: Tohoku University, Information Synergy Center
Submission Date: 03-24-06
NEC SX-7C SX-8   2GHz   40   8   5
0.30
0.0076
20.13
0.5032
0.00
0.0001
29.62
0.7406
Manufacturer: NEC
Processor Type: NEC SX-7
Processor Speed: 0.552GHz
Processor Count: 32
Threads: 1
Processses: 32
System Name: NEC SX-7
Interconnect: non
MPI: MPI/SX 7.0.6
Affiliation: Tohoku University, Information Synergy Center
Submission Date: 03-24-06
NEC SX-7   0.552GHz   32   1   32
0.26
0.0082
36.17
1.1303
0.26
0.0081
79.48
2.4837
System Information
System - Processor - Speed - Count - Threads - Processes
G-HPL PP-HPL G-PTRANS PP-PTRANS G-Random
Access
PP-Random
Access
G-FFT PP-FFT
MA/PT/PS/PC/TH/PR/CM/CS/IC/IA/SDTFlop/s TFlop/s GB/s GB/s Gup/s Gup/s GFlop/s GFlop/s
Manufacturer: NEC
Processor Type: NEC SX-7
Processor Speed: 0.552GHz
Processor Count: 32
Threads: 16
Processses: 2
System Name: NEC SX-7
Interconnect: non
MPI: MPI/SX 7.0.6
Affiliation: Tohoku University, Information Synergy Center
Submission Date: 03-24-06
NEC SX-7   0.552GHz   32   16   2
0.18
0.0056
22.02
0.6881
0.15
0.0046
8.00
0.2500
Manufacturer: IBM
Processor Type: IBM Power5+
Processor Speed: 2.2GHz
Processor Count: 64
Threads: 1
Processses: 64
System Name: P5 P575+
Interconnect: HPS
MPI: poe 4.2.2.3
Affiliation: IBM
Submission Date: 05-08-06
IBM P5 P575+ Power5+   2.2GHz   64   1   64
0.49
0.0077
44.30
0.6922
0.26
0.0041
23.25
0.3632
Manufacturer: IBM
Processor Type: IBM Power5+
Processor Speed: 2.2GHz
Processor Count: 128
Threads: 1
Processses: 128
System Name: P5 P575+
Interconnect: HPS
MPI: poe 4.2.2.3
Affiliation: IBM
Submission Date: 05-08-06
IBM P5 P575+ Power5+   2.2GHz   128   1   128
0.99
0.0077
89.99
0.7031
0.44
0.0034
41.48
0.3241
Manufacturer: Cray Inc.
Processor Type: AMD Opteron
Processor Speed: 2.6GHz
Processor Count: 10404
Threads: 1
Processses: 10404
System Name: XT3 Dual-Core
Interconnect: Cray SeaStar
MPI: xt-mpt 1.5.25
Affiliation: Oak Ridge National Lab
Submission Date: 11-06-06
Cray Inc. XT3 Dual-Core AMD Opteron   2.6GHz   10404   1   10404
43.51
0.0042
2038.92
0.1960
10.67
0.0010
1122.70
0.1079
Manufacturer: Cray Inc.
Processor Type: AMD Opteron
Processor Speed: 2.4GHz
Processor Count: 12960
Threads: 1
Processses: 25920
System Name: Red Storm/XT3
Interconnect: Cray custom
MPI: MPICH 2 v1.0.2
Affiliation: NNSA/Sandia National Laboratories
Submission Date: 11-10-06
Cray Inc. Red Storm/XT3 AMD Opteron   2.4GHz   12960   1   25920
90.99
0.0070
2351.50
0.1814
29.82
0.0023
1529.14
0.1180
Manufacturer: Cray Inc.
Processor Type: AMD Opteron
Processor Speed: 2.4GHz
Processor Count: 12800
Threads: 1
Processses: 25600
System Name: Red Storm/XT3
Interconnect: Seastar
MPI: xt-mpt/1.5.39 based on MPICH 2.0
Affiliation: DOE/NNSA/Sandia National Laboratories
Submission Date: 11-06-07
Cray Inc. Red Storm/XT3 AMD Opteron   2.4GHz   12800   1   25600
93.58
0.0073
4993.64
0.3901
33.56
0.0026
1515.42
0.1184
Manufacturer: Cray Inc.
Processor Type: AMD Opteron
Processor Speed: 2.4GHz
Processor Count: 12960
Threads: 1
Processses: 25920
System Name: Red Storm/XT3
Interconnect: Seastar
MPI: xt-mpt/1.5.39 based on MPICH 2.0
Affiliation: DOE/NNSA/Sandia National Laboratories
Submission Date: 11-06-07
Cray Inc. Red Storm/XT3 AMD Opteron   2.4GHz   12960   1   25920
93.24
0.0072
2371.40
0.1830
29.46
0.0023
2870.88
0.2215
Manufacturer: NEC
Processor Type: NEC SX-9
Processor Speed: 3.2GHz
Processor Count: 32
Threads: 16
Processses: 2
System Name: SX-9
Interconnect: IXS
MPI: MPI/SX 8.0.0/ISC
Affiliation: TOHOKU UNIVERSITY
Submission Date: 11-06-08
NEC SX-9   3.2GHz   32   16   2
1.83
0.0570
128.98
4.0306
0.10
0.0030
57.98
1.8120
Manufacturer: NEC
Processor Type: NEC SX-9
Processor Speed: 3.2GHz
Processor Count: 256
Threads: 1
Processses: 256
System Name: SX-9
Interconnect: IXS
MPI: MPI/SX 8.0.0/ISC
Affiliation: TOHOKU UNIVERSITY
Submission Date: 11-06-08
NEC SX-9   3.2GHz   256   1   256
20.19
0.0789
778.82
3.0423
1.40
0.0055
2377.31
9.2864
Manufacturer: IBM
Processor Type: PowerPC 450
Processor Speed: 0.85GHz
Processor Count: 32768
Threads: 4
Processses: 32768
System Name: Blue Gene/P
Interconnect: Torus
MPI: MPICH 2
Affiliation: Argonne National Lab - LCF
Submission Date: 11-17-08
IBM Blue Gene/P PowerPC 450   0.85GHz   32768   4   32768
173.36
0.0053
625.20
0.0191
103.18
0.0031
5079.59
0.1550
Manufacturer: Cray, Inc.
Processor Type: AMD Opteron
Processor Speed: 2.6GHz
Processor Count: 98304
Threads: 3
Processses: 32768
System Name: XT5
Interconnect: SeaStar 2+
MPI: MPT 3.4.2
Affiliation: National Institute for Computational Sciences
Submission Date: 11-02-09
Cray, Inc. XT5 AMD Opteron   2.6GHz   98304   3   32768
657.62
0.0067
1559.64
0.0159
18.50
0.0002
7529.50
0.0766
Manufacturer: Cray
Processor Type: AMD Opteron
Processor Speed: 2.6GHz
Processor Count: 196608
Threads: 3
Processses: 65536
System Name: XT5
Interconnect: Seastar
MPI: MPT 3.4.2
Affiliation: Oak Ridge National Laboratory
Submission Date: 11-10-09
Cray XT5 AMD Opteron   2.6GHz   196608   3   65536
1338.67
0.0068
1889.16
0.0096
36.43
0.0002
10698.50
0.0544
Manufacturer: Cray
Processor Type: AMD Opteron
Processor Speed: 2.6GHz
Processor Count: 223112
Threads: 2
Processses: 111556
System Name: XT5
Interconnect: Seastar
MPI: MPT 3.4.2
Affiliation: Oak Ridge National Laboratory
Submission Date: 11-10-09
Cray XT5 AMD Opteron   2.6GHz   223112   2   111556
1467.66
0.0066
13723.20
0.0615
37.69
0.0002
3879.21
0.0174
Manufacturer: NEC
Processor Type: SX-9
Processor Speed: 3.2GHz
Processor Count: 960
Threads: 1
Processses: 960
System Name: SX-9
Interconnect: IXS
MPI: MPI/SX 8.0.10
Affiliation: Japan Agency for Marine-Earth Science and Technology (JAMSTEC)
Submission Date: 11-11-09
NEC SX-9   3.2GHz   960   1   960
79.55
0.0829
2317.08
2.4136
2.07
0.0022
6942.39
7.2317
Manufacturer: IBM
Processor Type: Power PC 450
Processor Speed: 0.85GHz
Processor Count: 131072
Threads: 4
Processses: 32768
System Name: Dawn
Interconnect: Custom Torus + Tree + Barrier
MPI: MPICH2 1.0.7
Affiliation: NNSA - Lawrence Livermore National Laboratory
Submission Date: 11-11-09
IBM Dawn Power PC 450   0.85GHz   131072   4   32768
367.82
0.0028
757.12
0.0058
117.13
0.0009
3201.20
0.0244
Manufacturer: NEC
Processor Type: SX-9
Processor Speed: 3.2GHz
Processor Count: 8
Threads: 1
Processses: 2
System Name: SX-9
Interconnect: IXS
MPI: MPI/SX 8.0.10
Affiliation: Japan Agency for Marine-Earth Science and Technology (JAMSTEC)
Submission Date: 11-16-09
NEC SX-9   3.2GHz   8   1   2
0.21
0.0260
70.05
8.7565
0.16
0.0197
0.46
0.0580
System Information
System - Processor - Speed - Count - Threads - Processes
G-HPL PP-HPL G-PTRANS PP-PTRANS G-Random
Access
PP-Random
Access
G-FFT PP-FFT
MA/PT/PS/PC/TH/PR/CM/CS/IC/IA/SDTFlop/s TFlop/s GB/s GB/s Gup/s Gup/s GFlop/s GFlop/s
Manufacturer: NEC
Processor Type: SX-9
Processor Speed: 3.2GHz
Processor Count: 16
Threads: 1
Processses: 2
System Name: SX-9
Interconnect: IXS
MPI: MPI/SX 8.0.10
Affiliation: Japan Agency for Marine-Earth Science and Technology (JAMSTEC)
Submission Date: 11-16-09
NEC SX-9   3.2GHz   16   1   2
0.59
0.0368
100.70
6.2938
0.09
0.0057
0.48
0.0301
Manufacturer: NEC
Processor Type: SX-9
Processor Speed: 3.2GHz
Processor Count: 1280
Threads: 1
Processses: 1280
System Name: SX-9
Interconnect: IXS
MPI: MPI/SX 8.0.12a
Affiliation: Japan Agency for Marine-Earth Science and Technology (JAMSTEC)
Submission Date: 11-11-10
NEC SX-9   3.2GHz   1280   1   1280
100.28
0.0783
3167.77
2.4748
2.58
0.0020
11876.00
9.2781
Manufacturer: Fujitsu Ltd.
Processor Type: Fujitsu SPARC64 VIIIfx
Processor Speed: 2GHz
Processor Count: 147456
Threads: 8
Processses: 18432
System Name: K computer
Interconnect: Tofu interconnect
MPI: Parallelnavi Technical Computing Language V1.0L20
Affiliation: RIKEN Advanced Institute for Computational Science (AICS)
Submission Date: 10-31-11
Fujitsu Ltd. K computer Fujitsu SPARC64 VIIIfx   2GHz   147456   8   18432
2114.19
0.0143
5935.73
0.0403
77.61
0.0005
32499.50
0.2204
Manufacturer: Fujitsu Ltd.
Processor Type: Fujitsu SPARC64 VIIIfx
Processor Speed: 2GHz
Processor Count: 147456
Threads: 8
Processses: 18432
System Name: K computer
Interconnect: Tofu interconnect
MPI: Parallelnavi Technical Computing Language V1.0L20
Affiliation: RIKEN Advanced Institute for Computational Science (AICS)
Submission Date: 11-08-11
Fujitsu Ltd. K computer Fujitsu SPARC64 VIIIfx   2GHz   147456   8   18432
2117.70
0.0144
5833.41
0.0396
121.10
0.0008
34718.50
0.2354
Manufacturer: IBM
Processor Type: IBM Power7 Quad-Chip module
Processor Speed: 3.836GHz
Processor Count: 1470
Threads: 32
Processses: 1470
System Name: IBM Power775
Interconnect: IBM Hub Chip integrated interconnect
MPI: IBM PE MPI release 1206
Affiliation: IBM Development Engineering - DARPA Trial Subset
Submission Date: 07-15-12
IBM Power775 Power7 Quad-Chip module   3.836GHz   1470   32   1470
1067.79
0.7264
46006.30
31.2968
1571.91
1.0693
94855.80
64.5278
Manufacturer: Fujitsu
Processor Type: Fujitsu SPARC64 VIIIfx
Processor Speed: 2GHz
Processor Count: 663552
Threads: 8
Processses: 82944
System Name: K computer
Interconnect: Tofu Interconnect
MPI: Parallelnavi Technical Computing Language V1.0L20
Affiliation: RIKEN Advanced Institute for Computational Scinece
Submission Date: 10-23-12
Fujitsu K computer SPARC64 VIIIfx   2GHz   663552   8   82944
9795.56
0.0148
16551.80
0.0249
471.94
0.0007
205936.00
0.3104
Manufacturer: IBM
Processor Type: IBM Power7
Processor Speed: 3.836GHz
Processor Count: 1989
Threads: 32
Processses: 1989
System Name: Power 775
Interconnect: Custom IBM Hub Chip
MPI: IBM PE v1209
Affiliation: IBM Development Engineering
Submission Date: 11-08-12
IBM Power 775 Power7   3.836GHz   1989   32   1989
1343.67
0.6756
60473.80
30.4041
2020.77
1.0160
132658.00
66.6958
Manufacturer: IBM
Processor Type: IBM PowerPC A2
Processor Speed: 1.6GHz
Processor Count: 49152
Threads: 16
Processses: 196608
System Name: Blue Gene/Q (MIRA)
Interconnect: BGQ 5D TORUS
MPI: MPICH2 version 1.5
Affiliation: Argonne Leadership Computing Facility/Argonne National Laboratory
Submission Date: 10-26-14
IBM Blue Gene/Q (MIRA) PowerPC A2   1.6GHz   49152   16   196608
5709.28
0.1162
12751.80
0.2594
417.79
0.0085
226101.00
4.6000
Manufacturer: Fujitsu
Processor Type: Fujitsu SPARC64 VIIIfx
Processor Speed: 2GHz
Processor Count: 82944
Threads: 8
Processses: 82944
System Name: K computer
Interconnect: Tofu Interconnect
MPI: Parallelnavi Technical Computing Language V1.0L20
Affiliation: RIKEN Advanced Institute for Computational Scinece
Submission Date: 11-13-16
Fujitsu K computer SPARC64 VIIIfx   2GHz   82944   8   82944
9515.09
0.1147
15911.00
0.1918
460.50
0.0056
252100.00
3.0394

 

Note:
Blank fields in the table above are from early benchmark runs that did not include that individual benchmark,
in particular G-RandomAccess and G-FFT.

Column Definitions
G-HPL ( system performance )
Solves a randomly generated dense linear system of equations in double floating-point precision (IEEE 64-bit) arithmetic using MPI. The linear system matrix is stored in a two-dimensional block-cyclic fashion and multiple variants of code are provided for computational kernels and communication patterns. The solution method is LU factorization through Gaussian elimination with partial row pivoting followed by a backward substitution. Unit: Tera Flops per Second
PP-HPL ( per processor )
Solves a randomly generated dense linear system of equations in double floating-point precision (IEEE 64-bit) arithmetic using MPI. The linear system matrix is stored in a two-dimensional block-cyclic fashion and multiple variants of code are provided for computational kernels and communication patterns. The solution method is LU factorization through Gaussian elimination with partial row pivoting followed by a backward substitution. Unit: Tera Flops per Second
G-PTRANS (A=A+B^T, MPI) ( system performance )
Implements a parallel matrix transpose for two-dimensional block-cyclic storage. It is an important benchmark because it exercises the communications of the computer heavily on a realistic problem where pairs of processors communicate with each other simultaneously. It is a useful test of the total communications capacity of the network. Unit: Giga Bytes per Second
PP-PTRANS (A=A+B^T, MPI) ( per processor )
Implements a parallel matrix transpose for two-dimensional block-cyclic storage. It is an important benchmark because it exercises the communications of the computer heavily on a realistic problem where pairs of processors communicate with each other simultaneously. It is a useful test of the total communications capacity of the network. Unit: Giga Bytes per Second
G-RandomAccess ( system performance )
Global RandomAccess, also called GUPs, measures the rate at which the computer can update pseudo-random locations of its memory - this rate is expressed in billions (giga) of updates per second (GUP/s). Unit: Giga Updates per Second
PP-RandomAccess ( per processor )
Global RandomAccess, also called GUPs, measures the rate at which the computer can update pseudo-random locations of its memory - this rate is expressed in billions (giga) of updates per second (GUP/s). Unit: Giga Updates per Second
G-FFT ( system performance )
Global FFT performs the same test as FFT but across the entire system by distributing the input vector in block fashion across all the processes. Unit: Giga Flops per Second
PP-FFT ( per processor )
Global FFT performs the same test as FFT but across the entire system by distributing the input vector in block fashion across all the processes. Unit: Giga Flops per Second




Thu Jun 23 16:00:34 2022
0 seconds