Benchmarking LEON3


2012-02-23Publicerad av Sven-Åke Andersson

Introduction


In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it. The term 'benchmark' is also mostly utilized for the purposes of elaborately-designed benchmarking programs themselves.

Benchmarking is usually associated with assessing performance characteristics of computer hardware, for example, the floating point operation performance of a CPU, but there are circumstances when the technique is also applicable to software. Software benchmarks are, for example, run against compilers or database management systems.

CPU core benchmarking


Although it doesn’t reflect how you would use a processor in a real application, sometimes it’s important to isolate the CPU’s core from the other elements of the processor and focus on one key element. For example, you might want to have the ability to ignore memory and I/O effects and focus primarily on the pipeline operation. This is CoreMark’s domain. CoreMark is capable of testing a processor’s basic pipeline structure, as well as the ability to test basic read/write operations, integer operations, and control operations. Read more.


CoreMark


CoreMark is a benchmark that aims to measure the performance of central processing units (CPU) used in embedded systems. It was developed in 2009 by Shay Gal-On at EEMBC and is intended to become an industry standard, replacing the antiquated Dhrystone benchmark. The code is written in C code and contains implementations of the following algorithms: list processing (find and sort), Matrix (mathematics) manipulation (common matrix operations), state machine (determine if an input stream contains valid numbers), and CRC. Read more.






 

Downloading CoreMark

 

The test suite can be downloaded from www.coremark.org
 

After downloading and unpacking we have the following directory structure. We will skip the Makefile and instead compile the marked files directly.
 


 


Adapt to the LEON3 platform


The only thing we have to do is edit the file core_portme.h and define the COMPILER_VERSION and COMPILER_FLAGS. If we have missed something please let us know.




 

Compiling the benchmark


We put all the marked files in the same directory and use the following command to compile the code:

sparc-elf-gcc-4.4.2 -O3 -mv8 -funroll-loops -fgcse-sm -msoft-float -mcpu=v8 -o coremark.exe core_list_join.c core_main.c core_matrix.c core_state.c core_util.c core_portme.c -DPERFORMANCE_RUN=1 -DITERATIONS=2000
 

GCC options used
 

For more information see the GCC User's Manual.


-O3

Optimize yet more. ‘-O3’ turns on all optimizations specified by ‘-O2’ and also turns on the ‘-finline-functions’, ‘-funswitch-loops’, ‘-fpredictive-commoning’, ‘-fgcse-after-reload’, ‘-ftree-vectorize’ and ‘-fipa-cp-clone’ options.
 

-funroll-loops

Unroll loops whose number of iterations can be determined at compile time or upon entry to the loop. ‘-funroll-loops’ implies ‘-frerun-cse-after-loop’. This option makes code larger, and may or may not make it run faster.
 

-fgcse-sm

When ‘-fgcse-sm’ is enabled, a store motion pass is run after global common subexpression elimination. This pass will attempt to move stores out of loops. When used in conjunction with ‘-fgcse-lm’, loops containing a load/store sequence can be changed to a load before the loop and a store after the loop.
 

-msoft-float

Do not use the hardware floating-point instructions for floating-point operations. When ‘-msoft-float’ is specified, functions in ‘libgcc.a’ will be used to perform floating-point operations.
 

-mcpu=cpu_type

Set the instruction set and instruction scheduling parameters for machine type cpu type. 
 

-mv8

Specify that the target processor is the SPARC V8.

 

Using the Makefile


Here is a description on how to modify the Makefile setup:
http://tech.groups.yahoo.com/group/leon_sparc/message/21289
 

Running the benchmark


We use GRMON to load and run the benchmark. Here is the result:
 



 

 

Benchmark result


LEON3 running at 66MHz in a SPARTAN-6 FPGA. CoreMark 1.0 scores : 126.218 (1.91/MHz)


Here are scores for more CPUs.