- Om oss
- Plattformar och lösningar
- Utvecklingsverktyg och arbetsmetodik
Benchmarking OpenRISC 1200
2012-03-27Publicerad av Sven-Åke Andersson
In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it. The term 'benchmark' is also mostly utilized for the purposes of elaborately-designed benchmarking programs themselves.
Benchmarking is usually associated with assessing performance characteristics of computer hardware, for example, the floating point operation performance of a CPU, but there are circumstances when the technique is also applicable to software. Software benchmarks are, for example, run against compilers or database management systems.
CPU core benchmarking
Although it doesn’t reflect how you would use a processor in a real application, sometimes it’s important to isolate the CPU’s core from the other elements of the processor and focus on one key element. For example, you might want to have the ability to ignore memory and I/O effects and focus primarily on the pipeline operation. This is CoreMark’s domain. CoreMark is capable of testing a processor’s basic pipeline structure, as well as the ability to test basic read/write operations, integer operations, and control operations. Read more.
CoreMark is a benchmark that aims to measure the performance of central processing units (CPU) used in embedded systems. It was developed in 2009 by Shay Gal-On at EEMBC and is intended to become an industry standard, replacing the antiquated Dhrystone benchmark. The code is written in C code and contains implementations of the following algorithms: list processing (find and sort), Matrix (mathematics) manipulation (common matrix operations), state machine (determine if an input stream contains valid numbers), and CRC. Read more.
The test suite can be downloaded from www.coremark.org
After downloading and unpacking we have the following directory structure.
We will add two port directories called or1k and atlys. In these directories we put three files modified for our design. The files in the or1k directory will be used when compiling CoreMark for running in the simulator and the files in the atlys directory will be used when compiling for the Atlys board.
Without any optimization option, the compiler's goal is to reduce the cost of compilation and to make debugging produce the expected results. Statements are independent: if you stop the program with a breakpoint between statements, you can then assign a new value to any variable or change the program counter to any other statement in the function and get exactly the results you would expect from the source code.
Turning on optimization flags makes the compiler attempt to improve the performance and/or code size at the expense of compilation time and possibly the ability to debug the program. The compiler performs optimization based on the knowledge it has of the program. Compiling multiple files at once to a single output file mode allows the compiler to use information gained from all of the files when compiling each of them. Here is a link to a page describing possible optimization options. Depending on the compiler options we choose the benchmark result may vary. When comparing different processors it is important we use the same compiler options to get reliable results.
The port directory
New GNU toolchain
The first thing I had to do was asking Julius for an updated GNU toolchain. The 1.0rc1 precompiled version I had didn't let me compile and run the CoreMark benchmark. Julius compiled a new toolchain from the OpenCores SVN repository revision 789. He promised to put it on the OpenCores FTP site. Here is a link to download the latest version.
Compiling for the simulator
The following commands are used to compile the benchmark for running in the OR1K simulator:
make PORT_DIR=or1k ITERATIONS=2000
When changing number of iterations use the following command:
make PORT_DIR=or1k ITERATIONS=4000 REBUILD=1
Use this command to start the simulator:
or32-elf-sim -m8M coremark.exe
Compiling for the board
The following commands are used to compile the benchmark for running on the Atlys board:
make PORT_DIR=atlys ITERATIONS=2000
Create a bare metal boot image
To create a u-boot image from a baremetal program in bin format, the u-boot tool <mkimage> is used. It is available in u-boot's tools/ directory and the following command can be used to create a not compressed bare metal image called 'coremark' with load address 0 and entry point at 0x100:
or32-elf-objcopy -O binary coremark.exe coremark.bin
mkimage -A or1k -T standalone -C none -a 0 -e 0x100 -n coremark -d coremark.bin /tftpboot/coremark.ub
Here are the conditions during the benchmark.
|Development board||Digilent Atlys Xilinx University Program|
|FPGA||Xilinx Spartan-6 XC6LX45CSG324C|
|Processor clock||50 MHz|
|Instruction cache||32 KB|
|Data cache||32 KB|
|Floating point||Single precision|
Running on the board
Here is the result from running CoreMark on the Atlys board. Observe that no compiler optimization (except for -O2) has been used.
This gives a CoreMark value of 63.411/50 = 1.27/MHz. We will try to improve this value by adding some compiler options, compile the program and rerun the test.
Here are the results from trying to optimize the compilation phase. Without any optimization at all :
Memory system benchmark
The CoreMark benchmark is setup to mainly test the processor part of our system. Stefan Kristiansson has written a testbench to test the efficiency of the memory system which can be downloaded from his GIT repository using the following command:
git clone git://git.chokladfabriken.org/membenchmark
After downloading the files (main.c and makefile) we use the following command to compile and make a bin file for loading into our system.
Here is the result from running the program on our Atlys board.