VOLK Benchmarking Scripts The Python programs in this directory are designed to help benchmark and compare Volk enhancements to GNU Radio. There are two kinds of scripts here: collecting data and displaying the data. Data collection is done by running a Volk testing script that will populate a SQLite database file (volk_results.db by default). The plotting utility provided here reads from the database files and plots bar graphs to compare the different installations. These benchmarks can be used to compare previous versions of GNU Radio to using Volk; they can be used to compare different Volk proto-kernels, as well, by editing the volk_config file; or they could be used to compare performance between different machines and/or processors. ====================================================================== Volk Profiling Before doing any kind of Volk benchmarking, it is important to run the volk_profile program. The profiler will build a config file for the best SIMD architecture for your processor. Run volk_profile that is installed into $PREFIX/bin. This program tests all known Volk kernels for each proto-kernel supported by the processor. When finished, it will write to $HOME/.volk/volk_config the best architecture for the VOLK function. This file is read when using a function to know the best version of the function to execute. The volk_config file contains a line for each kernel, where each line looks like: volk_ The architecture will be something like (sse, sse2, sse3, avx, neon, etc.), depending on your processor. ====================================================================== Benchmark Tests There are currently two benchmark scripts defined for collecting data. There is one that runs through the type conversions that have been converted to Volk (volk_types.py) and the other runs through the math operators converted to using Volk (volk_math.py). Script prototypes Both have the same structure for use: ---------------------------------------------------------------------- ./volk_.py [-h] -L LABEL [-D DATABASE] [-N NITEMS] [-I ITERATIONS] [--tests [{0,1,2,3} [{0,1,2,3} ...]]] [--list] [--all] optional arguments: -h, --help show this help message and exit -L LABEL, --label LABEL Label of database table [default: None] -D DATABASE, --database DATABASE Database file to store data in [default: volk_results.db] -N NITEMS, --nitems NITEMS Number of items per iterations [default: 1000000000.0] -I ITERATIONS, --iterations ITERATIONS Number of iterations [default: 20] --tests [{0,1,2,3} [{0,1,2,3} ...]] A list of tests to run; can be a single test or a space-separated list. --list List the available tests --all Run all tests ---------------------------------------------------------------------- To run, you specify the tests to run and a label to store along with the results. To find out what the available tests are, use the '--list' option. To specify a subset of tests, use the '--tests' with space-separated list of tests numbers (e.g., --tests 0 2 4 9). Use the '--all' to run all tests. The label specified is used as an identifier for the benchmarking currently being done. This is required as it is important in organizing the data in the database (each label is its own table). Usually, the label will specify the type of run being done, such as "volk_aligned" or "v3_5_1". In these cases, the "volk_aligned" label says that this is for a benchmarking using the GNU Radio version that uses the aligned scheduler and Volk calls in the work functions. The "v3_5_1" label is if you were benchmarking an installed version 3.5.1 of GNU Radio, which is pre-Volk. These will then be plotted against each other to see the timing differences. The 'database' option will output the results to a new database file. This can be useful for separating the output of different runs or of different benchmarks, such as the types versus the math scripts, say, or to distinguish results from different computers. If rerun using the same database and label, the entries in the table will simply be replaced by the new results. It is often useful to use the 'sqlitebrowser' program to interrogate the database file farther, if you are interested in the structure or the raw data. Other parameters of this script set the number of items to process and number of iterations to use when computing the benchmarking data. These default to 1 billion samples per iteration over 20 iterations. Expect a default run to take a long time. Using the '-N' and '-I' options can be used to change the runtime of the benchmarks but are set high to remove problems of variance between iterations. ====================================================================== Plotting Results The volk_plot.py script reads a given database file and plots the results. The default behavior is to read all of the labels stored in the database and plot them as data sets on a bar graph. This shows the average time taken to process the number of items given. The options for the plotting script are: usage: volk_plot.py [-h] [-D DATABASE] [-E] [-P {mean,min,max}] [-% table] Plot Volk performance results from a SQLite database. Run one of the volk tests first (e.g, volk_math.py) ---------------------------------------------------------------------- optional arguments: -h, --help show this help message and exit -D DATABASE, --database DATABASE Database file to read data from [default: volk_results.db] -E, --errorbars Show error bars (1 standard dev.) -P {mean,min,max}, --plot {mean,min,max} Set the type of plot to produce [default: mean] -% table, --percent table Show percent difference to the given type [default: None] ---------------------------------------------------------------------- This script allows you to specify the database used (-D), but will always read all rows from all tables from it and display them. You can also turn on plotting error bars (1 standard deviation the mean). Be careful, though, as some older versions of Matplotlib might have an issue with this option. The mean time is only one possible statistic that we might be interested in when looking at the data. It represents the average user experience when running a given block. On the other hand, the minimum runtime best represents the actual performance of a block given minimal OS interruptions while running. Right now, the data collected includes the mean, variance, min, and max over the number of iterations given. Using the '-P' option, you can specify the type of data to plot (mean, min, or max). Another useful way of looking at the data is to compare the percent improvement of a benchmark compared to another. This is done using the '-%' option with the provided table (or label) as the baseline. So if we were interested in comparing how much the 'volk_aligned' was over 'v3_5_1', we would specify '-% v3_5_1' to see this. The plot would then only show the percent speedup observed using Volk for each of the blocks. ====================================================================== Benchmarking Walkthrough This will walk through an example of benchmarking the new Volk implementation versus the pre-Volk GNU Radio. It also shows how to look at the SIMD optimized versions versus the generic implementations. Since we introduced Volk in GNU Radio 3.5.2, we will use the following labels for our data: 1.) volk_aligned: v3.5.2 with volk_profile results in .volk/volk_config 2.) v3_5_2: v3.5.2 with the generic (non-SIMD) calls to Volk 3.) v3_5_1: an installation of GNU Radio from version v3.5.1 We assume that we have installed two versions of GNU Radio. v3.5.2 installed into /opt/gr-3_5_2 v3.5.1 installed into /opt/gr-3_5_1 To test cases 1 and 2 above, we have to run GNU Radio from the v3.5.2 installation, so we set the following environmental variables. Note that this is written for Ubuntu 11.10. These commands and directories may have to be changed depending on your OS and versions. export LD_LIBRARY_PATH=/opt/gr-3_5_2/lib export LD_LIBRARY_PATH=/opt/gr-3_5_2/lib/python2.7/dist-packages Now we can run the benchmark tests, so we will focus on the math operators: ./volk_math.py -D volk_results_math.db --all -L volk_aligned When this finishes, the 'volk_results_math.db' will contain our results for this run. We next want to run the generic, non-SIMD, calls. This can be done by changing the Volk kernel settings in $HOME/.volk/volk_config. First, make a backup of this file. Then edit it and change all architecture calls (sse, sse2, etc.) to 'generic.' Now, Volk will only call the generic versions of these functions. So we rerun the benchmark with: ./volk_math.py -D volk_results_math.db --all -L v3_5_2 Notice that the only thing changed here was the label to 'v3_5_2'. Next, we want to collect data for the non-Volk version of GNU Radio. This is important because some internals to GNU Radio were made when adding support for Volk, so it is nice to know what the differences do to our performance. First, we set the environmental variables to point to the v3.5.1 installation: export LD_LIBRARY_PATH=/opt/gr-3_5_1/lib export LD_LIBRARY_PATH=/opt/gr-3_5_1/lib/python2.7/dist-packages And when we run the test, we use the same command line, but the GNU Radio libraries and Python files used come from v3.5.1. We also change the label to indicate the different version to store. ./volk_math.py -D volk_results_math.db --all -L v3_5_1 We now have a database populated with three tables for the three different labels. We can plot them all together by simply running: ./volk_plot.py -D volk_results_math.db This will show the average run times for each of the three configurations for all math functions tested. We might also be interested to see the difference in performance from the v3.5.1 version, so we can run: ./volk_plot.py -D volk_results_math.db -% v3_5_1 That will plot both the 'volk_aligned' and 'v3_5_2' as a percentage improvement over v3_5_1. A positive value indicates that this version runs faster than the v3.5.1 version. ---------------------------------------------------------------------- Another interesting test case could be to compare results on different processors. So if you have different generation Intels, AMD, or whatever, you can simply pass the .db file around and run the Volk benchmark script to populate the database with different results. For this, you would specify a label like '-L i7_2620M' that indicates the processor type to uniquely ID the data.