Hello! This is my SPO 600 blog, and this post will be long – sorry. The goal is to pick one project that is CPU intensive, written in C or C++, and experiment different compiler options and present the results. That’s why it will be long – lots of data to show.
I choose the AWK project (https://github.com/onetrueawk/awk). It is a handy tool to process files. Parse, sort, and filter are some trivial operations that are CPU intensive. To make it harder, I created a huge XML file to parse it and count the tags.
I've described the machines in my last post, if you miss it, here it is.
I also created a script to run and collect the data. I planned to run each candidate 10 times, but a few attempts didn’t receive any data. So, I decided to nest the loop in a way that even if someone kills my process, the data could be used. Guess what? It happened!
To produce the candidates, I just changed the CFLAGS inside the makefile and ran the make command. I took care to clean everything between the compilations. Here is the list of the candidates:
awkO0 – CFLAGS = -O0 – optimization for compilation time (default)
awkO1 – CFLAGS = -O1 – optimization for code size and execution time
awkO2 – CFLAGS = -O2 – optimization more for code size and execution time
awkO3 – CFLAGS = -O3 – optimization more for code size and execution time
awkOs – CFLAGS = -Os – optimization for code size
Overall, the awkO3 performed better on all machines – except on Ccharlie. The awkO0 and awkOs are 60% and 15% slower than the optimum. In the Arch64 architecture, the awk01, awk02 and awkO3 performed somewhat the same. However, in the x86_64 architecture, the optimization gain is evident between them. Maybe there is room to improve the Arch64 compiler flags -O1, -O2 and -O3.
Here is the complete data. The time axis is in seconds. See you next post.
Comments
Post a Comment