Photo by Anders Jildén on Unsplash
Today’s topic is compiler optimizations. Besides translating our source code into machine binary executable, the compiler, based on optimization parameters, can also produce faster executables. Just by adding some parameters to the compiler, we can get a better performance or a smaller executable, for example. There are hundreds of those parameters which we can turn on or off using the prefix -f and -fno. However, instead of doing one by one, we can use the features mode using the -O param. It ranges from 0 (no optimization – the default) to 3 (the highest). Using those parameters has a cost —usually, the faster, the larger executable.
How does the compiler make it faster if my code is perfect?
I’m going to put some methods here, but if you want, here is more detail. Also, bear in mind that most of the optimizations are done in the intermediate representation of the program. So, the examples below are rewritten just to illustrate such modifications.
Strength Reduction: replace slow operations to faster ones.
Before
|
After
|
int x;
for (x=0; x < 10; x++) {
printf("%d\n", x*6);
}
|
int x;
for (x=0; x < 60; x+=6) {
printf("%d", x);
}
|
Instead of multiply each time, it is faster to keep adding the factor to the accumulator.
Hoisting: move operations outside the loop
Before
|
After
|
int t, x;
double c;
t = readtemp();
for (x = 0; x < 200; x++) {
c = (t-32)/1.8000 + 273.15;
foo(x,c);
}
|
int t, x;
double c;
t = readtemp();
c = (t-32)/1.8000 + 273.15;
for (x = 0; x < 200; x++) {
foo(x,c);
}
|
Instead of performing the calculation “n” times inside the loop, it can be done once outside. This works because “c” is constant for the entire loop’s life. The optimizer will find those constants and move them outside no matter where they are (variable, expressions, loop condition, etc.).
Loop Unswitching: swap loop-condition to condition-loop
Before
|
After
|
int foo(float ctl) {
int x;
for (x=0; x < 10000; x++) {
if (ctl == 0) {
bar(x);
} else {
qux(x);
}
}
}
|
int foo(float ctl) {
int x;
if (ctl == 0) {
for (x=0; x < 10000; x++) {
bar(x);
}
} else {
for (x=0; x < 10000; x++) {
qux(x);
}
}
}
|
Instead of evaluating the condition “n” times inside the loop, it is faster to do it once and then run the appropriated loop. Please note that the condition doesn’t change inside the loop’s life, and the executable might be bigger.
And there are more! The point here is to write the code the way we can better understand and let the compiler rewrite and optimize it. Additionally, the compiler can use machine-specific instructions or rearrange the code blocks to improve the performance. If you want to see how those parameters affect the performance, check this out. See you in the next post.
Comments
Post a Comment