Time to get the hands dirty and do some benchmarking. The goal of Lab6 is to run the different sound volume algorithms described in my last post in the five different machines and compare them.
I talked about the algorithm in the last post, so now it’s time to talk about the machines. Here they are:
AARCHIE
|
BBETTY
|
CCHARLIE
|
ISRAEL
|
XERXES
| |
OS
|
Fedora 28
|
Fedora 31
|
Fedora 30
|
Ubuntu 19.04
|
Fedora 30
|
Architecture
|
aarch64
|
aarch64
|
aarch64
|
aarch64
|
x86_64
|
CPU(s)
|
24
|
8
|
8
|
16
|
8
|
Thread(s) per core
|
1
|
1
|
1
|
1
|
2
|
Model name
|
Cortex-A53
|
Cortex-A57
|
X-Gene
|
Cortex-A72
|
Intel(R) Xeon(R) CPU E5-1630 v4 @ 3.70GHz
|
L1d cache
|
32K
|
-
|
unknown
|
32K
|
32K
|
L1i cache
|
32K
|
-
|
unknown
|
48K
|
32K
|
L2 cache
|
256K
|
-
|
unknown
|
1024K
|
256K
|
L3 cache
|
4096K
|
-
|
-
|
-
|
10240K
|
Before running anything, we need to make sure to get the time consumed by the algorithm only. So, I’ve to change the code provided to get the initial and final dates at the right time, do the elapsed time math and display it. Here is an example.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include <stdlib.h> | |
#include <stdio.h> | |
#include <stdint.h> | |
#include <time.h> | |
#include "vol.h" | |
// Function to scale a sound sample using a volume_factor | |
// in the range of 0.00 to 1.00. | |
static inline int16_t scale_sample(int16_t sample, float volume_factor) { | |
return (int16_t) (volume_factor * (float) sample); | |
} | |
// ADDED FUNCTION | |
long timediff(clock_t t1, clock_t t2) { | |
long elapsed; | |
elapsed = ((double)t2 - t1) / CLOCKS_PER_SEC * 1000; | |
return elapsed; | |
} | |
int main() { | |
clock_t t1, t2; // ADDED VARIABLES | |
long elapsed; | |
// Allocate memory for large data array | |
int16_t* data; | |
data = (int16_t*) calloc(SAMPLES, sizeof(int16_t)); | |
int x; | |
int ttl = 0; | |
// Seed the pseudo-random number generator | |
srand(1); | |
// Fill the array with random data | |
for (x = 0; x < SAMPLES; x++) { | |
data[x] = (rand()%65536)-32768; | |
} | |
// ###################################### | |
// This is the interesting part! | |
// Scale the volume of all of the samples | |
t1 = clock(); // ADDED initial date | |
for (x = 0; x < SAMPLES; x++) { | |
data[x] = scale_sample(data[x], 0.75); | |
} | |
t2 = clock(); // ADDED final date | |
elapsed = timediff(t1, t2); // ADDED elapsed time | |
printf("%ld ", elapsed); // ADDED print time | |
// ###################################### | |
// Sum up the data | |
for (x = 0; x < SAMPLES; x++) { | |
ttl = (ttl+data[x])%1000; | |
} | |
// Print the sum | |
printf("Result: %d\n", ttl); | |
return 0; | |
} |
To get more accurate data possible, I choose to run each one 100 times. I also put a delay of 5 minutes between executions. Then I set to run around 10 pm to collect the data in the next morning. With the data, I extracted the average elapsed time, along with the fastest and slowest. Here is my script to do the hard work for me.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
QTY=100 | |
if [[ ! -z $1 ]] | |
then | |
FILE_EXE="$1_exe.txt" | |
FILE_RPT="$1_rpt.txt" | |
echo "Testing: $1" | |
echo "Executions: $FILE_EXE" | |
echo "Report: $FILE_RPT" | |
> $FILE_EXE | |
for ((n=0;n<$QTY;n++)) | |
do | |
echo "$(date +"%Y-%m-%d %H:%M:%S,%3N") - INI: $n" | |
./$1 >> $FILE_EXE | |
echo "$(date +"%Y-%m-%d %H:%M:%S,%3N") - FIN: $n" | |
echo '' | |
sleep 5m | |
done | |
> $FILE_RPT | |
awk '{ | |
if (min == "") { min = max = $1 }; | |
if ($1 > max) { max = $1 }; | |
if ($1 < min) { min = $1 }; | |
sum += $1 | |
} END { | |
print "Executions: " NR "\nMin Time : " min "\nMax Time : " max "\nAvg time : " sum/NR | |
}' $FILE_EXE > $FILE_RPT | |
else | |
echo "Please inform the program to be tested" | |
fi |
Here are the results (numbers in milliseconds):
AARCHIE
|
BBETTY
|
CCHARLIE
|
ISRAEL
|
XERXES
| ||
Multiplication Method
|
Min
|
7571.00
|
933.00
|
1715.00
|
1455.00
|
340.00
|
Max
|
11548.00
|
942.00
|
1722.00
|
1456.00
|
394.00
| |
Avg
|
7622.43
|
934.68
|
1716.26
|
1455.53
|
353.02
| |
Lookup Table Method
|
Min
|
12732.00
|
1376.00
|
2220.00
|
2558.00
|
268.00
|
Max
|
34445.00
|
1390.00
|
2574.00
|
2591.00
|
348.00
| |
Avg
|
13083.50
|
1379.64
|
2406.17
|
2572.67
|
281.33
| |
Binary Math Method
|
Min
|
4079.00
|
782.00
|
1231.00
|
503.00
|
211.00
|
Max
|
4442.00
|
795.00
|
1237.00
|
505.00
|
254.00
| |
Avg
|
4101.68
|
782.91
|
1232.35
|
503.02
|
218.30
|
We can see a difference between the algorithms. The binary math method is faster on all platforms. The surprise here is that the multiplication method performs better in aarch64 than in x86_64. And the lookup the opposite, performing better in the x86_64 than in the aarch64. However, we can't compare between machines due to incompatibility. See you!
Comments
Post a Comment