Hey! Were you curious about the results of profiling AWK? Me too! Quick recap, what is profiling, and how to do it? Profiling is a technique to map the time execution of each part of the application. We can add instrumentation to the executable, or use interruption sampling to generate that map. Here, I’ll use both. Click here for more details on profiling.
For the instrumentation, we have to tell the compiler to add the tools needed to collect the execution data. So, I’ve changed the “makefile” file, CFLAGS variable with “-g -Og -pg” and ran the make command. Then, I just ran the awk the same way I did to benchmark it. Here is the command line:
./awk 'BEGIN {FS = "<|:|=";} {if ($8 == "DDD>") a ++;} END {print "count: " a;}' bmark-data.txt
This awk version, instrumented, generates a file gmon.out, which contains all execution data. This is the raw material to create a profile report using gprof. I did this on all class machines, and they produced somewhat the same results. I have the text and the graph report to show you, gprof.png and gprof.txt below.
For the sampling, we don’t need to add anything to the executable. So, I’ve changed the “makefile” file, CFLAGS variable with “-g -Og” and ran the make command. No -pg param. However, to collect the execution data, we need to execute a different command:
perf record ./awk 'BEGIN {FS = "<|:|=";} {if ($8 == "DDD>") a ++;} END {print "count: " a;}' bmark-data.txt
This command generated a file called perf.data with our execution data to make the profiling report. This tool doesn’t create a visual graph, only creates a textual and interactive report. Unfortunately, it is not possible to show the interactive one, but the text is. It is the perf.txt below.
Analyzing the reports, I found the hot spots. Those are executed many times and spend the most execution time. There are two strings:
1 – execute -> fldbld -> refldbld -> nematch
2 – execute -> program -> getrec -> readrec
GPROF.png
GPROF.txt
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
31.45 43.67 43.67 240000000 0.00 0.00 nematch
12.10 60.47 16.80 30000001 0.00 0.00 readrec
11.51 76.44 15.98 30000000 0.00 0.00 refldbld
7.81 87.29 10.84 240000008 0.00 0.00 execute
5.44 94.84 7.55 30000000 0.00 0.00 fldbld
5.41 102.35 7.51 150000001 0.00 0.00 setfval
5.02 109.32 6.97 270000030 0.00 0.00 is_number
3.16 113.71 4.39 120000009 0.00 0.00 get_str_val
2.70 117.46 3.76 30000001 0.00 0.00 getrec
1.29 119.26 1.80 30000000 0.00 0.00 relop
1.22 120.96 1.70 30000000 0.00 0.00 makedfa
1.15 122.55 1.59 30000000 0.00 0.00 pastat
1.07 124.04 1.49 30000001 0.00 0.00 tfree
0.98 125.40 1.36 120000008 0.00 0.00 getsval
0.97 126.75 1.35 60000000 0.00 0.00 getfval
0.88 127.96 1.22 30000000 0.00 0.00 incrdecr
0.86 129.16 1.20 30000001 0.00 0.00 gettemp
0.79 130.26 1.10 30000000 0.00 0.00 setlastfld
0.72 131.26 1.00 30000000 0.00 0.00 indirect
0.66 132.18 0.92 30000000 0.00 0.00 ifstat
0.64 133.07 0.89 30000000 0.00 0.00 fieldadr
0.55 133.83 0.76 60000000 0.00 0.00 cleanfld
0.52 134.55 0.72 30000019 0.00 0.00 adjbuf
0.50 135.24 0.69 2 0.35 0.35 isclvar
0.49 135.92 0.68 1 0.68 32.35 program
0.45 136.54 0.62 30000001 0.00 0.00 savefs
0.42 137.13 0.59 newfld
0.28 137.52 0.39 funnyvar
0.21 137.81 0.29 1 0.29 0.29 cat
0.14 138.00 0.19 2 0.10 0.10 getargv
0.12 138.16 0.16 matchop
0.11 138.31 0.15 2 0.08 0.08 setsval
0.11 138.46 0.15 pmatch
0.06 138.55 0.09 1 0.09 0.60 initgetrec
0.05 138.62 0.07 condexpr
0.05 138.69 0.07 nawk_tolower
0.03 138.73 0.05 arith
0.03 138.78 0.05 getnf
0.03 138.82 0.04 1 0.04 0.04 recinit
0.01 138.84 0.02 freefa
0.01 138.85 0.02 boolop
0.00 138.85 0.00 125 0.00 0.00 hash
0.00 138.85 0.00 125 0.00 0.00 tostring
0.00 138.85 0.00 89 0.00 0.00 input
0.00 138.85 0.00 63 0.00 0.00 ptoi
0.00 138.85 0.00 61 0.00 0.00 lookup
0.00 138.85 0.00 58 0.00 0.00 resize_state
0.00 138.85 0.00 58 0.00 0.00 setsymtab
0.00 138.85 0.00 41 0.00 0.00 gettok
0.00 138.85 0.00 30 0.00 0.00 yylex
0.00 138.85 0.00 28 0.00 0.00 cgoto
0.00 138.85 0.00 27 0.00 0.00 nodealloc
0.00 138.85 0.00 16 0.00 0.00 unput
0.00 138.85 0.00 14 0.00 0.00 node2
0.00 138.85 0.00 13 0.00 0.00 op2
0.00 138.85 0.00 10 0.00 0.00 node1
0.00 138.85 0.00 9 0.00 0.00 intalloc
0.00 138.85 0.00 8 0.00 0.00 celltonode
0.00 138.85 0.00 7 0.00 0.00 binsearch
0.00 138.85 0.00 7 0.00 0.00 linkum
0.00 138.85 0.00 7 0.00 0.00 word
0.00 138.85 0.00 6 0.00 0.00 peek
0.00 138.85 0.00 6 0.00 0.00 relex
0.00 138.85 0.00 5 0.00 0.00 first
0.00 138.85 0.00 5 0.00 0.00 follow
0.00 138.85 0.00 3 0.00 0.00 concat
0.00 138.85 0.00 3 0.00 0.00 itonp
0.00 138.85 0.00 3 0.00 0.00 makesymtab
0.00 138.85 0.00 3 0.00 0.00 node3
0.00 138.85 0.00 3 0.00 0.00 primary
0.00 138.85 0.00 3 0.00 0.00 stat3
0.00 138.85 0.00 3 0.00 0.00 string
0.00 138.85 0.00 3 0.00 0.00 unary
0.00 138.85 0.00 3 0.00 0.00 unputstr
0.00 138.85 0.00 2 0.00 0.00 exptostat
0.00 138.85 0.00 2 0.00 0.00 makefields
0.00 138.85 0.00 2 0.00 0.00 op1
0.00 138.85 0.00 2 0.00 0.00 rehash
0.00 138.85 0.00 2 0.00 0.00 setfree
0.00 138.85 0.00 1 0.00 0.00 alt
0.00 138.85 0.00 1 0.00 0.00 arginit
0.00 138.85 0.00 1 0.00 0.08 assign
0.00 138.85 0.00 1 0.00 0.00 cfoll
0.00 138.85 0.00 1 0.00 0.00 closeall
0.00 138.85 0.00 1 0.00 0.00 envinit
0.00 138.85 0.00 1 0.00 0.00 freetr
0.00 138.85 0.00 1 0.00 0.00 getpssval
0.00 138.85 0.00 1 0.00 0.00 growfldtab
0.00 138.85 0.00 1 0.00 0.00 makeinit
0.00 138.85 0.00 1 0.00 0.00 mkdfa
0.00 138.85 0.00 1 0.00 0.00 notnull
0.00 138.85 0.00 1 0.00 0.00 penter
0.00 138.85 0.00 1 0.00 0.00 printstat
0.00 138.85 0.00 1 0.00 0.00 regexp
0.00 138.85 0.00 1 0.00 0.00 reparse
0.00 138.85 0.00 1 0.00 0.00 resizesetvec
0.00 138.85 0.00 1 0.00 137.26 run
0.00 138.85 0.00 1 0.00 0.00 stat2
0.00 138.85 0.00 1 0.00 0.00 stdinit
0.00 138.85 0.00 1 0.00 0.00 syminit
0.00 138.85 0.00 1 0.00 0.00 yyparse
PERF.txt
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 972K of event 'cycles:uppp'
# Event count (approx.): 236100649774
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ..................................
#
17.46% awkPerf awkPerf [.] nematch
16.84% awkPerf libc-2.27.so [.] __GI_____strtod_l_internal
8.52% awkPerf libc-2.27.so [.] _IO_getc
7.58% awkPerf awkPerf [.] readrec
5.78% awkPerf awkPerf [.] refldbld
4.30% awkPerf awkPerf [.] execute
3.78% awkPerf libc-2.27.so [.] str_to_mpn.isra.0
3.63% awkPerf libc-2.27.so [.] __GI_strlen
3.19% awkPerf awkPerf [.] fldbld
2.89% awkPerf awkPerf [.] setfval
2.89% awkPerf awkPerf [.] is_number
2.10% awkPerf libc-2.27.so [.] __GI___strnlen
1.77% awkPerf awkPerf [.] getrec
1.71% awkPerf libc-2.27.so [.] strncpy
1.64% awkPerf libc-2.27.so [.] round_and_return
1.57% awkPerf awkPerf [.] get_str_val
1.53% awkPerf libc-2.27.so [.] __memcpy_generic
1.06% awkPerf awkPerf [.] _IO_getc@plt
0.93% awkPerf libc-2.27.so [.] strtod
0.84% awkPerf libc-2.27.so [.] strcmp
0.73% awkPerf libc-2.27.so [.] __errno_location
0.64% awkPerf awkPerf [.] makedfa
0.60% awkPerf awkPerf [.] incrdecr
0.60% awkPerf awkPerf [.] relop
0.60% awkPerf awkPerf [.] getfval
0.53% awkPerf awkPerf [.] ifstat
0.52% awkPerf awkPerf [.] gettemp
0.49% awkPerf libc-2.27.so [.] strcpy
0.42% awkPerf awkPerf [.] strtod@plt
0.42% awkPerf awkPerf [.] pastat
0.39% awkPerf libc-2.27.so [.] __mpn_construct_double
0.39% awkPerf awkPerf [.] cleanfld
0.38% awkPerf awkPerf [.] getsval
0.37% awkPerf libc-2.27.so [.] __mpn_lshift
0.37% awkPerf awkPerf [.] savefs
0.35% awkPerf awkPerf [.] setlastfld
0.31% awkPerf awkPerf [.] indirect
0.31% awkPerf awkPerf [.] program
0.30% awkPerf awkPerf [.] fieldadr
0.28% awkPerf awkPerf [.] adjbuf
0.25% awkPerf awkPerf [.] tfree
0.23% awkPerf awkPerf [.] __errno_location@plt
0.17% awkPerf awkPerf [.] strncpy@plt
0.15% awkPerf awkPerf [.] strlen@plt
0.08% awkPerf awkPerf [.] strcmp@plt
0.07% awkPerf awkPerf [.] strcpy@plt
0.01% awkPerf libc-2.27.so [.] _IO_file_underflow@@GLIBC_2.17
0.01% awkPerf [unknown] [k] 0xffff000010096654
0.01% awkPerf libc-2.27.so [.] __uflow
0.01% awkPerf libc-2.27.so [.] _IO_default_uflow
0.01% awkPerf libc-2.27.so [.] _IO_switch_to_get_mode
0.01% awkPerf libc-2.27.so [.] read
0.00% awkPerf libc-2.27.so [.] _IO_file_read
0.00% awkPerf libc-2.27.so [.] read_alias_file
0.00% awkPerf libc-2.27.so [.] _IO_vsnprintf
0.00% awkPerf ld-2.27.so [.] _dl_runtime_resolve
0.00% awkPerf libc-2.27.so [.] __strchrnul
0.00% awkPerf libc-2.27.so [.] _dl_addr
0.00% awkPerf ld-2.27.so [.] do_lookup_x
0.00% awkPerf ld-2.27.so [.] strcmp
0.00% awkPerf ld-2.27.so [.] _dl_relocate_object
0.00% awkPerf ld-2.27.so [.] _dl_check_all_versions
0.00% awkPerf ld-2.27.so [.] read
0.00% awkPerf ld-2.27.so [.] dl_main
0.00% awkPerf ld-2.27.so [.] __GI___tunables_init
0.00% awkPerf ld-2.27.so [.] _dl_start
0.00% awkPerf ld-2.27.so [.] _start
0.00% awkPerf [unknown] [k] 0xffff000010081544
#
# (Cannot load tips.txt file, please install perf!)
#
Comments
Post a Comment