Skip to main content

Posts

Showing posts from March, 2020

Benchmarking

Photo by  Alex manlyx  on  Unsplash Time to get the hands dirty and do some benchmarking. The goal of Lab6 is to run the different sound volume algorithms described in my last post in the five different machines and compare them. I talked about the algorithm in the last post, so now it’s time to talk about the machines. Here they are: AARCHIE BBETTY CCHARLIE ISRAEL XERXES OS Fedora 28 Fedora 31 Fedora 30 Ubuntu 19.04 Fedora 30 Architecture aarch64 aarch64 aarch64 aarch64 x86_64 CPU(s) 24 8 8 16 8 Thread(s) per core 1 1 1 1 2 Model name Cortex-A53 Cortex-A57 X-Gene Cortex-A72 Intel(R) Xeon(R) CPU E5-1630 v4 @ 3.70GHz L1d cache 32K - unknown 32K 32K L1i cache 32K - unknown 48K 32K L2 cache 256K - unknown 1024K 256K L3 cache 4096K - - - 10240K Before running anything, we need to make sure to get the time consumed by the algorithm o...

Performance Tuning Hero

Photo by  Ayo Ogunseinde  on  Unsplash Week 8! I’m Rodrigo, and this is my blog about my SPO 600 course. I’ve posted since January, and so far, I didn’t tell you what is SPO, right? It is Software Portability and Optimization. Today we will approach the Optimization part differently. Instead of squeezing the compiler, we will care about how the software is working. I have a pretty good experience in performance and tuning in Oracle database, PL/SQL and SQL. I can say that, by far, the significant gain in execution time lies in how the software is designed. The same steps that I’ve used, we are going to use in the course. First, do not touch the code without knowing how bad it is. Benchmarking is a must. Before, during and after, these metrics will guide our work and justify hours of analysis and development. It must be done right like a methodic scientist collecting vital data and not rushing. The more depth of info you get, the easiest will be the next steps. Se...

Just a Loop

Photo by  Tine Ivanič  on  Unsplash Hey everyone! This post is dedicated to Lab 5. I’ll say upfront that this was the most challenging loop of my career. Yes! If you told assembly, you were right. Our professor was kind to provide necessary codes from each platform. He also gave us access to several machines with different hardware capabilities for both architectures. Combined with the previous lecture, we had everything that we need to succeed. The goal was to create a program that shows “loop” ten times. Next, we had to change it to include the index number. Finally, extend it to show thirty times with the index suppressing the leading zero. Don’t forget, on BOTH architectures! If it were on C, C++, Java, JavaScript or Bash, I would have time to a coffee. To be fair, the first task was to put the “hello world” with the script provided. That was easy, and the only one that I accomplished. To compile it, I added a new target to the make file provided. The result y...

Hello World in Assembly

Photo by  Martin Sanchez  on  Unsplash Assembly alert! I promised to talk more about the compiler, but before it, more assembly. It is for a good reason, though. I was surprised to see my professor doing the “hello world” in assembly! Not only on x86_64 but also in the ARMv8 – different source code. He coded just like C or C++, saved, compiled and run it. The output was exactly like any other language. Nobody knew, but he was doing the Lab 5! I wish I were recording it. Our goal in this class was to compare the compiler output with the hand-crafted assembly. As expected, the compiler produced non-optimal executables even when we played with some fine-tuning options that I mentioned in the last post. How to compile an assembly code? Here are the commands: - Using GNU Assembler > as -g -o test.o test.s > ld -o test test.o - Using NASM Assembler > nasm -g -f elf64 -o test.o test.s > ld -o test test.o - Using GCC > gcc -g -o test.o test.S ...

x86_64 vs ARMv8

Photo by  Brian Kostiuk  on  Unsplash Things are getting interesting in the SPO 600 course. It’s time to get familiar with modern processor architectures: the x86_64, which powers all most everything today and the new ARMv8 that is gaining traction mostly because of its energy efficiency. Also, for the first time, we will “forget” assembly and focus on the compiler. So, what is the difference between x86_64 and ARMv8? Making a processor is hard and expensive, so instead, they decided to make the x86 (32bits) to work as 64bits – x86_64. That strategy popularized the 64bit environment. On the other side, the ARMv8 was designed for 64bits from the beginning, and its energy efficiency made it accessible on mobile applications. Who remembers the RISC vs CISC competition? The RISC concept tells us to execute simple operations quickly. The CISC concept is quite the opposite: complex operations will perform better than a bunch of simple ones. Who won? Well, everybody won!...