Recently I’ve started working on a Rust application. Shortly after the first prototype was ready, I decided to profile it to make sure I didn’t make any horrible mistakes that could affect performance.

Due to the nature of the application, I couldn’t just profile it locally (but if it was an option I’d use flamegraph). It needed to run in production (Kubernetes). So, I thought I should just use perf.

bash
1$ perf
2bash: perf: command not found

Ok, it needs it be installed first. The internet says to run apt-get install linux-tools-generic. Sure thing… Oh no:

bash
 1$ perf
 2WARNING: perf not found for kernel 5.10.226
 3
 4  You may need to install the following packages for this specific kernel:
 5    linux-tools-5.10.226-214.880.amzn2.x86_64
 6    linux-cloud-tools-5.10.226-214.880.amzn2.x86_64
 7
 8  You may also want to install one of the following packages to keep up to date:
 9    linux-tools-214.880.amzn2.x86_64
10    linux-cloud-tools-214.880.amzn2.x86_64

perf leverages Linux kernel functionality, so the kernel version is important. But no matter what I did, I couldn’t find a package for this specific kernel.

Compiling perf from the source Link to this heading

After wasting an hour chasing the specific perf package I needed, I decided it was likely more productive to compile it from scratch. So here’s what I ran:

bash
1apt-get update 
2apt-get install -y git make gcc flex bison
3git clone -b v5.10.205 --single-branch -n --depth=1 --filter=tree:0 \
4  git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
5cd linux-stable
6git sparse-checkout set --no-cone tools scripts
7git checkout
8cd tools/perf
9make 

Note the git tag with the specific version I’m using.

These steps create a fully functioning perf binary, but there’s a catch. perf can be compiled with various options, which can be viewed by running perf version --build-options. This is what you WANT to see:

bash
 1                         dwarf: [ on  ]
 2            dwarf_getlocations: [ on  ]
 3                         glibc: [ on  ]
 4                        libbfd: [ on  ]
 5                libbfd-buildid: [ on  ]
 6                        libcap: [ on  ]
 7                        libelf: [ on  ]
 8                       libnuma: [ on  ]
 9        numa_num_possible_cpus: [ on  ]
10                       libperl: [ OFF ]
11                     libpython: [ OFF ]
12                     libcrypto: [ on  ]
13                     libunwind: [ on  ]
14            libdw-dwarf-unwind: [ on  ]
15                          zlib: [ on  ]
16                          lzma: [ on  ]
17                     get_cpuid: [ on  ]
18                           bpf: [ on  ]
19                        libaio: [ on  ]
20                       libzstd: [ on  ]
21        disassembler-four-args: [ on  ]

Without many of these options (e.g. dwarf), you won’t be able to get accurate profiling data.

Most of them were OFF in my case, so I needed to install these dependencies before compiling the final binary:

bash
1apt-get install -y libunwind8-dev libdwarf-dev \
2  libelf-dev libdw-dev systemtap-sdt-dev \
3  libssl-dev libslang2-dev binutils-dev \
4  libzstd-dev libbabeltrace-dev libiberty-dev \
5  libnuma-dev libcap-dev

Runtime Link to this heading

Finally, you can copy the compiled binary to your application and start profiling.

Just kidding, of course not 🫠.

At the end of the previous section, we installed a bunch of dependencies in order to compile perf, but related runtime libraries are needed to run it.

I was able to get them (not sure if it’s a complete list, to be honest) with:

bash
1apt-get install -y libunwind8 libdwarf1 libelf1 libdw1 systemtap \
2      libslang2 binutils libzstd1 libbabeltrace1 libnuma1 libcap2

Now perf should mostly work.

One final adjustment you’ll probably need to make is tweaking perf_event_paranoid level. See more here.

Good luck!