Recently I’ve started working on a Rust application. Shortly after the first prototype was ready, I decided to profile it to make sure I didn’t make any horrible mistakes that could affect performance.
Due to the nature of the application, I couldn’t just profile it locally (but if it was an option I’d use flamegraph). It needed to run in production (Kubernetes). So, I thought I should just use perf
.
1$ perf
2bash: perf: command not found
Ok, it needs it be installed first. The internet says to run apt-get install linux-tools-generic
. Sure thing… Oh no:
1$ perf
2WARNING: perf not found for kernel 5.10.226
3
4 You may need to install the following packages for this specific kernel:
5 linux-tools-5.10.226-214.880.amzn2.x86_64
6 linux-cloud-tools-5.10.226-214.880.amzn2.x86_64
7
8 You may also want to install one of the following packages to keep up to date:
9 linux-tools-214.880.amzn2.x86_64
10 linux-cloud-tools-214.880.amzn2.x86_64
perf
leverages Linux kernel functionality, so the kernel version is important. But no matter what I did, I couldn’t find a package for this specific kernel.
Compiling perf from the source
After wasting an hour chasing the specific perf package I needed, I decided it was likely more productive to compile it from scratch. So here’s what I ran:
bash1apt-get update
2apt-get install -y git make gcc flex bison
3git clone -b v5.10.205 --single-branch -n --depth=1 --filter=tree:0 \
4 git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
5cd linux-stable
6git sparse-checkout set --no-cone tools scripts
7git checkout
8cd tools/perf
9make
Note the git tag with the specific version I’m using.
These steps create a fully functioning perf
binary, but there’s a catch. perf
can be compiled with various options, which can be viewed by running perf version --build-options
. This is what you WANT to see:
1 dwarf: [ on ]
2 dwarf_getlocations: [ on ]
3 glibc: [ on ]
4 libbfd: [ on ]
5 libbfd-buildid: [ on ]
6 libcap: [ on ]
7 libelf: [ on ]
8 libnuma: [ on ]
9 numa_num_possible_cpus: [ on ]
10 libperl: [ OFF ]
11 libpython: [ OFF ]
12 libcrypto: [ on ]
13 libunwind: [ on ]
14 libdw-dwarf-unwind: [ on ]
15 zlib: [ on ]
16 lzma: [ on ]
17 get_cpuid: [ on ]
18 bpf: [ on ]
19 libaio: [ on ]
20 libzstd: [ on ]
21 disassembler-four-args: [ on ]
Without many of these options (e.g. dwarf), you won’t be able to get accurate profiling data.
Most of them were OFF in my case, so I needed to install these dependencies before compiling the final binary:
bash1apt-get install -y libunwind8-dev libdwarf-dev \
2 libelf-dev libdw-dev systemtap-sdt-dev \
3 libssl-dev libslang2-dev binutils-dev \
4 libzstd-dev libbabeltrace-dev libiberty-dev \
5 libnuma-dev libcap-dev
Runtime
Finally, you can copy the compiled binary to your application and start profiling.
Just kidding, of course not 🫠.
At the end of the previous section, we installed a bunch of dependencies in order to compile perf
, but related runtime libraries are needed to run it.
I was able to get them (not sure if it’s a complete list, to be honest) with:
bash1apt-get install -y libunwind8 libdwarf1 libelf1 libdw1 systemtap \
2 libslang2 binutils libzstd1 libbabeltrace1 libnuma1 libcap2
Now perf
should mostly work.
One final adjustment you’ll probably need to make is tweaking perf_event_paranoid
level. See more here.
Good luck!