- GraalVM 25 (Latest)
- GraalVM for JDK 21
- GraalVM for JDK 17
- Archives
- Dev Build
- Getting Started with Native Image
- Guides
- Native Image Basics
- Build Overview
- Reachability Metadata
- Optimizations and Performance
- Debugging and Diagnostics
- Debug Info Feature
- Inspect Tool
- JDK Flight Recorder
- Native Memory Tracking
- Java Diagnostic Command
- Linux Perf Profiler Support
- Points-to Analysis Reports
- Dynamic Features
- Interoperability with Native Code
- Workshops and Labs
Linux Perf Profiler Support in Native Image
The perf profiler is a performance analysis tool in Linux that enables you to collect and analyze various performance-related data such as CPU utilization, memory usage, and more.
It is particularly useful for profiling and understanding the behavior of applications.
Installation
Perf is a profiler tool for Linux systems.
Most distributions come with perf pre-installed, but you can install it using your package manager if it is not available.
To install perf on Oracle Linux/Red Hat/CentOS, run this command:
sudo yum install perf
To install perf on Debian/Ubuntu, run the following commands one by one:
sudo apt update
sudo apt install linux-tools-common linux-tools-generic
After installing perf, backup the default values of the following options:
cat /proc/sys/kernel/perf_event_paranoid > perf_event_paranoid.backup
cat /proc/sys/kernel/kptr_restrict > kptr_restrict.backup
Then set them to the new desired values:
echo -1 > /proc/sys/kernel/perf_event_paranoid
echo 0 > /proc/sys/kernel/kptr_restrict
In the example above, -1 and 0 are used as values, which are the least restrictive, so it is not recommended to use them in production code.
You can customize these values according to your needs.
perf_event_paranoid has four different levels (values):
- -1: Allow use of (almost) all events by all users.
- >=0: Disallow
ftracefunction tracepoint by users withoutCAP_SYS_ADMIN. - >=1: Disallow CPU event access by users without
CAP_SYS_ADMIN. - >=2: Disallow kernel profiling by users without
CAP_SYS_ADMIN.
kptr_restrict has three different levels (values):
- 0: Kernel pointers are readable by all users.
- 1: Kernel pointers are only accessible to privileged users (those with the
CAP_SYS_ADMINcapability). - 2: Kernel pointers are hidden from all users.
Once finished using perf, restore the original values:
cat perf_event_paranoid.backup > /proc/sys/kernel/perf_event_paranoid
cat kptr_restrict.backup > /proc/sys/kernel/kptr_restrict
Building Native Executables
The following command assumes that native-image is on the system path and available.
If it is not installed, refer to the Getting Started.
native-image -g -H:+PreserveFramePointer <entry_class>
The -g option instructs Native Image to produce debug information for the generated binary.
perf can use this debug information, for example, to provide proper names for types and methods in traces.
The -H:+PreserveFramePointer option instructs Native Image to save frame pointers on the stack.
This allows perf to reliably unwind stack frames and reconstruct the call hierarchy.
Profiling of Runtime-Compiled Methods
Native Image can generate detailed runtime compilation metadata for perf in the jitdump format. This enables perf profiling of runtime compiled methods, for example for Truffle compilations.
jitdump
The jitdump format stores detailed metadata for runtime compiled code. This requires post-processing of the perf data to inject the runtime compilation metadata.
-
Build with jitdump support:
native-image -g -H:+PreserveFramePointer -H:+RuntimeDebugInfo -H:RuntimeDebugInfoFormat=jitdump ...At image-runtime, the jitdump file _
/jit- .dump_ is created, and runtime compilation metadata is written to it. The output directory can be configured with `-R:RuntimeJitdumpDir= ` (defaults to _./jitdump_). -
Record with perf:
When recording profiling data, use the
-k 1option to ensure time-based events are ordered correctly for injection:perf record -k 1 -o perf.data <your-application>If the perf data was not recorded with
-k 1, injecting runtime compilation metadata from a jitdump file will fail. -
Inject jitdump into perf data:
perf inject -j -i perf.data -o perf.jit.dataThis step:
- Locates the jitdump file.
- Generates a .so file for each runtime compilation entry in the jitdump file.
- Injects runtime compilation metadata into the profiling data and stores it in perf.jit.data.
-
Inspect profiling data:
perf report -i perf.jit.dataSymbols from the jitdump file appear as coming from _jitted-
- .so_, where `code_id` is the index of a compilation entry in the jitdump file.
Basic Operations
CPU Profiling
-
List all available events:
perf listThis command displays a list of all available events that you can use for profiling.
-
Record CPU events:
perf record -e <event> -o perf.data <your_executable>Replace
<event>with the desired event from the list. This command profiles your executable and save the data to a file named perf.data. -
Generate a report:
perf reportThis command generates a report based on the collected data. You can use various options to customize the output.
Memory Profiling
-
Record memory events:
perf record -e memory:<event> -o perf.data <your_executable>Replace
<event>with a specific memory event. This command profiles memory-related events. -
Generate a memory report:
perf report --sort=dsoThis command generates a report focused on memory-related events, sorted by dynamic shared object (DSO).
Tracing
-
Record system-wide traces:
sudo perf record -a -g -o perf.dataThis command records system-wide traces, including call-graph information, and saves the data to a file named perf.data. Use sudo for system-wide tracing.
-
Generate a trace report:
perf scriptThis command generates a script that can be used for analyzing the recorded trace data.
Generating Flame Graphs from Profiling Data
FlameGraph is a tool written in Perl that can be used to produce flame graphs from perf profiling data. Flame graphs generated by this tool visualize stack samples as interactive SVGs, making it easy to identify hot code paths in an application.
-
Download the tool and record profiling data as described in Basic Operations.
Make sure the profiling data was recorded with
-gto capture call graphs, otherwise the flame graph will be flat. - Fold stacks:
perf script -i perf.data | ./stackcollapse-perf.pl > perf.data.folded - Render an SVG:
./flamegraph.pl perf.data.folded > perf.data.svg -
Open the flame graph:
Use an application to view the generated SVG file (for example,
firefox,chromium).firefox perf.data.svg
Highlighting Runtime-Compiled Methods
If the native image supports profiling of runtime-compiled methods, it is possible to highlight runtime-compiled symbols in the flame graph.
-
Build the native image with jitdump support, record profiling data and inject the jitdump information as described in jitdump.
-
Fold stacks:
This involves folding the stacks for the non-jitdump-injected perf.data and the jitdump-injected perf.jit.data.
perf script -i perf.data | ./stackcollapse-perf.pl > perf.data.folded perf script -i perf.jit.data | ./stackcollapse-perf.pl > perf.jit.data.folded -
Generate a consistent color palette map:
Use the non-jitdump-injected perf.data.folded to create a consistent palette map in palette.map for events in perf.data. The first call with
--cpwill create the map while subsequent calls with--cpreuse the map for consistent coloring of known events. This also produces a flame graph for the non-jitdump-injected data../flamegraph.pl --cp perf.data.folded > perf.data.svg -
Reuse the color palette map:
Use the consistent palette for already known events with
--cpfor the jitdump-injected perf.jit.data.folded. This is, events already seen in the non-jitdump-injected perf.data.folded get a fixed coloring. New events get a random coloring from the palette selected with the--coloroption (e.g.mem)../flamegraph.pl --cp --color mem perf.jit.data.folded > perf.jit.data.svg -
Open the flame graph:
firefox perf.jit.data.svg
Generate an Invocation-Time-Ordered Flame Graph
Generate a stack-reversed flame graph with the topmost frames shown at the bottom of the flame graph in order of invocation time. Calls appear left-to-right in chronological order, with stack frames in each call arranged top-to-bottom from oldest to newest. Events from all threads contributing to the profiling data are shown interleaved.
./flamegraph.pl --reverse perf.data.folded > perf.data.svg
firefox perf.data.svg