Truffle Language Safepoint Tutorial

As of 21.1 Truffle has support for guest language safepoints. Truffle safepoints allow to interrupt the guest language execution to perform thread local actions submitted by a language or tool. A safepoint is a location during the guest language execution where the state is consistent and other operations can read its state.

This replaces previous instrumentation or assumption-based approaches to safepoints, which required the code to be invalidated for a thread local action to be performed. The new implementation uses fast thread local checks and callee register saved stub calls to optimize for performance and keep the overhead minimal. This means that for every loop back-edge and method exit we perform an additional non-volatile read which can potentially lead to slight slow-downs.

Use Cases #

Common use-cases of Truffle language safepoints are:

  • Cancellation, requested exit or interruptions during guest language execution. The stack is unwound by submitting a thread local action.
  • Reading the current stack trace information for other threads than the currently executing thread.
  • Enumerating all object references active on the stack.
  • Running a guest signal handler or guest finalizer on a given thread.
  • Implement guest languages that expose a safepoint mechanism as part of their development toolkit.
  • Debuggers evaluating expressions in languages that do not support execution on multiple threads.

Language Support #

Safepoints are explicitly polled by invoking the TruffleSafepoint.poll(Node) method. A Truffle guest language implementation must ensure that a safepoint is polled repeatedly within a constant time interval. For example, a single arithmetic expression completes within a constant number of CPU cycles. However, a loop that summarizes values over an array uses a non-constant time dependent on the actual array size. This typically means that safepoints are best polled at the end of loops and at the end of function or method calls to cover recursion. In addition, any guest language code that blocks the execution, like guest language locks, need to use the TruffleSafepoint.setBlocked(Interrupter) API to allow cooperative polling of safepoints while the thread is waiting.

Please read more details on what steps language implementations need to take to support thread local actions in the javadoc.

Thread Local Actions #

Languages and instruments can submit actions using their environment.

Usage example:


Env env; // language or instrument environment

env.submitThreadLocal(null, new ThreadLocalAction(true /*side-effecting*/, true /*synchronous*/) {
     @Override
     protected void perform(Access access) {
         assert access.getThread() == Thread.currentThread();
     }
});

Read more in the javadoc.

Current Limitations #

There is currently no way to run thread local actions while the thread is executing in boundary annotated methods unless the method cooperatively polls safepoints or uses the blocking API. Unfortunately it is not always possible to cooperatively poll safepoints, for example, if the code currently executes third party native code. A future improvement will allow to run code for other threads while they are blocked. This is one of the reasons why it is recommended to use ThreadLocalAction.Access.getThread() instead of directly using Thread.currentThread(). When the native call returns it needs to wait for any thread local action that is currently executing for this thread. This will enable to collect guest language stack traces from other threads while they are blocked by uncooperative native code. Currently the action will be performed on the next safepoint location when the native code returns.

Tooling for Debugging #

There are several debug options available:

Excercise safepoints with SafepointALot #

SafepointALot is a tool to exercise every safepoint of an application and collect statistics.

If enabled with the --engine.SafepointALot option it prints the statistics on the cpu time interval between safepoints at the end of an execution.

For example, running:

graalvm/bin/js --engine.SafepointALot js-benchmarks/harness.js -- octane-deltablue.js

Prints the following output to the log on context close:

DeltaBlue: 540
[engine] Safepoint Statistics
  --------------------------------------------------------------------------------------
   Thread Name         Safepoints | Interval     Avg              Min              Max
  --------------------------------------------------------------------------------------
   main                  48384054 |            0.425 us           0.1 us       44281.1 us
  -------------------------------------------------------------------------------------
   All threads           48384054 |            0.425 us           0.1 us       42281.1 us

It is recommended for guest language implementations to try to stay below 1ms on average. Note that precise timing can depend on CPU and interruptions by the GC. Since GC times are included in the safepoint interval times, it is expected that the maximum is close to the maximum GC interruption time. Future versions of this tool will be able to exclude GC interruption times from this statistic.

Trace thread local actions #

The option --engine.TraceThreadLocalActions allows to trace all thread local actions of any origin.

Example output:

[engine] [tl] submit                 0  thread[main]                action[SampleAction$8@5672f0d1]     all-threads[alive=4]        side-effecting     asynchronous
[engine] [tl]   perform-start        0  thread[pool-1-thread-410]   action[SampleAction$8@5672f0d1]
[engine] [tl]   perform-start        0  thread[pool-1-thread-413]   action[SampleAction$8@5672f0d1]
[engine] [tl]   perform-start        0  thread[pool-1-thread-412]   action[SampleAction$8@5672f0d1]
[engine] [tl]   perform-done         0  thread[pool-1-thread-413]   action[SampleAction$8@5672f0d1]
[engine] [tl]   perform-done         0  thread[pool-1-thread-410]   action[SampleAction$8@5672f0d1]
[engine] [tl]   perform-start        0  thread[pool-1-thread-411]   action[SampleAction$8@5672f0d1]
[engine] [tl]   perform-done         0  thread[pool-1-thread-412]   action[SampleAction$8@5672f0d1]
[engine] [tl]   perform-done         0  thread[pool-1-thread-411]   action[SampleAction$8@5672f0d1]
[engine] [tl] done                   0  thread[pool-1-thread-411]   action[SampleAction$8@5672f0d1]

Printing guest and host stack frames every time interval. #

The option --engine.TraceStackTraceInterval=1000 allows to set the time interval in milliseconds to repeatedly print the current stack trace. Note that the stack trace is printed on the next safepoint poll and therefore might not be accurate.

graalvm/bin/js --engine.TraceStackTraceInterval=1000 js-benchmarks/harness.js -- octane-deltablue.js

Prints the following output:

[engine] Stack Trace Thread main: org.graalvm.polyglot.PolyglotException
	at <js> BinaryConstraint.chooseMethod(octane-deltablue.js:359-381:9802-10557)
	at <js> Constraint.satisfy(octane-deltablue.js:176:5253-5275)
	at <js> Planner.incrementalAdd(octane-deltablue.js:597:16779-16802)
	at <js> Constraint.addConstraint(octane-deltablue.js:165:4883-4910)
	at <js> UnaryConstraint(octane-deltablue.js:219:6430-6449)
	at <js> StayConstraint(octane-deltablue.js:297:8382-8431)
	at <js> chainTest(octane-deltablue.js:817:23780-23828)
	at <js> deltaBlue(octane-deltablue.js:883:25703-25716)
	at <js> MeasureDefault(harness.js:552:20369-20383)
	at <js> BenchmarkSuite.RunSingleBenchmark(harness.js:614:22538-22550)
	at <js> RunNextBenchmark(harness.js:340:11560-11614)
	at <js> RunStep(harness.js:141:5673-5686)
	at <js> BenchmarkSuite.RunSuites(harness.js:160:6247-6255)
	at <js> runBenchmarks(harness.js:686-688:24861-25023)
	at <js> main(harness.js:734:26039-26085)
	at <js> :program(harness.js:783:27470-27484)
	at org.graalvm.polyglot.Context.eval(Context.java:348)
	at com.oracle.truffle.js.shell.JSLauncher.executeScripts(JSLauncher.java:347)
	at com.oracle.truffle.js.shell.JSLauncher.launch(JSLauncher.java:88)
	at org.graalvm.launcher.AbstractLanguageLauncher.launch(AbstractLanguageLauncher.java:124)
	at org.graalvm.launcher.AbstractLanguageLauncher.launch(AbstractLanguageLauncher.java:71)
	at com.oracle.truffle.js.shell.JSLauncher.main(JSLauncher.java:73)

[engine] Stack Trace Thread main: org.graalvm.polyglot.PolyglotException
	at <js> EqualityConstraint.execute(octane-deltablue.js:528-530:14772-14830)
	at <js> Plan.execute(octane-deltablue.js:781:22638-22648)
	at <js> chainTest(octane-deltablue.js:824:24064-24077)
	at <js> deltaBlue(octane-deltablue.js:883:25703-25716)
	at <js> MeasureDefault(harness.js:552:20369-20383)
	at <js> BenchmarkSuite.RunSingleBenchmark(harness.js:614:22538-22550)
	at <js> RunNextBenchmark(harness.js:340:11560-11614)
	at <js> RunStep(harness.js:141:5673-5686)
	at <js> BenchmarkSuite.RunSuites(harness.js:160:6247-6255)
	at <js> runBenchmarks(harness.js:686-688:24861-25023)
	at <js> main(harness.js:734:26039-26085)
	at <js> :program(harness.js:783:27470-27484)
	at org.graalvm.polyglot.Context.eval(Context.java:348)
	at com.oracle.truffle.js.shell.JSLauncher.executeScripts(JSLauncher.java:347)
	at com.oracle.truffle.js.shell.JSLauncher.launch(JSLauncher.java:88)
	at org.graalvm.launcher.AbstractLanguageLauncher.launch(AbstractLanguageLauncher.java:124)
	at org.graalvm.launcher.AbstractLanguageLauncher.launch(AbstractLanguageLauncher.java:71)
	at com.oracle.truffle.js.shell.JSLauncher.main(JSLauncher.java:73)

Further Reading #

Daloze, Benoit, Chris Seaton, Daniele Bonetta, and Hanspeter Mössenböck. “Techniques and applications for guest-language safepoints.” In Proceedings of the 10th Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems, pp. 1-10. 2015.

https://dl.acm.org/doi/abs/10.1145/2843915.2843921