Java has quite possibly one of the worst ways to interact with native methods using JNI (Java Native Interface). JNI APIs follow weird conventions and the language boundary interop is unbelievably cumbersome to deal with. Graal VM introduces a new way to cross language boundary that is easier to maintain and allows interop with any native library.
Exchanging anything other than primitive data via JNI typically involves either reaching back to the VM (very slow) or use a serialization library like Protobuf (still slow) or directly access memory by address in Java via sun.misc.Unsafe (fast but unstable). That’s why I was super excited for Graal VM’s polyglot abilities which will essentially eliminate the need for JNI and introduce a safer way to cross the language boundary from Java. LLVM interoperability in reference manual gave me a lot of hope and showed how simple this can be.
Turns out there are still some limitations in polyglot capabilities of Graal VM. For Java interop with native methods it requires you to create a native image (compile the java application to executable code).
The only problem is that native image building functionality in Graal VM is quite limited (as of now) and it is not always possible to build one for complex Java applications. I still wanted to see if this is indeed simpler than using JNI and if there is any performance gap between these approaches (the fastest way that I know of is to expose the address via JNI method and write wrapper classes that access memory directly using Unsafe in Java).
Exposing C data structures
Let’s start with a simple data structure (in file
triple.h) and create the corresponding structure for that in Java,
C struct package provides us with required functionality to achieve this. For our example, we need to use two concepts to complete the mapping,
- CField - Maps a primitive field in struct to Java.
- CFieldAddress - Maps a complex (or nested) struct by leveraging composition in Java.
value_t is pretty straightforward and
CField lets us create getters and setters that directly interact with native memory.
That’s not so bad - no hardcoded constants or crazy long method names and it allows bi-directional mutations. This is already better than JNI. Similarly, we can map Triple which is a composite structure that carries 3 values. A slight difference here is that we will reference the Java
Value interface using address indirection. This means we don’t get setters/getters like the previous example which makes sense. You cannot really set a new subject to the C struct as it is defined (it is possible if those member fields are pointers to
Voila! We can now access these just like rest of Java code.
I glossed over one thing to get to mapping part but Graal VM requires you to enclose these struct classes in some CContext. This is used as part of native-image build to resolve the offsets correctly.
Interacting with native functions
Above section showed how C data structures can be exposed to Java. But we still need some functions to create these structures so rest of code can interact with them. One thing I like about this is that the object/struct creation is completely managed by native code, unlike JNI where there is no such strict contract (I may be wrong but couldn’t find any documentation that says otherwise).
Let’s say we have this basic function which creates the triple and returns the address to the created object,
Calling this from Java is really simple with just one annotation and reference to
Triple class created above.
After that Graal still needs to be informed which library provides this method and this can be done with another annotation (assuming the shared library built with above called is named
Transition mentioned above can be further tuned to improve the performance. Using the default value for transition causes Graal to transition the thread state from Java to C and it will make Java parts of stack available. No transition can be used in more tighter loops where the native method is just doing some raw computation.
Native interaction via JNI/Unsafe
Here is an example of creating a similar interface using
sun.misc.Unsafe and getting access to the native data structure in Java. Not only this is very error-prone and hard to debug but extremely hard to evolve as structure evolves in the development lifecycle (not to mention the deprecation of Unsafe in recent JDK versions). A safer alternative is to create
DirectByteBuffer via JNI API and use that to achieve similar functionality (with a small performance penalty since it involves reaching back to VM from native context).
First we need to create a special JNI method that allows Java to bind native methods. JNI methods need to follow this specific convention
Java_package_name_ClassName_method for them to be detected correctly.
Basically, all we are doing with the native method is to expose the raw memory address to Java and we use Unsafe APIs to access the memory directly. Obviously, this is very unsafe so we need to hide this abomination from callers, here’s an example of a wrapper class that hides these details (still a major pain to maintain the magic numbers on an ongoing basis).
If you need to abstract this out further, take a look at Javolution codebase. Its a great reference to build a generic framework that can eliminate much of the constants in above code.
As always take these micro benchmarks with a grain of salt and make sure to conduct a similar experiment for your application to really quantify the difference.
Linked source code below has a very basic benchmark that runs 100M iterations interacting with these methods.
Here are the results,
|Method||Time per iteration|
Not only code is cleaner with Graal VM but it is about 1.4x faster over the fastest possible JNI implementation. I am not sure how much of that is due to AOT complication or something fundamental with native method interaction.
At this point, you might be wondering why even bother crossing the language boundary as performance differences in modern languages are quite small. For a database it is beneficial to handle the request/responses in Java, this lets us use server frameworks like Netty or support a variety of serialization formats/charset encodings. At the same time database is all about control, you want to make sure resources are released when they are supposed to be released and query limits are enforced strictly (memory or time). Having a fast maintainable way to cross the language boundary when needed makes these type of uses cases possible.
Even though it is required to go through the native image route in Graal VM, the actual interaction between Java and native methods is much easier/succinct compared to using JNI. With sufficient care, it could even be backward compatible as long as new members to structs are additive. I am hopeful with existing limitations of native image removed, language interop becomes straightforward and open up a new class of applicatons on Graal VM.
All of the code mentioned in this post is available at the following repository: graal-native-interaction.