diff --git a/CHANGELOG.md b/CHANGELOG.md index 6b0e308..0e50d58 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,12 @@ # Change Log All notable changes to this project will be documented in this file. This change log follows the conventions of [keepachangelog.com](http://keepachangelog.com/). +## [0.1.220] - 2021-10-09 +### Fixed +- All-primitive method types still used serialization when called from `cfn` +- Arrays deserialized to non-vector sequences +- Non-primitive argument types fail to link + ## [0.1.205] - 2021-10-06 ### Added - An `address?` predicate @@ -41,6 +47,7 @@ All notable changes to this project will be documented in this file. This change - Support for serializing and deserializing arbitrary Clojure functions - Support for serializing and deserializing arbitrary Clojure data structures +[0.1.220]: https://github.com/IGJoshua/coffi/compare/v0.1.205...v0.1.220 [0.1.205]: https://github.com/IGJoshua/coffi/compare/v0.1.192...v0.1.205 [0.1.192]: https://github.com/IGJoshua/coffi/compare/v0.1.184...v0.1.192 [0.1.184]: https://github.com/IGJoshua/coffi/compare/v0.1.176...v0.1.184 diff --git a/README.md b/README.md index 49ff808..c896dce 100644 --- a/README.md +++ b/README.md @@ -17,8 +17,8 @@ This library is available on Clojars. Add one of the following entries to the `:deps` key of your `deps.edn`: ```clojure -org.suskalo/coffi {:mvn/version "0.1.205"} -io.github.IGJoshua/coffi {:git/tag "v0.1.205" :git/sha "0149012"} +org.suskalo/coffi {:mvn/version "0.1.220"} +io.github.IGJoshua/coffi {:git/tag "v0.1.220" :git/sha "abcbf0f"} ``` If you use this library as a git dependency, you will need to prepare the @@ -590,6 +590,406 @@ This functionality can be extended by specifying new types as implementations of the multimethod `reify-symbolspec`, although it's recommended that for any library authors who do so, namespaced keywords be used to name types. +## Alternatives +This library is not the only Clojure library providing access to native code. In +addition the following libraries exist: + +- [dtype-next](https://github.com/cnuernber/dtype-next) +- [tech.jna](https://github.com/techascent/tech.jna) +- [clojure-jna](https://github.com/Chouser/clojure-jna) + +Dtype-next has support for Java versions 8-16 and GraalVM, but is focused +strongly on array-based programming, and doesn't provide facilities for +callbacks, as well as being focused on keeping memory in the native side rather +than marshaling data to and from Clojure-native structures. In Java 16, this +uses the first iteration of Panama, while in other Java versions it uses JNA. + +Tech.jna and clojure-jna both use the JNA library in all cases, and neither +provide support for dealing with struct types or callbacks. + +An additional alternative to coffi is to directly use the JNI, which is the +longest-standing method of wrapping native code in the JVM, but comes with the +downside that it requires you to write both native and Java code to use, even if +you only intend to use it from Clojure. + +If your application needs to be able to run in earlier versions of the JVM than +17, or you don't want to use incubator functionality, you should consider these +other options. Dtype-next provides the most robust support for native code, but +if you are wrapping a simple library then the other libraries may be more +appealing, as they have a smaller API surface area and it's easier to wrap +functions. + +### Benchmarks +An additional consideration when thinking about alternatives is the performance +of each available option. It's an established fact that JNA (used by all three +alternative libraries on JDK <16) introduces more overhead when calling native +code than JNI does. + +In order to provide a benchmark to see how much of a difference the different +native interfaces make, we can use +[criterium](https://github.com/hugoduncan/criterium) to benchmark each. +[GLFW](https://www.glfw.org)'s +[`glfwGetTime`](https://www.glfw.org/docs/latest/group__input.html#gaa6cf4e7a77158a3b8fd00328b1720a4a) +function will be used for the test as it performs a simple operation, and is +conveniently already wrapped in JNI by the excellent +[LWJGL](https://www.lwjgl.org/) library. + +The following benchmarks were run on a Lenovo Thinkpad with an Intel i7-10610U +running Manjaro Linux, using Clojure 1.10.3 on Java 17. + +#### JNI +The baseline for performance is the JNI. Using LWJGL it's relatively simple to +benchmark. The following Clojure CLI command will start a repl with LWJGL and +criterium loaded. + +```sh +$ clj -Sdeps '{:deps {org.lwjgl/lwjgl {:mvn/version "3.2.3"} + org.lwjgl/lwjgl-glfw {:mvn/version "3.2.3"} + org.lwjgl/lwjgl$natives-linux {:mvn/version "3.2.3"} + org.lwjgl/lwjgl-glfw$natives-linux {:mvn/version "3.2.3"} + criterium/criterium {:mvn/version "0.4.6"}}}' +``` + +Then from the repl + +```clojure +user=> (import 'org.lwjgl.glfw.GLFW) +org.lwjgl.glfw.GLFW +user=> (require '[criterium.core :as bench]) +nil +user=> (GLFW/glfwInit) +true +user=> (bench/bench (GLFW/glfwGetTime) :verbose) +amd64 Linux 5.10.68-1-MANJARO 8 cpu(s) +OpenJDK 64-Bit Server VM 17+35-2724 +Runtime arguments: -Dclojure.basis=/home/jsusk/.clojure/.cpcache/2667074721.basis +Evaluation count : 1613349900 in 60 samples of 26889165 calls. + Execution time sample mean : 32.698446 ns + Execution time mean : 32.697811 ns +Execution time sample std-deviation : 1.274600 ns + Execution time std-deviation : 1.276437 ns + Execution time lower quantile : 30.750813 ns ( 2.5%) + Execution time upper quantile : 33.757662 ns (97.5%) + Overhead used : 6.400704 ns +nil +``` + +GLFW requires that we initialize it before calling the `glfwGetTime` function. +Besides that this is a simple interop call which directly maps to the native +function. + +This gives us a basis of 32.7 ns +/-1.3 ns. All other libraries will be +evaluated relative to this result. + +To ensure fairness, we'll also get that overhead value to be used in further +tests. + +```clojure +user=> bench/estimated-overhead-cache +6.400703613065185E-9 +``` + +#### Coffi +The dependencies when using coffi are simpler, but it also requires some JVM +options to support the foreign access api. + +```sh +$ clj -Sdeps '{:deps {org.suskalo/coffi {:mvn/version "0.1.205"} + criterium/criterium {:mvn/version "0.4.6"}}}' \ + -J--add-modules=jdk.incubator.foreign \ + -J--enable-native-access=ALL-UNNAMED +``` + +In order to ensure fair comparisons, we're going to use the same overhead value +on each run, so before we do the benchmark we'll set it to the observed value +from last time. + +```clojure +user=> (require '[criterium.core :as bench]) +nil +user=> (alter-var-root #'bench/estimated-overhead-cache (constantly 6.400703613065185E-9)) +6.400703613065185E-9 +user=> (require '[coffi.ffi :as ffi]) +nil +user=> (require '[coffi.mem :as mem]) +nil +user=> (ffi/load-system-library "glfw") +nil +user=> ((ffi/cfn "glfwInit" [] ::mem/int)) +1 +user=> (let [f (ffi/cfn "glfwGetTime" [] ::mem/double)] + (bench/bench (f) :verbose)) +amd64 Linux 5.10.68-1-MANJARO 8 cpu(s) +OpenJDK 64-Bit Server VM 17+35-2724 +Runtime arguments: --add-modules=jdk.incubator.foreign --enable-native-access=ALL-UNNAMED -Dclojure.basis=/home/jsusk/.clojure/.cpcache/72793624.basis +Evaluation count : 1657995600 in 60 samples of 27633260 calls. + Execution time sample mean : 31.382665 ns + Execution time mean : 31.386493 ns +Execution time sample std-deviation : 1.598571 ns + Execution time std-deviation : 1.608818 ns + Execution time lower quantile : 29.761194 ns ( 2.5%) + Execution time upper quantile : 33.228276 ns (97.5%) + Overhead used : 6.400704 ns +nil +``` + +This result is about 1.3 ns faster, and while that is less than the standard +deviation of 1.6, it's quite close to it. + +#### Clojure-JNA +Clojure-JNA uses the JNA library, which was designed to provide Java with an +easy way to access native libraries, but which is known for not having the +greatest performance. Since this is an older project, I'm also including the +clojure dependency to ensure the correct version is used. + +```sh +$ clj -Sdeps '{:deps {org.clojure/clojure {:mvn/version "1.10.3"} + net.n01se/clojure-jna {:mvn/version "1.0.0"} + criterium/criterium {:mvn/version "0.4.6"}}}' +``` + +The naive way to call the function using Clojure-JNA is to use `jna/invoke`. + +```clojure +user=> (require '[criterium.core :as bench]) +nil +user=> (alter-var-root #'bench/estimated-overhead-cache (constantly 6.400703613065185E-9)) +6.400703613065185E-9 +user=> (require '[net.n01se.clojure-jna :as jna]) +nil +user=> (jna/invoke Integer glfw/glfwInit) +1 +user=> (bench/bench (jna/invoke Double glfw/glfwGetTime) :verbose) +amd64 Linux 5.10.68-1-MANJARO 8 cpu(s) +OpenJDK 64-Bit Server VM 17+35-2724 +Runtime arguments: -Dclojure.basis=/home/jsusk/.clojure/.cpcache/3229486237.basis +Evaluation count : 195948720 in 60 samples of 3265812 calls. + Execution time sample mean : 350.335614 ns + Execution time mean : 350.373520 ns +Execution time sample std-deviation : 24.833070 ns + Execution time std-deviation : 24.755929 ns + Execution time lower quantile : 300.000019 ns ( 2.5%) + Execution time upper quantile : 365.759273 ns (97.5%) + Overhead used : 6.400704 ns + +Found 13 outliers in 60 samples (21.6667 %) + low-severe 12 (20.0000 %) + low-mild 1 (1.6667 %) + Variance from outliers : 53.4220 % Variance is severely inflated by outliers +nil +``` + +As you can see, this method of calling functions is very bad for performance, +with call overhead dominating function runtime by an order of magnitude. That +said, this isn't a completely fair comparison, nor the most realistic, because +this way of calling functions looks the function up on each invocation. + +To adjust for this, we'll use the `jna/to-fn` function to give a persistent +handle to the function that we can call. + +```clojure +user=> (let [f (jna/to-fn Double glfw/glfwGetTime)] + (bench/bench (f) :verbose)) +amd64 Linux 5.10.68-1-MANJARO 8 cpu(s) +OpenJDK 64-Bit Server VM 17+35-2724 +Runtime arguments: -Dclojure.basis=/home/jsusk/.clojure/.cpcache/3229486237.basis +Evaluation count : 611095020 in 60 samples of 10184917 calls. + Execution time sample mean : 104.623634 ns + Execution time mean : 104.638406 ns +Execution time sample std-deviation : 7.649296 ns + Execution time std-deviation : 7.638963 ns + Execution time lower quantile : 92.446016 ns ( 2.5%) + Execution time upper quantile : 110.258832 ns (97.5%) + Overhead used : 6.400704 ns +nil +``` + +This is much better, but is still about 3x slower than JNI, meaning the overhead +from using JNA is still bigger than the function runtime. + +This performance penalty is still small in the scope of longer-running +functions, and so may not be a concern for your application, but it is something +to be aware of. + +#### tech.jna +The tech.jna library is similar in scope to Clojure-JNA, however was written to +fit into an ecosystem of libraries meant for array-based programming for machine +learning and data science. + +```sh +$ clj -Sdeps '{:deps {techascent/tech.jna {:mvn/version "4.05"} + criterium/criterium {:mvn/version "0.4.6"}}}' +``` + +This library is also quite simple to use, the only slightly odd thing I'm doing +here is to dereference the var outside the benchmark in order to ensure it's an +apples-to-apples comparison. We don't want var dereference time mucking up our +benchmark. + +```clojure +user=> (require '[criterium.core :as bench]) +nil +user=> (alter-var-root #'bench/estimated-overhead-cache (constantly 6.400703613065185E-9)) +6.400703613065185E-9 +user=> (require '[tech.v3.jna :as jna]) +nil +user=> (jna/def-jna-fn "glfw" glfwInit "initialize glfw" Integer) +#'user/glfwInit +user=> (glfwInit) +Oct 09, 2021 10:30:50 AM clojure.tools.logging$eval1122$fn__1125 invoke +INFO: Library glfw found at [:system "glfw"] +1 +user=> (jna/def-jna-fn "glfw" glfwGetTime "gets the time as a double since init" Double) +#'user/glfwGetTime +user=> (let [f @#'glfwGetTime] + (bench/bench (f) :verbose)) +amd64 Linux 5.10.68-1-MANJARO 8 cpu(s) +OpenJDK 64-Bit Server VM 17+35-2724 +Runtime arguments: -Dclojure.basis=/home/jsusk/.clojure/.cpcache/2910209237.basis +Evaluation count : 323281680 in 60 samples of 5388028 calls. + Execution time sample mean : 203.976803 ns + Execution time mean : 203.818712 ns +Execution time sample std-deviation : 14.557312 ns + Execution time std-deviation : 14.614080 ns + Execution time lower quantile : 179.732593 ns ( 2.5%) + Execution time upper quantile : 213.929374 ns (97.5%) + Overhead used : 6.400704 ns +nil +``` + +This version is even slower than Clojure-JNA. I'm unsure where this overhead is +coming from, but I'll admit that I haven't looked at their implementations very +closely. + +#### dtype-next +The library dtype-next replaced tech.jna in the toolkit of the group working on +machine learning and array-based programming, and it includes support for +composite data types including structs, as well as primitive functions. + +In addition, dtype-next has two different ffi backends. First is JNA, which is +usable on any JDK version, and is what we'll use for the first benchmark. Second +is the Java 16 version of Project Panama, which will be shown next. + +In order to use the dtype-next ffi with the JNA backend, the JNA library has to +be included in the dependencies. + +```sh +$ clj -Sdeps '{:deps {cnuernber/dtype-next {:mvn/version "8.032"} + net.java.dev.jna/jna {:mvn/version "5.8.0"} + criterium/criterium {:mvn/version "0.4.6"}}}' +``` + +The dtype-next library also requires some more ceremony around declaring native +functions. One advantage this has is that multiple symbols with the same name +can be loaded from different shared libraries, but it also does increase +friction when defining native wrappers. + +Some easier ways to define native wrappers are provided than what is seen here, +but they share some disadvantages in documentation over the core methods +provided in coffi, although they are comparable to the data model provided in +coffi. + +```clojure +user=> (require '[criterium.core :as bench]) +nil +user=> (alter-var-root #'bench/estimated-overhead-cache (constantly 6.400703613065185E-9)) +6.400703613065185E-9 +user=> (require '[tech.v3.datatype.ffi :as dt-ffi]) +nil +user=> (def fn-defs {:glfwInit {:rettype :int32} :glfwGetTime {:rettype :float64}}) +#'user/fn-defs +user=> (def library-def (dt-ffi/define-library fn-defs)) +#'user/library-def +user=> (def library-instance (dt-ffi/instantiate-library library-def "/usr/lib/libglfw.so")) +#'user/library-instance +user=> (def init (:glfwInit @library-instance)) +#'user/init +user=> (init) +1 +user=> (let [f (:glfwGetTime @library-instance)] + (bench/bench (f) :verbose)) +amd64 Linux 5.10.68-1-MANJARO 8 cpu(s) +OpenJDK 64-Bit Server VM 17+35-2724 +Runtime arguments: -Dclojure.basis=/home/jsusk/.clojure/.cpcache/643862289.basis +Evaluation count : 710822100 in 60 samples of 11847035 calls. + Execution time sample mean : 90.900112 ns + Execution time mean : 90.919917 ns +Execution time sample std-deviation : 6.463312 ns + Execution time std-deviation : 6.470108 ns + Execution time lower quantile : 79.817126 ns ( 2.5%) + Execution time upper quantile : 95.454652 ns (97.5%) + Overhead used : 6.400704 ns +nil +``` + +This version of JNA usage is significantly faster than either of the other JNA +libraries, but is still substantially slower than using JNI or coffi. + +In addition to the JNA backend, dtype-next has a Java 16-specific backend that +uses an older version of Panama. This version requires similar setup to coffi in +order to run. + +```sh +$ clj -Sdeps '{:deps {cnuernber/dtype-next {:mvn/version "8.032"} + criterium/criterium {:mvn/version "0.4.6"}}}' \ + -J--add-modules=jdk.incubator.foreign \ + -J-Dforeign.restricted=permit \ + -J--add-opens=java.base/java.lang=ALL-UNNAMED \ + -J-Djava.library.path=/usr/lib/x86_64-linux-gnu +``` + +The actual code to run the benchmark is identical to the last example, but is +reproduced here for completeness. + +```clojure +user=> (require '[criterium.core :as bench]) +nil +user=> (alter-var-root #'bench/estimated-overhead-cache (constantly 6.400703613065185E-9)) +6.400703613065185E-9 +user=> (require '[tech.v3.datatype.ffi :as dt-ffi]) +nil +user=> (def fn-defs {:glfwInit {:rettype :int32} :glfwGetTime {:rettype :float64}}) +#'user/fn-defs +user=> (def library-def (dt-ffi/define-library fn-defs)) +#'user/library-def +user=> (def library-instance (dt-ffi/instantiate-library library-def "/usr/lib/libglfw.so")) +#'user/library-instance +user=> (def init (:glfwInit @library-instance)) +#'user/init +user=> (init) +1 +user=> (let [f (:glfwGetTime @library-instance)] + (bench/bench (f) :verbose)) +amd64 Linux 5.10.68-1-MANJARO 8 cpu(s) +OpenJDK 64-Bit Server VM 16.0.2+7 +Runtime arguments: --add-modules=jdk.incubator.foreign -Dforeign.restricted=permit --add-opens=java.base/java.lang=ALL-UNNAMED -Djava.library.path=/usr/lib/x86_64-linux-gnu -Dclojure.basis=/home/jsusk/.clojure/.cpcache/2337051659.basis +Evaluation count : 1588513080 in 60 samples of 26475218 calls. + Execution time sample mean : 58.732468 ns + Execution time mean : 58.647361 ns +Execution time sample std-deviation : 9.732389 ns + Execution time std-deviation : 9.791738 ns + Execution time lower quantile : 31.318115 ns ( 2.5%) + Execution time upper quantile : 65.449222 ns (97.5%) + Overhead used : 6.400704 ns + +Found 14 outliers in 60 samples (23.3333 %) + low-severe 8 (13.3333 %) + low-mild 4 (6.6667 %) + high-mild 2 (3.3333 %) + Variance from outliers : 87.6044 % Variance is severely inflated by outliers +nil +``` + +Not reproduced here, but notable for comparison, in my testing Java 16's version +of the JNI version performed about the same. + +This is significantly faster than the JNA version of dtype-next, but it is still +slower than modern Panama. This is likely to simply be a result of optimizations +and changes to the Panama API, and when dtype-next is updated to use the Java 17 +version of Panama I expect it will perform in line with coffi, but this +benchmark will be reproduced when this happens. Still, this shows that as it +stands, coffi is the fastest FFI available to Clojure developers. + ## Known Issues The project author is aware of these issues and plans to fix them in a future release: @@ -603,6 +1003,28 @@ These features are planned for future releases. - Functions for wrapping structs in padding following various standards - Header parsing tool for generating a data model? - Generic type aliases +- Helpers for generating enums & bitflags +- Helper macro for out arguments +- Improve error messages from defcfn macro +- Mapped memory + +### Future JDKs +The purpose of coffi is to provide a wrapper for published versions of Project +Panama, starting with JDK 17. As new JDKs are released, coffi will be ported to +the newer versions of Panama. When JDK 18 is released, a tag will be added to +mark the final release of coffi that is compatible with Java 17, as that is the +LTS release of the JDK. Development of new features and fixes as well as support +for new Panama idioms and features will continue with focus only on the latest +JDK. If a particular feature is not specific to the newer JDK, PRs backporting +it to versions of coffi supporting Java 17 will likely be accepted. + +### 1.0 Release +Because the feature that coffi wraps in the JDK is an incubator feature (and +likely in JDK 19 a [preview +feature](https://mail.openjdk.java.net/pipermail/panama-dev/2021-September/014946.html)) +coffi itself will not be released in a 1.0.x version until the feature becomes a +core part of the JDK, likely before or during the next LTS release, Java 21, in +September 2023. ## License diff --git a/src/clj/coffi/ffi.clj b/src/clj/coffi/ffi.clj index bfb497f..a3bc11c 100644 --- a/src/clj/coffi/ffi.clj +++ b/src/clj/coffi/ffi.clj @@ -20,6 +20,7 @@ CLinker FunctionDescriptor MemoryLayout + MemorySegment SegmentAllocator))) ;;; FFI Code loading and function access @@ -108,10 +109,10 @@ (defn- insn-layout "Gets the type keyword or class for referring to the type in bytecode." [type] - (when-some [prim (mem/primitive-type type)] - (if (not= prim ::mem/pointer) - (keyword (name prim)) - (mem/java-layout type)))) + (or (when-some [prim (mem/primitive-type type)] + (when (not= prim ::mem/pointer) + (keyword (name prim)))) + (mem/java-layout type))) (def ^:private unbox-fn-for-type "Map from type name to the name of its unboxing function." @@ -268,7 +269,10 @@ (-> symbol ensure-address (make-downcall args ret) - (make-serde-wrapper args ret))) + (cond-> + (every? #(= % (mem/primitive-type %)) + (cons ret args)) + (make-serde-wrapper args ret)))) (defn vacfn-factory "Constructs a varargs factory to call the native function referenced by `symbol`. @@ -547,7 +551,6 @@ args-types (gensym "args-types") ret-type (gensym "ret-type") address (gensym "symbol") - invoke (gensym "invoke") native-sym (gensym "native") [arity fn-tail] (-> args :wrapper :fn-tail) fn-tail (case arity @@ -561,14 +564,7 @@ `(let [~args-types ~(:native-arglist args) ~ret-type ~(:return-type args) ~address (find-symbol ~(name (:symbol args))) - ~invoke (make-downcall ~address ~args-types ~ret-type) - ~(or (-> args :wrapper :native-fn) native-sym) - ~(if (and (every? #(= % (mem/primitive-type %)) - (:native-arglist args)) - (= (:return-type args) - (mem/primitive-type (:return-type args)))) - invoke - `(make-serde-wrapper ~invoke ~args-types ~ret-type)) + ~native-sym (cfn ~address ~args-types ~ret-type) fun# ~(if (:wrapper args) `(fn ~(:name args) ~@fn-tail) diff --git a/src/clj/coffi/mem.clj b/src/clj/coffi/mem.clj index a237442..7378384 100644 --- a/src/clj/coffi/mem.clj +++ b/src/clj/coffi/mem.clj @@ -643,9 +643,9 @@ (defmethod deserialize-from ::array [segment [_array type count]] - (map #(deserialize-from % type) - (slice-segments (slice segment 0 (* count (size-of type))) - (size-of type)))) + (mapv #(deserialize-from % type) + (slice-segments (slice segment 0 (* count (size-of type))) + (size-of type)))) (s/def ::type (s/spec