Merge branch 'release/2021-10-09'

This commit is contained in:
Joshua Suskalo 2021-10-09 11:26:53 -05:00
commit a99ee34783
4 changed files with 444 additions and 19 deletions

View file

@ -1,6 +1,12 @@
# Change Log
All notable changes to this project will be documented in this file. This change log follows the conventions of [keepachangelog.com](http://keepachangelog.com/).
## [0.1.220] - 2021-10-09
### Fixed
- All-primitive method types still used serialization when called from `cfn`
- Arrays deserialized to non-vector sequences
- Non-primitive argument types fail to link
## [0.1.205] - 2021-10-06
### Added
- An `address?` predicate
@ -41,6 +47,7 @@ All notable changes to this project will be documented in this file. This change
- Support for serializing and deserializing arbitrary Clojure functions
- Support for serializing and deserializing arbitrary Clojure data structures
[0.1.220]: https://github.com/IGJoshua/coffi/compare/v0.1.205...v0.1.220
[0.1.205]: https://github.com/IGJoshua/coffi/compare/v0.1.192...v0.1.205
[0.1.192]: https://github.com/IGJoshua/coffi/compare/v0.1.184...v0.1.192
[0.1.184]: https://github.com/IGJoshua/coffi/compare/v0.1.176...v0.1.184

426
README.md
View file

@ -17,8 +17,8 @@ This library is available on Clojars. Add one of the following entries to the
`:deps` key of your `deps.edn`:
```clojure
org.suskalo/coffi {:mvn/version "0.1.205"}
io.github.IGJoshua/coffi {:git/tag "v0.1.205" :git/sha "0149012"}
org.suskalo/coffi {:mvn/version "0.1.220"}
io.github.IGJoshua/coffi {:git/tag "v0.1.220" :git/sha "abcbf0f"}
```
If you use this library as a git dependency, you will need to prepare the
@ -590,6 +590,406 @@ This functionality can be extended by specifying new types as implementations of
the multimethod `reify-symbolspec`, although it's recommended that for any
library authors who do so, namespaced keywords be used to name types.
## Alternatives
This library is not the only Clojure library providing access to native code. In
addition the following libraries exist:
- [dtype-next](https://github.com/cnuernber/dtype-next)
- [tech.jna](https://github.com/techascent/tech.jna)
- [clojure-jna](https://github.com/Chouser/clojure-jna)
Dtype-next has support for Java versions 8-16 and GraalVM, but is focused
strongly on array-based programming, and doesn't provide facilities for
callbacks, as well as being focused on keeping memory in the native side rather
than marshaling data to and from Clojure-native structures. In Java 16, this
uses the first iteration of Panama, while in other Java versions it uses JNA.
Tech.jna and clojure-jna both use the JNA library in all cases, and neither
provide support for dealing with struct types or callbacks.
An additional alternative to coffi is to directly use the JNI, which is the
longest-standing method of wrapping native code in the JVM, but comes with the
downside that it requires you to write both native and Java code to use, even if
you only intend to use it from Clojure.
If your application needs to be able to run in earlier versions of the JVM than
17, or you don't want to use incubator functionality, you should consider these
other options. Dtype-next provides the most robust support for native code, but
if you are wrapping a simple library then the other libraries may be more
appealing, as they have a smaller API surface area and it's easier to wrap
functions.
### Benchmarks
An additional consideration when thinking about alternatives is the performance
of each available option. It's an established fact that JNA (used by all three
alternative libraries on JDK <16) introduces more overhead when calling native
code than JNI does.
In order to provide a benchmark to see how much of a difference the different
native interfaces make, we can use
[criterium](https://github.com/hugoduncan/criterium) to benchmark each.
[GLFW](https://www.glfw.org)'s
[`glfwGetTime`](https://www.glfw.org/docs/latest/group__input.html#gaa6cf4e7a77158a3b8fd00328b1720a4a)
function will be used for the test as it performs a simple operation, and is
conveniently already wrapped in JNI by the excellent
[LWJGL](https://www.lwjgl.org/) library.
The following benchmarks were run on a Lenovo Thinkpad with an Intel i7-10610U
running Manjaro Linux, using Clojure 1.10.3 on Java 17.
#### JNI
The baseline for performance is the JNI. Using LWJGL it's relatively simple to
benchmark. The following Clojure CLI command will start a repl with LWJGL and
criterium loaded.
```sh
$ clj -Sdeps '{:deps {org.lwjgl/lwjgl {:mvn/version "3.2.3"}
org.lwjgl/lwjgl-glfw {:mvn/version "3.2.3"}
org.lwjgl/lwjgl$natives-linux {:mvn/version "3.2.3"}
org.lwjgl/lwjgl-glfw$natives-linux {:mvn/version "3.2.3"}
criterium/criterium {:mvn/version "0.4.6"}}}'
```
Then from the repl
```clojure
user=> (import 'org.lwjgl.glfw.GLFW)
org.lwjgl.glfw.GLFW
user=> (require '[criterium.core :as bench])
nil
user=> (GLFW/glfwInit)
true
user=> (bench/bench (GLFW/glfwGetTime) :verbose)
amd64 Linux 5.10.68-1-MANJARO 8 cpu(s)
OpenJDK 64-Bit Server VM 17+35-2724
Runtime arguments: -Dclojure.basis=/home/jsusk/.clojure/.cpcache/2667074721.basis
Evaluation count : 1613349900 in 60 samples of 26889165 calls.
Execution time sample mean : 32.698446 ns
Execution time mean : 32.697811 ns
Execution time sample std-deviation : 1.274600 ns
Execution time std-deviation : 1.276437 ns
Execution time lower quantile : 30.750813 ns ( 2.5%)
Execution time upper quantile : 33.757662 ns (97.5%)
Overhead used : 6.400704 ns
nil
```
GLFW requires that we initialize it before calling the `glfwGetTime` function.
Besides that this is a simple interop call which directly maps to the native
function.
This gives us a basis of 32.7 ns +/-1.3 ns. All other libraries will be
evaluated relative to this result.
To ensure fairness, we'll also get that overhead value to be used in further
tests.
```clojure
user=> bench/estimated-overhead-cache
6.400703613065185E-9
```
#### Coffi
The dependencies when using coffi are simpler, but it also requires some JVM
options to support the foreign access api.
```sh
$ clj -Sdeps '{:deps {org.suskalo/coffi {:mvn/version "0.1.205"}
criterium/criterium {:mvn/version "0.4.6"}}}' \
-J--add-modules=jdk.incubator.foreign \
-J--enable-native-access=ALL-UNNAMED
```
In order to ensure fair comparisons, we're going to use the same overhead value
on each run, so before we do the benchmark we'll set it to the observed value
from last time.
```clojure
user=> (require '[criterium.core :as bench])
nil
user=> (alter-var-root #'bench/estimated-overhead-cache (constantly 6.400703613065185E-9))
6.400703613065185E-9
user=> (require '[coffi.ffi :as ffi])
nil
user=> (require '[coffi.mem :as mem])
nil
user=> (ffi/load-system-library "glfw")
nil
user=> ((ffi/cfn "glfwInit" [] ::mem/int))
1
user=> (let [f (ffi/cfn "glfwGetTime" [] ::mem/double)]
(bench/bench (f) :verbose))
amd64 Linux 5.10.68-1-MANJARO 8 cpu(s)
OpenJDK 64-Bit Server VM 17+35-2724
Runtime arguments: --add-modules=jdk.incubator.foreign --enable-native-access=ALL-UNNAMED -Dclojure.basis=/home/jsusk/.clojure/.cpcache/72793624.basis
Evaluation count : 1657995600 in 60 samples of 27633260 calls.
Execution time sample mean : 31.382665 ns
Execution time mean : 31.386493 ns
Execution time sample std-deviation : 1.598571 ns
Execution time std-deviation : 1.608818 ns
Execution time lower quantile : 29.761194 ns ( 2.5%)
Execution time upper quantile : 33.228276 ns (97.5%)
Overhead used : 6.400704 ns
nil
```
This result is about 1.3 ns faster, and while that is less than the standard
deviation of 1.6, it's quite close to it.
#### Clojure-JNA
Clojure-JNA uses the JNA library, which was designed to provide Java with an
easy way to access native libraries, but which is known for not having the
greatest performance. Since this is an older project, I'm also including the
clojure dependency to ensure the correct version is used.
```sh
$ clj -Sdeps '{:deps {org.clojure/clojure {:mvn/version "1.10.3"}
net.n01se/clojure-jna {:mvn/version "1.0.0"}
criterium/criterium {:mvn/version "0.4.6"}}}'
```
The naive way to call the function using Clojure-JNA is to use `jna/invoke`.
```clojure
user=> (require '[criterium.core :as bench])
nil
user=> (alter-var-root #'bench/estimated-overhead-cache (constantly 6.400703613065185E-9))
6.400703613065185E-9
user=> (require '[net.n01se.clojure-jna :as jna])
nil
user=> (jna/invoke Integer glfw/glfwInit)
1
user=> (bench/bench (jna/invoke Double glfw/glfwGetTime) :verbose)
amd64 Linux 5.10.68-1-MANJARO 8 cpu(s)
OpenJDK 64-Bit Server VM 17+35-2724
Runtime arguments: -Dclojure.basis=/home/jsusk/.clojure/.cpcache/3229486237.basis
Evaluation count : 195948720 in 60 samples of 3265812 calls.
Execution time sample mean : 350.335614 ns
Execution time mean : 350.373520 ns
Execution time sample std-deviation : 24.833070 ns
Execution time std-deviation : 24.755929 ns
Execution time lower quantile : 300.000019 ns ( 2.5%)
Execution time upper quantile : 365.759273 ns (97.5%)
Overhead used : 6.400704 ns
Found 13 outliers in 60 samples (21.6667 %)
low-severe 12 (20.0000 %)
low-mild 1 (1.6667 %)
Variance from outliers : 53.4220 % Variance is severely inflated by outliers
nil
```
As you can see, this method of calling functions is very bad for performance,
with call overhead dominating function runtime by an order of magnitude. That
said, this isn't a completely fair comparison, nor the most realistic, because
this way of calling functions looks the function up on each invocation.
To adjust for this, we'll use the `jna/to-fn` function to give a persistent
handle to the function that we can call.
```clojure
user=> (let [f (jna/to-fn Double glfw/glfwGetTime)]
(bench/bench (f) :verbose))
amd64 Linux 5.10.68-1-MANJARO 8 cpu(s)
OpenJDK 64-Bit Server VM 17+35-2724
Runtime arguments: -Dclojure.basis=/home/jsusk/.clojure/.cpcache/3229486237.basis
Evaluation count : 611095020 in 60 samples of 10184917 calls.
Execution time sample mean : 104.623634 ns
Execution time mean : 104.638406 ns
Execution time sample std-deviation : 7.649296 ns
Execution time std-deviation : 7.638963 ns
Execution time lower quantile : 92.446016 ns ( 2.5%)
Execution time upper quantile : 110.258832 ns (97.5%)
Overhead used : 6.400704 ns
nil
```
This is much better, but is still about 3x slower than JNI, meaning the overhead
from using JNA is still bigger than the function runtime.
This performance penalty is still small in the scope of longer-running
functions, and so may not be a concern for your application, but it is something
to be aware of.
#### tech.jna
The tech.jna library is similar in scope to Clojure-JNA, however was written to
fit into an ecosystem of libraries meant for array-based programming for machine
learning and data science.
```sh
$ clj -Sdeps '{:deps {techascent/tech.jna {:mvn/version "4.05"}
criterium/criterium {:mvn/version "0.4.6"}}}'
```
This library is also quite simple to use, the only slightly odd thing I'm doing
here is to dereference the var outside the benchmark in order to ensure it's an
apples-to-apples comparison. We don't want var dereference time mucking up our
benchmark.
```clojure
user=> (require '[criterium.core :as bench])
nil
user=> (alter-var-root #'bench/estimated-overhead-cache (constantly 6.400703613065185E-9))
6.400703613065185E-9
user=> (require '[tech.v3.jna :as jna])
nil
user=> (jna/def-jna-fn "glfw" glfwInit "initialize glfw" Integer)
#'user/glfwInit
user=> (glfwInit)
Oct 09, 2021 10:30:50 AM clojure.tools.logging$eval1122$fn__1125 invoke
INFO: Library glfw found at [:system "glfw"]
1
user=> (jna/def-jna-fn "glfw" glfwGetTime "gets the time as a double since init" Double)
#'user/glfwGetTime
user=> (let [f @#'glfwGetTime]
(bench/bench (f) :verbose))
amd64 Linux 5.10.68-1-MANJARO 8 cpu(s)
OpenJDK 64-Bit Server VM 17+35-2724
Runtime arguments: -Dclojure.basis=/home/jsusk/.clojure/.cpcache/2910209237.basis
Evaluation count : 323281680 in 60 samples of 5388028 calls.
Execution time sample mean : 203.976803 ns
Execution time mean : 203.818712 ns
Execution time sample std-deviation : 14.557312 ns
Execution time std-deviation : 14.614080 ns
Execution time lower quantile : 179.732593 ns ( 2.5%)
Execution time upper quantile : 213.929374 ns (97.5%)
Overhead used : 6.400704 ns
nil
```
This version is even slower than Clojure-JNA. I'm unsure where this overhead is
coming from, but I'll admit that I haven't looked at their implementations very
closely.
#### dtype-next
The library dtype-next replaced tech.jna in the toolkit of the group working on
machine learning and array-based programming, and it includes support for
composite data types including structs, as well as primitive functions.
In addition, dtype-next has two different ffi backends. First is JNA, which is
usable on any JDK version, and is what we'll use for the first benchmark. Second
is the Java 16 version of Project Panama, which will be shown next.
In order to use the dtype-next ffi with the JNA backend, the JNA library has to
be included in the dependencies.
```sh
$ clj -Sdeps '{:deps {cnuernber/dtype-next {:mvn/version "8.032"}
net.java.dev.jna/jna {:mvn/version "5.8.0"}
criterium/criterium {:mvn/version "0.4.6"}}}'
```
The dtype-next library also requires some more ceremony around declaring native
functions. One advantage this has is that multiple symbols with the same name
can be loaded from different shared libraries, but it also does increase
friction when defining native wrappers.
Some easier ways to define native wrappers are provided than what is seen here,
but they share some disadvantages in documentation over the core methods
provided in coffi, although they are comparable to the data model provided in
coffi.
```clojure
user=> (require '[criterium.core :as bench])
nil
user=> (alter-var-root #'bench/estimated-overhead-cache (constantly 6.400703613065185E-9))
6.400703613065185E-9
user=> (require '[tech.v3.datatype.ffi :as dt-ffi])
nil
user=> (def fn-defs {:glfwInit {:rettype :int32} :glfwGetTime {:rettype :float64}})
#'user/fn-defs
user=> (def library-def (dt-ffi/define-library fn-defs))
#'user/library-def
user=> (def library-instance (dt-ffi/instantiate-library library-def "/usr/lib/libglfw.so"))
#'user/library-instance
user=> (def init (:glfwInit @library-instance))
#'user/init
user=> (init)
1
user=> (let [f (:glfwGetTime @library-instance)]
(bench/bench (f) :verbose))
amd64 Linux 5.10.68-1-MANJARO 8 cpu(s)
OpenJDK 64-Bit Server VM 17+35-2724
Runtime arguments: -Dclojure.basis=/home/jsusk/.clojure/.cpcache/643862289.basis
Evaluation count : 710822100 in 60 samples of 11847035 calls.
Execution time sample mean : 90.900112 ns
Execution time mean : 90.919917 ns
Execution time sample std-deviation : 6.463312 ns
Execution time std-deviation : 6.470108 ns
Execution time lower quantile : 79.817126 ns ( 2.5%)
Execution time upper quantile : 95.454652 ns (97.5%)
Overhead used : 6.400704 ns
nil
```
This version of JNA usage is significantly faster than either of the other JNA
libraries, but is still substantially slower than using JNI or coffi.
In addition to the JNA backend, dtype-next has a Java 16-specific backend that
uses an older version of Panama. This version requires similar setup to coffi in
order to run.
```sh
$ clj -Sdeps '{:deps {cnuernber/dtype-next {:mvn/version "8.032"}
criterium/criterium {:mvn/version "0.4.6"}}}' \
-J--add-modules=jdk.incubator.foreign \
-J-Dforeign.restricted=permit \
-J--add-opens=java.base/java.lang=ALL-UNNAMED \
-J-Djava.library.path=/usr/lib/x86_64-linux-gnu
```
The actual code to run the benchmark is identical to the last example, but is
reproduced here for completeness.
```clojure
user=> (require '[criterium.core :as bench])
nil
user=> (alter-var-root #'bench/estimated-overhead-cache (constantly 6.400703613065185E-9))
6.400703613065185E-9
user=> (require '[tech.v3.datatype.ffi :as dt-ffi])
nil
user=> (def fn-defs {:glfwInit {:rettype :int32} :glfwGetTime {:rettype :float64}})
#'user/fn-defs
user=> (def library-def (dt-ffi/define-library fn-defs))
#'user/library-def
user=> (def library-instance (dt-ffi/instantiate-library library-def "/usr/lib/libglfw.so"))
#'user/library-instance
user=> (def init (:glfwInit @library-instance))
#'user/init
user=> (init)
1
user=> (let [f (:glfwGetTime @library-instance)]
(bench/bench (f) :verbose))
amd64 Linux 5.10.68-1-MANJARO 8 cpu(s)
OpenJDK 64-Bit Server VM 16.0.2+7
Runtime arguments: --add-modules=jdk.incubator.foreign -Dforeign.restricted=permit --add-opens=java.base/java.lang=ALL-UNNAMED -Djava.library.path=/usr/lib/x86_64-linux-gnu -Dclojure.basis=/home/jsusk/.clojure/.cpcache/2337051659.basis
Evaluation count : 1588513080 in 60 samples of 26475218 calls.
Execution time sample mean : 58.732468 ns
Execution time mean : 58.647361 ns
Execution time sample std-deviation : 9.732389 ns
Execution time std-deviation : 9.791738 ns
Execution time lower quantile : 31.318115 ns ( 2.5%)
Execution time upper quantile : 65.449222 ns (97.5%)
Overhead used : 6.400704 ns
Found 14 outliers in 60 samples (23.3333 %)
low-severe 8 (13.3333 %)
low-mild 4 (6.6667 %)
high-mild 2 (3.3333 %)
Variance from outliers : 87.6044 % Variance is severely inflated by outliers
nil
```
Not reproduced here, but notable for comparison, in my testing Java 16's version
of the JNI version performed about the same.
This is significantly faster than the JNA version of dtype-next, but it is still
slower than modern Panama. This is likely to simply be a result of optimizations
and changes to the Panama API, and when dtype-next is updated to use the Java 17
version of Panama I expect it will perform in line with coffi, but this
benchmark will be reproduced when this happens. Still, this shows that as it
stands, coffi is the fastest FFI available to Clojure developers.
## Known Issues
The project author is aware of these issues and plans to fix them in a future
release:
@ -603,6 +1003,28 @@ These features are planned for future releases.
- Functions for wrapping structs in padding following various standards
- Header parsing tool for generating a data model?
- Generic type aliases
- Helpers for generating enums & bitflags
- Helper macro for out arguments
- Improve error messages from defcfn macro
- Mapped memory
### Future JDKs
The purpose of coffi is to provide a wrapper for published versions of Project
Panama, starting with JDK 17. As new JDKs are released, coffi will be ported to
the newer versions of Panama. When JDK 18 is released, a tag will be added to
mark the final release of coffi that is compatible with Java 17, as that is the
LTS release of the JDK. Development of new features and fixes as well as support
for new Panama idioms and features will continue with focus only on the latest
JDK. If a particular feature is not specific to the newer JDK, PRs backporting
it to versions of coffi supporting Java 17 will likely be accepted.
### 1.0 Release
Because the feature that coffi wraps in the JDK is an incubator feature (and
likely in JDK 19 a [preview
feature](https://mail.openjdk.java.net/pipermail/panama-dev/2021-September/014946.html))
coffi itself will not be released in a 1.0.x version until the feature becomes a
core part of the JDK, likely before or during the next LTS release, Java 21, in
September 2023.
## License

View file

@ -20,6 +20,7 @@
CLinker
FunctionDescriptor
MemoryLayout
MemorySegment
SegmentAllocator)))
;;; FFI Code loading and function access
@ -108,10 +109,10 @@
(defn- insn-layout
"Gets the type keyword or class for referring to the type in bytecode."
[type]
(when-some [prim (mem/primitive-type type)]
(if (not= prim ::mem/pointer)
(keyword (name prim))
(mem/java-layout type))))
(or (when-some [prim (mem/primitive-type type)]
(when (not= prim ::mem/pointer)
(keyword (name prim))))
(mem/java-layout type)))
(def ^:private unbox-fn-for-type
"Map from type name to the name of its unboxing function."
@ -268,7 +269,10 @@
(-> symbol
ensure-address
(make-downcall args ret)
(make-serde-wrapper args ret)))
(cond->
(every? #(= % (mem/primitive-type %))
(cons ret args))
(make-serde-wrapper args ret))))
(defn vacfn-factory
"Constructs a varargs factory to call the native function referenced by `symbol`.
@ -547,7 +551,6 @@
args-types (gensym "args-types")
ret-type (gensym "ret-type")
address (gensym "symbol")
invoke (gensym "invoke")
native-sym (gensym "native")
[arity fn-tail] (-> args :wrapper :fn-tail)
fn-tail (case arity
@ -561,14 +564,7 @@
`(let [~args-types ~(:native-arglist args)
~ret-type ~(:return-type args)
~address (find-symbol ~(name (:symbol args)))
~invoke (make-downcall ~address ~args-types ~ret-type)
~(or (-> args :wrapper :native-fn) native-sym)
~(if (and (every? #(= % (mem/primitive-type %))
(:native-arglist args))
(= (:return-type args)
(mem/primitive-type (:return-type args))))
invoke
`(make-serde-wrapper ~invoke ~args-types ~ret-type))
~native-sym (cfn ~address ~args-types ~ret-type)
fun# ~(if (:wrapper args)
`(fn ~(:name args)
~@fn-tail)

View file

@ -643,7 +643,7 @@
(defmethod deserialize-from ::array
[segment [_array type count]]
(map #(deserialize-from % type)
(mapv #(deserialize-from % type)
(slice-segments (slice segment 0 (* count (size-of type)))
(size-of type))))