diff --git a/README.md b/README.md index 76e068a..c1d6dc4 100644 --- a/README.md +++ b/README.md @@ -2,14 +2,16 @@ Current [semantic](http://semver.org/) version: ```clojure [com.taoensso/nippy "1.2.1"] ; Stable -[com.taoensso/nippy "1.3.0-alpha3"] ; Development (adds crypto support!) +[com.taoensso/nippy "2.0.0-alpha1"] ; Development (see notes below) ``` +2.x adds pluggable compression, crypto support (also pluggable), an improved API (including much better error messages), and hugely improved performance. It **is backwards compatible**, but please note that the `freeze-to-bytes`/`thaw-from-bytes` API has been **deprecated** in favor of `freeze`/`thaw`. + # Nippy, a Clojure serialization library Clojure's [rich data types](http://clojure.org/datatypes) are *awesome*. And its [reader](http://clojure.org/reader) allows you to take your data just about anywhere. But the reader can be painfully slow when you've got a lot of data to crunch (like when you're serializing to a database). -Nippy is an attempt to provide a drop-in, high-performance alternative to the reader. It's a fork of [Deep-Freeze](https://github.com/halgari/deep-freeze) and is used as the [Carmine Redis client](https://github.com/ptaoussanis/carmine) serializer. +Nippy is an attempt to provide a reliable, high-performance **drop-in alternative to the reader**. It's used, among others, as the [Carmine Redis client](https://github.com/ptaoussanis/carmine) and [Faraday DynamoDB client]https://github.com/ptaoussanis/faraday) serializer. ## What's in the box™? * Small, uncomplicated **all-Clojure** library. @@ -17,8 +19,8 @@ Nippy is an attempt to provide a drop-in, high-performance alternative to the re * Comprehesive, extensible **support for all major data types**. * **Reader-fallback** for difficult/future types (including Clojure 1.4+ tagged literals). * **Full test coverage** for every supported type. - * [Snappy](http://code.google.com/p/snappy/) **integrated de/compression** for efficient storage and network transfer. - * Enable **high-strength encryption** with a single `:password [:salted "my-password"]` option. (1.3.0+) + * Fully pluggable **compression**, including built-in high-performance [Snappy](http://code.google.com/p/snappy/) compressor. + * Fully pluggable **encryption**, including built-in high-strength AES128 enabled with a single `:password [:salted "my-password"]` option. (2.0.0+) ## Getting started @@ -76,24 +78,21 @@ nippy/stress-data :bigdec (bigdec 3.1415926535897932384626433832795) :ratio 22/7 - - ;; Clojure 1.4+ - ;; :tagged-uuid (java.util.UUID/randomUUID) - ;; :tagged-date (java.util.Date.) - } + :tagged-uuid (java.util.UUID/randomUUID) + :tagged-date (java.util.Date.)} ``` Serialize it: ```clojure -(def frozen-stress-data (nippy/freeze-to-bytes nippy/stress-data)) +(def frozen-stress-data (nippy/freeze nippy/stress-data)) => # ``` Deserialize it: ```clojure -(nippy/thaw-from-bytes frozen-stress-data) +(nippy/thaw frozen-stress-data) => {:bytes (byte-array [(byte 1) (byte 2) (byte 3)]) :nil nil :boolean true @@ -104,14 +103,14 @@ Couldn't be simpler! ### Encryption (currently in **ALPHA**) -As of 1.3.0, Nippy also gives you **dead simple data encryption**. Add a single option to your usual freeze/thaw calls like so: +As of 2.0.0, Nippy also gives you **dead simple data encryption**. Add a single option to your usual freeze/thaw calls like so: ```clojure -(nippy/freeze-to-bytes nippy/stress-data :password [:salted "my-password"]) ; Encrypt -(nippy/thaw-from-bytes :password [:salted "my-password"]) ; Decrypt +(nippy/freeze nippy/stress-data {:password [:salted "my-password"]}) ; Encrypt +(nippy/thaw {:password [:salted "my-password"]}) ; Decrypt ``` -There's two forms of encryption on offer: `:salted` and `:cached`. Each of these makes carefully-chosen trade-offs and is suited to one of two common use cases. See the `aes128-salted` and `aes128-cached` [docstrings](http://ptaoussanis.github.io/nippy/taoensso.nippy.crypto.html) for a detailed explanation of why/when you'd want one or the other. +There's two default forms of encryption on offer: `:salted` and `:cached`. Each of these makes carefully-chosen trade-offs and is suited to one of two common use cases. See the `default-aes128-encryptor` [docstring](http://ptaoussanis.github.io/nippy/taoensso.nippy.encryption.html) for a detailed explanation of why/when you'd want one or the other. ## Performance @@ -138,4 +137,4 @@ Otherwise reach me (Peter Taoussanis) at [taoensso.com](https://www.taoensso.com ## License -Copyright © 2012, 2013 Peter Taoussanis. Distributed under the [Eclipse Public License](http://www.eclipse.org/legal/epl-v10.html), the same as Clojure. +Copyright © 2012, 2013 Peter Taoussanis. Distributed under the [Eclipse Public License](http://www.eclipse.org/legal/epl-v10.html), the same as Clojure. \ No newline at end of file diff --git a/benchmarks/chart.png b/benchmarks/chart.png index 77608db..8fbbff7 100644 Binary files a/benchmarks/chart.png and b/benchmarks/chart.png differ diff --git a/project.clj b/project.clj index ae44847..7f34a34 100644 --- a/project.clj +++ b/project.clj @@ -1,16 +1,23 @@ -(defproject com.taoensso/nippy "1.3.0-alpha3" +(defproject com.taoensso/nippy "2.0.0-alpha1" :description "Clojure serialization library" :url "https://github.com/ptaoussanis/nippy" :license {:name "Eclipse Public License" :url "http://www.eclipse.org/legal/epl-v10.html"} - :dependencies [[org.clojure/clojure "1.3.0"] + :dependencies [[org.clojure/clojure "1.4.0"] + [expectations "1.4.43"] [org.iq80.snappy/snappy "0.3"]] - :profiles {:1.3 {:dependencies [[org.clojure/clojure "1.3.0"]]} - :1.4 {:dependencies [[org.clojure/clojure "1.4.0"]]} - :1.5 {:dependencies [[org.clojure/clojure "1.5.1"]]} - :dev {:dependencies []} - :test {:dependencies [[org.xerial.snappy/snappy-java "1.0.5-M3"]]}} - :aliases {"test-all" ["with-profile" "test,1.3:test,1.4:test,1.5" "test"]} - :plugins [[codox "0.6.4"]] + :profiles {:1.4 {:dependencies [[org.clojure/clojure "1.4.0"]]} + :1.5 {:dependencies [[org.clojure/clojure "1.5.1"]]} + :dev {:dependencies []} + :test {:dependencies [[org.xerial.snappy/snappy-java "1.0.5-M3"]]} + :bench {:dependencies [] + :jvm-opts ["-server" "-XX:+UseCompressedOops"]}} + :aliases {"test-all" ["with-profile" "test,1.4:test,1.5" "expectations"] + "test-auto" ["with-profile" "test" "autoexpect"] + "start-dev" ["with-profile" "dev,test,bench" "repl" ":headless"] + "start-bench" ["trampoline" "start-dev"]} + :plugins [[lein-expectations "0.0.7"] + [lein-autoexpect "0.2.5"] + [codox "0.6.4"]] :min-lein-version "2.0.0" :warn-on-reflection true) diff --git a/src/taoensso/nippy.clj b/src/taoensso/nippy.clj index 2ab1b57..ba54b48 100644 --- a/src/taoensso/nippy.clj +++ b/src/taoensso/nippy.clj @@ -1,16 +1,30 @@ (ns taoensso.nippy - "Simple, high-performance Clojure serialization library. Adapted from - Deep-Freeze." + "Simple, high-performance Clojure serialization library. Originally adapted + from Deep-Freeze." {:author "Peter Taoussanis"} - (:require [taoensso.nippy.utils :as utils] - [taoensso.nippy.crypto :as crypto]) + (:require [taoensso.nippy + (utils :as utils) + (compression :as compression) + (encryption :as encryption)]) (:import [java.io DataInputStream DataOutputStream ByteArrayOutputStream ByteArrayInputStream] [clojure.lang Keyword BigInt Ratio PersistentQueue PersistentTreeMap PersistentTreeSet IPersistentList IPersistentVector IPersistentMap IPersistentSet IPersistentCollection])) -;;;; Define type IDs +;; TODO Allow ba or wrapped-ba input? +;; TODO Provide ToFreeze, Frozen, Encrypted, etc. tooling helpers + +;;;; Header IDs +;; Nippy 2.x+ prefixes frozen data with a 5-byte header: + +(def ^:const id-nippy-magic-prefix (byte 17)) +(def ^:const id-nippy-header-ver (byte 0)) +;; * Compressor id (0 if no compressor) +;; * Encryptor id (0 if no encryptor) +(def ^:const id-nippy-reserved (byte 0)) + +;;;; Data type IDs ;; 1 (def ^:const id-bytes (int 2)) @@ -53,74 +67,80 @@ ;;;; Shared low-level stream stuff -(defn- write-id! [^DataOutputStream stream ^Integer id] (.writeByte stream id)) +(defn- write-id [^DataOutputStream stream ^Integer id] (.writeByte stream id)) -(defn- write-bytes! +(defn- write-bytes "Writes arbitrary byte data, preceded by its length." [^DataOutputStream stream ^bytes ba] (let [size (alength ba)] (.writeInt stream size) ; Encode size of byte array (.write stream ba 0 size))) -(defn- write-biginteger! - "Wrapper around `write-bytes!` for common case of writing a BigInteger." +(defn- write-biginteger + "Wrapper around `write-bytes` for common case of writing a BigInteger." [^DataOutputStream stream ^BigInteger x] - (write-bytes! stream (.toByteArray x))) + (write-bytes stream (.toByteArray x))) -(defn- read-bytes! +(defn- read-bytes "Reads arbitrary byte data, preceded by its length." ^bytes [^DataInputStream stream] (let [size (.readInt stream) ba (byte-array size)] (.read stream ba 0 size) ba)) -(defn- read-biginteger! - "Wrapper around `read-bytes!` for common case of reading a BigInteger. +(defn- read-biginteger + "Wrapper around `read-bytes` for common case of reading a BigInteger. Note that as of Clojure 1.3, java.math.BigInteger ≠ clojure.lang.BigInt." ^BigInteger [^DataInputStream stream] - (BigInteger. (read-bytes! stream))) + (BigInteger. (read-bytes stream))) ;;;; Freezing -(defprotocol Freezable (freeze [this stream])) +(defprotocol Freezable (freeze-to-stream* [this stream])) -(defmacro freezer +(defn- freeze-to-stream + "Like `freeze-to-stream*` but with metadata support." + [x ^DataOutputStream s] + (if-let [m (meta x)] + (do (write-id s id-meta) + (freeze-to-stream m s))) + (freeze-to-stream* x s)) + +(defmacro ^:private freezer "Helper to extend Freezable protocol." [type id & body] `(extend-type ~type ~'Freezable - (~'freeze [~'x ~(with-meta 's {:tag 'DataOutputStream})] - (write-id! ~'s ~id) + (~'freeze-to-stream* [~'x ~(with-meta 's {:tag 'DataOutputStream})] + (write-id ~'s ~id) ~@body))) -(defmacro coll-freezer +(defmacro ^:private coll-freezer "Extends Freezable to simple collection types." [type id & body] `(freezer ~type ~id - (.writeInt ~'s (count ~'x)) ; Encode collection length - (doseq [i# ~'x] (freeze-to-stream!* ~'s i#)))) + (.writeInt ~'s (count ~'x)) + (doseq [i# ~'x] (freeze-to-stream i# ~'s)))) -(defmacro kv-freezer +(defmacro ^:private kv-freezer "Extends Freezable to key-value collection types." [type id & body] `(freezer ~type ~id - (.writeInt ~'s (* 2 (count ~'x))) ; Encode num kvs + (.writeInt ~'s (* 2 (count ~'x))) (doseq [[k# v#] ~'x] - (freeze-to-stream!* ~'s k#) - (freeze-to-stream!* ~'s v#)))) + (freeze-to-stream k# ~'s) + (freeze-to-stream v# ~'s)))) -(freezer (Class/forName "[B") id-bytes (write-bytes! s x)) +(freezer (Class/forName "[B") id-bytes (write-bytes s x)) (freezer nil id-nil) (freezer Boolean id-boolean (.writeBoolean s x)) (freezer Character id-char (.writeChar s (int x))) -(freezer String id-string (write-bytes! s (.getBytes x "UTF-8"))) +(freezer String id-string (write-bytes s (.getBytes x "UTF-8"))) (freezer Keyword id-keyword (.writeUTF s (if-let [ns (namespace x)] (str ns "/" (name x)) (name x)))) -(declare freeze-to-stream!*) - (coll-freezer PersistentQueue id-queue) (coll-freezer PersistentTreeSet id-sorted-set) (kv-freezer PersistentTreeMap id-sorted-map) @@ -135,154 +155,188 @@ (freezer Short id-short (.writeShort s x)) (freezer Integer id-integer (.writeInt s x)) (freezer Long id-long (.writeLong s x)) -(freezer BigInt id-bigint (write-biginteger! s (.toBigInteger x))) -(freezer BigInteger id-bigint (write-biginteger! s x)) +(freezer BigInt id-bigint (write-biginteger s (.toBigInteger x))) +(freezer BigInteger id-bigint (write-biginteger s x)) (freezer Float id-float (.writeFloat s x)) (freezer Double id-double (.writeDouble s x)) (freezer BigDecimal id-bigdec - (write-biginteger! s (.unscaledValue x)) + (write-biginteger s (.unscaledValue x)) (.writeInt s (.scale x))) (freezer Ratio id-ratio - (write-biginteger! s (.numerator x)) - (write-biginteger! s (.denominator x))) + (write-biginteger s (.numerator x)) + (write-biginteger s (.denominator x))) ;; Use Clojure's own reader as final fallback -(freezer Object id-reader (write-bytes! s (.getBytes (pr-str x) "UTF-8"))) +(freezer Object id-reader (write-bytes s (.getBytes (pr-str x) "UTF-8"))) -(defn- freeze-to-stream!* [^DataOutputStream s x] - (if-let [m (meta x)] - (do (write-id! s id-meta) - (freeze-to-stream!* s m))) - (freeze x s)) +(defn- wrap-nippy-header [data-ba compressor encryptor password] + (let [header-ba (byte-array + [id-nippy-magic-prefix + id-nippy-header-ver + (byte (if compressor (compression/header-id compressor) 0)) + (byte (if password (encryption/header-id encryptor) 0)) + id-nippy-reserved])] + (utils/ba-concat header-ba data-ba))) -(defn freeze-to-stream! - "Serializes x to given output stream." - ([data-output-stream x] ; For <= 1.0.1 compatibility - (freeze-to-stream! data-output-stream x true)) - ([data-output-stream x print-dup?] - (binding [*print-dup* print-dup?] ; For `pr-str` - (freeze-to-stream!* data-output-stream x)))) - -(defn freeze-to-bytes - "Serializes x to a byte array and returns the array." - ^bytes [x & {:keys [compress? print-dup? password] - :or {compress? true - print-dup? true}}] +(defn freeze + "Serializes arg (any Clojure data type) to a byte array. Enable + `:legacy-mode?` flag to produce bytes readable by Nippy < 2.x." + ^bytes [x & [{:keys [print-dup? password compressor encryptor legacy-mode?] + :or {print-dup? true + compressor compression/default-snappy-compressor + encryptor encryption/default-aes128-encryptor}}]] (let [ba (ByteArrayOutputStream.) stream (DataOutputStream. ba)] - (freeze-to-stream! stream x print-dup?) + (binding [*print-dup* print-dup?] (freeze-to-stream x stream)) (let [ba (.toByteArray ba) - ba (if compress? (utils/compress-bytes ba) ba) - ba (if password (crypto/encrypt-aes128 password ba) ba)] - ba))) + ba (if compressor (compression/compress compressor ba) ba) + ba (if password (encryption/encrypt encryptor password ba) ba)] + (if legacy-mode? ba (wrap-nippy-header ba compressor encryptor password))))) ;;;; Thawing -(declare thaw-from-stream!*) +(declare thaw-from-stream) -(defn coll-thaw! +(defn coll-thaw "Thaws simple collection types." - [^DataInputStream s] - (repeatedly (.readInt s) #(thaw-from-stream!* s))) + [coll ^DataInputStream s] + (utils/repeatedly-into coll (.readInt s) #(thaw-from-stream s))) -(defn coll-thaw-kvs! +(defn coll-thaw-kvs "Thaws key-value collection types." - [^DataInputStream s] - (repeatedly (/ (.readInt s) 2) - (fn [] [(thaw-from-stream!* s) (thaw-from-stream!* s)]))) + [coll ^DataInputStream s] + (utils/repeatedly-into coll (/ (.readInt s) 2) + (fn [] [(thaw-from-stream s) (thaw-from-stream s)]))) -(defn- thaw-from-stream!* +(defn- thaw-from-stream [^DataInputStream s] (let [type-id (.readByte s)] (utils/case-eval type-id - id-reader (read-string (String. (read-bytes! s) "UTF-8")) - id-bytes (read-bytes! s) + id-reader (read-string (String. (read-bytes s) "UTF-8")) + id-bytes (read-bytes s) id-nil nil id-boolean (.readBoolean s) id-char (.readChar s) - id-string (String. (read-bytes! s) "UTF-8") + id-string (String. (read-bytes s) "UTF-8") id-keyword (keyword (.readUTF s)) - id-queue (into (PersistentQueue/EMPTY) (coll-thaw! s)) - id-sorted-set (into (sorted-set) (coll-thaw! s)) - id-sorted-map (into (sorted-map) (coll-thaw-kvs! s)) + id-queue (coll-thaw (PersistentQueue/EMPTY) s) + id-sorted-set (coll-thaw (sorted-set) s) + id-sorted-map (coll-thaw-kvs (sorted-map) s) - id-list (into '() (reverse (coll-thaw! s))) - id-vector (into [] (coll-thaw! s)) - id-set (into #{} (coll-thaw! s)) - id-map (into {} (coll-thaw-kvs! s)) - id-coll (doall (coll-thaw! s)) + id-list (into '() (rseq (coll-thaw [] s))) + id-vector (coll-thaw [] s) + id-set (coll-thaw #{} s) + id-map (coll-thaw-kvs {} s) + id-coll (seq (coll-thaw [] s)) - id-meta (let [m (thaw-from-stream!* s)] (with-meta (thaw-from-stream!* s) m)) + id-meta (let [m (thaw-from-stream s)] (with-meta (thaw-from-stream s) m)) id-byte (.readByte s) id-short (.readShort s) id-integer (.readInt s) id-long (.readLong s) - id-bigint (bigint (read-biginteger! s)) + id-bigint (bigint (read-biginteger s)) id-float (.readFloat s) id-double (.readDouble s) - id-bigdec (BigDecimal. (read-biginteger! s) (.readInt s)) + id-bigdec (BigDecimal. (read-biginteger s) (.readInt s)) - id-ratio (/ (bigint (read-biginteger! s)) - (bigint (read-biginteger! s))) + id-ratio (/ (bigint (read-biginteger s)) + (bigint (read-biginteger s))) ;;; DEPRECATED id-old-reader (read-string (.readUTF s)) id-old-string (.readUTF s) - id-old-map (apply hash-map (repeatedly (* 2 (.readInt s)) - #(thaw-from-stream!* s))) + id-old-map (apply hash-map (utils/repeatedly-into [] (* 2 (.readInt s)) + #(thaw-from-stream s))) (throw (Exception. (str "Failed to thaw unknown type ID: " type-id)))))) -(defn thaw-from-stream! - "Deserializes an object from given input stream." - [data-input-stream read-eval?] - (binding [*read-eval* read-eval?] - (let [;; Support older versions of Nippy that wrote a version header - maybe-schema-header (thaw-from-stream!* data-input-stream)] - (if (and (string? maybe-schema-header) - (.startsWith ^String maybe-schema-header "\u0000~")) - (thaw-from-stream!* data-input-stream) - maybe-schema-header)))) +(defn thaw + "Deserializes frozen bytes to their original Clojure data type. Enable + `:legacy-mode?` to read bytes written by Nippy < 2.x. -(defn thaw-from-bytes - "Deserializes an object from given byte array." - [ba & {:keys [compressed? read-eval? password] - :or {compressed? true - read-eval? false ; For `read-string` injection safety - NB!!! - }}] - (try - (-> (let [ba (if password (crypto/decrypt-aes128 password ba) ba) - ba (if compressed? (utils/uncompress-bytes ba) ba)] - ba) - (ByteArrayInputStream.) - (DataInputStream.) - (thaw-from-stream! read-eval?)) - (catch Exception e - (throw (Exception. - (cond password "Thaw failed. Unencrypted data or bad password?" - compressed? "Thaw failed. Encrypted or uncompressed data?" - :else "Thaw failed. Encrypted and/or compressed data?") - e))))) + WARNING: Enabling `:read-eval?` can lead to security vulnerabilities unless + you are sure you know what you're doing." + [^bytes ba & [{:keys [read-eval? password compressor encryptor legacy-mode? + strict?] + :or {compressor compression/default-snappy-compressor + encryptor encryption/default-aes128-encryptor}}]] -(comment - (-> (freeze-to-bytes "my data" :password [:salted "password"]) - (thaw-from-bytes)) - (-> (freeze-to-bytes "my data" :compress? true) - (thaw-from-bytes :compressed? false))) + (let [ex (fn [msg & [e]] (throw (Exception. (str "Thaw failed. " msg) e))) + thaw-data (fn [data-ba compressor password] + (let [ba data-ba + ba (if password (encryption/decrypt encryptor password ba) ba) + ba (if compressor (compression/decompress compressor ba) ba) + stream (DataInputStream. (ByteArrayInputStream. ba))] + (binding [*read-eval* read-eval?] (thaw-from-stream stream))))] -(def stress-data - "Reference data used for tests & benchmarks." - (let [support-tagged-literals? - (utils/version-sufficient? (clojure-version) "1.4.0")] + (if legacy-mode? ; Nippy < 2.x + (try (thaw-data ba compressor password) + (catch Exception e + (cond password (ex "Unencrypted data or wrong password?" e) + compressor (ex "Encrypted or uncompressed data?" e) + :else (ex "Encrypted and/or compressed data?" e)))) + ;; Nippy >= 2.x, we have a header! + (let [[[id-magic* id-header* id-comp* id-enc* _] data-ba] + (utils/ba-split ba 5) + + compressed? (not (zero? id-comp*)) + encrypted? (not (zero? id-enc*))] + + (cond + (not= id-magic* id-nippy-magic-prefix) + (ex (str "Not Nippy data, data frozen with Nippy < 2.x, " + "or data may be corrupt?\n" + "Enable `:legacy-mode?` option for data frozen with Nippy < 2.x.")) + + (> id-header* id-nippy-header-ver) + (ex "Data frozen with newer Nippy version. Please upgrade.") + + (and strict? (not encrypted?) password) + (ex (str "Data is not encrypted. Try again w/o password.\n" + "Disable `:strict?` option to ignore this error. ")) + + (and strict? (not compressed?) compressor) + (ex (str "Data is not compressed. Try again w/o compressor.\n" + "Disable `:strict?` option to ignore this error.")) + + (and encrypted? (not password)) + (ex "Data is encrypted. Please try again with a password.") + + (and encrypted? password + (not= id-enc* (encryption/header-id encryptor))) + (ex "Data encrypted with a different Encrypter.") + + (and compressed? compressor + (not= id-comp* (compression/header-id compressor))) + (ex "Data compressed with a different Compressor.") + + :else + (try (thaw-data data-ba (when compressed? compressor) + (when encrypted? password)) + (catch Exception e + (if (and encrypted? password) + (ex "Wrong password, or data may be corrupt?" e) + (ex "Data may be corrupt?" e))))))))) + +(comment (thaw (freeze "hello")) + (thaw (freeze "hello" {:compressor nil})) + (thaw (freeze "hello" {:compressor nil}) {:strict? true}) ; ex + (thaw (freeze "hello" {:password [:salted "p"]})) ; ex + (thaw (freeze "hello") {:password [:salted "p"]})) + +;;;; Stress data + +(def stress-data "Reference data used for tests & benchmarks." + (let [] {:bytes (byte-array [(byte 1) (byte 2) (byte 3)]) :nil nil :boolean true @@ -323,6 +377,25 @@ :ratio 22/7 - ;; Clojure 1.4+ - :tagged-uuid (when support-tagged-literals? (java.util.UUID/randomUUID)) - :tagged-date (when support-tagged-literals? (java.util.Date.))})) \ No newline at end of file + ;; Clojure 1.4+ tagged literals + :tagged-uuid (java.util.UUID/randomUUID) + :tagged-date (java.util.Date.)})) + +;;;; Deprecated API + +(defn freeze-to-bytes "DEPRECATED: Use `freeze` instead." + ^bytes [x & {:keys [print-dup? compress? password] + :or {print-dup? true + compress? true}}] + (freeze x {:print-dup? print-dup? + :compressor (when compress? compression/default-snappy-compressor) + :password password + :legacy-mode? true})) + +(defn thaw-from-bytes "DEPRECATED: Use `thaw` instead." + [ba & {:keys [read-eval? compressed? password] + :or {compressed? true}}] + (thaw ba {:read-eval? read-eval? + :compressor (when compressed? compression/default-snappy-compressor) + :password password + :legacy-mode? true})) \ No newline at end of file diff --git a/src/taoensso/nippy/benchmarks.clj b/src/taoensso/nippy/benchmarks.clj index 902c5c1..0b2d554 100644 --- a/src/taoensso/nippy/benchmarks.clj +++ b/src/taoensso/nippy/benchmarks.clj @@ -1,26 +1,31 @@ (ns taoensso.nippy.benchmarks {:author "Peter Taoussanis"} - (:use [taoensso.nippy :as nippy :only (freeze-to-bytes thaw-from-bytes)]) - (:require [taoensso.nippy.utils :as utils] - [taoensso.nippy.crypto :as crypto])) + (:require [taoensso.nippy :as nippy :refer (freeze thaw)] + [taoensso.nippy.utils :as utils])) ;; Remove stuff from stress-data that breaks reader (def data (dissoc nippy/stress-data :queue :queue-empty :bytes)) -(defmacro bench [& body] `(utils/bench 10000 (do ~@body) :warmup-laps 1000)) +(defmacro bench [& body] `(utils/bench 10000 (do ~@body) :warmup-laps 2000)) -(defn reader-freeze [x] (binding [*print-dup* false] (pr-str x))) -(defn reader-thaw [x] (binding [*read-eval* false] (read-string x))) -(def reader-roundtrip (comp reader-thaw reader-freeze)) +(defn freeze-reader [x] (binding [*print-dup* false] (pr-str x))) +(defn thaw-reader [x] (binding [*read-eval* false] (read-string x))) +(def roundtrip-reader (comp thaw-reader freeze-reader)) -(def roundtrip-defaults (comp nippy/thaw-from-bytes nippy/freeze-to-bytes)) -(def roundtrip-encrypted (comp #(nippy/thaw-from-bytes % :password [:cached "p"]) - #(nippy/freeze-to-bytes % :password [:cached "p"]))) -(def roundtrip-fast (comp #(nippy/thaw-from-bytes % :compressed? false) - #(nippy/freeze-to-bytes % :compress? false))) +(def roundtrip-defaults (comp thaw freeze)) +(def roundtrip-encrypted (comp #(thaw % {:password [:cached "p"]}) + #(freeze % {:password [:cached "p"]}))) +(def roundtrip-fast (comp #(thaw % {}) + #(freeze % {:compressor nil}))) -(defn autobench [] (bench (roundtrip-defaults data) - (roundtrip-encrypted data))) +(defn autobench [] + (println "Benchmarking roundtrips") + (println "-----------------------") + (let [results {:defaults (bench (roundtrip-defaults data)) + :encrypted (bench (roundtrip-encrypted data)) + :fast (bench (roundtrip-fast data))}] + (println results) + results)) (comment @@ -30,38 +35,42 @@ (println {:reader - {:freeze (bench (reader-freeze data)) - :thaw (let [frozen (reader-freeze data)] - (bench (reader-thaw frozen))) - :round (bench (reader-roundtrip data)) - :data-size (count (.getBytes ^String (reader-freeze data) "UTF-8"))}}) + {:freeze (bench (freeze-reader data)) + :thaw (let [frozen (freeze-reader data)] (bench (thaw-reader frozen))) + :round (bench (roundtrip-reader data)) + :data-size (count (.getBytes ^String (freeze-reader data) "UTF-8"))}}) (println {:defaults - {:freeze (bench (freeze-to-bytes data)) - :thaw (let [frozen (freeze-to-bytes data)] - (bench (thaw-from-bytes frozen))) - :round (bench (roundtrip-defaults data)) - :data-size (count (freeze-to-bytes data))}}) + {:freeze (bench (freeze data)) + :thaw (let [frozen (freeze data)] (bench (thaw frozen))) + :round (bench (roundtrip-defaults data)) + :data-size (count (freeze data))}}) (println {:encrypted - {:freeze (bench (freeze-to-bytes data :password [:cached "p"])) - :thaw (let [frozen (freeze-to-bytes data :password [:cached "p"])] - (bench (thaw-from-bytes frozen :password [:cached "p"]))) - :round (bench (roundtrip-encrypted data)) - :data-size (count (freeze-to-bytes data :password [:cached "p"]))}}) + {:freeze (bench (freeze data {:password [:cached "p"]})) + :thaw (let [frozen (freeze data {:password [:cached "p"]})] + (bench (thaw frozen {:password [:cached "p"]}))) + :round (bench (roundtrip-encrypted data)) + :data-size (count (freeze data {:password [:cached "p"]}))}}) (println {:fast - {:freeze (bench (freeze-to-bytes data :compress? false)) - :thaw (let [frozen (freeze-to-bytes data :compress? false)] - (bench (thaw-from-bytes frozen :compressed? false))) - :round (bench (roundtrip-fast data)) - :data-size (count (freeze-to-bytes data :compress? false))}}) + {:freeze (bench (freeze data {:compressor nil})) + :thaw (let [frozen (freeze data {:compressor nil})] + (bench (thaw frozen))) + :round (bench (roundtrip-fast data)) + :data-size (count (freeze data {:compressor nil}))}}) (println "Done! (Time for cake?)")) + ;;; 13 June 2013: Clojure 1.5.1, Nippy 2.0.0-alpha1 + ;; {:reader {:freeze 23124, :thaw 26469, :round 47674, :data-size 22923}} + ;; {:defaults {:freeze 4007, :thaw 2520, :round 6038, :data-size 12387}} + ;; {:encrypted {:freeze 5560, :thaw 3867, :round 9157, :data-size 12405}} + ;; {:fast {:freeze 3429, :thaw 2078, :round 5577, :data-size 13237}} + ;;; 11 June 2013: Clojure 1.5.1, Nippy 1.3.0-alpha1 ;; {:reader {:freeze 17042, :thaw 31579, :round 48379, :data-size 22954}} ;; {:fast {:freeze 3078, :thaw 4684, :round 8117, :data-size 13274}} diff --git a/src/taoensso/nippy/compression.clj b/src/taoensso/nippy/compression.clj new file mode 100644 index 0000000..7c2b625 --- /dev/null +++ b/src/taoensso/nippy/compression.clj @@ -0,0 +1,23 @@ +(ns taoensso.nippy.compression + "Alpha - subject to change." + {:author "Peter Taoussanis"} + (:require [taoensso.nippy.utils :as utils])) + +;;;; Interface + +(defprotocol ICompressor + (header-id [compressor]) ; Unique, >0, <= 128 + (compress ^bytes [compressor ba]) + (decompress ^bytes [compressor ba])) + +;;;; Default implementations + +(deftype DefaultSnappyCompressor [] + ICompressor + (header-id [_] 1) + (compress [_ ba] (org.iq80.snappy.Snappy/compress ba)) + (decompress [_ ba] (org.iq80.snappy.Snappy/uncompress ba 0 (alength ^bytes ba)))) + +(def default-snappy-compressor + "Default org.iq80.snappy.Snappy compressor." + (DefaultSnappyCompressor.)) \ No newline at end of file diff --git a/src/taoensso/nippy/crypto.clj b/src/taoensso/nippy/encryption.clj similarity index 53% rename from src/taoensso/nippy/crypto.clj rename to src/taoensso/nippy/encryption.clj index f3582f3..8224d45 100644 --- a/src/taoensso/nippy/crypto.clj +++ b/src/taoensso/nippy/encryption.clj @@ -1,24 +1,19 @@ -(ns taoensso.nippy.crypto +(ns taoensso.nippy.encryption "Alpha - subject to change. Simple no-nonsense crypto with reasonable defaults. Because your Clojure data deserves some privacy." {:author "Peter Taoussanis"} - (:require [clojure.string :as str] - [taoensso.nippy.utils :as utils])) + (:require [taoensso.nippy.utils :as utils])) ;;;; Interface -(defprotocol IEncrypter - (gen-key ^javax.crypto.spec.SecretKeySpec [encrypter salt-ba pwd]) - (encrypt ^bytes [encrypter pwd ba]) - (decrypt ^bytes [encrypter pwd ba])) +(defprotocol IEncryptor + (header-id [encryptor]) ; Unique, >0, <= 128 + (encrypt ^bytes [encryptor pwd ba]) + (decrypt ^bytes [encryptor pwd ba])) -(defrecord AES128Encrypter [key-work-factor key-cache]) +;;;; Default digests, ciphers, etc. -;;;; Digests, ciphers, etc. - -;; 128bit keys have good JVM availability and are -;; entirely sufficient, Ref. http://goo.gl/2YRQG (def ^:private ^:const aes128-block-size (int 16)) (def ^:private ^:const salt-size (int 16)) @@ -36,10 +31,10 @@ (defn- sha512-key "SHA512-based key generator. Good JVM availability without extra dependencies (PBKDF2, bcrypt, scrypt, etc.). Decent security with multiple rounds." - [salt-ba ^String pwd key-work-factor] + [salt-ba ^String pwd] (loop [^bytes ba (let [pwd-ba (.getBytes pwd "UTF-8")] (if salt-ba (utils/ba-concat salt-ba pwd-ba) pwd-ba)) - n (* (int Short/MAX_VALUE) key-work-factor)] + n (* (int Short/MAX_VALUE) (if salt-ba 5 64))] (if-not (zero? n) (recur (.digest sha512-md ba) (dec n)) (-> ba (java.util.Arrays/copyOf aes128-block-size) @@ -52,37 +47,62 @@ (time (sha512-key nil "hi" 128)) ; ~4500ms (paranoid) ) -;;;; Default implementation +;;;; Default implementations -(extend-type AES128Encrypter - IEncrypter - (gen-key [{:keys [key-work-factor key-cache]} salt-ba pwd] - ;; Trade-off: salt-ba and key-cache mutually exclusive - (utils/memoized key-cache sha512-key salt-ba pwd key-work-factor)) +(defn- destructure-typed-pwd + [typed-password] + (letfn [(throw-ex [] + (throw (Exception. + (str "Expected password form: " + "[<#{:salted :cached}> ].\n " + "See `default-aes128-encryptor` docstring for details!"))))] + (if-not (vector? typed-password) + (throw-ex) + (let [[type password] typed-password] + (if-not (#{:salted :cached} type) + (throw-ex) + [type password]))))) - (encrypt [{:keys [key-cache] :as this} pwd data-ba] - (let [salt? (not key-cache) - iv-ba (rand-bytes aes128-block-size) - salt-ba (when salt? (rand-bytes salt-size)) - prefix-ba (if-not salt? iv-ba (utils/ba-concat iv-ba salt-ba)) - key (gen-key this salt-ba pwd) - iv (javax.crypto.spec.IvParameterSpec. iv-ba)] - (.init aes128-cipher javax.crypto.Cipher/ENCRYPT_MODE key iv) +(comment (destructure-typed-pwd [:salted "foo"])) + +(defrecord DefaultAES128Encryptor [key-cache] + IEncryptor + (header-id [_] 1) + + (encrypt [this typed-pwd data-ba] + (let [[type pwd] (destructure-typed-pwd typed-pwd) + salt? (= type :salted) + iv-ba (rand-bytes aes128-block-size) + salt-ba (when salt? (rand-bytes salt-size)) + prefix-ba (if-not salt? iv-ba (utils/ba-concat iv-ba salt-ba)) + key (utils/memoized (when-not salt? (:key-cache this)) + sha512-key salt-ba pwd) + iv (javax.crypto.spec.IvParameterSpec. iv-ba)] + (.init aes128-cipher javax.crypto.Cipher/ENCRYPT_MODE + ^javax.crypto.spec.SecretKeySpec key iv) (utils/ba-concat prefix-ba (.doFinal aes128-cipher data-ba)))) - (decrypt [{:keys [key-cache] :as this} pwd ba] - (let [salt? (not key-cache) + (decrypt [this typed-pwd ba] + (let [[type pwd] (destructure-typed-pwd typed-pwd) + salt? (= type :salted) prefix-size (+ aes128-block-size (if salt? salt-size 0)) [prefix-ba data-ba] (utils/ba-split ba prefix-size) [iv-ba salt-ba] (if-not salt? [prefix-ba nil] (utils/ba-split prefix-ba aes128-block-size)) - key (gen-key this salt-ba pwd) + key (utils/memoized (when-not salt? (:key-cache this)) + sha512-key salt-ba pwd) iv (javax.crypto.spec.IvParameterSpec. iv-ba)] - (.init aes128-cipher javax.crypto.Cipher/DECRYPT_MODE key iv) + (.init aes128-cipher javax.crypto.Cipher/DECRYPT_MODE + ^javax.crypto.spec.SecretKeySpec key iv) (.doFinal aes128-cipher data-ba)))) -(def aes128-salted - "USE CASE: You want more than a small, finite number of passwords (e.g. each +(def default-aes128-encryptor + "Alpha - subject to change. + Default 128bit AES encryptor with multi-round SHA-512 keygen. + + Password form [:salted \"my-password\"] + --------------------------------------- + USE CASE: You want more than a small, finite number of passwords (e.g. each item encrypted will use a unique user-provided password). IMPLEMENTATION: Uses a relatively cheap key hash, but automatically salts @@ -94,13 +114,13 @@ particular key. Slower than `aes128-cached`, and easier to attack any particular key - but - keys are independent." - (AES128Encrypter. 5 nil)) + keys are independent. -(def aes128-cached - "USE CASE: You want only a small, finite number of passwords (e.g. a limited - number of staff/admins, or you'll be using a single password to - encrypt many items). + Password form [:cached \"my-password\"] + --------------------------------------- + USE CASE: You want only a small, finite number of passwords (e.g. a limited + number of staff/admins, or you'll be using a single password to + encrypt many items). IMPLEMENTATION: Uses a _very_ expensive (but cached) key hash, and no salt. @@ -112,37 +132,19 @@ Faster than `aes128-salted`, and harder to attack any particular key - but increased danger if a key is somehow compromised." - (AES128Encrypter. 64 (atom {}))) + (DefaultAES128Encryptor. (atom {}))) -(defn- destructure-typed-password - "[ ] -> [Encrypter ]" - [typed-password] - (letfn [(throw-ex [] - (throw (Exception. - (str "Expected password form: " - "[<#{:salted :cached}> ].\n " - "See `aes128-salted`, `aes128-cached` for details."))))] - (if-not (vector? typed-password) - (throw-ex) - (let [[type password] typed-password] - [(case type :salted aes128-salted :cached aes128-cached (throw-ex)) - password])))) - -(defn encrypt-aes128 [typed-password ba] - (let [[encrypter password] (destructure-typed-password typed-password)] - (encrypt encrypter password ba))) - -(defn decrypt-aes128 [typed-password ba] - (let [[encrypter password] (destructure-typed-password typed-password)] - (decrypt encrypter password ba))) +;;;; Default implementation (comment - (encrypt-aes128 "my-password" (.getBytes "Secret message")) ; Malformed - (time (gen-key aes128-salted nil "my-password")) - (time (gen-key aes128-cached nil "my-password")) - (time (->> (.getBytes "Secret message" "UTF-8") - (encrypt-aes128 [:salted "p"]) - (encrypt-aes128 [:cached "p"]) - (decrypt-aes128 [:cached "p"]) - (decrypt-aes128 [:salted "p"]) + (def dae default-aes128-encryptor) + (def secret-ba (.getBytes "Secret message" "UTF-8")) + (encrypt dae "p" secret-ba) ; Malformed + (time (encrypt dae [:salted "p"] secret-ba)) + (time (encrypt dae [:cached "p"] secret-ba)) + (time (->> secret-ba + (encrypt dae [:salted "p"]) + (encrypt dae [:cached "p"]) + (decrypt dae [:cached "p"]) + (decrypt dae [:salted "p"]) (String.)))) \ No newline at end of file diff --git a/src/taoensso/nippy/utils.clj b/src/taoensso/nippy/utils.clj index a02f2e5..d4d1910 100644 --- a/src/taoensso/nippy/utils.clj +++ b/src/taoensso/nippy/utils.clj @@ -1,7 +1,6 @@ (ns taoensso.nippy.utils {:author "Peter Taoussanis"} - (:require [clojure.string :as str]) - (:import org.iq80.snappy.Snappy)) + (:require [clojure.string :as str])) (defmacro case-eval "Like `case` but evaluates test constants for their compile-time value." @@ -14,25 +13,25 @@ clauses) ~(when default default)))) -(defn pairs - "Like (partition 2 coll) but faster and returns lazy seq of vector pairs." - [coll] - (lazy-seq - (when-let [s (seq coll)] - (let [n (next s)] - (cons [(first s) (first n)] (pairs (next n))))))) +(defn repeatedly-into + "Like `repeatedly` but faster and `conj`s items into given collection." + [coll n f] + (if-not (instance? clojure.lang.IEditableCollection coll) + (loop [v coll idx 0] + (if (>= idx n) + v + (recur (conj v (f)) (inc idx)))) + (loop [v (transient coll) idx 0] + (if (>= idx n) + (persistent! v) + (recur (conj! v (f)) (inc idx)))))) -(defmacro time-ns - "Returns number of nanoseconds it takes to execute body." - [& body] - `(let [t0# (System/nanoTime)] - ~@body - (- (System/nanoTime) t0#))) +(defmacro time-ns "Returns number of nanoseconds it takes to execute body." + [& body] `(let [t0# (System/nanoTime)] ~@body (- (System/nanoTime) t0#))) (defmacro bench "Repeatedly executes form and returns time taken to complete execution." - [num-laps form & {:keys [warmup-laps num-threads as-ms?] - :or {as-ms? true}}] + [num-laps form & {:keys [warmup-laps num-threads as-ns?]}] `(try (when ~warmup-laps (dotimes [_# ~warmup-laps] ~form)) (let [nanosecs# (if-not ~num-threads @@ -44,23 +43,17 @@ doall (map deref) dorun))))] - (if ~as-ms? (Math/round (/ nanosecs# 1000000.0)) nanosecs#)) + (if ~as-ns? nanosecs# (Math/round (/ nanosecs# 1000000.0)))) (catch Exception e# (str "DNF: " (.getMessage e#))))) -(defn version-compare - "Comparator for version strings like x.y.z, etc." - [x y] - (let [vals (fn [s] (vec (map #(Integer/parseInt %) (str/split s #"\."))))] - (compare (vals x) (vals y)))) +(defn version-compare "Comparator for version strings like x.y.z, etc." + [x y] (let [vals (fn [s] (vec (map #(Integer/parseInt %) (str/split s #"\."))))] + (compare (vals x) (vals y)))) -(defn version-sufficient? - [version-str min-version-str] +(defn version-sufficient? [version-str min-version-str] (try (>= (version-compare version-str min-version-str) 0) (catch Exception _ false))) -(defn compress-bytes [^bytes ba] (Snappy/compress ba)) -(defn uncompress-bytes [^bytes ba] (Snappy/uncompress ba 0 (alength ba))) - (defn memoized "Like `memoize` but takes an explicit cache atom (possibly nil) and immediately applies memoized f to given arguments." diff --git a/test/taoensso/nippy/tests/main.clj b/test/taoensso/nippy/tests/main.clj new file mode 100644 index 0000000..a0d4e07 --- /dev/null +++ b/test/taoensso/nippy/tests/main.clj @@ -0,0 +1,51 @@ +(ns taoensso.nippy.tests.main + (:require [expectations :as test :refer :all] + [taoensso.nippy :as nippy :refer (freeze thaw)] + [taoensso.nippy.benchmarks :as benchmarks])) + +;; Remove stuff from stress-data that breaks roundtrip equality +(def test-data (dissoc nippy/stress-data :bytes)) + +(def roundtrip-defaults (comp thaw freeze)) +(def roundtrip-encrypted (comp #(thaw % {:password [:salted "p"]}) + #(freeze % {:password [:salted "p"]}))) +(def roundtrip-defaults-legacy (comp #(thaw % {:legacy-mode? true}) + #(freeze % {:legacy-mode? true}))) +(def roundtrip-encrypted-legacy (comp #(thaw % {:password [:salted "p"] + :legacy-mode? true}) + #(freeze % {:password [:salted "p"] + :legacy-mode? true}))) + +;;; Basic data integrity +(expect test-data (roundtrip-defaults test-data)) +(expect test-data (roundtrip-encrypted test-data)) +(expect test-data (roundtrip-defaults-legacy test-data)) +(expect test-data (roundtrip-encrypted-legacy test-data)) + +(expect ; Snappy lib compatibility (for legacy versions of Nippy) + (let [^bytes raw-ba (freeze test-data {:compressor nil}) + ^bytes xerial-ba (org.xerial.snappy.Snappy/compress raw-ba) + ^bytes iq80-ba (org.iq80.snappy.Snappy/compress raw-ba)] + (= (thaw raw-ba) + (thaw (org.xerial.snappy.Snappy/uncompress xerial-ba)) + (thaw (org.xerial.snappy.Snappy/uncompress iq80-ba)) + (thaw (org.iq80.snappy.Snappy/uncompress iq80-ba 0 (alength iq80-ba))) + (thaw (org.iq80.snappy.Snappy/uncompress xerial-ba 0 (alength xerial-ba)))))) + +;;; API stuff + +;; Strict/auto mode - compression +(expect test-data (thaw (freeze test-data {:compressor nil}))) +(expect Exception (thaw (freeze test-data {:compressor nil}) {:strict? true})) + +;; Strict/auto mode - encryption +(expect test-data (thaw (freeze test-data) {:password [:salted "p"]})) +(expect Exception (thaw (freeze test-data) {:password [:salted "p"] :strict? true})) + +;; Encryption - passwords +(expect Exception (thaw (freeze test-data {:password "malformed"}))) +(expect Exception (thaw (freeze test-data {:password [:salted "p"]}))) +(expect test-data (thaw (freeze test-data {:password [:salted "p"]}) + {:password [:salted "p"]})) + +(expect (benchmarks/autobench)) ; Also tests :cached passwords \ No newline at end of file diff --git a/test/test_nippy/main.clj b/test/test_nippy/main.clj deleted file mode 100644 index 30a11d1..0000000 --- a/test/test_nippy/main.clj +++ /dev/null @@ -1,30 +0,0 @@ -(ns test-nippy.main - (:use [clojure.test]) - (:require [taoensso.nippy :as nippy] - [taoensso.nippy.benchmarks :as benchmarks])) - -;; Remove stuff from stress-data that breaks roundtrip equality -(def test-data (dissoc nippy/stress-data :bytes)) - -(def roundtrip-defaults (comp nippy/thaw-from-bytes nippy/freeze-to-bytes)) -(def roundtrip-encrypted (comp #(nippy/thaw-from-bytes % :password [:cached "secret"]) - #(nippy/freeze-to-bytes % :password [:cached "secret"]))) -(deftest test-roundtrip-defaults (is (= test-data (roundtrip-defaults test-data)))) -(deftest test-roundtrip-encrypted (is (= test-data (roundtrip-encrypted test-data)))) - -(println "Benchmarking roundtrips (x3)") -(println "----------------------------") -(println (benchmarks/autobench)) -(println (benchmarks/autobench)) -(println (benchmarks/autobench)) - -(deftest test-snappy-library-compatibility - (let [thaw #(nippy/thaw-from-bytes % :compressed? false) - ^bytes raw-ba (nippy/freeze-to-bytes test-data :compress? false) - ^bytes xerial-ba (org.xerial.snappy.Snappy/compress raw-ba) - ^bytes iq80-ba (org.iq80.snappy.Snappy/compress raw-ba)] - (is (= (thaw raw-ba) - (thaw (org.xerial.snappy.Snappy/uncompress xerial-ba)) - (thaw (org.xerial.snappy.Snappy/uncompress iq80-ba)) - (thaw (org.iq80.snappy.Snappy/uncompress iq80-ba 0 (alength iq80-ba))) - (thaw (org.iq80.snappy.Snappy/uncompress xerial-ba 0 (alength xerial-ba))))))) \ No newline at end of file