June 8, 2017

An implementation of Mircea Popescu's Fabulous Hash Function

Filed under: common lisp, tmsr — Benjamin Vulpes @ 5:40 p.m.

At the beginning of March (you're stuck believing me, I keep detailed logs of this sort of thing), I wrote an implementation of Mircea Popescu's Fabulous Hash Function, in the process finding a few mistakes in his documentation.

I left off before publishing at the time, as while the individual operations worked, the assemblage together didn't return the same values I saw on Trilema. I got distracted, life moved on, and the proggy sat on my drive, further unexamined. Today, the hash function's author pointed out that he'd published results from an implementation of an older spec, Stan exhorted me to publish and reconcile, and here we are.

Updated: June 9, 2017: 4:03

'twas only my just deserts for sitting on a piece of code for months and then releasing it abruptly without loading it into my head fully: asciilifeform found the most glaringest flaw in the whole thing, that while the rest of the program operates in a destructive fashion, the invert operation returned the array that should have been emplaced into its input argument. More beatings is the only remedy...

Patch to fix follows:
In the interest of preserving my sanity, I'll not be hosting patches for this thing.

Updated: June 9, 2017: 7:24

Turns out, it doesn't work at all.

Updated: June 10, 2017: 01:05

I'm excising the old code below, and commit to maintaining an up-to-date version below. To paraphrase Mircea Popescu, it does something, although whether it does the right thing is hard to tell. Hash functions, amirite?

Please join us in #trilema on the Freenode IRC network to play with the bot. Demo session here:

Updated: Jun 10, 2017 01:24

Bug in the spec. Code updated. Combatants licking their wounds.

Updated: June 10, 2017 06:02
Reordered updates first to last, like a sane person.

Astonishment and awe, as the element expected to bring chaos brought instead entirely predictable order.Ambiguity in the spec this time!

And ultimately, sweet sweet synchrony.

With candidate canonical implementations in the field, it's time for another round of tests. I'll list some results here and hopefully enterprising souls will come up with some new mismatches to run down.

(mpfhf:mpfhf #*1111 4)

(mpfhf:mpfhf #*11111111111111111111111111111111 64)

(mpfhf:mpfhf #*1111111111111111111111111111111111111111111111111111111111111111 64)

 (mpfhf:mpfhf #*1111111111111111111111111111111111111111111111111111111111111111 128)

(mpfhf:mpfhf #*11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 256)

And, the final torture test: a 64-bit MPFHF of the text document found here:


Updated: June 30, 2017 06:02

I've squeezed the obvious gains out of the thing, but it's still clocking in at some ~4X the runtime of sina's golang implementation. I can squeeze another 25% out of it by constraining hash output sizes, but that's not actually a useful optimization.

I find myself woefully underequipped to profile this code for the hot spots. More research on benchmarking and profiling CL to follow...

Currently operational implementation:

(defpackage :mpfhf
  (:use :cl :bit-smasher)
  (:export :flip :invert :screw :expand :mpfhf :hash-string :main))
(in-package :mpfhf)

(defparameter *log-out* nil)

(declaim (inline expand flip invert underlying-screw screw half-screw rewind))

(defun flip (element target)
  (declare ((array bit 1) element)
           (fixnum target))
  (let* ((original (bit element target))
         (new (if (= 0 original) 1 0)))
    (setf (bit element target) new)

(defun invert (element)
  (declare ((array bit 1) element))
   element (array-dimensions element)
   :initial-contents (bit-not element)))

(defun underlying-screw (b m-position count)
  (declare ((array bit 1) b)
           (fixnum m-position count))
  (loop for i integer below count do
       (let* ((product (* i m-position))
              (target (rem product (length b))))
         (flip b target))))

(defun half-screw (a b m-position)
  (declare ((array bit 1) a b)
           (fixnum m-position))
  (underlying-screw b m-position (floor (/ (length a) 2))))

(defun screw (a b m-position)
  "screw a in b"
  (declare ((array bit 1) a b)
       (fixnum m-position))
  (underlying-screw b m-position (length a)))

(defun expand (element)
  (declare ((array bit 1) element))
  (let ((original-size (array-total-size element)))
    (adjust-array element (+ 1 original-size) :initial-element 0)

(defun rewind (place)
  (declare (fixnum place))
  (if (= 0 place)
      (decf place)))

(defun mpfhf (m hash-length)
  (declare ((array bit 1) m)
           (fixnum hash-length))
  (let ((s (make-array 1 :element-type 'bit :initial-element 0 :adjustable t))
        (r (make-array hash-length :element-type 'bit :initial-element 0 :adjustable t))
        (message-length (length m)))
       for message-position fixnum = 0 then (+ 1 message-position)
       for steps fixnum = 0 then (+ 1 steps)
         (>= message-position message-length)
           (case (elt m message-position)
              (expand s)
              (screw s r message-position)
              (case (elt r (rem message-position hash-length))
                (0 (flip r (rem message-position hash-length))
                   (if (= 0 message-position) nil (decf message-position)))
                (t (flip r (rem message-position hash-length))
                   (invert s))))
              (half-screw s r message-position)
              (let ((idx-a (rem message-position hash-length))
                    (idx-b (rem message-position (array-total-size s))))
                  ((= (the bit (elt r idx-a))
                      (the bit (elt s idx-b)))
                   (expand s)
                   (screw r s message-position))
                   (flip r (rem message-position hash-length)))))))))
    (values r *log-out*)))

(defun hash-string (message length)
  (declare (simple-string message)
           (fixnum length))
  (let* ((bits (loop for c character across message appending
                    (coerce (bit-smasher:bits<- (char-code c)) 'list)))
         (bit-message (make-array (* 8 (length message))
                                  :element-type 'bit
                                  :initial-contents bits)))
    (mpfhf bit-message length)))

(defun main ()
  (let ((message (second sb-ext:*posix-argv*))
        (length (parse-integer (third sb-ext:*posix-argv*))))
    (format t "~A~%" (hash-string message length))))

Not shown above, the MPFHF returning a log stream (and the log lines, for sanity's sake) so that the bot can slam it into my pastulator. As of this update, the logs are so verbose that the pastulator won't eat them for larger hashes, and my MPFHF implementation consumes absurd amounts of memory when logging.

The test cases are from the Trilema article linked above, and as they are notionally from runs of an implementation of a different spec, none match.

Reconciliation notes, bug reports, portability mistake reports, and any other sort of suggestions welcome.


RSS feed for comments on this post. TrackBack URL


« The CIA's flickr account --- "Hey kid, want to buy some VERBOTEN HATE SPEECH?" »