RSS / Atom

Reading on Clojure

2011-06-18 , , , Comment

A few years ago I attended a conference where Rick Hickey presented on his new JVM-based language Clojure. At the time I thought, “Cool. A lisp with Java interop and software transactional memory (STM)” and pushed it onto the pile of things to investigate. Last month I got around to it.

I’ve finished Halloway’s Programming Clojure which I’ve had sitting here since New Years when I ordered Programming Erlang and a few other books and started Fogus’s The Joy of Clojure. I prefer “Joy”, it’s a different kind of introductory book much more to my taste of jumping in with both feet (and included the pdf ebook gratis) and I’ve begun converting some existing small projects from Common Lisp. I’m impressed, as much with the community as with the language: Emacs/Slime (snapshot)/swank work with barely any tweaking of my existing config, Leiningen hides the Maven ugliness (which is itself an improvement over Ant), plugins exist for the Netbeans and Eclipse IDEs (neither are more compelling than Emacs where I’ve spent the last decade, maybe I’m not getting it), pretty much everything I’ve looked for has turned up in Clojars and #clojure is populated by helpful and fairly tolerant people.

It’s a nice language and a reasonably pleasant way to work with Java (though I did have to break out Java in a Nutshell and popped for the upgrade to the pdf ebook).


Crockford's Base32 in Common Lisp

2010-11-18 , , ,

Two nights ago I couldn’t sleep and ran across Doug Crockford’s Human Error Resistant Base32 encoding. This kind of twiddling is fun so I wrote an implementation in Common Lisp. However, there only seem to be a few implementations out there (one I found in Perl gives identical results, one in Python pads to an even-length with leading zeros) so it’s mostly tested against itself and some scribbling on paper. Developed on SBCL, lightly tested on CCL and ECL so should work on others (ignoring my attempts at optimization).

Download here base32-crockford.lisp


weblocks, finally!

2010-04-19 , , , Comment

I have a little web app in mind so I decided to experiment with Weblocks, a continuations-based web framework in Common Lisp, which has been on my list of cool things to check out. I’m on OSX and it’s been a true experience in yak shaving...

  1. The easiest method is to install it via clbuild. Except there’s no tarball.
  2. The clbuild repo is in darcs. Ok, I’ll build it from macports. Except, the dependencies are broken.
  3. Mix and match versions, local port repo, a couple of patches. Nada. Drop back to older release. At least I didn’t have to fight with ghc. Ok, off and running again.
  4. Nope, need mercurial. Ok, build that.
  5. Oops, rebuild SBCL to include threading. Really ready now!
  6. No luck- problems with cl-json and symbols have changed names in parenscript. Great, fix that to find…
  7. What? Some deps don’t build properly within clbuild. Ok, build them one at a time.
  8. Problems with hunchentoot, so pull the latest and build.
  9. Wrong, need a newer version of chunga, build that too.
  10. Try newer versions of various deps.
  11. Try older versions of various deps.
  12. Rebuild several deps by hand, skipping clbuild entirely.

Lather, rinse, repeat a few times starting at step #7. Sleep. Work. Hack. Curse. Hack. Drink. Sleep. Work. Hack. Curse.

Finally, for no obvious reason it’s working!

weblocks demo

I’ll follow up with the (current, as of today) magic combination of versions and build steps but I first need to balance the cursing with drinking.


See you at ILC 2009

2009-03-19 , , , Comment [1]


Good advice, solve the failure first

2009-01-27 , , ,

Kenny posted Tilton’s Law, which is summarized as “Solve the failure first” (and his corollary, solve the first problem).


Optimization, or how to waste time trying

2008-08-18 , , ,

I’ve been on some very long conference calls lately so I made a try of log parsing in common lisp:

   (defun split (delim line)
      "Return a list of subsequences separated by a one character delimiter, delim itself is not returned"
      (loop :for mark = 0 :then (1+ point)
            :for point = (position delim line :start mark)
            :collect (subseq line mark point)
            :while point))
   (defun splitcols (columns file &key (delim #\Space) grep)
      "Iterate over the lines of a file, returning the specified columns.
    If :grep is supplied only lines matching that string will be returned."
      (with-open-file (s file)
        (loop :for line = (read-line s nil)
              :while line
              :when (or (eq grep nil) (search grep line))
              :collect (loop :for c :in columns
                             :collect (elt (split delim line) c)))))

It suits the simple case where one might use awk to pull out columns from a log but it’s quite a bit slower on SBCL 1.0.18 on my machine than just using awk. So as a challenge to myself I tried to optimize it which resulted in spending a few minutes salting it heavily with declarations to reach this:

   (defun split-optimized (delim line)
      "Return a list of subsequences separated by a one character delimiter, delim itself is not returned"
      (declare (optimize (speed 3) (debug 0) (space 0) (safety 0)))
      (declare (type simple-string line)
               (type character delim)
               (type seq-index point))
      (loop :for mark = (the fixnum 0) :then (the fixnum (1+ point))
            :for point = (position delim (the simple-string line) :start mark)
            :collect (subseq (the string line) mark point)
            :while point))
   (defun splitcols-optimized-1 (columns file &key (delim #\Space) grep)
      "Iterate over the lines of a file, returning the specified columns.
    If :grep is supplied only lines matching that string will be returned."
      ; this runs in about half the time of splitcols
      (declare (optimize (speed 3) (debug 0) (space 0) (safety 0)))
      (declare (type simple-string grep)
               (type character delim))
      (declare (inline split-optimized))
      (with-open-file (s file)
        (loop :for line of-type simple-string = (read-line s nil)
              :while line
              :when (or (eq grep nil) (search grep line)) ; maybe inefficient search...
              :collect (loop :for c :in columns
                             :collect (elt (split-optimized delim line) c)))))

which is imperfect but quite a bit better in terms of runtime, as tested against a 25MB apache log:

   CL-USER> (sb-profile:report)
      seconds  |    consed   | calls |  sec/call  |  name
         1.394 | 138,110,624 |     1 |   1.393998 | SPLITCOLS-INLINE-SPLIT
         1.361 | 138,114,088 |     1 |   1.360998 | SPLITCOLS
         0.848 | 137,340,504 |     1 |   0.847998 | SPLITCOLS-OPTIMIZED-1
         3.603 | 413,565,216 |     3 |            | Total

and there I was. All that consing (and time) was going somewhere so I spent a few hours over the next day trying permutations of declarations, preallocating, using `do` instead of `loop`, writing my own `position` and `subseq` so I can share state. None of it was successful in reducing runtime or consing.

This evening- on another call- I noticed that similarity. Maybe it’s not really my awful code. So I wrote a null function that just slurps the file. Bingo. Here’s how it came out:

   CL-USER> (sb-profile:report)
      seconds  |    consed   | calls |  sec/call  |  name  
         1.366 | 138,108,688 |     1 |   1.365998 | SPLITCOLS
         0.819 | 137,341,248 |     1 |   0.818998 | SPLITCOLS-OPTIMIZED-1
         0.602 | 129,961,032 |     1 |   0.601998 | DO-NOTHING-WITH-FILE
         0.591 | 129,960,592 |     1 |   0.590998 | DO-NOTHING-WITH-FILE-OPTIMIZED
         3.378 | 535,371,560 |     4 |            | Total

With that in mind, I profiled a bit more thoughtfully:

   CL-USER> (sb-profile:report)
      seconds  |    consed   |  calls  |  sec/call  |  name
         1.953 | 387,329,576 | 366,735 |   0.000005 | READ-LINE
         1.428 |           0 | 244,488 |   0.000006 | SEARCH
         0.279 |  11,002,520 |       1 |   0.278550 | SPLITCOLS-OPTIMIZED-1
         0.103 |  15,072,864 | 192,159 |   0.000001 | SUBSEQ
         0.070 |   9,810,576 |       1 |   0.070177 | SPLITCOLS
         0.050 |  10,138,608 |       1 |   0.050177 | SPLITCOLS-INLINE-SPLIT
         0.000 |           0 |  21,285 |   0.000000 | ELT
         3.884 | 433,354,144 | 824,670 |            | Total

and see that the time and gc is mostly spent in `read-line`. A quick look at it, a skim of `sb-streams` and some googling informs me that I’m not ready to take on doing a special purpose rewrite of that. Yet.

Still, this was a fun exercise and a reminder of C.A.R. Hoare’s advice that premature optimization is the root of all evil.


« Older Posts