Navigation


RSS / Atom



Optimization, or how to waste time trying

2008-08-19 , , R. Lonstein

I’ve been on some very long conference calls lately so I made a try of log parsing in common lisp:

   (defun split (delim line)
      "Return a list of subsequences separated by a one character delimiter, delim itself is not returned"
      (loop :for mark = 0 :then (1+ point)
            :for point = (position delim line :start mark)
            :collect (subseq line mark point)
            :while point))
   (defun splitcols (columns file &key (delim #\Space) grep)
      "Iterate over the lines of a file, returning the specified columns.
    If :grep is supplied only lines matching that string will be returned."
      (with-open-file (s file)
        (loop :for line = (read-line s nil)
              :while line
              :when (or (eq grep nil) (search grep line))
              :collect (loop :for c :in columns
                             :collect (elt (split delim line) c)))))

It suits the simple case where one might use awk to pull out columns from a log but it’s quite a bit slower on SBCL 1.0.18 on my machine than just using awk. So as a challenge to myself I tried to optimize it which resulted in spending a few minutes salting it heavily with declarations to reach this:

   (defun split-optimized (delim line)
      "Return a list of subsequences separated by a one character delimiter, delim itself is not returned"
      (declare (optimize (speed 3) (debug 0) (space 0) (safety 0)))
      (declare (type simple-string line)
               (type character delim)
               (type seq-index point))
      (loop :for mark = (the fixnum 0) :then (the fixnum (1+ point))
            :for point = (position delim (the simple-string line) :start mark)
            :collect (subseq (the string line) mark point)
            :while point))
   (defun splitcols-optimized-1 (columns file &key (delim #\Space) grep)
      "Iterate over the lines of a file, returning the specified columns.
    If :grep is supplied only lines matching that string will be returned."
      ; this runs in about half the time of splitcols
      (declare (optimize (speed 3) (debug 0) (space 0) (safety 0)))
      (declare (type simple-string grep)
               (type character delim))
      (declare (inline split-optimized))
      (with-open-file (s file)
        (loop :for line of-type simple-string = (read-line s nil)
              :while line
              :when (or (eq grep nil) (search grep line)) ; maybe inefficient search...
              :collect (loop :for c :in columns
                             :collect (elt (split-optimized delim line) c)))))

which is imperfect but quite a bit better in terms of runtime, as tested against a 25MB apache log:

   CL-USER> (sb-profile:report)
      seconds  |    consed   | calls |  sec/call  |  name
    ---------------------------------------------------------
         1.394 | 138,110,624 |     1 |   1.393998 | SPLITCOLS-INLINE-SPLIT
         1.361 | 138,114,088 |     1 |   1.360998 | SPLITCOLS
         0.848 | 137,340,504 |     1 |   0.847998 | SPLITCOLS-OPTIMIZED-1
    ---------------------------------------------------------
         3.603 | 413,565,216 |     3 |            | Total

and there I was. All that consing (and time) was going somewhere so I spent a few hours over the next day trying permutations of declarations, preallocating, using `do` instead of `loop`, writing my own `position` and `subseq` so I can share state. None of it was successful in reducing runtime or consing.

This evening- on another call- I noticed that similarity. Maybe it’s not really my awful code. So I wrote a null function that just slurps the file. Bingo. Here’s how it came out:

   CL-USER> (sb-profile:report)
      seconds  |    consed   | calls |  sec/call  |  name  
    ---------------------------------------------------------
         1.366 | 138,108,688 |     1 |   1.365998 | SPLITCOLS
         0.819 | 137,341,248 |     1 |   0.818998 | SPLITCOLS-OPTIMIZED-1
         0.602 | 129,961,032 |     1 |   0.601998 | DO-NOTHING-WITH-FILE
         0.591 | 129,960,592 |     1 |   0.590998 | DO-NOTHING-WITH-FILE-OPTIMIZED
    ---------------------------------------------------------
         3.378 | 535,371,560 |     4 |            | Total

With that in mind, I profiled a bit more thoughtfully:

   CL-USER> (sb-profile:report)
      seconds  |    consed   |  calls  |  sec/call  |  name
    -----------------------------------------------------------
         1.953 | 387,329,576 | 366,735 |   0.000005 | READ-LINE
         1.428 |           0 | 244,488 |   0.000006 | SEARCH
         0.279 |  11,002,520 |       1 |   0.278550 | SPLITCOLS-OPTIMIZED-1
         0.103 |  15,072,864 | 192,159 |   0.000001 | SUBSEQ
         0.070 |   9,810,576 |       1 |   0.070177 | SPLITCOLS
         0.050 |  10,138,608 |       1 |   0.050177 | SPLITCOLS-INLINE-SPLIT
         0.000 |           0 |  21,285 |   0.000000 | ELT
    -----------------------------------------------------------
         3.884 | 433,354,144 | 824,670 |            | Total

and see that the time and gc is mostly spent in `read-line`. A quick look at it, a skim of `sb-streams` and some googling informs me that I’m not ready to take on doing a special purpose rewrite of that. Yet.

Still, this was a fun exercise and a reminder of C.A.R. Hoare’s advice that premature optimization is the root of all evil.

Comment


Commenting is closed for this article.