Living Lazy, Without Variables

Programmers coming to functional languages for the first time cannot imagine life without variables. I address this head-on in the Clojure book. In Section 2.7 (free download here), I port an imperative method from the Apache Commons Lang to Clojure. First the Java version:

  // From Apache Commons Lang, http://commons.apache.org/lang/
  public static int indexOfAny(String str, char[] searchChars) {
    if (isEmpty(str) || ArrayUtils.isEmpty(searchChars)) {
  	  return -1;
    }
    for (int i = 0; i < str.length(); i++) {
  	  char ch = str.charAt(i);
  	  for (int j = 0; j < searchChars.length; j++) {
  	    if (searchChars[j] == ch) {
  		    return i;
  	    } 
  	  }
    }
    return -1;
  }

And now the Clojure code. I have shown the supporting function indexed as well:

  (defn indexed [s] (map vector (iterate inc 0) s))
  (defn index-of-any [s chars]
    (some (fn [[idx char]] (if (get chars char) idx)) 
  	      (indexed s)))

There are many things I like about the Clojure version, but I want to focus on something I didn't mention already in the book. A reader thought the Clojure version did too much work:

...the [Java] version can be seen as *more efficient* when a match is found because scanning stops right there, whereas "indexed" constructs the whole list of pairs, regardless of whether or not a match WILL be found....

The reader's assumption is reasonable, but incorrect. Clojure's sequence library functions are generally lazy. So the call to indexed is really just a promise to generate indexes if they are actually needed.

To see this, create a logging-seq that writes to stdout every time it actually yields an element:

  (defn logging-seq [s]
    (if s
      (do (println "Iterating over " (first s))
  	(lazy-cons (first s) (logging-seq (rest s))))))

Now, you can add logging-seq to indexed so that each element of indexed is of the form [index, element, logged-element].

  (defn indexed [s] (map vector (iterate inc 0) s (logging-seq s)))

Test the modified indexed function at the Clojure REPL:

  user=> (indexed "foo")
  Iterating over  f
  (Iterating over  o
  [0 \f \f] Iterating over  o
  [1 \o \o] [2 \o \o])

As you can see, the indexed sequence is only produced as needed. (At the REPL it is needed to print the return value.)

Finally, you can test indexed-of-any and see that Clojure only produces enough of the sequence to get an answer. For a match on the first character, it only goes to the first character:

  (index-of-any "foo" #{\f})
  Iterating over  f
  0

If there is no match, index-of-any has to traverse the entire string:

  (index-of-any "foo" #{\z})
  Iterating over  f
  Iterating over  o
  Iterating over  o
  nil

So give up on those variables, and live lazy!

Get In Touch