An Introduction to the Clojure Standard Library

The Clojure standard library is a treasure trove of useful functions. Knowing what's in there can speed up development or even lead to a better functional design.

At the time of this writing there are:

user=> (clojure-version)  
"1.9.0-alpha14"
user=> (count (ns-publics 'clojure.core))  
649  

available public functions and macros in just clojure.core namespace alone! What are those? Why are they useful?

Building your developer tool box

Software development is often compared to a craft, despite the fact that it's predominantly an intellectual activity. While software development is abstract in nature there are many craft-oriented aspects to it:

  • The keyboard requires time and dedication to operate correctly. There are endless discussions on the best keyboard layout for programmers, for example to speed up typing - Dvorak users often claim huge benefits compared to QWERTY. Here's one comparison, including other kinds of layouts.
  • The development environment is a key aspect of programmers' productivity and another source of endless debate. Mastering a development environment often translates into learning useful key combinations and ways to customise the most common operations.
  • Using libraries, tools and idioms surrounding the language. Almost everything above the pure syntax rules.
  • Proficiency in several programming languages is definitely a plus in the job marketplace. The way to achieve it is by practicing them on a regular basis, including getting familiar with APIs and libraries the language offers.
  • Many other aspects require specific skills depending on the area of application: teaching, presenting or leadership.

The focus on mastering programming skills is so important that it became one of the key objectives of the Software Craftsmanship Movement. Software Craftsmanship advocates learning through practice and promotes an apprenticeship process similar to other professions.

A development kit usually comes for free as part of the language installation. The standard library functions are normally packaged with the language and used almost immediately. Interestingly, the standard library doesn't get the amount of attention you would expect for such an easy tool to reach, but is mostly learned on-the-go and by necessity.

Why should I care more about the Clojure Standard Library?

The expressiveness of a language is often described as the speed at which ideas can be translated into working software. Part of the expressiveness comes from the language itself (in terms of syntax), but another fundamental part comes from the standard library.

A good standard library liberates the programmer from the most mundane tasks like connecting to data sources, parsing XML, dealing with numbers and a lot more.

When the standard library does a good job, developers are free to concentrate on core business aspects of an application, boosting productivity and return on investment.

Consider also that a deep knowledge of the standard library is often what distinguishes an average developer from an expert developer. The expert can solve problems more elegantly and faster than the beginner because, apart from having solved the same problem before, they can compose a complex solution by pulling small pieces together from the standard library.

Finally, the standard library contains solutions to common programming problems that have been battle-tested over generations of previous applications. It is certainly the case for Clojure. The robustness and reliability that comes with that kind of stress is difficult to achieve otherwise.

What's inside the Clojure standard library?

The Clojure standard library is quite comprehensive and can be divided roughly into three parts:

  1. What is commonly referred as core, which is the content of the single namespace clojure.core. Core contains the functions that have evolved to be the main public API for the language, including basic math operators, functions to create and manipulate other functions, conditionals and much more. Functions in core are always available without any explicit reference from any namespace.
  2. Other namespaces that are shipped as part of the Clojure installation. These are usually prefixed with clojure followed by a descriptive name, like clojure.test, clojure.zippers or clojure.string. Functions in these namespaces need to be required.
  3. Finally, the content of the Java SDK which is easily available as part of Clojure Java interoperability features.

The standard library content can be roughly categorised by looking at the major features Clojure introduces and by the most common programming tasks. There are, for example, big groups of functions dedicated to Software Transactional Memory, concurrency or persistent collections. Of course Clojure also adds all the necessary support for common tasks like IO, math operations, formatting, strings and much more. Apparently missing from the Clojure standard library are solutions already provided by the Java SDK; for example, cryptography, low-level networking, HTTP, 2D graphics and so on. For all practical purposes, those features are not missing, but just usable as they are from Java without the need to re-write them in Clojure. Java interoperability is one of the big strengths of Clojure, opening the possibility to easily use the Java SDK (Standard Development Kit) from a Clojure program. Here's a broad categorisation:

  • The core support namespaces integrate core with additional functionalities on top of those already present. clojure.string is possibly the best example. Core already contains str but any other useful string functionalities have been moved out into the clojure.string namespace. clojure.template contains a few helpers for macro creation. clojure.set is about the "set" data structure. clojure.pprint contains formatters for almost all Clojure data types so they can print in a nice, human-readable form. Finally, clojure.stacktrace contains function to handle Java exceptions manipulation and formatting.
  • The REPL and server namespaces contain functionalities dedicated to the REPL. clojure.main includes handling of the main entry point into the Clojure executable and part of the REPL functionalities that have been split into clojure.repl in later time. The latest addition, clojure.core.server implements the server socket functionality.
  • The general supporting namespaces are about additional APIs, beyond what core has to offer. The namespaces present here enrich Clojure with new functionalities. clojure.walk and clojure.zip for example are two ways to walk and manipulate tree-like data structure. clojure.xml offers XML parsing capabilities. clojure.test is the unit test framework included with Clojure. clojure.sh contains functions to "shell-out" commands to the operative system. clojure.core.reducers offers a model of parallel computation.
  • The Java namespaces are dedicated to Java interop beyond what core already has to offer. clojure.java.browser and clojure.java.javadoc offer the possibility to open a native browser to display generic web pages or javadoc documentation, respectively. clojure.reflect wraps the Java reflection APIs, offering an idiomatic Clojure layer on top of it. clojure.java.io offers a sane approach to java.io, removing all the idiosyncrasies that made Java IO so confusing, like knowing the correct combination of constructors to transform a Stream into a Reader and vice-versa. Finally the clojure.inspector offers a simple UI to navigate data structures.
  • Finally, data serialization namespaces is about ways in which Clojure data can be encoded as string as an exchange format. clojure.edn is the main entry point into EDN format serialization. clojure.data contains only one user-dedicated function diff to compute differences between data structures. clojure.instant defines encoding of time related types.

Making your development life easier

The standard library is not just there to solve the usual recurring programming problems, but to offer elegant solutions to new development challenges. "Elegant" in this context translates to composable solutions that are easy to read and maintain. Let's look at the following example.

Suppose that you're given the task to create a report to display information on screen in a human readable form. Information is coming from an external system, and a library has already taken care of that communication. All you know is that the input arrives structured as the following XML (here saved as a local balance var definition):

(def balance
  "<balance>
    <accountId>3764882</accountId>
    <lastAccess>20120121</lastAccess>
    <currentBalance>80.12389</currentBalance>
  </balance>")

The balance needs to be displayed in a user-friendly way, that means:

  1. Removing any unwanted symbols other than letters (like the colon at the beginning of each key)
  2. Separating the words (using uppercase letters as delimiters)
  3. Formatting the balance as a currency with two decimal digits.

You might be tempted to solve the problem like this:

(require '[clojure.java.io :as io])
(require '[clojure.xml :as xml])

(defn- to-double [k m]
  (update-in m [k] #(Double/valueOf %)))

(defn parse [xml] ; <1>
  (let [xml-in (java.io.ByteArrayInputStream. (.getBytes xml))
        results 
          (to-double
            :currentBalance
            (apply merge
              (map #(hash-map (:tag %) (first (:content %)))
                (:content (xml/parse xml-in)))))]
    (.close xml-in)
    results))

(defn clean-key [k] ; <2>
  (let [kstr (str k)]
    (if (= \: (first kstr))
      (apply str (rest kstr))
      kstr)))

(defn- up-first [[head & others]]
  (apply str (conj others (.toUpperCase (str head)))))

(defn separate-words [k] ; <3>
  (let [letters (map str k)]
    (up-first (reduce 
      #(str %1 (if (= %2 (.toLowerCase %2)) 
                  %2 
                  (str " " %2))) "" letters))))

(defn format-decimals [v] ; <4>
  (if (float? v)
    (let [[_ nat dec] (re-find #"(\d+)\.(\d+)" (str v))]
      (cond
        (= (count dec) 1) (str v "0")
        (> (count dec) 2) (apply str nat "." (take 2 dec))
        :default (str v)))
    v))

(defn print-balance [xml] ; <5>
  (let [balance (parse xml)]
    (letfn [(transform [acc item]
              (assoc acc
                     (separate-words (clean-key item))
                     (format-decimals (item balance))))]
      (reduce transform {} (keys balance)))))

(print-balance balance)
;; {"Account Id" 3764882, 
;;  "Last Access" "20120121", 
;;  "Current Balance" "80.12"}
  1. parse takes the XML input string and parses it into a hash-map containing just the necessary keys. parse also converts :currentBalance into a double.
  2. clean-key solves the problem of removing the ":" at the beginning of each attribute name. It checks the beginning of the attribute before removing potentially unwanted characters.
  3. separate-words takes care of searching upper-case letters and pre-pending a space. reduce is used here to store the accumulation of changes so far while we read the original string as the input. up-first was extracted as an handy support to upper-case the first letter.
  4. format-decimals handles floating point numbers format. It searches digits with re-find and then either appends (padding zeros) or truncates the decimal digits.
  5. Finally, print-balance puts all the transformations together. Again, reduce is used to create a new map with the transformations while we read the original one. The reducing function was big enough to suggest an anonymous function in a letfn form. The core of the function is assoc the new formatted attribute with the formatted value in the new map to display.

While being relatively easy to read (the three formatting rules are somehow separated into functions) the example shows minimal use of what the standard library has to offer. It contains map, reduce, apply and a few others, including XML parsing, which are of course important functions (and usually what beginners learn first). But, there are definitely other functions in the standard library that would make the same code more concise and readable.

Let's have a second look at the requirements to see if we can do do a better job. The source of complexity in the code above can be tracked down to the following:

  • String processing: strings need to be analyzed and de-composed. The clojure.string namespace comes to mind and possibly subs.
  • Hash-map related computations: both keys and values need specific processing. reduce is used here because we want to gradually mutate both the key and the value at the same time. But zipmap sounds like a viable alternative worth exploring.
  • Formatting rules of the final output: things like string padding of numerals or rounding of decimals. There is an interesting clojure.pprint/cl-format function that might comes handy.
  • Other details like nested forms and IO side effects. In the first case, threading macros can be used to improve readability. Finally, macros like with-open removes the need for developers to remember to initialize the correct Java IO type and close it at the end.

By reasoning on the aspect of the problem we need to solve, we listed a few functions or macros that might be helpful. The next step is to verify our assumptions and rewrite the example:

(require '[clojure.java.io :as io])
(require '[clojure.xml :as xml])

(defn- to-double [k m]
  (update-in m [k] #(Double/valueOf %)))

(defn parse [xml] ; <1>
  (with-open [xml-in (io/input-stream (.getBytes xml))]
    (->> (xml/parse xml-in)
         :content
         (map #(hash-map (:tag %) (first (:content %))))
         (apply merge)
         (to-double :currentBalance))))

(defn separate-words [s]
  (-> (str (.toUpperCase (subs s 0 1)) (subs s 1))    ; <2>
      (clojure.string/replace #"([A-Z][a-z]*)" "$1 ") ; <3>
      clojure.string/trim))

(defn format-decimals [v]
  (if (float? v)
    (clojure.pprint/cl-format nil "~$" v) ; <4>
    v))

(defn print-balance [xml]
  (let [balance (parse xml)
        ks (map (comp separate-words name) (keys balance))
        vs (map format-decimals (vals balance))]
    (zipmap ks vs))) ; <5>

(print-balance balance)
;; {"Account Id" 3764882, 
;;  "Last Access" "20120121", 
;;  "Current Balance" "80.12"}
  1. parse now avoids the let block, including the annoying side effect of having to close the input stream by making use of with-open macro. ->> threading macro has been used to give linear flow to the previously nested XML processing.
  2. subs makes it really easy to process sub-strings. We don't need an additional function anymore because turning the first letter to upper-case is now a short, single liner.
  3. The key function in the new separate-words version is clojure.string/replace. The regex finds groups of one upper-case letter followed by lower-case letters. The last argument conveniently offers the possibility to refer to matching groups. We just need to append a space.
  4. format-decimals delegates almost completely to clojure.pprint/cl-format which does all the job of formatting decimals.
  5. zipmap brings in another dramatic change in the way we process the map. We can isolate changes to the keys (composing words separation and removing the unwanted ":") and changes to the values into two separated map operations. zipmap conveniently combines them back into a new map without the need of reduce or assoc.

The second example shows an important fact about "knowing your tools" (in this case the Clojure standard library): the use of a different set of functions not only cuts the number of lines from 45 to 30, but also opens up the design to completely different decisions. Apart from the case where we delegated entire sub-tasks to other functions (like cl-format for decimals or int to clean a key), the main algorithmic logic took a different approach that does not use reduce or assoc. A solution that is shorter and more expressive is clearly easier to evolve and maintain.

The well-kept secret of the Clojure Ninja

Learning about the functions in the standard library is usually a process that starts at the very beginning. It happens when you first approach some tutorial or book. For example, when the author shows a beautiful one-liner that solves an apparently big problem.

Usually developers don't pay explicit attention to the functions in the standard library, assuming knowledge will somewhat increase while studying the features of the language. This approach can work up to a certain point, but it is unlikely to scale. If you are serious about learning the language, consider allocating explicit time to understand the different nuances of similar functions or the content of some obscure namespace. The proof that this is time well spent can be found by reading other people's experiences: the web contains many articles describing the process of learning Clojure or documenting discoveries (possibly the best example is Jay Field's blog).

The following is a trick that works wonders to become a true Clojure Ninja. Along with learning tools like tutorials, books or exercises like the Clojure Koans, consider adding the following:

  • Chose a specific time of the day to select a function from the Clojure standard library. It could be lunch or commuting time for example.
  • Study the details of the function sitting in front of you. Look at the official docs first, try out examples at the REPL, search the web or www.github.com for Clojure projects using it.
  • Try to find where the function breaks or other special corner cases. Pass nil or unexpected types as arguments and see what happens.
  • Rinse and repeat the next day.

Don't forget to open up the sources for the function, especially if it belongs to the "core" Clojure namespace. By looking at the Clojure sources, you have the unique opportunity to learn from the work of Rich Hickey and the core team. You'll be surprised to see how much design and thinking goes behind a function in the standard library. You could even find the history of a function intriguing, especially if it goes back to the origins of Lisp: apply for example, links directly to the MIT AI labs where Lisp was born in 1958! (eval and apply are at the core of the meta-circular interpreter of Lisp fame. The whole Lisp history is another fascinating read on its own. See any paper from Herbert Stoyan on that matte.) Only by expanding your knowledge about the content of the standard library you'll be able to fully appreciate the power of Clojure.

Summary

  • The standard library is the collection of functions and macros that comes out of the box by installing Clojure.
  • The Clojure Standard Library is rich and robust, allowing developers to concentrate on core business aspects of an application.
  • Deep knowledge of the content of the Standard Library improves code expressiveness exponentially.
  • Knowing what's inside the standard library can speed up development or even lead to a better functional design.