Posted on August 20, 2020

Actually, Maybe is Great

Like Programming in Haskell in General

One of the reasons I like Haskell is because of how much it simplifies dealing with certain classes of problems. Take dealing with potentially missing values, for example. Some languages use some form of NULL to attempt to address this, essentially incorporating NULL as a member of every type.

In Haskell, the absence of NULL and the presence of algebraic data types and typeclasses allows for a tremendously powerful and flexible approach to working with these kinds of optional values: Maybe lets us avoid the hazards and frustration in dealing with NULL values, and helps us eliminate boilerplate significantly.

Preface

The catalyst for this post was a discussion I had with my colleagues about NULL vs. Maybe. Fairly quickly after the discussion began, it became clear to me that many believe using Maybe throughout a codebase is tedious and boilerplate-heavy.

I would like to show why this is not the case using clear and simple examples in Haskell, building up to a guided refactoring of some code used in a real application. I hope that even readers without much Haskell experience will find these examples understandable, but some familiarity with basic Haskell syntax is assumed: I expect you’ll be able to read basic type signatures, understand how variables and functions are declared and used, and have a passing familiarity with do notation. Regardless, if you’re otherwise an experienced programmer in some other language, I’m hoping that if you squint you should be able to understand a lot of these examples, regardless of your familiarity with Haskell.

Some of these examples will make it obvious how they provide utility, others will seem contrived and may take imagination to appreciate. My goal is to impress upon the reader how the features I mentioned above (ADTs and typeclasses) allow for designing around sets of lawful operations–operations which you can compose freely within the bounds of their lawful behavior. This approach makes building flexible, modular components quite enjoyable once you’ve started to understand how things fit together.

I am hoping this piece will fall solidly in the lower third of the Haskell pyramid, somewhere around the “productive” line. I also hope that, by the end of this piece, you will have developed some intuition of how much utility is provided by Maybe and more importantly, the language features that make Maybe nice to work with in the first place.

The Haskell Approach

A common sort of example given when introducing Maybe is to use a case expression or function with pattern matching, like so:

maybeInc :: Maybe Int -> Maybe Int
maybeInc mi =
    case mi of
      Just s -> Just (s + 1)
      Nothing -> Nothing

-- equivalent to

maybeInc :: Maybe Int -> Maybe Int
maybeInc (Just s) = Just (s + 1)
maybeInc Nothing = Nothing

-- ghci
λ> maybeInc (Just 1)
Just 2
λ> maybeInc Nothing
Nothing
λ> 

Unfortunately, these simple examples barely hint at what you get by making optionality first-class.

In fact, I don’t have to write functions like the one above, because Maybe has a Functor typeclass instance. This means we can “lift” functions into the Maybe context with fmap; I can directly apply functions as-is to any Maybe value:

λ> fmap (+ 1) (Just 1)
Just 2
λ> fmap (+ 1) Nothing
Nothing
λ> (+ 1) <$> Just 1 -- (<$>) is Applicative infix syntax for fmap
Just 2

It is exactly the fact that we’ve been able to make this concept of optionality first class–or to be more accurate, the concept of a tagged union first class–that makes it possible for us to apply general operations over it in a concise fashion.

The Power of Typeclasses

Taking a quick glance at some of the instances defined for Maybe we see:

data Maybe a = Nothing | Just a 	-- Defined in ‘GHC.Maybe’
instance Applicative Maybe -- Defined in ‘GHC.Base’
instance Eq a => Eq (Maybe a) -- Defined in ‘GHC.Maybe’
instance Functor Maybe -- Defined in ‘GHC.Base’
instance Monad Maybe -- Defined in ‘GHC.Base’
instance Semigroup a => Monoid (Maybe a) -- Defined in ‘GHC.Base’
instance Ord a => Ord (Maybe a) -- Defined in ‘GHC.Maybe’
instance Semigroup a => Semigroup (Maybe a)
  -- Defined in ‘GHC.Base’
instance Show a => Show (Maybe a) -- Defined in ‘GHC.Show’
instance Read a => Read (Maybe a) -- Defined in ‘GHC.Read’
instance Foldable Maybe -- Defined in ‘Data.Foldable’
instance MonadFail Maybe -- Defined in ‘Control.Monad.Fail’
instance Traversable Maybe -- Defined in ‘Data.Traversable’
...

Each typeclass instance represents a useful set of general, composable functions defined for the Maybe type. These same typeclasses have instances defined for many other standard types in Haskell, and are very often provided for types in libraries as well, so they present a consistent interface that anyone familiar with the language will immediately recognize. Many of these typeclasses have associated laws which provide guarantees when working with them.

This is just a subset of what is available for Maybe, and there are many more instances out there in various libraries in the wild. That said, in the list above are some of the most fundamental and commonly-used typeclasses that you’ll encounter in Haskell.

Let’s take some examples from the above to see what we get out of the box.

We can compare Maybe instances if we can compare what they contain (Ord):

λ> Just 1 > Just 2
False
λ> Nothing > Just 2
False
λ> Just 3 > Just 2
True
λ> 

We can append Maybe instances together if we can apply that same operation to their contents (Semigroup/Monoid):

λ> Just "foo" <> Just "bar"
Just "foobar"
λ> Just "foo" <> Nothing
Just "foo"
λ> Just [1] <> Just [2] -- etc.
Just [1,2]

We can apply functions to the contents as you’ve already seen (Applicative):

-- <$> is infix fmap
λ> (\a' b' -> if (a' > 0) then a' else b') <$> (Just 2) <*> (Just 1)
Just 2
λ> (\a' b' -> if (a' > 0) then a' else b') <$> (Just 0) <*> (Just 1)
Just 1

As hinted at previously, another way to think about this is as if we are lifting functions into the Applicative context:

λ> (liftA2 (\a b -> if (a > 0) then a else b)) (Just 2) (Just 1)
Just 2
λ> (liftA2 (\a b -> if (a > 0) then a else b)) (Just 0) (Just 1)
Just 1

In fact, Applicative gives us the ability to think about “effects” or an “effectful context” distinctly from the “wrapped” values we’re working with.

For example, let’s say we are building a security service for authenticating users and setting their permissions. We have to run a sequence of checks, each of which has to be valid in order for someone to be logged-in. Additionally, with our last check we want to confirm that the user has a score above a certain threshold, which will give us an extra bit of information by which to filter the actions available to the user.

One way we can model this is to use a new sum type to represent our sequence of checks, and wrap those in Maybe to represent the success or failure of a given check–Nothing indicates failure of any check, and the Bool value we get at the end in the presence of a non-Nothing value tells us if our user’s score was great enough.

(If it’s not obvious, please understand that this is almost certainly not how we’d actually model this in Haskell, or any other language, I’d imagine.)

data Checks = Authenticated | Authorized | DogLikeabilityScore { score :: Int }
  deriving Show

Then we can run our sequence of checks like this:

λ> (>200) . score <$> (Just Authenticated *> Just Authorized *> Just (DogLikeabilityScore 10))
Just False
λ> (>200) . score <$> (Just Authenticated *> Just Authorized *> Just (DogLikeabilityScore 201))
Just True
λ> (>200) . score <$> (Just Authenticated *> Nothing *> Just (DogLikeabilityScore 201))
Nothing
λ> (>200) . score <$> (Nothing *> Just Authorized *> Just (DogLikeabilityScore 201))
Nothing
λ> 

*> is a function in the Control.Applicative package, with the type Applicative f => f a -> f b -> f b. It has a companion <* with the type Applicative f => f a -> f b -> f a. They allow you to process your effects and take or ignore the corresponding values in different order; they differ only in which Applicative value they return.

Using this Applicative approach, we are able to short circuit our chain of Checks so that we fail immediately when any one of their effects evaluates to Nothing. At the same time we are able to apply a function against the value of our final Maybe instance only, without boilerplate.

We’re not limited to this ordering either. Let’s say that, for some reason, we need to calculate our DogLikeabilityScore before the user is Authorized, but after they are Authenticated:

λ> (>200) . score <$> (Just Authenticated *> Just (DogLikeabilityScore 201) <* Just Authorized)
Just True
λ> 

I want to highlight the fact that, not only are we able to work with Maybe values without boilerplate, but we have a streamlined way to work with the separate “layers” of Maybe values at the same time; we can process the effects in parallel, but distinctly. This is not just safer and nicer than working with NULL values or any of its many iterations, but it is far more general and powerful.

However, there’s further we can go with power and flexibility (by sacrificing some generality and composability): with Monad we can make decisions about whether or not to process an effect within a context. We also get do notation, which makes it easy to exploit the flexibility provided by Monad.

Let’s revisit the function we were previously lifting into an Applicative context, and make a Monad-based version:

posIntPred_App :: Maybe Int -> Maybe Int -> Maybe Int
posIntPred_App = liftA2 (\a b -> if (a > 0) then a else b)

posIntPred_Mon :: Maybe Int -> Maybe Int -> Maybe Int
posIntPred_Mon a b = do
  as <- a
  if (as > 0) then return as else b

Or, maybe it’s more clear if we write this like so:

posIntPred_Mon' :: Maybe Int -> Maybe Int -> Maybe Int
posIntPred_Mon' a b = a >>= (\a' -> if (a' > 0) then return a' else b)

The point is that Monad lets us selectively process computations as we see fit. It gives us the power of control, where Applicative can only chain effectful computations together (which is sometimes just what is needed).

This may be more obvious if we compare the two functions’ behavior with Nothing values. Here, we see short-circuiting behavior–we cannot control whether or not the Nothing we pass triggers a “Nothing cascade:”

λ> posIntPred_App (Just 1) Nothing
Nothing

Whereas with Monad we never evaluate our second argument:

λ> posIntPred_Mon (Just 1) Nothing
Just 1

As an aside, I want to note that we can also implement this function like so:

posIntPred_Ord :: Maybe Int -> Maybe Int -> Maybe Int
posIntPred_Ord a b = if (a > (Just 0)) then a else b

It’s important not to lose the forest for the trees sometimes, especially when you have such a rich set of tools to choose from!

A (More) Practical Example

There are many more typeclasses to explore, but some of the most important foundational ones have been introduced, so let’s examine a bit of code that actually does something useful.

We’ll start with something that’s a bit of a mess, and see how we can structure our Maybe computations more effectively as we refactor:

type CacheString = String
type Cache = M.Map Address ForecastJson
type Address = T.Text
type ForecastJson = Value

...

getFirstEntryEndTime :: ForecastJson -> Maybe LT.ZonedTime
getFirstEntryEndTime forecastJson = (UTC.iso8601ParseM . T.unpack . fromJust)
  (forecastJson ^? key "properties" . key "periods" . _Array . _head . key "endTime" . _String)

mainApp :: (MonadIO m, MonadHttp m) => T.Text -> m ()
mainApp address = do
  cache <- liftIO (readCache ".cache")
  now <- liftIO LT.getZonedTime
  forecastJson <- case (M.lookup address cache) of
                    Just cachedJson -> let firstEntryEndTime = fromJust (getFirstEntryEndTime cachedJson)
                                       in if (LT.zonedTimeToLocalTime firstEntryEndTime >
                                               LT.zonedTimeToLocalTime now)
                                          then liftIO (putStrLn "Using cached JSON response")
                                               >> pure cachedJson
                                          else (getLatestForecast address cache)
                    Nothing         -> (getLatestForecast address cache)
  liftIO (putStrLn (D.renderForecast forecastJson))
  let updatedCache = M.insert address forecastJson cache
  liftIO (writeCache ".cache" updatedCache)

Yikes. This represents the main application logic for a little weather forecast utility. Everything is executing inside of a monadic context. In this case we’re simply defining the constraints we know we need–MonadIO for file-handling and printing, MonadHttp for making calls with Req. If this isn’t making a lot of sense right now, don’t worry–just think back to the do notation we introduced, and consider the MonadIO and MonadHttp constraints as dictating what we’re allowed to do within that do block.

Within the body you’ll see liftIO a lot, which is boilerplate that’s needed when running IO actions in this kind of constrained monadic context; it may be helpful here to you if it’s unclear which actions are IO actions. The MonadHttp constraint is only really relevant in considering the calls to getLatestForecast, where the actual HTTP requests to various services are made, when we aren’t using a cached response.

So, what’s wrong with this code? Well, for one, the branching logic used to calculate forecastJson is a muddle of concerns. It includes a lot of nesting and a confusing mix of a case (to process the Maybe value Data.Map.lookup returns) and an if (to check that our initial cached JSON entry’s end time is not older than now) inside a let. We are also calling getLatestForecast multiple times, when we should ideally be able to collapse our failure states into a single logical flow and call it just once.

More generally, we are dealing with a ton of Maybe values all over the place which we should be able to handle better and more coherently. In a few places we are even risking program failure because of calls to the partial function fromJust–this is handy when testing code out, but is a bad idea for anything where you want to ensure robustness.

First let’s write an idealized version of this function and see if we can write code to support that:

mainApp :: (MonadIO m, MonadHttp m) => T.Text -> m ()
mainApp address = do
  cache <- liftIO (readCache ".cache")
  forecastJson <- getForecast address cache
  liftIO (putStrLn (D.renderForecast forecastJson))
  let updatedCache = M.insert address forecastJson cache
  liftIO (writeCache ".cache" updatedCache)

Much cleaner! We’ve removed anything that isn’t strictly about reading/writing the cache or dumping out our formatted forecast table and pushed it into a separate function, which only cares about getting the forecast.

Of course, now we actually have to convert the rest of this mess (along with our getFirstEntryEndTime function) into something nicer:

  now <- liftIO LT.getZonedTime
  forecastJson <- case (M.lookup address cache) of
                    Just cachedJson -> let firstEntryEndTime = fromJust (getFirstEntryEndTime cachedJson)
                                       in if (LT.zonedTimeToLocalTime firstEntryEndTime >
                                               LT.zonedTimeToLocalTime now)
                                          then liftIO (putStrLn "Using cached JSON response")
                                               >> pure cachedJson
                                          else (getLatestForecast address cache)
                    Nothing         -> (getLatestForecast address cache)

First, let’s create the shell of a function which will at least type-check, and which fits into the “hole” we’ve created in our idealized sketch of the main flow above:

getForecast :: (MonadIO m, MonadHttp m) => Address -> Cache -> m ForecastJson
getForecast address cache = undefined

Pretty stupid, but it typechecks (and shows us why, if you hear a Haskell programmer say “if it typechecks it works,” you should hope they are being sarcastic). It has the MonadHttp constraint we know we’ll need to look up the forecast with getLatestForecast, and MonadIO for getting our current time and printing out a message.

Let’s make it work. We know we’re going to need our forecastJson at the end, so let’s get the logic for that in place, and we’ll keep putting placeholders in as needed. One goal we have is to centralize the logic around whether our not our cache can supply a value for this specific address within the right time range. We also want to avoid duplicated calls to getLatestForecast, but that should fall out of our first goal:

getCachedForecastJson :: LT.ZonedTime -> Address -> Cache -> Maybe ForecastJson
getCachedForecastJson = undefined

getForecast :: (MonadIO m, MonadHttp m) => Address -> Cache -> m ForecastJson
getForecast address cache = do
  now <- liftIO LT.getZonedTime
  case (getCachedForecastJson now address cache) of
    Just cachedJson -> liftIO (putStrLn "Using cached JSON")
                       >> pure cachedJson
    Nothing         -> getLatestForecast address

The logic about whether or not we can use a cached value is central. The only other thing we do is get a timestamp for the current time, and we dump out a notification to the user when we end up using our cache.

In the case we determine we cannot use a cached value, we look up the forecast in getLatestForecast. And, note that we’ve naturally structured the logic such that we only call getLatestForecast once here. (Examining that function is outside the scope of this post, but you can read the entire codebase for this utility here.)

We explicitly check our Maybe values with a case expression here, in the style of our first example in this piece. But here this is appropriate–this explicitness illuminates a crucial branch point in the computation, where we engage in one of two very different behaviors depending on the outcome.

Taking Advantage of the Maybe Monad

While we wanted to be explicit in our last refactoring, now we’d like to chain all the remaining computations together into a single computation that returns a single Maybe value–this is what we need to implement getCachedForecastJson:

case (M.lookup address cache) of -- we need to do this lookup, and
                                 -- everything beneath here up to
                                 -- where we call getLatestForecast
  Just cachedJson -> let firstEntryEndTime = fromJust (getFirstEntryEndTime cachedJson)
                     in if (LT.zonedTimeToLocalTime firstEntryEndTime >
                             LT.zonedTimeToLocalTime now)
                        then liftIO (putStrLn "Using cached JSON response")
                             >> pure cachedJson
                        -- this is in getForecast
                        else (getLatestForecast address cache)
                        -- this is in getForecast
  Nothing         -> (getLatestForecast address cache)

But before we tackle that, remember our getFirstEntryEndTime function?

getFirstEntryEndTime :: ForecastJson -> Maybe LT.ZonedTime
getFirstEntryEndTime forecastJson = (UTC.iso8601ParseM . T.unpack . fromJust)
  (forecastJson ^? key "properties" . key "periods" . _Array . _head . key "endTime" . _String)

I’ll briefly explain what this does. First, using lens, a possibly-empty “endTime” is extracted from the cached JSON. Then it’s converted from Text into a String as that’s what UTC.iso8601ParseM expects (string types are a bit of a wart in Haskell, you can read more about it here–and, incidentally, that whole article is great, as is everything on that blog, check it out).

(An aside for those curious about lens: think of lens as a very generalized method of querying and manipulating all kinds of data in Haskell. In this case we are able to use it to drill into a value of type…Value, which is supplied by the library aeson (and these specific lenses are actually supplied by the library lens-aeson). A Value is essentially the Haskell equivalent of JSON data, so it supplies all the same Object, String, etc. types. So here lens gives us the ability to extract values from JSON data and manipulate it as we see fit, but it can be used for much more than that.)

However in this first version I cheated and used a partial function just to get it working, because I was reasonably certain I would always have a value at the end of my preview (^?) into the JSON data. This may be true, but, it may also be a bad assumption. Bottom line, I’d prefer to avoid partial functions whenever I can.

First, the text conversion and date parsing bit will go inline into to the body of getCachedForecastJson (see below), and all of the “lensy” stuff will be split out into its own isolated function. This just keeps things a bit cleaner, and since we are wrapping up all our Maybe logic in a single function, it makes sense to move those functions there anyways.

getFirstEntryEndTimeStr :: ForecastJson -> Maybe T.Text
getFirstEntryEndTimeStr = preview ( key "properties"
                                  . key "periods"
                                  . _Array
                                  . _head
                                  . key "endTime"
                                  . _String )

Finally, we get to getCachedForecastJson. Here we can leverage the Monad instance for Maybe. This is where we can really see how powerful this approach is:

getCachedForecastJson :: LT.ZonedTime -> Address -> Cache -> Maybe ForecastJson
getCachedForecastJson now address cache = do
  cachedJson <- M.lookup address cache
  firstEntryEndTime <- getFirstEntryEndTimeStr cachedJson
                       >>= UTC.iso8601ParseM . T.unpack :: Maybe LT.ZonedTime
  let isNowAfterEndTime = LT.zonedTimeToLocalTime firstEntryEndTime > LT.zonedTimeToLocalTime now
  cachedJson <$ guard isNowAfterEndTime

Notice anything? There’s no pattern matching, or explicit case, or if handling of Maybe values anywhere here. Nonetheless we’ve eliminated all calls to the partial function fromJust. The only mention of Maybe is where we explicitly annotate our UTC.iso8601parseM function to ensure it returns a Maybe LT.ZonedTime, and of course in the type signature. We could actually leave it out, but it’s good practice to include, and provides useful documentation–a glance at the type signature tells us we’re dealing with Maybe values in our do block, as opposed to some other Monad instance.

Boilerplate is basically…NULL (sorry).

I’d like to discuss a few of the pieces here in more depth to highlight how all of Haskell’s standard library and typeclasses fit together to enable this:

The function UTC.iso8601parseM is actual far more general; its type signature is (MonadFail m, UTC.ISO8601 t) => String -> m t. MonadFail basically provides a fail function that is called when a pattern match in do notation fails. Here we leverage the fact that there is a Maybe instance of fail–it simply returns Nothing.

Similarly we leverage the Alternative instance for Maybe when we convert the Bool results of our isNowAfterEndTime predicate to a Maybe value with <$ guard. You can think of Alternative like a combination of Applicative and Monoid. If that means nothing, the important thing here is that, in particular, it provides an empty function which evaluates to Nothing (see the docs for guard and (<$) for more).

We have the flexibility to use intermediate values like cachedJson throughout our do block, but we can also simply chain these calculations together when we don’t need to work with the intermediate values directly–see how we bind (>>=) the output of getFirstEntryEndTime cachedJson to UTC.iso8601parseM . T.unpack–all we care about is passing the output of one calculation to the next.

At any point a Nothing value may result from one of these calculations, but we defer handling it and simply chain our Maybe calculations together, returning a single Maybe value at the end. The evaluation of our Maybe effect happens only at the point we need to make a decision–in the case expression in getForecast.

The final code, in its entireity:

getFirstEntryEndTimeStr :: ForecastJson -> Maybe T.Text
getFirstEntryEndTimeStr = preview ( key "properties"
                                  . key "periods"
                                  . _Array
                                  . _head
                                  . key "endTime"
                                  . _String )

getCachedForecastJson :: LT.ZonedTime -> Address -> Cache -> Maybe ForecastJson
getCachedForecastJson now address cache = do
  cachedJson <- M.lookup address cache
  firstEntryEndTime <- getFirstEntryEndTimeStr cachedJson
                       >>= UTC.iso8601ParseM . T.unpack :: Maybe LT.ZonedTime
  let isNowAfterEndTime = LT.zonedTimeToLocalTime firstEntryEndTime > LT.zonedTimeToLocalTime now
  cachedJson <$ guard isNowAfterEndTime

getForecast :: (MonadIO m, MonadHttp m) => Address -> Cache -> m ForecastJson
getForecast address cache = do
  now <- liftIO LT.getZonedTime
  case (getCachedForecastJson now address cache) of
    Just cachedJson -> liftIO (putStrLn "Using cached JSON")
                       >> pure cachedJson
    Nothing         -> getLatestForecast address

mainApp :: (MonadIO m, MonadHttp m) => T.Text -> m ()
mainApp address = do
  cache <- liftIO (readCache ".cache")
  forecastJson <- getForecast address cache
  liftIO (putStrLn (D.renderForecast forecastJson))
  let updatedCache = M.insert address forecastJson cache
  liftIO (writeCache ".cache" updatedCache)

A bit longer, but much cleaner and easier to understand. This is because of, not in spite of, the ubiquity of Maybe values in Haskell.

Afterword

It’s hard to reconcile the examples just explored with the claim that using Maybe leads inevitably to boilerplate and suffering.

Having considered this for some time, I’ve come to the conclusion that many programmers base their assumptions about Maybe on their own experience of dealing with NULL. That is, I imagine most programmers who have worked in a language with NULL have been faced with the situation where something blew up because of a NULL value, and they were forced to deal with it. Extrapolating from there, I can imagine how most programmers would assume that having Maybe values everywhere something could be NULL would end up littering their code with if/else checks where (maybe) none were necessary before.

I believe this is partially because, ironically, the presence of NULL encourages programmers to simply stop thinking about potential missing values in their code, and especially in a dynamically-typed language, ends up promoting a reactionary approach to safety in the presence of potential missing values. After all, when any value can be NULL, it can be hard to know where to go once you’ve taken care of the obvious places–and that’s not even taking into consideration whatever philosophy of NULL-handling library authors in your language may subscribe to (not much of one, in my experience).

So, I can understand how anyone with this kind of experience would make the same assumption about Maybe. It’s not easy to imagine an alternative if you haven’t really experienced it. And for most programmers who haven’t spent time with Haskell, it’s far from obvious how nice it is to work with and how well it solves this problem.

I hope this piece helps make that more clear.


Thank you very much for reading this far! I invite you to drop me a line if you have any feedback–my address is ddellacosta at Google’s famous email service.

You can get the code written for the second half of this post here. I welcome feedback on that code as well.

If this piece has piqued your interest in Haskell and you’d like to learn more, I recommend Graham Hutton’s Programming in Haskell. There are also a bunch of learning resources linked from the main Haskell site, and the Haskell Reddit is a good source of news about Haskell and the community, with resources listed on the sidebar there as well.

Finally, thanks to merijn in #haskell on freenode for the <$ guard technique.