Michael Xavier

blog $ takeWhile bloggable thoughts

Projects GitHub Resume Email All Posts RSS

  • GHC Type-Level Literals for Pagination

    March 25, 2017

    I recently came up with a pretty nice idea to use type-level literals that arrived in GHC 7.10. I hadn’t directly used type-level literals before so this also served as a pretty gentle introduction to them.

    Type-level Naturals

    I don’t intend to completely cover type-level literals or type-level naturals here. The Haskell wiki has a decent jumping off point for reference. Here’s how I like to think about it:

    Haskell’s type system is powerful and only getting more powerful over time. Beyond just letting you define types and checking them as you construct a program, it is also starting to make it easier for the developer to say “Hey GHC, here’s some type info. Please remember it because I’m going to need you to reference it later.” I do this all the time with phantom types: type variables that have no bearing on the data but can be used for type-level assertions later:

    data S3Ref a = S3Ref ObjectKey
    
    storeOnS3 :: (Serialize a) => S3Ref a -> a -> m ()
    
    getFromS3 :: (Deserialize a) => S3Ref a -> m (Either DeserializeError a)

    The above pseudocode gives us a type variable to keep information just for the type system: even though an S3 object key is essentially a piece of text, we let the developer associate it with a type. The compiler will remember that type, assert you don’t mix it up with another type and recover it in the implementation to produce the expected value. Pretty neat!

    Type-level naturals are a baby step forward: they let us put natural numbers (positive integers) into the type system and recover them at runtime. You can do a lot of fancy stuff with this ability, but it got me thinking: what common functionality do I deal with that have the concept of a positive integer and how could it benefit from lifting that into the type system?

    Pagination

    I work on web apps in Haskell quite a bit. In pretty much any API or traditional web app I write, I always have some sort of “listing” endpoint: users, products, etc. In most of those cases I have to implement pagination: the ability to get the data 1 page at a time, because there ends up being too much data to reasonably return it all in one request. Simple pagination tends to have 2 components: the page number (1 and onward) and the number of records per-page to return. We’ll be discussing per-page in this article because it has some interesting properties:

    1. Per-page must be a positive integer (sound familiar?)
    2. Per-page must have a limit. If sending a whole table is too costly, we should not let a user do it by setting a too-high per-page count.
    3. It often makes sense to set the per-page limit on a per-resource basis. If we have a very compact resource or if we know that users need quite a few of them at a time, it may make sense to have a higher limit than a resource where each item is always much larger and more costly to send over the wire.

    First Attempt: Smart-Constructor

    I’m a big fan of smart-constructors. In short, its a technique in Haskell where you do not export the default constructor for a type from the module where you define it. You only export the type and a way to construct it and retrieve its internals. Once you test the constructor thoroughly for correctness, you can be certain that any time outside of the module where you see the value that it maintains the invariants you set forth.

    So for per-page we want to make it impossible to set a value too low (0 or lower) or too high. Previously I did this:

    module Pagination
        ( PerPage
        , perPage
        , mkPerPage
        , PerPageError(..)
        ) where
    
    newtype PerPage = PerPage
      { perPage :: Int
      } deriving (Show, Eq, Ord)
    
    
    data PerPageError = TooSmall
                      | TooBig
                      deriving (Show, Eq)
    
    mkPerPage :: Int -> Either PerPageError PerPage
    mkPerPage n
      | n <= 0 = Left TooSmall
      | n >= 20 = Left TooBig
      | otherwise = Right (PerPage n)

    This is pretty good. It meets our criteria for restricting values but it is inflexible. We cannot specialize it to each use case. In order to do that we’d probably have to:

    1. Write one of these constructors for each different resource. A UserPerPage, a ProductPerPage, etc. Also, to get that module separation, we’d have to dump all these pagination constructors into a separate module from where we’d use it.
    2. We could let you specify a limit in the mkPerPage constructor. Indeed, you may even end up doing this in option 1 to DRY things up. The type would look like mkPerPage :: Int -> Int -> Either PerPageError PerPage. That doesn’t seem great since it would be easy to screw up the arguments and pass the user-supplied pagination as the limit!

    Improving PerPage with Type-Level Naturals

    What if we could have just one type to deal with pagination and leave the maximum up to the use site? After all, the only time we reference the maximum is once in the constructor. Here’s what I came up with:

    module Pagination
        ( PerPage
        , perPage
        , mkPerPage
        , PerPageError(..)
        ) where
    
    import qualified Data.Proxy as P
    import qualified GHC.TypeLits as TL
    
    newtype PerPage (max :: TL.Nat) = PerPage
      { perPage :: Int
      } deriving (Show, Eq, Ord)
    
    
    data PerPageError = TooSmall
                      | TooBig
                      deriving (Show, Eq)
    
    mkPerPage :: (TL.KnownNat max) => Int -> Either PerPageError (PerPage max)
    mkPerPage n
      | n <= 0 = Left TooSmall
      | n >= (fromInteger (TL.natVal (P.Proxy :: P.Proxy max))) = Left TooBig
      | otherwise = Right (PerPage n)

    We add a type variable, max to our PerPage type. Its like a phantom type because it is not used in the actual data structure. We’re asking the compiler to remember max because we will use it later. We don’t take just any max though, we take a max of type Nat which is short for Natural. We say that max can be one of many (in fact, infinite) types that fall under the umbrella of natural numbers.

    In our constructor, we add the constraint KnownNat, which is always present for natural numbers. The constraint means that the compiler has remembered which natural number is inhabiting max and can retrieve it for us whenever we want.

    Lastly, we have (fromInteger (TL.natVal (P.Proxy :: P.Proxy max))). Proxy is basically a general purpose type with 1 phantom type variable:

    data Proxy a = Proxy

    It is a handy way to refer to things at the type level when you don’t have anything at the value level. natVal takes a Proxy referring to a known natural number and gives you at the value level that number as an Integer. Lastly, we use fromInteger to convert it from Integer -> Int.

    So how do we use this? Let’s say we’re writing users code and our app decides that 20 is the maximum number of users we can return per page. It would look like this:

    type UserPerPage = PerPage 20

    That’s it! We can now specialize any of our user-facing code to safely limit the pagination. If we want a function with unrestricted pagination, say for internal scripts, we can do that too!

    -- Anything at the web layer that we don't trust must be limited to 20 per page.
    untrustedGetUsers :: PageNum -> UserPerPage -> m [User]
    untrustedGetUsers = internalGetUsers -- internalGetUsers is just a more generalized version of this
    
    -- We can use this in any code where we're not concerned about fetching too much per page.
    internalGetUsers :: (TL.KnownNat maxpp) => PageNum -> PerPage maxpp -> m [User]
    internalGetUsers = error "todo"
    Comments
  • Enterprise Haskell Pattern: Lensed Reader

    April 3, 2016

    Recently I’ve gotten some feedback from a non-Haskelling colleague that from their perspective, the Haskell community does not discuss design patterns very often. On one hand, I do feel that functional programming languages are not as fertile of a breeding ground for patterns as OOP languages. I suspect that languages that tend towards heavy usage of “nouns” also tend towards heirarchical thinking and classification. Who knows?

    But this is a bit of a cop out. We do use patterns in Haskell. About a year and a half ago I went from being a hobbyist Haskeller to a full-time Haskeller. This was my first opportunity to see “real” Haskell code in the wild. I remember having the same worries before using Haskell in my day job that I would immediately run into problems that would require a pattern I didn’t know. While I think these worries were definitely overblown, I’d like to discuss some of the techniques I’ve picked up in the hopes that other Haskellers looking to use Haskell in “enterprise” software can at least have a jumping off point.

    This article will be discussing what I’ve called the “Lensed Reader” pattern.

    The Precursor: ReaderT-based Transformer Stack

    First and foremost, not long after having the idea to write about this pattern, I came across a wonderful talk that covers many of the points. If you prefer learning through videos, I highly recommend Next Level MTL by George Wilson.

    Most applications I’ve worked on need a big piece of read-only state. Things you’ll commonly find in this state object are:

    • Database connection pools
    • Application configuration
    • Logging environment

    A ReaderT-based monad transformer is usually perfect for this. For things like your logging environment you may want to add namespaces or pause logging from time to time, but thankfully, MonadReader implementations provide a local combinator which temporarily modifies the reader context and restores it automatically, so you don’t need to necessarily resort to MonadState. I usually end up defining a newtype transformer stack at the heart of my application. Its important to use a newtype wrapper to define instances for your stack without resorting to orphaned instances. It’ll end up looking something like this:

    {-# LANGUAGE GeneralizedNewtypeDeriving #-}
    import Control.Monad.Reader
    
    data AppState = AppState -- ..
    
    newtype AppT m a = AppT { unAppT :: ReaderT AppState m a}
                     deriving ( Functor
                              , Applicative
                              , Monad
                              , MonadIO
                              , MonadReader AppState)
    
    runAppT :: AppState -> AppT m a -> m a
    runAppT s m = runReaderT (unAppT m) s

    A couple of quick notes if any of this looks unfamiliar:

    • GeneralizedNewtypeDeriving lets us piggyback on ReaderT’s instances. By and large, if ReaderT r m a has an instance, our stack can get it without any boilerplate. If we need a customized instance, we’re free to write it ourselves.
    • runAppT can be read in the following order:
      1. unAppT unwraps your AppT m a to a plain ReaderT AppState m a.
      2. runReaderT further unwraps it to AppState -> m a
      3. We pass in the AppState and get an m a.

    Baby’s First Whitelabel App

    Now we’re going to set up a real rinky-dink, useless app to demonstrate the technique. All our app can do is log. And we obviously want to be able to license this groundbreaking tech to any outfit willing to pay, so we’ll be able to configure the app to have a configurable name.

    data Config = Config {
          companyName :: String
        }
    
    data AppState = AppState {
          asConfig :: Config
        , asLogger :: String -> IO ()
        }

    Cool! Now we can define some helper functions we’ll need in our app:

    logMsg :: String -> AppT IO ()
    logMsg msg = do
      logger <- asks asLogger
      logger msg
    
    getCompanyName :: AppT IO String
    getCompanyName = asks (companyName . asConfig)

    All this looks great, but there’s a problem. These functions are very specific about the monad they run in. Sure, you can log a message and get the company name in AppT IO, but you can make due with a lot less. Its also a code smell that getCompanyName has IO in its type because it isn’t even doing any IO. There’s virtue in generic functions in Haskell because they communicate the capabilities they require and thus shrink the solution space. No cards hiding in the sleeve as it were. Put another way, you can hide a hell of a lot in IO. If IO in a function is any m that implements Monad, then even if it resolves to IO in the end, we can be sure that this particular function doesn’t avail itself of the evils of IO.

    Also, when you’re specific about your monad stack, you have to throw in lots of lifts when you try to use those functions from deeper in a stack. Its like having a home appliance that only works on the 2nd floor. For instance, say we were using EitherT to encapsulate some operation that could fail and mix it in with our app’s operations.

    import Control.Monad.Trans.Either
    
    -- | Try to download an update for the software
    tryUpdate :: IO (Either String ())
    tryUpdate = return (Left "Psych! Thats the wrong number!")
    
    update :: EitherT String (AppT IO) ()
    update = do
      EitherT tryUpdate -- will abort if there's an error, which there will be
      lift (logMsg "Update complete") -- never gonna happen

    Yuck. Lifting. What if we are a few more layers deep in a monad transformer stack? What if we refactored some of this code somewhere else in the stack? We constantly have to keep track of how many lifts we’ll need to do. Wouldn’t it be nicer is to say that logging and company name can be accessed wherever you have access to AppState?

    Use the MTL!

    import Control.Monad.IO.Class
    
    logMsg :: (MonadIO m, MonadReader AppState m) => String -> m ()
    logMsg msg = do
      logger <- asks asLogger
      liftIO (logger msg)
    
    getCompanyName :: (MonadReader AppState m) => m String
    getCompanyName = asks (companyName . asConfig)
    
    update :: (MonadIO m, MonadReader AppState m) => EitherT String m ()
    update = do
      EitherT (liftIO tryUpdate) -- will abort if there's an error, which there will be
      logMsg "Update complete"

    Great! Here’s what we got:

    • MonadReader AppState m says in this monad, we could call ask and get an AppState. asks lets us refine that a bit with a selector function to just grab a piece of the state.
    • logMsg will run in any monad that has access to AppState and can run IO. These constraints act like capabilities and we only ask for what we need. We could easily create an alternative transformer stack in test that satisfied these constraints.
    • getCompanyName no longer needs IO, which is great because it has no business doing IO.
    • No more lifts!

    More Granularity with Lensed Reader

    In one of my real world applications, I wrote a utility for some analysts. It used the large AppState like record to generate a report. Much to my dismay, I found the analysts were avoiding using it because they didn’t have the databases (like PostgreSQL and Redis) the normal app needed, so when the app loaded up that AppState, the connection pools failed to establish and the whole thing crashed.

    The analysts were being reasonable. The actual task this tool was performing didn’t really need databases. It just needed the config. If all my code was using MonadReader AppState m, then everything would require the whole AppState, even if it wasn’t going to use the whole thing. The solution I arrived at was to break down AppState into just what I needed. So I used classy lenses.

    {-# LANGUAGE GeneralizedNewtypeDeriving #-}
    import Control.Monad.Reader
    
    data Config = Config {
          _companyName :: String
        }
    
    
    data AppState = AppState {
          _asConfig :: Config
        , _asLogger :: String -> IO ()
        }
    
    makeLenses ''AppState
    makeClassy ''Config
    
    instance HasConfig AppState where
      config = asConfig

    That makeClassy gives us something like this:

    class HasConfig a where
      config :: Lens' a Config
      companyName :: Lens' a String
    
    instance HasConfig Config where -- ...

    In other words, we now have a way to specify data types that contain Config. Note that companyName has a default implementation that pulls it off of Config. I’ve heard this type of abstraction refered to as a “seam”. It is a line in the fabric of the code that can be easily opened up and modified if need be.

    The final piece of the puzzle is view from lens, which is just like asks from MonadReader but it takes a lens.

    Now we can have:

    getCompanyName :: (MonadReader r m, HasConfig r) => m String
    getCompanyName = view (config . companyName)

    Take care to note that lenses compose in the opposite direction of functions, so we access config first, then companyName from there. Now, in a reporting function, we can be specific about what context each function needs and hook it up to a lighter context or even a totally different transformer stack.

    heavyReport :: (MonadReader AppState m) => m String
    heavyReport = do
      cn <- getCompanyName
      return (cn ++ " is the best company!")
    
    
    lightReport :: (MonadReader r m, HasConfig r) => m String
    lightReport = do
      cn <- getCompanyName
      return (cn ++ " is the best company!")
    
    runReport :: Config -> String
    runReport = runReader lightReport

    Check that out! We didn’t need AppT or IO. lightReport is just as happy being used in a minimal Reader as it is in our official AppT monad.

    tl;dr

    • Create “classy” lenses for your app’s state type and any subcomponents of it that you are likely to need to access independently. Don’t go too crazy on this. Try to observe YAGNI. It isn’t a huge deal to take some existing code and break up big chunks of state into smaller ones as you see fit. The great thing is that the type system will guide you as to how you’ll need to update your type signatures throughout your application.
    • Use constraints throughout the code instead of concrete transformer stacks. You only end up specifying the stack near main where you actually run the thing.
    • Try to use the minimal set of constraints needed for your functions. Low-level functions end up with smaller sets of constraints. Larger ones accumulate the combined constraints. It can be a bit of a pain having to write out constraints but on the flip side, it is nice to look at a function and see exactly what capabilities are required to implement it. If your function needs a Config, it carries (MonadReader r m, HasConfig r). If it doesn’t have that, GHC will give you a type error and tell you exactly what constraints you’re missing!
    • I’ve noticed that if you configure your projects with -Wall -Werror (and I strongly recommend that you do), GHC 8.0 will warn you about unnecessary constraints, so as your code evolves, if constraints stop being necessary, GHC will remind you to drop them!
    Comments
  • Protect Ya Neck: Enhancing Type Safety of 3rd Party Libraries

    October 24, 2015

    Over the last few years I’ve been transitioning most of the software development work I do from dynamic programming languages that offer very little in the way of type safety (Ruby, JavaScript) to ones that offer more (TypeScript, Haskell, PureScript), with plenty of trips back and forth between the two camps. Because the grass is always greener on the other side, it is easy early on to attribute safety to a language and to have an overly simplified, overly generous definition of what safety is. With more experience under my belt, I’ve come to realize that type safety is not a checkbox, but more of a dial, and while the language you use may control how far in either direction that dial goes, it is for better or worse up to the user to turn the knob. While type-safety is supposed to be a tool where a machine assists you in building more reliable software, its effectiveness is still largely controlled by the user’s knowledge and desire to protect themselves and their user. In the immortal words of The Wu:

    Aint a damn thing changed boy, protect ya neck.

    I may make this a series. Future posts will likely be less rambly and will stick to providing a real world case where the language didn’t automatically protect me, but rather where I recognized I was implementing something risky and chose to protect myself with the tools available.

    Example: Externally-Typed Resources

    At work we use an instrumentation tool. It is conceptually pretty simple: it provides you with some simple tools to instrument your codes with a few different types of measurements, namely counters and timers. It provides a generic backend interface to allow you to ship the collected metrics off to external services for analysis. It is not unlike the ekg package.

    The interface reminds me a lot of redis. It offers some dead simple types where you specify an arbitrary string key and it stores the data there for later processing. Some pseudo-typed examples are:

    -- | Increment a counter stored at the key by 1
    incrementI :: String -> m ()
    -- | Time a computation and store it as a sampling to key
    timeI :: String -> m a -> m a

    This is a usable, easy to understand interface for this library to have. You can’t necessarily expect 3rd party libraries to go much beyond this. As a user though, this is completely unsafe. It would be very easy to mix up the keys and accidentally increment a timer or time a counter. Furthermore, if I had to work with the same timer or counter in multiple places in my code, I could easily mistype the key or change it in one place but not somewhere else and screw up my data.

    Try GADTs!

    I don’t indend to explain GADTs from first principles in this post. Plenty of other sources do that better. What I will explain is why I chose them. GADTs offer some nice properties for solving this problem:

    1. They can easily represent “phantom” type variables (I.E. type variables representing types that are not actually members of the data structure). This is a great way to lift some information up to the type level and have the type checker do what its good at: keep track of it for you.
    2. They can join values with disparate phantom types into a single type for the times when you want to deal with them all in one place.

    Here’s the code I ended up using

    data Counter
    data Timer
    
    data Metric a where
      RequestTime :: Host -> Path -> Metric Timer
      ErrorCount :: Metric Counter
    
    toName :: Metric a -> String
    toName (RequestTime h p) = "request-time-" <> h <> ":" <> p
    toName ErrorCount = "error-count"
    
    incrementI :: Metric Counter -> m ()
    incrementI c = I.incrementI (toName c)
    
    timeI :: Metric Timer -> m ()
    timeI c = I.timeI (toName c)

    This requires the GADTs and EmptyDataDecls extensions. Take note of a few things here:

    1. I use empty data declarations for the metric types. I don’t actually need to construct the value, I use it only at the type level.
    2. I can use richer types than Strings in each Metric if I want to. This is helpful for both expressiveness and to prevent type mixups.
    3. I import the underlying library qualified under I and reexport its functions with enriched types. Everywhere in my app I would import this module and not the underlying library.
    4. All of the metrics my application supports are all in one place.
    5. I wait until the very last second before I use toName, which loses type information.

    Another great thing is you can set up this barrier on the other side of the library as well. Say the underlying library provides these functions:

    getCounter :: String -> m CounterValue
    getTimer :: String -> m TimerValue

    We can wrap those up again with our metric type and be sure that we can’t be looking up a timer with a counter’s key or vice versa:

    getCounter :: Metric Counter -> m CounterValue
    getTimer :: Metric Timer -> m TimerValue
    Comments
  • Adding a UUID Column to a Persistent Table

    April 14, 2015

    This is just a quick snippet I’ve been meaning to post for a few weeks. A few weeks ago I needed to add a Postgres UUID column to one of my tables using Persistent. I dug around and all I found were vague, closed tickets, and old irrelevant blog posts on the Yesod site that mentioned UUIDs but didn’t give any good examples. The solution ended up being simple but I hope it helps someone else who is having this problem, or more likely, future me when I forget how this is done.

    Let’s say you’ve got some schema TH that looks like:

    share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
    Person
        name String
        age Int Maybe
        deriving Show
    |]

    We want to:

    1. Add a UUID column to Person.
    2. Give it a default value so any existing columns are backfilled.
    3. Give it a unique index for fast lookups and guaranteed uniqueness.
    4. Wrap it in a distinct type to avoid confusing it with other UUIDs, much like how persistent gives us distinct types for primary keys.

    Our schema TH now looks like:

    share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
    Person
        name String
        age Int Maybe
        uuid PersonUUID default=uuid_generate_v4()
        UniquePersonUUID uuid
    
        deriving Show
    |]

    In a module accessible from your schema’s you’ll also add something like:

    {-# LANGUAGE GeneralizedNewtypeDeriving #-}
    {-# LANGUAGE OverloadedStrings #-}
    module MyApp.Schema where
    
    import Control.Error
    import Control.Lens
    import Data.ByteString (ByteString)
    import Data.UUID
    import Data.Text (Text)
    import qualified Data.Text as T
    import System.Random
    
    newtype PersonUUID = PersonUUID {
          _personUuid :: UUID
        } deriving (Show, Eq, Ord, Random)
    
    makeLenses ''PersonUUID
    
    instance PersistFieldSql PersonUUID where
      sqlType = const $ SqlOther "uuid"
    
    instance PersistField PersonUUID where
      toPersistValue = toPersistValueUUID personUuid
      fromPersistValue = fromPersistValueUUID personUuid
    
    _ASCIIBytes :: Prism' ByteString UUID
    _ASCIIBytes = prism toASCIIBytes $ \bs -> note bs $ fromASCIIBytes bs
    
    toPersistValueUUID :: Iso' a UUID -> a -> PersistValue
    toPersistValueUUID i a = PersistDbSpecific $ a ^. i . re _ASCIIBytes
    
    fromPersistValueUUID :: Iso' a UUID -> PersistValue -> Either Text a
    fromPersistValueUUID i (PersistDbSpecific bs) =
      note "Could not parse UUID" $ bs ^? _ASCIIBytes . from i
    fromPersistValueUUID _ x = Left $ "Invalid value for UUID: " <> showT x
    
    showT :: Show a => a -> Text
    showT = T.pack . show

    Let’s break this down a bit. First, we create a newtype around UUID to distinguish the type and then derive an Iso that can get us to and from the UUID via makeLenses. We also create a Prism between ByteString and UUID. I like to read prisms as the left type variable (ByteString) is the “wider” type and the right type variable (UUID) is the “narrow” one. That is to say, you know you can always go from the narrow type to the wide one but not necessarily the other way. Conveniently, PersistDbSpecific expects a ByteString so this is exactly what we need to serialize our type to the database.

    note is a great little function from the errors package of type e -> Maybe a -> Either e a that upgrades a Maybe into an Either.

    Lastly, you’ll want to make sure that the uuid-ossp extension is enabled in your database. You can issue the command CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; to do so.

    Comments
  • Hidden Gem in Control.Applicative

    February 20, 2015

    The other day I was looking through some haskell code and found a curious little combinator: <$. At first I thought it was <$>, the infix alias for fmap. I’m so used to seeing <$> because I see and write it many times a day. I decided to look it up on the official haddocks to see what was going on. The description is:

    Replace all locations in the input with the same value. The default definition is fmap . const, but this may be overridden with a more efficient version.

    I know I’ve read that before and didn’t understood it. Locations? What’s that supposed to mean? Why would I want to use this? “Whatever, I’ll probably never need it,” I thought. And so I never ended up using <$ until now, and that’s a real shame. Let’s take a closer look at the documentation and the type and see if we can figure out what it means.

    The type is

    (<$) :: Functor f => a -> f b -> f a

    So we only require a Functor constraint to use it. As Bartosz Milewski notes, Functors are Containers. Remembering this analogy of “Containers” made the “locations” phrasing click for me. A list is a functor, and intuitively, locations in a list are the elements. So let’s see what it looks like to replace all locations in the input with the “same value”, or the first argument:

    Prelude Control.Applicative> 3 <$ [7,8,9]
    [3,3,3]

    Ah ha! This makes sense too, looking at the type.

    A Closer Look at the Types

    The first argument is of type a, the second f b, a container of b. And that’s the last time we see b. The only thing Functor provides us is

    fmap :: Functor f => (a -> b) -> f a -> f b

    Given we know nothing about a and b, this only lets us apply a function to every “location” in a functor/container. There’s no breaking out early, nor adding new elements. <$ doesn’t even take a function, just a single value of a. It knows nothing about a and nothing about f b except how to map over it. It couldn’t even return an “empty” f b because functor doesn’t give it the tools to do that. The only thing <$ can do is replace each location in the functor with the given a and indeed the only implementation it can have is the one it has, fmap . const. Pretty cool!

    So back to the list example. I guess this could be helpful. You could imagine a list or Map of items to validate and some condition causing you to replace all values with some invalidated value. Or maybe a list of tests and we want to replace them all with failures. This is a bit of a stretch though. There is a much more useful case…

    The Killer Feature: Parsers

    Haskell parsers tend to be Applicative Functors and are really nice to use so you end up using them a lot. You may be writing an Applicative parser for command line options using optparse-applicative, a parser using attoparsec, or a FromJSON instance in aeson. In any of these scenarios, if you find yourself parsing some fixed token, <$ can help!

    Let’s say we’ve got the following type:

    data Status = Staged | Running | Finished

    I’ve been writing parsers like this for some time:

    string :: Text -> Parser Text
    
    parseStatus :: Parser Status
    parseStatus = parseStaged <|> parseRunning <|> parseFinished
      where
        parseStaged   = string "Staged" *> pure Staged
        parseRunning  = string "Running" *> pure Running
        parseFinished = string "Finished" *> pure Finished

    Parser is a Functor, Applicative, and Alternative (which gives us <|>). This is where its hard to apply the description <$ of replacing all “locations” in a functor. What are locations in a parser? In this case the type is more illuminating. The f b in the type is Parser Text. f is Parser, and b is Text. We want to throw out the Text once this parser succeeds and replace it with our token. So we instead can write:

    parseStatus :: Parser Status
    parseStatus = parseStaged <|> parseRunning <|> parseFinished
      where
        parseStaged   = Staged <$ string "Staged"
        parseRunning  = Running <$ string "Running"
        parseFinished = Finished <$ string "Finished"

    This feels nicer. We no longer have to “lift” the value into the parser with pure. You can read this as “the result is Staged if I’m given a the string ‘Staged’”.

    This little thought exercise has helped underscore the importance of reasoning with the tools that typeclasses give us. The simpler typeclasses have this great property of being extremely polymorphic, which at once makes them very powerful in their use and very constrained in their implementations.

    Comments
Copyright © 2010-2013 Michael Xavier
Site generated with hakyll .