Consider the following problem: you want to manage multiple kinds of "archive" files (e.g. zip, tar.gz, tar.bz2, .tar, etc.). You want to write one function that returns the most appropriate kind of archive, and functions that work on generic archive objects.
The problem is that the type checker forces you to return a concrete type. Something like this isn't allowed past the type checker:
makeArchive :: (Archive a) => String -> a
Why not? Because makeArchive
makes no guarantees on what it's returning. Any function that uses it doesn't really know what type it's getting back. Sure, it looks similar to read
:
read :: (Read a) => String -> a
Except that the type checker can infer which version of read
you're trying to use. If you're expecting to read an integer, the type checker will infer that you want the version of read
that returns an integer. If you're expecting to read some indeterminate type, the type checker will tell you give you an error ("Ambiguous type variable...
"):The problem is that a function likeworks = do s <- getLine
let i = read s
print (i + 1) -- read must provide an Int
doesntWork = do s <- getLine
let i = read s
print i -- no constraint what read provides
makeArchive
cannot return any instance of a typeclass like Archive
, it needs to return a specific type. Another way of looking at it is that you cannot write a version of makeArchive
that returns one instance of Archive
in one place, and a different instance of Archive
in another place. That would be like writing a function that sometimes returns strings, sometimes returns integers, and sometimes returns lists of tuples -- such a function is simply not allowed by the type system.In object oriented programs, a common solution would be to return a proxy object that delegates all of its operations to some specific object. Fortunately, that's exactly what we need to do in Haskell. All of the necessary types, including the proxy type, will be related to each other because they will be members of the same typeclass:
Next, we need to define a proxy type that will wrap any type that is anclass Archive a where
getContents :: a -> [String]
...
data ZipArchive = ...
instance Archive ZipArchive where
getContents = ...
data TarArchive = ...
instance Archive TarArchive where
getContents = ...
data TarGzArchive = ...
instance Archive TarGzArchive where
getContents = ...
data TarBz2Archive = ...
instance Archive TarBz2Archive where
getContents = ...
Archive
. To do this, we need to use existential types:Now, we can write{-# OPTIONS -fglasgow-exts #-}
data ArchiveProxy = forall a. (Archive a) => ArchiveProxy a
instance Archive ArchiveProxy where
getContents (ArchiveProxy a) = getContents a
makeArchive
to create a specific concrete type (like ZipArchive
, TarArchive
, or whatnot), and wrap that in the ArchiveProxy
constructor to return something of type ArchiveProxy
:The value that's returned bymakeArchive :: String -> ArchiveProxy
makeArchive fn
| ".zip" `isSuffixOf` fn = ArchiveProxy (ZipArchive ...)
| ".tar" `isSuffixOf` fn = ArchiveProxy (TarArchive ...)
| ...
makeArchive
is opaque, and can perform any methods that are defined by the Archive
typeclass. The benefit is that functions can be written in terms of this generic proxy, but always execute the most specific operation for the archive under consideration.All of this is explained in the Haskell Wiki under existential types, but that discussion approaches the topic from an academic and mathematical perspective. I suspect many professional programmers would be more comfortable with this object oriented view of the world, using proxies and delegates. The terms are different, but the concept is the same.
3 comments:
Your data declaration has only one constructor and it takes only one parameter. Therefore you can use a "newtype" declaration:
newtype ArchiveProxy = forall a. (Archive a) => ArchiveProxy a
The "newtype" creates a distinct type just like "data" but the compiler knows it is really an alias for "forall a. Archive a => a" and after type checking it deletes the type from the program.
So the running program is guaranteed to have zero overhead from constructing/wrapping and destructing/unwrapping the Archive type.
I'm copying my comment from Reddit in case you don't frequent that site:
Maybe I'm missing something, but I would do that example as
data Archive = ZipArchive ZipArchive
| TarArchive TarArchive
| TarGzArchive TarGzArchive
| TarBz2Archive TarBz2Archive
makeArchive :: String -> Archive
makeArchive fn
| ".zip" `isSuffixOf` fn = ZipArchive (makeZipArchive fn)
| ".tar" `isSuffixOf` fn = TarArchive (makeTarArchive fn)
| ...
To anonymous: That only works if you have an known number of fixed types to begin with. Part of the advantage of using objects is that you can extend functionality at any time by just adding a new class. The URL class in java has a factory that can generate one of many types of URLs given the url string. Anyone can add a new url type (ie- blah:// in addition to http://, ftp:// etc.) and simply register it for use with the system. You need existential types in Haskell to do the same thing.
Post a Comment