Copyright | (c) Universiteit Utrecht 2010-2011, University of Oxford 2012-2013 |
---|---|
License | see libraries/base/LICENSE |
Maintainer | [email protected] |
Stability | internal |
Portability | non-portable |
Safe Haskell | Trustworthy |
Language | Haskell2010 |
If you're using GHC.Generics
, you should consider using the http://hackage.haskell.org/package/generic-deriving package, which contains many useful generic functions.
Since: 4.6.0.0
Datatype-generic functions are are based on the idea of converting values of a datatype T
into corresponding values of a (nearly) isomorphic type Rep T
. The type Rep T
is built from a limited set of type constructors, all provided by this module. A datatype-generic function is then an overloaded function with instances for most of these type constructors, together with a wrapper that performs the mapping between T
and Rep T
. By using this technique, we merely need a few generic instances in order to implement functionality that works for any representable type.
Representable types are collected in the Generic
class, which defines the associated type Rep
as well as conversion functions from
and to
. Typically, you will not define Generic
instances by hand, but have the compiler derive them for you.
The key to defining your own datatype-generic functions is to understand how to represent datatypes using the given set of type constructors.
Let us look at an example first:
data Tree a = Leaf a | Node (Tree a) (Tree a)
deriving Generic
The above declaration (which requires the language pragma DeriveGeneric
) causes the following representation to be generated:
instanceGeneric
(Tree a) where typeRep
(Tree a) =D1
D1Tree (C1
C1_0Tree (S1
NoSelector
(Par0
a)):+:
C1
C1_1Tree (S1
NoSelector
(Rec0
(Tree a)):*:
S1
NoSelector
(Rec0
(Tree a)))) ...
Hint: You can obtain information about the code being generated from GHC by passing the -ddump-deriv
flag. In GHCi, you can expand a type family such as Rep
using the :kind!
command.
There are many datatype-generic functions that do not distinguish between positions that are parameters or positions that are recursive calls. There are also many datatype-generic functions that do not care about the names of datatypes and constructors at all. To keep the number of cases to consider in generic functions in such a situation to a minimum, it turns out that many of the type constructors introduced above are actually synonyms, defining them to be variants of a smaller set of constructors.
K1
The type constructors Par0
and Rec0
are variants of K1
:
typePar0
=K1
P
typeRec0
=K1
R
Here, P
and R
are type-level proxies again that do not have any associated values.
M1
The type constructors S1
, C1
and D1
are all variants of M1
:
typeS1
=M1
S
typeC1
=M1
C
typeD1
=M1
D
The types S
, C
and D
are once again type-level proxies, just used to create several variants of M1
.
Next to K1
, M1
, :+:
and :*:
there are a few more type constructors that occur in the representations of other datatypes.
V1
For empty datatypes, V1
is used as a representation. For example,
data Empty deriving Generic
yields
instanceGeneric
Empty where typeRep
Empty =D1
D1EmptyV1
U1
If a constructor has no arguments, then U1
is used as its representation. For example the representation of Bool
is
instanceGeneric
Bool where typeRep
Bool =D1
D1Bool (C1
C1_0BoolU1
:+:
C1
C1_1BoolU1
)
As :+:
and :*:
are just binary operators, one might ask what happens if the datatype has more than two constructors, or a constructor with more than two fields. The answer is simple: the operators are used several times, to combine all the constructors and fields as needed. However, users /should not rely on a specific nesting strategy/ for :+:
and :*:
being used. The compiler is free to choose any nesting it prefers. (In practice, the current implementation tries to produce a more or less balanced nesting, so that the traversal of the structure of the datatype from the root to a particular component can be performed in logarithmic rather than linear time.)
A datatype-generic function comprises two parts:
Generic
, performs the conversion between the original value and its Rep
-based representation and then invokes the generic instances.As an example, let us look at a function encode
that produces a naive, but lossless bit encoding of values of various datatypes. So we are aiming to define a function
encode :: Generic
a => a -> [Bool]
where we use Bool
as our datatype for bits.
For part 1, we define a class Encode'
. Perhaps surprisingly, this class is parameterized over a type constructor f
of kind * -> *
. This is a technicality: all the representation type constructors operate with kind * -> *
as base kind. But the type argument is never being used. This may be changed at some point in the future. The class has a single method, and we use the type we want our final function to have, but we replace the occurrences of the generic type argument a
with f p
(where the p
is any argument; it will not be used).
class Encode' f where encode' :: f p -> [Bool]
With the goal in mind to make encode
work on Tree
and other datatypes, we now define instances for the representation type constructors V1
, U1
, :+:
, :*:
, K1
, and M1
.
In order to be able to do this, we need to know the actual definitions of these types:
dataV1
p -- lifted version of Empty dataU1
p =U1
-- lifted version of () data (:+:
) f g p =L1
(f p) |R1
(g p) -- lifted version ofEither
data (:*:
) f g p = (f p):*:
(g p) -- lifted version of (,) newtypeK1
i c p =K1
{unK1
:: c } -- a container for a c newtypeM1
i t f p =M1
{unM1
:: f p } -- a wrapper
So, U1
is just the unit type, :+:
is just a binary choice like Either
, :*:
is a binary pair like the pair constructor (,)
, and K1
is a value of a specific type c
, and M1
wraps a value of the generic type argument, which in the lifted world is an f p
(where we do not care about p
).
The instance for V1
is slightly awkward (but also rarely used):
instance Encode' V1
where
encode' x = undefined
There are no values of type V1 p
to pass (except undefined), so this is actually impossible. One can ask why it is useful to define an instance for V1
at all in this case? Well, an empty type can be used as an argument to a non-empty type, and you might still want to encode the resulting type. As a somewhat contrived example, consider [Empty]
, which is not an empty type, but contains just the empty list. The V1
instance ensures that we can call the generic function on such types.
There is exactly one value of type U1
, so encoding it requires no knowledge, and we can use zero bits:
instance Encode'U1
where encode'U1
= []
In the case for :+:
, we produce False
or True
depending on whether the constructor of the value provided is located on the left or on the right:
instance (Encode' f, Encode' g) => Encode' (f:+:
g) where encode' (L1
x) = False : encode' x encode' (R1
x) = True : encode' x
In the case for :*:
, we append the encodings of the two subcomponents:
instance (Encode' f, Encode' g) => Encode' (f:*:
g) where encode' (x:*:
y) = encode' x ++ encode' y
The case for K1
is rather interesting. Here, we call the final function encode
that we yet have to define, recursively. We will use another type class Encode
for that function:
instance (Encode c) => Encode' (K1
i c) where encode' (K1
x) = encode x
Note how Par0
and Rec0
both being mapped to K1
allows us to define a uniform instance here.
Similarly, we can define a uniform instance for M1
, because we completely disregard all meta-information:
instance (Encode' f) => Encode' (M1
i t f) where encode' (M1
x) = encode' x
Unlike in K1
, the instance for M1
refers to encode'
, not encode
.
We now define class Encode
for the actual encode
function:
class Encode a where encode :: a -> [Bool] default encode :: (Generic
a) => a -> [Bool] encode x = encode' (from
x)
The incoming x
is converted using from
, then we dispatch to the generic instances using encode'
. We use this as a default definition for encode
. We need the 'default encode' signature because ordinary Haskell default methods must not introduce additional class constraints, but our generic default does.
Defining a particular instance is now as simple as saying
instance (Encode a) => Encode (Tree a)
It is not always required to provide instances for all the generic representation types, but omitting instances restricts the set of datatypes the functions will work for:
:+:
instance is given, the function may still work for empty datatypes or datatypes that have a single constructor, but will fail on datatypes with more than one constructor.:*:
instance is given, the function may still work for datatypes where each constructor has just zero or one field, in particular for enumeration types.K1
instance is given, the function may still work for enumeration types, where no constructor has any fields.V1
instance is given, the function may still work for any datatype that is not empty.U1
instance is given, the function may still work for any datatype where each constructor has at least one field.An M1
instance is always required (but it can just ignore the meta-information, as is the case for encode
above).
Datatype-generic functions as defined above work for a large class of datatypes, including parameterized datatypes. (We have used Tree
as our example above, which is of kind * -> *
.) However, the Generic
class ranges over types of kind *
, and therefore, the resulting generic functions (such as encode
) must be parameterized by a generic type argument of kind *
.
What if we want to define generic classes that range over type constructors (such as Functor
, Traversable
, or Foldable
)?
Generic1
classLike Generic
, there is a class Generic1
that defines a representation Rep1
and conversion functions from1
and to1
, only that Generic1
ranges over types of kind * -> *
. The Generic1
class is also derivable.
The representation Rep1
is ever so slightly different from Rep
. Let us look at Tree
as an example again:
data Tree a = Leaf a | Node (Tree a) (Tree a)
deriving Generic1
The above declaration causes the following representation to be generated:
instance Generic1
Tree where type Rep1
Tree = D1
D1Tree (C1
C1_0Tree (S1
NoSelector
Par1
) :+:
C1
C1_1Tree (S1
NoSelector
(Rec1
Tree) :*:
S1
NoSelector
(Rec1
Tree))) ...
The representation reuses D1
, C1
, S1
(and thereby M1
) as well as :+:
and :*:
from Rep
. (This reusability is the reason that we carry around the dummy type argument for kind-*
-types, but there are already enough different names involved without duplicating each of these.)
What's different is that we now use Par1
to refer to the parameter (and that parameter, which used to be a
), is not mentioned explicitly by name anywhere; and we use Rec1
to refer to a recursive use of Tree a
.
* -> *
typesUnlike Par0
and Rec0
, the Par1
and Rec1
type constructors do not map to K1
. They are defined directly, as follows:
newtypePar1
p =Par1
{unPar1
:: p } -- gives access to parameter p newtypeRec1
f p =Rec1
{unRec1
:: f p } -- a wrapper
In Par1
, the parameter p
is used for the first time, whereas Rec1
simply wraps an application of f
to p
.
Note that K1
(in the guise of Rec0
) can still occur in a Rep1
representation, namely when the datatype has a field that does not mention the parameter.
The declaration
data WithInt a = WithInt Int a
deriving Generic1
yields
classRep1
WithInt where typeRep1
WithInt =D1
D1WithInt (C1
C1_0WithInt (S1
NoSelector
(Rec0
Int):*:
S1
NoSelector
Par1
))
If the parameter a
appears underneath a composition of other type constructors, then the representation involves composition, too:
data Rose a = Fork a [Rose a]
yields
classRep1
Rose where typeRep1
Rose =D1
D1Rose (C1
C1_0Rose (S1
NoSelector
Par1
:*:
S1
NoSelector
([]:.:
Rec1
Rose)
where
newtype (:.:
) f g p =Comp1
{unComp1
:: f (g p) }
Void: used for datatypes without constructors
Unit: used for constructors without arguments
U1 |
Used for marking occurrences of the parameter
Recursive calls of kind * -> *
Constants, additional parameters and recursion of kind *
Meta-information (constructor names, etc.)
data (f :+: g) p infixr 5 Source
Sums: encode choice between constructors
data (f :*: g) p infixr 6 Source
Products: encode multiple arguments to constructors
(f p) :*: (g p) infixr 6 |
newtype (f :.: g) p infixr 7 Source
Composition of functors
Type synonym for encoding recursion (of kind *)
Type synonym for encoding parameters (other than the last)
Tag for K1: recursion (of kind *)
Tag for K1: parameters (other than the last)
Type synonym for encoding meta-information for datatypes
Type synonym for encoding meta-information for constructors
Type synonym for encoding meta-information for record selectors
Tag for M1: datatype
Tag for M1: constructor
Tag for M1: record selector
Class for datatypes that represent datatypes
datatypeName :: t d (f :: * -> *) a -> [Char] Source
The name of the datatype (unqualified)
moduleName :: t d (f :: * -> *) a -> [Char] Source
The fully-qualified name of the module where the type is declared
isNewtype :: t d (f :: * -> *) a -> Bool Source
Marks if the datatype is actually a newtype
class Constructor c where Source
Class for datatypes that represent data constructors
conName :: t c (f :: * -> *) a -> [Char] Source
The name of the constructor
conFixity :: t c (f :: * -> *) a -> Fixity Source
The fixity of the constructor
conIsRecord :: t c (f :: * -> *) a -> Bool Source
Marks if this constructor is a record
Class for datatypes that represent records
data NoSelector Source
Used for constructor fields without a name
Datatype to represent the fixity of a constructor. An infix | declaration directly corresponds to an application of Infix
.
Prefix | |
Infix Associativity Int |
data Associativity Source
Datatype to represent the associativity of a constructor
LeftAssociative | |
RightAssociative | |
NotAssociative |
Datatype to represent the arity of a tuple.
Get the precedence of a fixity value.
Representable types of kind *. This class is derivable in GHC with the DeriveGeneric flag on.
Convert from the datatype to its representation
Convert from the representation to the datatype
Generic Bool | |
Generic Char | |
Generic Double | |
Generic Float | |
Generic Int | |
Generic Ordering | |
Generic () | |
Generic Associativity | |
Generic Fixity | |
Generic Arity | |
Generic Any | |
Generic All | |
Generic Void | |
Generic [a] | |
Generic (U1 p) | |
Generic (Par1 p) | |
Generic (Maybe a) | |
Generic (Last a) | |
Generic (First a) | |
Generic (Product a) | |
Generic (Sum a) | |
Generic (Endo a) | |
Generic (Dual a) | |
Generic (ZipList a) | |
Generic (Identity a) | |
Generic (Either a b) | |
Generic (Rec1 f p) | |
Generic (a, b) | |
Generic (Proxy * t) | |
Generic (WrappedMonad m a) | |
Generic (Const a b) | |
Generic (K1 i c p) | |
Generic ((:+:) f g p) | |
Generic ((:*:) f g p) | |
Generic ((:.:) f g p) | |
Generic (a, b, c) | |
Generic (Alt k f a) | |
Generic (WrappedArrow a b c) | |
Generic (M1 i c f p) | |
Generic (a, b, c, d) | |
Generic (a, b, c, d, e) | |
Generic (a, b, c, d, e, f) | |
Generic (a, b, c, d, e, f, g) |
Representable types of kind * -> *. This class is derivable in GHC with the DeriveGeneric flag on.
from1 :: f a -> Rep1 f a Source
Convert from the datatype to its representation
Convert from the representation to the datatype
Generic1 [] | |
Generic1 Maybe | |
Generic1 Last | |
Generic1 First | |
Generic1 Product | |
Generic1 Sum | |
Generic1 Dual | |
Generic1 ZipList | |
Generic1 Identity | |
Generic1 (Either a) | |
Generic1 ((,) a) | |
Generic1 (WrappedMonad m) | |
Generic1 (Const a) | |
Generic1 ((,,) a b) | |
Generic1 (Alt * f) | |
Generic1 (WrappedArrow a b) | |
Generic1 ((,,,) a b c) | |
Generic1 ((,,,,) a b c d) | |
Generic1 ((,,,,,) a b c d e) | |
Generic1 ((,,,,,,) a b c d e f) |
© The University of Glasgow and others
Licensed under a BSD-style license (see top of the page).
https://downloads.haskell.org/~ghc/7.10.3/docs/html/libraries/base-4.8.2.0/GHC-Generics.html