6.19.1. Rewrite rules¶
- {-# RULES "⟨name⟩" forall ⟨binder⟩ ... . ⟨expr⟩ = ⟨expr⟩ ... #-}¶
- Where:
- top-level 
 - Define a rewrite rule to be used to optimize a source program. 
The programmer can specify rewrite rules as part of the source program (in a pragma). Here is an example:
{-# RULES
      "map/map"    forall f g xs.  map f (map g xs) = map (f.g) xs
  #-}
Use the debug flag -ddump-simpl-stats to see what rules fired. If
you need more information, then -ddump-rule-firings shows you each
individual rule firing and -ddump-rule-rewrites also shows what the
code looks like before and after the rewrite.
- -fenable-rewrite-rules¶
- Allow the compiler to apply rewrite rules to the source program. 
6.19.1.1. Syntax¶
From a syntactic point of view:
- There may be zero or more rules in a - RULESpragma, separated by semicolons (which may be generated by the layout rule).
- The layout rule applies in a pragma. Currently no new indentation level is set, so if you put several rules in single - RULESpragma and wish to use layout to separate them, you must lay out the starting in the same column as the enclosing definitions.- {-# RULES "map/map" forall f g xs. map f (map g xs) = map (f.g) xs "map/append" forall f xs ys. map f (xs ++ ys) = map f xs ++ map f ys #-} - Furthermore, the closing - #-}should start in a column to the right of the opening- {-#.
- Each rule has a name, enclosed in double quotes. The name itself has no significance at all. It is only used when reporting how many times the rule fired. 
- A rule may optionally have a phase-control number (see Phase control), immediately after the name of the rule. Thus: - {-# RULES "map/map" [2] forall f g xs. map f (map g xs) = map (f.g) xs #-} - The - [2]means that the rule is active in Phase 2 and subsequent phases. The inverse notation- [~2]is also accepted, meaning that the rule is active up to, but not including, Phase 2.- Rules support the special phase-control notation - [~], which means the rule is never active. This feature supports plugins (see Compiler Plugins), by making it possible to define a RULE that is never run by GHC, but is nevertheless parsed, typechecked etc, so that it is available to the plugin.
- Each (term) variable mentioned in a rule must either be in scope (e.g. - map), or bound by the- forall(e.g.- f,- g,- xs). The variables bound by the- forallare called the pattern variables. They are separated by spaces, just like in a type- forall.
- A pattern variable may optionally have a type signature. If the type of the pattern variable is polymorphic, it must have a type signature. For example, here is the - foldr/buildrule:- "fold/build" forall k z (g::forall b. (a->b->b) -> b -> b) . foldr k z (build g) = g k z - Since - ghas a polymorphic type, it must have a type signature.
- If - ExplicitForAllis enabled, type/kind variables can also be explicitly bound. For example:- {-# RULES "id" forall a. forall (x :: a). id @a x = x #-}- When a type-level explicit - forallis present, each type/kind variable mentioned must now also be either in scope or bound by the- forall. In particular, unlike some other places in Haskell, this means free kind variables will not be implicitly bound. For example:- "this_is_bad" forall (c :: k). forall (x :: Proxy c) ... "this_is_ok" forall k (c :: k). forall (x :: Proxy c) ... - When bound type/kind variables are needed, both foralls must always be included, though if no pattern variables are needed, the second can be left empty. For example: - {-# RULES "map/id" forall a. forall. map (id @a) = id @[a] #-}
- The left hand side of a rule must consist of a top-level variable applied to arbitrary expressions. For example, this is not OK: - "wrong1" forall e1 e2. case True of { True -> e1; False -> e2 } = e1 "wrong2" forall f. f True = True "wrong3" forall x. Just x = Nothing - In - "wrong1", the LHS is not an application; in- "wrong2", the LHS has a pattern variable in the head. In- "wrong3", the LHS consists of a constructor, rather than a variable, applied to an argument.
- A rule does not need to be in the same module as (any of) the variables it mentions, though of course they need to be in scope. 
- All rules are implicitly exported from the module, and are therefore in force in any module that imports the module that defined the rule, directly or indirectly. (That is, if A imports B, which imports C, then C’s rules are in force when compiling A.) The situation is very similar to that for instance declarations. 
- Inside a - RULES“- forall” is treated as a keyword, regardless of any other flag settings. Furthermore, inside a- RULES, the language extension- ScopedTypeVariablesis automatically enabled; see Lexically scoped type variables.
- Like other pragmas, - RULESpragmas are always checked for scope errors, and are typechecked. Typechecking means that the LHS and RHS of a rule are typechecked, and must have the same type. However, rules are only enabled if the- -fenable-rewrite-rulesflag is on (see Semantics).
6.19.1.2. Semantics¶
From a semantic point of view:
- Rules are enabled (that is, used during optimisation) by the - -fenable-rewrite-rulesflag. This flag is implied by- -O, and may be switched off (as usual) by- -fno-enable-rewrite-rules. (NB: enabling- -fenable-rewrite-ruleswithout- -Omay not do what you expect, though, because without- -OGHC ignores all optimisation information in interface files; see- -fignore-interface-pragmas). Note that- -fenable-rewrite-rulesis an optimisation flag, and has no effect on parsing or typechecking.
- Rules are regarded as left-to-right rewrite rules. When GHC finds an expression that is a substitution instance of the LHS of a rule, it replaces the expression by the (appropriately-substituted) RHS. By “a substitution instance” we mean that the LHS can be made equal to the expression by substituting for the pattern variables. 
- GHC makes absolutely no attempt to verify that the LHS and RHS of a rule have the same meaning. That is undecidable in general, and infeasible in most interesting cases. The responsibility is entirely the programmer’s! 
- GHC makes no attempt to make sure that the rules are confluent or terminating. For example: - "loop" forall x y. f x y = f y x - This rule will cause the compiler to go into an infinite loop. 
- If more than one rule matches a call, GHC will choose one arbitrarily to apply. 
- GHC currently uses a very simple, syntactic, matching algorithm for matching a rule LHS with an expression. It seeks a substitution which makes the LHS and expression syntactically equal modulo alpha conversion. The pattern (rule), but not the expression, is eta-expanded if necessary. (Eta-expanding the expression can lead to laziness bugs.) But not beta conversion (that’s called higher-order matching). - Matching is carried out on GHC’s intermediate language, which includes type abstractions and applications. So a rule only matches if the types match too. See Specialisation below. 
- GHC keeps trying to apply the rules as it optimises the program. For example, consider: - let s = map f t = map g in s (t xs) - The expression - s (t xs)does not match the rule- "map/map", but GHC will substitute for- sand- t, giving an expression which does match. If- sor- twas (a) used more than once, and (b) large or a redex, then it would not be substituted, and the rule would not fire.
- GHC will never match a forall’d variable in a template with an expression which contains locally bound variables. For example, it is permitted to write a rule which contains a case expression: - {-# RULES "test/case-tup" forall (x :: (Int, Int)) (y :: Int) (z :: Int). test (case x of (l, r) -> y) z = case x of (l, r) -> test y z #-} - But the rule will not match when - ycontains either of- lor- rbecause they are locally bound. Therefore the following application will fail to trigger the rule:- prog :: (Int, Int) -> (Int, Int) prog x = test (case x of (p, q) -> p) 0 - because - ywould have to match against- p(which is locally bound) but it will fire for:- prog :: (Int, Int) -> (Int, Int) prog x = test (case x of (p, q) -> 0) 0 - because - ycan match against- 0.
- A rule that has a forall binder with a polymorphic type, is likely to fail to fire. E. g., - {-# RULES forall (x :: forall a. Num a => a -> a). f x = blah #-}- Here - xhas a polymorphic type. This applies to a forall’d binder with a type class constraint, such as:- {-# RULES forall @m (x :: KnownNat m => Proxy m). g x = blah #-}- See #21093 for discussion. 
6.19.1.3. How rules interact with INLINE/NOINLINE pragmas¶
Ordinary inlining happens at the same time as rule rewriting, which may lead to unexpected results. Consider this (artificial) example
f x = x
g y = f y
h z = g True
{-# RULES "f" f True = False #-}
Since f's right-hand side is small, it is inlined into g, to
give
g y = y
Now g is inlined into h, but f's RULE has no chance to
fire. If instead GHC had first inlined g into h then there would have
been a better chance that f's RULES might fire.
The way to get predictable behaviour is to use a NOINLINE pragma, or an
INLINE[⟨phase⟩] pragma, on f, to ensure that it is not inlined until
its RULES have had a chance to fire. The warning flag
-Winline-rule-shadowing (see Warnings and sanity-checking) warns about
this situation.
6.19.1.4. How rules interact with CONLIKE pragmas¶
GHC is very cautious about duplicating work. For example, consider
f k z xs = let xs = build g
           in ...(foldr k z xs)...sum xs...
{-# RULES "foldr/build" forall k z g. foldr k z (build g) = g k z #-}
Since xs is used twice, GHC does not fire the foldr/build rule.
Rightly so, because it might take a lot of work to compute xs, which
would be duplicated if the rule fired.
Sometimes, however, this approach is over-cautious, and we do want the
rule to fire, even though doing so would duplicate redex. There is no
way that GHC can work out when this is a good idea, so we provide the
CONLIKE pragma to declare it, thus:
{-# INLINE CONLIKE [1] f #-}
f x = blah
CONLIKE is a modifier to an INLINE or NOINLINE pragma. It specifies that
an application of f to one argument (in general, the number of arguments
to the left of the = sign) should be considered cheap enough to
duplicate, if such a duplication would make rule fire. (The name
“CONLIKE” is short for “constructor-like”, because constructors
certainly have such a property.) The CONLIKE pragma is a modifier to
INLINE/NOINLINE because it really only makes sense to match
f on the LHS of a rule if you are sure that f is not going to be inlined
before the rule has a chance to fire.
6.19.1.5. How rules interact with class methods¶
Giving a RULE for a class method is a bad idea:
class C a where
  op :: a -> a -> a
instance C Bool where
  op x y = ...rhs for op at Bool...
{-# RULES "f" op True y = False #-}
In this example, op is not an ordinary top-level function; it is a
class method. GHC rapidly rewrites any occurrences of
op-used-at-type-Bool to a specialised function, say opBool,
where
opBool :: Bool -> Bool -> Bool
opBool x y = ..rhs for op at Bool...
So the RULE never has a chance to fire, for just the same reasons as in How rules interact with INLINE/NOINLINE pragmas.
The solution is to define the instance-specific function yourself, with a pragma to prevent it being inlined too early, and give a RULE for it:
instance C Bool where
  op = opBool
opBool :: Bool -> Bool -> Bool
{-# NOINLINE [1] opBool #-}
opBool x y = ..rhs for op at Bool...
{-# RULES "f" opBool True y = False #-}
If you want a RULE that truly applies to the overloaded class method, the only way to do it is like this:
class C a where
  op_c :: a -> a -> a
op :: C a => a -> a -> a
{-# NOINLINE [1] op #-}
op = op_c
{-# RULES "reassociate" op (op x y) z = op x (op y z) #-}
Now the inlining of op is delayed until the rule has a chance to
fire. The down-side is that instance declarations must define op_c,
but all other uses should go via op.
6.19.1.6. List fusion¶
The RULES mechanism is used to implement fusion (deforestation) of common list functions. If a “good consumer” consumes an intermediate list constructed by a “good producer”, the intermediate list should be eliminated entirely.
The following are good producers:
- List comprehensions 
- Enumerations of - Int,- Integerand- Char(e.g.- ['a'..'z']).
- Explicit lists (e.g. - [True, False])
- The cons constructor (e.g - 3:4:[])
- ++
- map
- take,- filter
- iterate,- repeat
- zip,- zipWith
The following are good consumers:
- List comprehensions 
- array(on its second argument)
- ++(on its first argument)
- foldr
- map
- take,- filter
- concat
- unzip,- unzip2,- unzip3,- unzip4
- zip,- zipWith(but on one argument only; if both are good producers,- zipwill fuse with one but not the other)
- partition
- head
- and,- or,- any,- all
- sequence_
- msum
So, for example, the following should generate no intermediate lists:
array (1,10) [(i,i*i) | i <- map (+ 1) [0..9]]
This list could readily be extended; if there are Prelude functions that you use a lot which are not included, please tell us.
If you want to write your own good consumers or producers, look at the Prelude definitions of the above functions to see how to do so.
6.19.1.7. Specialisation¶
Rewrite rules can be used to get the same effect as a feature present in earlier versions of GHC. For example, suppose that:
genericLookup :: Ord a => Table a b   -> a   -> b
intLookup     ::          Table Int b -> Int -> b
where intLookup is an implementation of genericLookup that works
very fast for keys of type Int. You might wish to tell GHC to use
intLookup instead of genericLookup whenever the latter was
called with type Table Int b -> Int -> b. It used to be possible to
write
{-# SPECIALIZE genericLookup :: Table Int b -> Int -> b = intLookup #-}
This feature is no longer in GHC, but rewrite rules let you do the same thing:
{-# RULES "genericLookup/Int" genericLookup = intLookup #-}
This slightly odd-looking rule instructs GHC to replace
genericLookup by intLookup whenever the types match. What is
more, this rule does not need to be in the same file as
genericLookup, unlike the SPECIALIZE pragmas which currently do
(so that they have an original definition available to specialise).
It is Your Responsibility to make sure that intLookup really
behaves as a specialised version of genericLookup!!!
An example in which using RULES for specialisation will Win Big:
toDouble :: Real a => a -> Double
toDouble = fromRational . toRational
{-# RULES "toDouble/Int" toDouble = i2d #-}
i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
The i2d function is virtually one machine instruction; the default
conversion—via an intermediate Rational-is obscenely expensive by
comparison.
6.19.1.8. Controlling what’s going on in rewrite rules¶
- Use - -ddump-rulesto see the rules that are defined in this module. This includes rules generated by the specialisation pass, but excludes rules imported from other modules.
- Use - -ddump-simpl-statsto see what rules are being fired. If you add- -dppr-debugyou get a more detailed listing.
- Use - -ddump-rule-firingsor- -ddump-rule-rewritesto see in great detail what rules are being fired. If you add- -dppr-debugyou get a still more detailed listing.
- The definition of (say) - buildin- GHC/Base.hslooks like this:- build :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a] {-# INLINE build #-} build g = g (:) [] - Notice the - INLINE! That prevents- (:)from being inlined when compiling- PrelBase, so that an importing module will “see” the- (:), and can match it on the LHS of a rule.- INLINEprevents any inlining happening in the RHS of the- INLINEthing. I regret the delicacy of this.
- In - libraries/base/GHC/Base.hslook at the rules for- mapto see how to write rules that will do fusion and yet give an efficient program even if fusion doesn’t happen. More rules in- GHC/List.hs.