i-am-tom / holmes Goto Github PK

View Code? Open in Web Editor NEW

298.0 13.0 13.0 111 KB

A reference library for constraint-solving with propagators and CDCL.

Home Page: https://hackage.haskell.org/package/holmes

License: MIT License

Haskell 100.00%

propagation constraint-solver backtracking logic-programming

holmes's Introduction

🕵️‍♂️ Holmes

Holmes is a library for computing constraint-solving problems. Under the hood, it uses propagator networks and conflict-directed clause learning to optimise the search over the parameter space.

Now available on Hackage!

👟 Example

Dinesman's problem is a nice first example of a constraint problem. In this problem, we imagine five people — Baker, Cooper, Fletcher, Miller, and Smith — living in a five-story apartment block, and we must figure out the floor on which each person lives. Here's how we state the problem with Holmes:

import Data.Holmes

dinesman :: IO (Maybe [ Defined Int ])
dinesman = do
  let guesses = 5 `from` [1 .. 5]

  guesses `satisfying` \[ baker, cooper, fletcher, miller, smith ] -> and'
    [ distinct [ baker, cooper, fletcher, miller, smith ]
    , baker ./= 5
    , cooper ./= 1
    , fletcher ./= 1 .&& fletcher ./= 5
    , miller .> cooper
    , abs' (smith .- fletcher) ./= 1
    , abs' (fletcher .- cooper) ./= 1
    ]

👣 Step-by-step problem-solving

Now we've written the poster example, how do we go about stating and solving our own constraint problems?

⚖️ 0. Pick a parameter type

Right now, there are two parameter type constructors: Defined and Intersect. The choice of type determines the strategy by which we solve the problem:

Defined only permits two levels of knowledge about a value: nothing and everything. In other words, it doesn't support a notion of partial information; we either know a value, or we don't. This is fine for small problem spaces, particularly when few branches are likely to fail, but we can usually achieve faster results using another type.
Intersect stores a set of "possible answers", and attempts to eliminate possibilities as the computation progresses. For problems with many constraints, this will produce significantly faster results than Defined as we can hopefully discover failures much earlier.

It would seem that Intersect would be the best choice in most cases, but beware: it will only work for small enum types. An Intersect Int for which we have no knowledge will contain every possible Int, and will therefore take an intractable time to compute. Defined has no such restrictions.

🗺 1. State the parameter space

Next, we need to produce a Config stating the search space we want to explore when looking for satisfactory inputs. The simplest way to do this is with the from function:

from :: Int -> [ x ] -> Config Holmes (Defined x)

from :: Int -> [ x ] -> Config Holmes (Intersect x)

If, for example, we wanted to solve a Sudoku problem, we might say something like this:

definedConfig :: Config Holmes (Defined Int)
definedConfig = 81 `from` [ 1 .. 9 ]

We read this as, "81 variables whose values must all be numbers between 1 and 9". At this point, we place no constraints (such as uniqueness of rows or columns); we're just stating the possible range of values that could exist in each parameter.

We could do the same for Intersect, but we'd first need to produce some enum type to represent our parameter space:

data Value = V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9
  deriving stock (Eq, Ord, Show, Enum, Bounded, Generic)
  deriving anyclass (Hashable)

instance Num Value where -- Syntactic sugar for numeric literals.
  fromInteger = toEnum . pred . fromInteger

Now, we can produce an Intersect parameter space. Because we can now work with a type who has only 9 values, rather than all possible Int values, producing the initial possibilities list becomes tractable:

intersectConfig :: Config Holmes (Intersect Value)
intersectConfig = 81 `from` [ 1 .. 9 ]

There's one more function that lets us do slightly better with an Intersect strategy, and that's using:

using :: [ Intersect Value ] -> Config Holmes (Intersect Value)

With using, we can give a precise "initial state" for all the Intersect variables in our system. This, it turns out, is very convenient when we're trying to state sudoku problems:

squares :: Config Holmes (Intersect Value)
squares = let x = mempty in using
    [ x, 5, 6,   x, x, 3,   x, x, x
    , 8, 1, x,   x, x, x,   x, x, x
    , x, x, x,   5, 4, x,   x, x, x

    , x, x, 4,   x, x, x,   x, 8, 2
    , 6, x, 8,   2, x, 4,   3, x, 7
    , 7, 2, x,   x, x, x,   4, x, x

    , x, x, x,   x, 7, 8,   x, x, x
    , x, x, x,   x, x, x,   x, 9, 3
    , x, x, x,   3, x, x,   8, 2, x
    ]

Now, let's write some constraints!

📯 2. Declare your constraints

Typically, your constraints should be stated as a predicate on the input parameters, with a type that, when specialised to your problem, should look something like [Prop Holmes (Defined Int)] -> Prop Holmes (Defined Bool). Now, what's this Prop type?

🕸 Propagators

If this library has done its job properly, this predicate shouldn't look too dissimilar to regular predicates. However, behind the scenes, the Prop type is wiring up a lot of relationships.

As an example, consider the (+) function. This has two inputs and one output, and the output is the sum of the two inputs. This is totally fixed, and there's nothing we can do about it. This is fine when we write normal programs, because we only have one-way information flow: input flows to output, and it's as simple as that.

When we come to constraint problems, however, we have multi-way information flow: we might know the output before we know the inputs! Ideally, it'd be nice in these situations if we could "work backwards" to the information we're missing.

When we say x .+ y .== z, we actually wire up multiple relationships: x + y = z, z - y = x, and z - x = y. That way, as soon as we learn two of the three values involved in addition, we can infer the other!

The operators provided by this library aim to maximise information flow around a propagator network by automatically wiring up all the different identities for all the different operators. We'll see later that this allows us to write seemingly-magical functions like backwards: given a function and an output, we can produce the function's input!

🛠 The problem-solving toolkit

With all this in mind, the following functions are available to us for multi-directional information flow. We'll leave the type signatures to Haddock, and instead just run through the functions and either their analogous regular functions or a brief explanation of what they do:

🎚 Boolean functions

Function	Analogous function / notes
`(.&&)`	`(&&)`
`all'`	`all`
`allWithIndex'`	`all'`, but the predicate also receives the list index
`and'`	`and`
`(.\|\|)`	`(\|\|)`
`any'`	`any`
`anyWithIndex'`	`any'`, but the predicate also receives the list index
`or'`	`or`
`not'`	`not`
`false`	`False`
`true`	`True`

🏳️‍🌈 Equality functions

Function	Analogous function / notes
`(.==)`	`(==)`
`(./=)`	`(/=)`
`distinct`	Are all list elements different (according to `(./=)`)?

🥈 Comparison functions

Function	Analogous function / notes
`(.<)`	`(<)`
`(.<=)`	`(<=)`
`(.>)`	`(>)`
`(.>=)`	`(>=)`

🎓 Arithmetic functions

Function	Analogous function / notes
`(.*)`	`(*)`
`(./)`	`(/)`
`(.+)`	`(+)`
`(.-)`	`(-)`
`(.%.)`	`mod`
`(.*.)`	`()` for integral* functions
`(./.)`	`div`
`abs'`	`abs`
`negate'`	`negate`
`recip'`	`recip`

🌱 Information-generating functions

Function	Analogous function / notes
`(.$)`	Apply a function to the value within the parameter type.
`zipWith'`	Similar to `liftA2`; generate results from the parameters.
`(.>>=)`	Turn each value within the parameter type into the parameter type.

The analogy gets stretched a bit here, unfortunately. It's perhaps helpful to think of these functions in terms of Intersect:

(.$) maps over the remaining candidates in an Intersect.
zipWith' creates an Intersect of the cartesian product of the two given Intersects, with the pairs applied to the given function.
(.>>=) takes every remaining candidate, applies the given function, then unions the results to produce an Intersect of all possible results.

Using the above toolkit, we could express the constraints of our sudoku example. After we establish some less interesting functions for splitting up our 81 inputs into helpful chunks...

rows :: [ x ] -> [[ x ]]
rows [] = []
rows xs = take 9 xs : rows (drop 9 xs)

columns :: [ x ] -> [[ x ]]
columns = transpose . rows

subsquares :: [ x ] -> [[ x ]]
subsquares xs = do
  x <- [ 0 .. 2 ]
  y <- [ 0 .. 2 ]

  let subrows = take 3 (drop (y * 3) (rows xs))
      values  = foldMap (take 3 . drop (x * 3)) subrows

  pure values

... we can use the propagator toolkit to specify our constraints in a delightfully straightforward way:

constraints :: forall m. MonadCell m => [ Prop m (Intersect Value) ] -> Prop m (Intersect Bool)
constraints board = and'
  [ all' distinct (columns    board)
  , all' distinct (rows       board)
  , all' distinct (subsquares board)
  ]

The type signature looks a little bit ugly here, but the polymorphism is to guarantee that predicate computations are totally generic propagator networks that can be run in any interpretation strategy. As we'll see later, Holmes isn't the only one capable of solving a mystery!

Typically, we write the constraint predicate inline (as we did for the Dinesman example above), so we never usually write this signature anyway!)

We've explained all the rules and constraints of the sudoku puzzle, and designed a propagator network to solve it! Now, why don't we get ourselves a solution?

💡 3. Solving the puzzle

Currently, Holmes only exposes two strategies for solving constraint problems:

satisfying, which returns the first valid configuration that is found, if one exists. As soon as this result has been found, computation will cease, and this program will return the result.
whenever, which returns all valid configurations in the search space. This function could potentially run for a long time, depending on the size of the search space, so you might find better results by sticking to satisfying and simply adding more constraints to eliminate the results you don't want!

These functions are named to be written as infix functions, which hopefully makes our programs a lot easier to read:

sudoku :: IO (Maybe [ Intersect Value ])
sudoku = squares `satisfying` constraints

At last, we combine the three steps to solve our problem. This README is a literate Haskell file containing a complete sudoku solver, so feel free to run cabal new-test readme and see for yourself!

🎁 Bonus surprises

We've now covered almost the full API of the library. However, there are a couple extra little surprises in there for the curious few:

📖 `Control.Monad.Watson`

Watson knows Holmes' methods, and can apply them to compute results. Unlike Holmes, however, Watson is built on top of ST rather than IO, and is thus a much purer soul.

Users can import Control.Monad.Watson and use the equivalent satisfying and whenever functions to return results without the IO wrapper, thus making these computations observably pure! For most computations — certainly those outlined in this README — Watson is more than capable of deducing results.

🎲 Random restart with `shuffle`

Watson isn't quite as capable as Holmes, however. Consider a typical Config:

example :: Config Holmes (Defined Int)
example = 1 `from` [1 .. 10]

With this Config, a program will run with a single parameter. For the first run, that parameter will be set to Exactly 1. For the second run, it will be set to Exactly 2. In other words, it tries each value in order.

For many problems, however, we can get to results faster — or produce more desirable results — by applying some randomness to this order. This is especially useful in problems such as procedural generation, where randomness tends to lead to more natural-seeming outputs. See the WaveFunctionCollapse example for more details!

♻️ Running functions forwards and backwards

With satisfying and whenever, we build a predicate on the input parameters we supply. However, we can use propagators to create normal functions, too! Consider the following function:

celsius2fahrenheit :: MonadCell m => Prop m (Defined Double) -> Prop m (Defined Double)
celsius2fahrenheit c = c .* (9 ./ 5) .+ 32

This function converts a temperature written in celsius to fahrenheit. The interesting part of this, however, is that this is a function over propagator networks. This means that, while we can use it as a regular function...

fahrenheit :: Maybe (Defined Double)
fahrenheit = forward celsius2fahrenheit 40.0 -- Just 104.0

... the "input" and "output" labels are meaningless! In fact, we can just as easily pass a value to the function as the output and get back the input!

celsius :: Maybe (Defined Double)
celsius = backward celsius2fahrenheit 104.0 -- Just 40.0

Because neither forward nor backward require any parameter search, they are both computed by Watson, so the results are pure!

🚂 Exploring the code

Now we've covered the what, maybe you're interested in the how! If you're new to the code and want to get a feel for how the library works:

The best place to start is probably in Data/JoinSemilattice/Class/* (we can ignore Merge until the next step). These will give you an idea of how we represent relationships (as opposed to functions) in Holmes.
After that, Control/Monad/Cell/Class.hs gives an overview of the primitives for building a propagator network. In particular, see unary and binary for an idea of how we lift our relationships into a network. Here's where src/Data/JoinSemilattice/Class/Merge gets used, too, so the write primitive should give you an idea of why it's useful.
src/Data/Propagator.hs introduces the high-level user-facing abstraction for stating constraints. Most of these functions are just wrapped calls to the aforementioned unary or binary, and really just add some syntactic sugar.
Finally, Control/Monad/MoriarT.hs is a full implementation of the interface including support for provenance and backtracking. It also uses the functions in Data/CDCL.hs to optimise the parameter search. This is the base transformer on top of which we build Control/Monad/Holmes.hs and Control/Monad/Watson.hs.

Thus concludes our whistle-stop tour of my favourite sights in the repository!

☎️ Questions?

If anything isn't clear, feel free to open an issue, or just message me on Twitter; it's where you'll most likely get a reply! I want this project to be an accessible way to approach the fields of propagators, constraint-solving, and CDCL. If there's anything I can do to improve this repository towards that goal, please let me know!

💐 Acknowledgements

Edward Kmett, whose propagators repository* gave us the Prop abstraction. I spent several months looking for alternative ways to represent computations, and never came close to something as neat.
Marco Sampellegrini, Alex Peitsinis, Irene Papakonstantinou, and plenty others who have helped me figure out how to present this library in a maximally-accessible way.

* This repository also approaches propagator network computations using Andy Gill's observable sharing methods, which may be of interest! Neither Holmes nor Watson implement this, as it requires some small breaks to purity and referential transparency, of which users must be aware. We sacrifice some performance gains for ease of use.

holmes's People

Contributors

Stargazers

Watchers

Forkers

webyrd gwils markus1189 robx jingmouren obrienwrite tjweir olivierdeckers stjordanis ju1m j-mueller silky tristano8

holmes's Issues

Failing to build Watson: MonadFail not in scope

Tried to install holmes from Cabal and got this error:

src/Control/Monad/Watson.hs:55:10: error:
    Not in scope: type constructor or class ‘MonadFail’
   |
55 | instance MonadFail (Watson h) where
   |          ^^^^^^^^^

I'm using GHC version 8.6.5, it seems to be failing on Hackage CI as well.

Produce solutions where the order doesn't matter

First off, thanks for your work on this library and for providing such extensive docs. It was super easy for me to get started!

I'm currently experimenting with a constellation where the order of the elements in the solution doesn't matter. I'm pretty sure I'm missing something here, so maybe you can help me out.

data M = A | B | C | D deriving (Bounded, Eq, Enum, Ord, Generic, Hashable, Show)

solve :: IO [[Defined M]]
solve = do
  let guesses = 3 `from` [A .. D]
  guesses `whenever` \solution -> and' [distinct solution]

run :: IO ()
run = solve >>= traverse_ print . take 3

This will print

[Exactly A,Exactly B,Exactly C]
[Exactly A,Exactly B,Exactly D]
[Exactly A,Exactly C,Exactly B]

However, I'd like to add a constraint that makes the solver treat the first and the third solution as equal solutions (and therefore delete one of them from the set of possible solutions).
I could map over the solution afterwards, but this doesn't help in making the computation run faster. That's why I want to include it in the constraints.

Is there a way to do that?

A small example produces a result I didn't expect

Hello, big fan of your library and it seems very cool. I'm trying it out with this years advent of code, and ran into this while doing the first problem. Boiled down, basically:

test :: IO (Maybe [ Defined Int ])
test = do
  let guesses = 2 `from` [1, 3]

  guesses `satisfying` \[a, b] -> and'
    [ (a .+ b) .== 5 ]

returns Just [Exactly 1,Exactly 4], where as I would've expected it to fail without a match.

I'm guessing I'm using something wrong but was surprised. Thanks again for your library!

When using shuffle, Holmes can't always find a solution

See https://github.com/olivierdeckers/holmes/blob/aquarium-example/examples/Aquarium.hs for an example

when running solve example4, it sometimes returns the solution, and sometimes returns Nothing.

stack overflow / infinite loop when solving with Intersect instead of Defined

Chances are I'm doing something wrong here, but since I'm at a point where I'm likely to shelve this toy project, thought I should report my problem just in case it's an issue with Holmes.

Concretely, my puzzle solver runs fine and slow with Defined, and typechecks but doesn't run at all with Intersect. (Doesn't run at all means: stack overflow in GHCI, non-termination in compiled executable.)

The offending function seems to be:

-- given a list of values at indexes, count how many of these occur
countEqual ::
  (Eq x, Mapping f c, c x, c v, Num v, Num (f v), SumR (f v), MonadCell m) =>
  [(Int, x)] ->
  [Prop m (f x)] ->
  Prop m (f v)
countEqual vals cells = foldr (.+) (lift 0) (map f vals)
  where
    isEqual v w = if w == v then 1 else 0
    f (i, v) = isEqual v .$ (cells !! i)

With this (and a bounded integer type Val4 with range 0..4), I see:

> let cfg = 1 `from` [1] :: Config Holmes (Defined Int)
> let prop = (\cells -> countEqual [(0, 3::Int)] cells .>= lift (pure (0::Int))) :: MonadCell m => [ Prop m (Defined Int)] -> Prop m (Defined Bool)
> cfg `satisfying` prop
Just [Exactly 1]
> let cfg = 1 `from` [1] :: Config Holmes (Intersect Val4)
> let prop cells = countEqual [(0, 3::Val4)] cells .>= lift (singleton (0::Val4))
> cfg `satisfying` prop
*** Exception: stack overflow

(Chances are there's a better way to do what I'm doing. The thing I'm trying to encode is that some values on a Sudoku-like board count certain other cells in the board according to their values and relative position.)

Thanks for making Holmes by the way! Certainly had fun figuring things out this far.

Improve type inference for (.==), (.>=), ...

In order to produce lift, the following class was created:

class Lifting (f :: Type -> Type) (c :: Type -> Constraint) | f -> c where
  lift' :: c x => x -> f x

Now, if we say lift 3 .== lift 4 :: Prop m (Defined Bool), we run into a problem: we can't infer a type of Defined Int (specifically the Defined part) for the 3 and 4, because there's no reversing functional dependency on the EqR class to get us from Defined Bool back to Defined Int. This makes sense - the instances are of the shape EqR (Defined x) (Defined Bool), so you couldn't have a functional dependency in the other direction even if you wanted one.

The type inference could be fixed if we restructured the EqR and OrdR classes to be of the same shape:

class EqR (f :: Type -> Type) (c :: Type -> Constraint) | f -> c where
  eqR :: c x => (f x, f x, f Bool) -> (f x, f x, f Bool)

Now, we've guaranteed that the parameter type of the input and bool result match. This means that with lift 3 .== lift 4 :: Prop m (Defined Bool), it is clear that the type of 3 must be (Num x => Defined x), which is what we're after.

This is not a lot of work, but it might be a good first issue if someone fancies getting involved in the library!

Thoughts on deterministic shuffling, for repeatable procedural generation?

Hi there, thanks for this lovely library! I have grand plans to (eventually) incorporate procedural generation via wavefunction collapse into https://github.com/swarm-game/swarm/ (it currently has procedural generation but only based on Perlin noise) . However, I was reading the documentation today, and it seems like I want something in between Holmes and Watson:

I do want to generate "natural-looking" procedurally generated things, so I definitely want the random shuffling that the Holmes monad does. Besides just generating more natural-looking things, I also don't want to get the same solution every time.
However, because I am generating infinite worlds and only dynamically generating/loading the parts of the world that are actually explored, the generation needs to be deterministic (based on a seed value), like Minecraft. I also just really don't want to introduce IO.

So I would really like a monad that (1) is built over ST, but (2) does random shuffling based on a seeded PRNG. This seems like it should be totally possible in theory --- from what I can tell Holmes is implemented in terms of shuffle from hedgehog, which has type MonadGen m => [a] -> m [a] and does not itself depend on IO in any way --- the IO only comes in from the use of sample :: MonadIO m => Gen a -> m a, which is one convenient way to run a MonadGen computation but not the only way.

Curious to hear what you think about the possibility of exposing a slightly different API which allows this. I'd be very happy to try to submit a PR if you think it seems reasonable. I saw elsewhere that you mentioned working on a rewrite of this library, so I'm not quite sure what the status is.

some common way to embed values across Defined / Intersect would be nice

TLDR: I can use pure to get a value to test against into Defined, and singleton for Intersect, but there's nothing polymorphic. Would be nice to have something.

The story behind it:

I wrote a solver for a puzzle type using Defined, which worked out fine but turned out pretty slow. So I thought I'd try out Intersect to see if that improved things. In what may have been a mistaked I tried to generalize the constraints to be able to use both Defined and Intersect.

I ran into a couple issues there:

I had used over (fmap f) (because (.$) was broken when I started with this), which worked fine due to the Functor instance on Defined, but Intersect doesn't have that. Switching to (.$) turned out to work at the expense of having to specify terrifying type signatures. I suppose Intersect doesn't naturally admit a Functor instance? Otherwise that would be nice to have.
I used some equality checks against given values with x .== lift (pure v), which caused an Applicative constraint which again works for Defined but not Intersect. That's this ticket.

I'll have to disclaim this report by being very much unsure what I'm doing here -- I'm plugging things together getting them to work rather than really understanding how this stuff should function, so there's a good chance I'm asking for things that don't make sense.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.