# Computationally solving Cardinal Chains: An adventure with graphs.

Cardinal Chains is a simple to understand, fantastically fun puzzle game (you can get it here, and on any mobile device).The premise is as follows: The player is given a grid of numbers, and a coloured start point. The objective is to fill the entire grid with colour by moving from the start to every square in the grid. You are not allowed to backtrack, and you can only move to a square with a higher or equal number than the one you’re currently on. Being the extremely cool person giant nerd that I am, my first thought was “Can I get a computer to play this?”

In drawing out diagrams of the levels, it became apparent they could be represented as graphs. Rather than a graph of a function, which most people are familiar with, I’m referring to a mathematical structure  made up of nodes connected together by edges. For example, the level above can be represented like this: Graphs can be weighted or unweighted, and can be undirected, directed or mixed. Weighted vs unweighted simply refers to whether the vertices have a value associated with them. These can be used to represent the “cost” of travelling along that vertex. For example, if your nodes are cities, and your vertices roads, the weights might represent how far apart the cities are (i.e. how “long” each road between them is).

In an undirected graph, every edge can be travelled along in both directions. In a directed graph, each edge only goes one way. A mixed graph is, unsurprisingly, one that contains both directed and undirected edges. It’s worth noting that any mixed (or indeed undirected) graph can be made into a directed graph by simply converting every undirected edge to two directed ones, one in each direction.

So what we have above is an unweighted mixed graph.

#### Complications

Graph theory is a pretty well established subject, so all I need to do now is look up the algorithm for visiting every node on a graph exactly once, code it up, and we’re done right? Wrong. The problem of visiting every node on a graph exactly once has indeed been studied at length (it’s called the Hamiltonian path problem), but it has been shown to be np-complete. What this means is that it’s very easy to check whether a solution is valid, but there is no efficient way to come up with a solution in the first place. Consider soduku – if someone hands you a completed grid it’s trivial to check whether it’s correct, but actually solving it in the first place is much harder. The Hamiltonian path problem has a similar type of difficulty. Note that both with soduku and the cardinal chains levels, most everyday examples are small enough that they can simply be solved with brute force: try every possible solution until you come up with the right one. I’d quite like to avoid this if possible, though.

At this point I realised I wasn’t going to be able to solve any arbitrary, randomly generated level. However, the levels in Cardinal Chains have been designed to be solvable by a human. Hell, I did the first 75 levels or so in about half an hour. So, I decided on a new strategy: play the game, and slowly, step-by-step, figure out what logical deductions I’m using to solve the problems. If I can generalise these into rules, perhaps I can come up with a step-by-step way a machine can solve each level. After doing this for a little while (and spending lots of time drawing and staring at graph representations on my whiteboard), I came up with the following rules (I refer to directed edges leaving or arriving at a node as exits and entrances):

1. “End points”, if they exist, can be located as nodes with either a single two-way edge or no exit edges.
2. If a node with a two-way edge is an end point, the edge can be made into an entry edge.
3. If a node has only a two-way edge and exit edges, the two-way edge can be converted to an entry edge (since all nodes must be accessible somehow).
4. If a node has only a two-way edge and entry edges, the two-way edge can be converted to an exit edge (since if it’s not an end point, you need to be able to leave it).
5. If a node is part of the only path to an end point, that node can’t have any other exits (as if you took them you’d never get to the end point).
6. If a node has only a single exit, all other entrances to that destination can be destroyed (since you have to arrive at it from that node).
7. If every node (apart from Start) has a single exit edge, and an end point has been found, the solution is found by backtracking from End and only moving to Start when it is the only possible option.

Following only these rules I was able to solve several levels “by hand” on my whiteboard. How exciting! Time to write some code. I chose to use Python, since it’s by far the language I’m most comfortable with, and enough  my brain is already occupied with problem solving to have to also worry about unfamiliar syntax. If I want to visualise the solutions later, I’ll use Tk, or write a little snippet that throws the relevant information to p5.js.

#### Implementing the rules

The first thing we need is something to represent our graph. For an unweighted graph, we can just use a dictionary: Each node has an ID, which we use as a key, and its values are the IDs of the nodes it is connected to. This representation is implicitly directed: {'A':['B', 'C']} represents a graph with 3 nodes and two one way vertices going from A to B and C respectively. If we want two way vertices, we have to add vertices going back the other way:

So we need a graph class, a dictionary to store the nodes and their vertices and a method to construct a graph from a Cardinal Chains level. We also need a method to check if a vertex between two nodes is two-way, since I refer to undirected vertices a lot in my rules, while our graph representation is directed. I chose to represent the levels as matrices, with a value of -1 representing an unused grid square, 0 representing a start square and then the other squares simply represented by their numbers. As a result, we can construct the graph as follows:

1. Iterate over each grid square and do the following:
• If the square contains -1, do nothing
• Otherwise, look at all the adjacent squares. If their values are greater than or equal to the current square’s value, construct vertices to them from the current square.

Dealing with all the edge cases turned this method into a bit of a monster of for loops and if/else statements, but the comments should make it clear enough.
(mouse over or click the ⇔ icon to expand the code box so you don’t need to scroll sideways).

At this point I realised a visualiser of some kind would probably prove a very hand debugging tool down the road. I’m not terribly experienced using Tkinter, so the code is a little hack-y, but I’ve included it and my test script anyways just in case you’re interested. The result confirms my graph representation is working as expected.
(again, mouse over or click the ⇔ icon to expand the code box) Great. We’re getting somewhere. Now I need to code-ify the rules I’ve come up with, and see if we can solve this level with it. I started with finding the end point, and quickly found a problem with my implementation: if a node has no vertices leaving it, it doesn’t exist in  self.nodes . So I modified  from_matrix to initialise every node as an with an empty array as the list of vertices. This also removed the slightly ugly “try…except KeyError” workaround I was using. (I could have also just used a defaultdict, whose existence I managed to forget about)

Overall, rules 1-4  and 6 were really straightforward to implement:

Rule 5 is going to be slightly more tricky – we need a way of testing if a node is part of the only path to an end point. The first thing that occurred to me was this:

1. Start at the end point
2. Backtrack as long as you have only one option of where to go
3. Keep a list of all the nodes you visit

It’s not the prettiest code I’ve ever written, and it’s rather inefficient, but at this point I’m not too worried about that; the cardinal chains levels are small enough that iterating through the nodes takes a negligible amount of time.

Repeatedly applying my check_edges() function solves the simple level we’ve been working with in 3 iterations. In fact, it solves some fairly complex levels in only a few iterations too: #### Complications….again.

My algorithm runs in to a brick wall really hard on some other levels: That’s it. We remove two nodes in one iteration, and then nothing. Despite the fact that it’s easy to “see” a solution to the level,

Spoilers!

Clockwise rotation outwards from centre (starting at 6 o’clock) followed by anticlockwise rotation around the edge.

[collapse]
we’re missing some insight to deriving it logically. Staring at the top-right “2” (that was what, for me at least, made the solution to this puzzle obvious), I realised I had overlooked the “inverse” version of rule 6:

Rule 6.5: If a node has only a single entrance, all other exits from the entering node can be destroyed.

Adding this in takes a single simple if statement and gets us much closer to solving the level:  The unsolved, problematic region in the center-right

We’re still stuck with a little chain on the right, though. Again, to a human, it’s blatantly obvious you can’t take the right-leading vertex when you reach the top-right 2. So why is it obvious? Well, once you take that vertex, part of the graph has become unreachable and we’ve therefore failed. This is a more complicated statement to test than it seems though: what we’re actually saying is “No paths from the node this vertex leads to have access to a part of the graph we have yet to visit“. To test this, we have to evaluate all the paths from a given node, and since we’re applying all our rules to each node, we’d end up evaluating every possible path in the graph, which would be even less efficient than brute-forcing every possible route from start to end. It seems there isn’t a way deterministically find the correct path using the kind of rules we’ve been following up to this point (at least, I haven’t been able to find one!). Time to leverage the fact that we are indeed using a machine, and apply some brute-force. I found a depth-first-traversal algorithm by Neelam Yadav over at GeeksforGeeks, and modified it to do what I wanted: Find every possible path (without repetition of nodes) from the start to the end point of the graph, and select only the paths that visit every node.

Obviously, this solution only works if we’ve found an end point. If we haven’t we have to do it for every possible end point (i.e. every node in the graph that isn’t a start point), which is significantly more computationally expensive. So, now we can evaluate how long it takes to implement the original graph compared to our “simplified” one. Since we’re going to be looking at performance, it’s important we’re only calling check_vertices() until it stops removing edges:

#### Performance Comparison

In order to test a large number of iterations without creating a new graph object every time, we’re going to need a function to reset the graph to its original state

And here’s our test script and results

We’re seeing a reasonably large benefit to simplifying the graph in advance; just brute-forcing takes about 1.5x longer. Let’s add some complexity to our level by placing an additional couple of rows of 4s along the bottom and see how that affects the results. The modified level 20, with 10 additional interconnected nodes

Adding more complexity increased the performance gain to about 2.5x. This is what you’d expect: the more edges in the original graph, the more potential paths you cull by simplifying it. I had to reduce the number of iterations in order to run the test in a reasonable amount of time, but since the difference in results generated between running it for 10 and 100 iterations is negligible, I’m satisfied this hasn’t affected anything.

#### Conclusions

Sadly, I wasn’t able to find a way to solve all the single start-point Cardinal Chains levels without resorting to brute force at some point. However, I was able to create an algorithm that tangibly speeds up the process of solving them, particularly for large levels with many edges. Cardinal Chains has many more levels with multiple start points, that I intend to tackle in a later post. I’d also really like to get my computer to actually play the game – that is, read the screen, calculate the solution, then move the mouse to actually implement it. This will require some fun image analysis techniques, so I’m looking forward to doing that, too.

Again, if you fancy playing some Cardinal Chains, you can get it here, the app store, or on Google play. Alternatively, if you’d like to play with the Graph class and solving methods I wrote, you can get it here. 