# Graph Traversal: solving the 8-puzzle with basic A.I.

I’ve been working through Peter Norvig and Stuart Russel’s Artificial

Intelligence, A Modern

Approach (thanks to the Square engineering library) and one of the most helpful chapters involved methodically demonstrating basic graph traversal algorithms for problem solving. If that sounds heady, it’s not — I think you’ll enjoy it.

First,

let’s talk about graphs. A graph is any set of points (nodes) and the

lines (edges) between those points. A simple kind of graph is a tree

structure. This is where there’s a single ‘root’ which has 1 or more

branches that then have their own branches, etc. Many things in computer

science can be expressed using some kind of tree structure graph.

Here’s how this relates to AI: the root node (the one at the top of

the tree) is the state the world is in right now. Each other node

represents a different state of the world that’s reachable across an

edge by the node immediately above it. The line between them is some

kind of action. Moving a car’s wheels, turning on a servo, whatever. The

A.I. just needs to know that if you start at node A and do action X

you’ll get to node B. Your software will appear to be intelligent if it

can start at the root node and find its way to a better state of the

world through a series of actions.

## A world where you try to get a big number

Let’s reuse the above image as an example. Imagine it’s a complete map

of all possible states of the world. The only actions that are

available to you are “go left” and “go right” (‘cause you’re always

starting at the root (top) node and heading downward). Imagine,

additionally, that the value of the number of a node is how much we like

that node. From a glance you can easily tell that one of the nodes has a

value of **11** and is therefore the “best” node in the graph; it’s the

“best” state of the world that we could possibly get to. If our software

is intelligent it should start at the root node and find the **11** node

as it’s destination.

Since we can easily see where the **11** node is the right answer is “go

left, then right, then right again.” But how do we teach our code to

find the best node?

## Recursive Depth-first search

Here’s the simplest way we can find the right node on the given tree.

```
best = nil
def walk(node)
best = node if best.nil? || node > best
node.children.each {|child| walk child }
end
walk root
```

Here we “walk” down each branch of the tree all the way to the end using

recursion. This is known as recursive ```
depth first
search
```

and is a great

tool when you think that any path might have a good node really far down

so you just want to get really deep really fast. It’s also the least

code possible to find a solution to our problem. Unfortunately, the

simple implementation of depth-first involves recursion which means

we’re limited to only traversing graphs that have a total depth less

than our runtime stack frame limit. If you’ve ever seen a “stack

overflow error” it’s because there was so much recursion in your program

that the computer assumed you were caught in an infinite loop and gave

up. To demonstrate this try running this simple program in `irb`

:

```
def go(n) # On my machine the last thing printed was 8247 and then I saw:
puts n # SystemStackError: stack level too deep
go n + 1 # which means Ruby let me use 8,247 stack frames
end # before giving up. If I were to try going 9,000 nodes deep
go 0 # in a graph I'd get this error.
```

Now, if you wanted to do a depth-first search without recursion you can

but it’s no longer the simplest code so I’ll skip it here. Suffice to

say that it would mean you’ll have to manually keep track of which nodes

to visit in which order rather than letting your programming language do

it for you.

## Breadth-first search in a loop

Another simple way to traverse the graph is to look at the nodes from

left to right on each level rather than going all the way down one

branch and then all the way down another branch. This is called

```
breadth-first
search
```

and is

better if you think there’s a good node close to the root.

This is also a very simple bit of code *and* it frees us

from having to rely on our runtime’s call stack. Rather than recursing

through methods and letting Ruby keep track of our work (as in the above

example) we’ll just run in a loop and store our work in an array. The

advantage here is you can put more than 9,000 items in an array —

it’s only bounded by how much memory is on your machine.

```
queue = [root]
best = -1
begin
current = queue.shift
best = [current, best].max
current.children.each {|child| queue.push child }
end until queue.empty?
```

The name “breadth-first” comes from the fact that it’ll look at all the

nodes at each level from side to side before proceeding down to the next

level. Notice that the array variable is named `queue`

. This is because

in a depth-first implementation you’re always going to have a list that

you put newly-discovered nodes onto the end of and pull nodes to explore

off of the front.

Breadth-first search is easy to reason about, you won’t run out of stack

space like when you used recursion (although it’s possible you’ll run

out of memory), and if your environment supports concurrency primitives

you might be able to run it in parallel quite easily. C# has a ```
Consumer
Queue
```

that can help with this and Clojure has multiple ways to iterate

through a list in parallel. Ruby requires you to do more work

synchronizing when threads get to append their newly-discovered nodes to

the end of the queue but otherwise you get the parallelism cheaply.

## Solving the sliding-block puzzle

We walk

along the edges of a graph from the starting node to a better

node. With me so far? If this graph represents a set of world states

that are reachable from each other via actions then searching the graph

is the same thing as figuring out what to do.

Let’s use this technique to try to solve a problem that has a clear

starting state and a clear ending state with many (possibly *very* many)

intermediate states. The sliding-block puzzle (often called an 8-puzzle

or, in it’s larger variant, a 15-puzzle) is a great case for us to

tackle.

In an 8-puzzle you’ve got a bunch of tiles in the wrong places and just

one empty space that you can move around until the tiles are in the

right order. How do you know which move to make first? How do you know

when you’re on the right track? How do you know if you’ve going in

loops?

If we model each possible action as edges in a graph and each potential puzzle

state as a node then we just start at the beginning and begin exploring

the graph. We’ll stop once we’ve found the solution (or, if we built our

code poorly, we’ll stop when we run out of memory or time).

Now, there are two ways we can set up our data for this problem. One is

to generate all possible states (nodes) that the puzzle can have and to

then connect the adjacent states. We would then have a complete graph we

could traverse and we’d even be able to mark the solution ahead of time

and know where it was located in the graph. Unfortunately, the number of

states is 9-factorial or about 360,000. Generating that many puzzle

slide orientations and then iterating through each one would take, at

best, `(9!)(9!-1)/2`

node-comparison operations (the formula for how

many edges can exist between nodes in a graph is `(n * n-1)/2`

)

So let’s not do that. Rather, let’s start at the root node (the starting

state) and then create branches from each node as we go. We’ll stop

when we discover our solution — hopefully long before we examine 360,000

states.

## Defining a puzzle class

We’re going to need a few tools. First, let’s put together a way to

represent a puzzle board with tiles in a particular position:

```
class Puzzle
Solution = [0, 1, 2, # Let's put all the tiles in ascending order
3, 4, 5, # the same way you'd see them on a phone keypad.
6, 7, 8] # We use a '0' for the blank cell because
# `nil` doesn't play well with others.
attr_reader :cells
def initialize cells
@cells = cells
end
def solution?
Solution == @cells
end
end
```

What did we just do there? That is a `Puzzle`

class where each instance

knows whether it’s a solution. The cells/tiles of the puzzle are kept in

a list.

Now let’s construct a way to represent a state (a node on the

solution graph). A state isn’t just a representation of puzzle tile

position but *also* the history of how that puzzle arrangement was

reached from the starting point. This is key: if we don’t keep track of

how we got to a solution node on the graph then we’ll never be able to

report how to solve the puzzle. So we need to keep a list as we go of

which actions we’ve taken to arrive at the current node.

```
class Puzzle # We're extending the Puzzle class.
def zero_position # `zero_position` tells us which cell has
@cells.index(0) # the '0'.
end
def swap swap_index # `swap` tells the puzzle:
new_cells = @cells.clone # "give me you, but with the '0' cell
new_cells[zero_position] = new_cells[swap_index] # replaced by the cell at some other
new_cells[swap_index] = 0 # location of my choice."
Puzzle.new new_cells # This is how we'll simulate moving a tile.
end
end
class State # Each `State` instance represents a node
Directions = [:left, :right, :up, :down] # in our solution graph. It keeps track of
# both a puzzle and the list of actions
# required to arrive at there from the
attr_reader :puzzle, :path # starting node.
def initialize puzzle, path = []
@puzzle, @path = puzzle, path # The `path` is a list of actions
end # like 'up', 'down', 'right', right'
def solution? # Each node in our graph should
puzzle.solution? # know if it's a solution.
end
def branches # Returns all adjacent possible states
Directions.map do |dir| # including steps needed to get there.
branch_toward dir # Most nodes will have 2-4 branches based
end.compact.shuffle # on how many directions the blank tile
end # can try to go.
private
def branch_toward direction
blank_position = puzzle.zero_position
blankx = blank_position % 3
blanky = (blank_position / 3).to_i
cell = case direction # The only reason this method is so long
when :left # is because sometimes the blank tile is already
blank_position - 1 unless 0 == blankx # at a wall and that direction isn't possible
when :right
blank_position + 1 unless 2 == blankx
when :up
blank_position - 3 unless 0 == blanky
when :down
blank_position + 3 unless 2 == blanky
end
State.new puzzle.swap(cell), @path + [direction] if cell
end
end
```

This State class knows about one particular arrangement of the puzzle

and is able to determine next steps. When we call State#branches we get

a list of adjacent puzzle arrangements (anything reachable by moving the

empty space over by 1 square) and each of these new states include the

full list of steps necessary to reach them.

That’s the setup. Now that we have some problem-specific helpers we can

use our breadth-first algorithm from up above to start tackling this.

```
def search state
state.branches.reject do |branch| # Important: don't revisit puzzles
@visited.include? branch.puzzle.cells # you've already seen!
end.each do |branch| # The list of places we need to search
@frontier << branch # is known as the 'frontier'
end
end
require 'set' # We'll remember what we've seen in a set, it has
def solve puzzle # way better lookup times than an array.
@visited = Set.new
@frontier = []
state = State.new puzzle
loop {
@visited << state.puzzle.cells
break if state.solution? # This is the `base` or end condition
search state
state = @frontier.shift # Pull another off the list, keep chugging along
}
state
end
```

If we feed in a solveable puzzle we can see that this code works. Let’s

try one where the empty tile was moved right and then down. The solution

should be to move it up and then left:

```
p solve(Puzzle.new [1, 4, 2, # `solve` is going to return a State
3, 0, 5, # instance. We care about it's #path
6, 7, 8]).path
## => [:up, :left]
```

So… it works, but it’s just kinda wandering around until it finds a

solution. We gave it a problem that was only 2 steps from a solution so

if we gave it something harder would it ever finish? And how long would the

solution path be?

Here’s our code running with a puzzle who’s optimal solution is 20 steps

away:

```
p solve(Puzzle.new [7, 6, 2, # I generated this by creating a solution state and
5, 3, 1, # running `state = state.branches.sample` in a big loop.
0, 4, 8]).path
## => [:up, :up, :right, :down, :right, :down, :left, :left,
## :up, :up, :right, :down, :down, :right, :up, :left,
## :down, :left, :up, :up]
## Time: 27 seconds
```

It works! Eventually. But 27 seconds is a bit slow. What if we were

tackling the 15-puzzle instead? Rather than the 9! (360K) options we

would be searching through 16! (20 trillion) options. That would take

almost literally forever.

## Uniform-cost search

As we walk the graph we’re keeping a `frontier`

— a list of states

we’re hoping to explore in the future. Since we always add to the back

of the list and take (`shift`

) from the front it’s technically a FIFO queue rather than just a list.

What if, rather than picking the next element from the queue to explore

we tried to pick the *best* one? Then we wouldn’t have to explore quite

so many trillions of nodes in our state graph.

Uniform-cost search entails keeping track of the how far any given node

is from the root node and using that as its `cost`

. For our puzzle

example that means that however many steps `n`

it takes to get to state

`s`

then `s`

has a cost of `n`

. In code: ```
s.cost =
steps_to_reach_from_start(s)
```

. A variant of this is called Dijkstra’s

Algorithm.

There’s one missing piece here though: we don’t want to examine every

item in the entire `frontier`

queue every time we want to pick the

next lowest-cost element. What we need is a priority queue that

automatically sorts its members by some value so looking up an element

by cost is cheap and doesn’t slow down the rest of what we’re trying to

do.

```
class PriorityQueue # This is a terrible implementation of a
def initialize &comparator # priority queue. The `#sort!` method iterates
@comparator = comparator # through every item every time.
@elements = [] # What you want is a priority queue backed by a heap data structure.
end # In Ruby you should use the `PriorityQueue` gem
# and on the JVM there's a good Java implementation.
def << element
@elements << element
sort!
end
def pop # `pop` is the typical queue-polling nomenclature.
@elements.shift # Your implementation may call it something else.
end
private
def sort!
@elements = @elements.sort_by &@comparator # This line is why this implementation
# sucks. Don't use this IRL.
end
end
class State
def cost # The cost is pretty simple to calculate here.
path.size # The path contains all the steps, in order that we
end # used to arrive at this state. So the cost
end # is just the number of steps.
require 'set'
def solve puzzle
@visited = Set.new
@frontier = PriorityQueue.new {|s| s.cost }
state = State.new puzzle
loop {
@visited << state.puzzle.cells
break if state.solution?
search state
state = @frontier.pop
}
state
end
```

**Sidebar:** you may be wondering why this gains us any advantage? Sure,

we’re now picking the the best node from the queue rather than whichever

one was added first but we still have to explore all of them, right?

Actually, no. Because we’re sorting by the ‘cost’ of the nodes we can be

guaranteed that whenever we find a solution it’s the best one. There

may be other paths to solutions in our graph but they are all guaranteed

to be of higher cost. So this Uniform-cost search lets us leave a vast

section of the queue unexplored.

What does that do to our performance? Well, if we re-run our above

20-step puzzle the time will drop considerably from 27 seconds to 10

seconds (on my machine).

This is a big speedup and, for larger problems, can shave days off the

calculation time. But there’s much more we can do.

## A* Search

The uniform-cost search picks the best next state from the frontier.

Let’s enhance the code’s understanding of what makes something

“best” by calculating not only the distance from the start to where we

are but the distance from where we are to the goal.

Old cost function: `steps_to_get_to(s)`

New cost function: `steps_to_get_to(s) + steps_to_goal_from(s)`

But, uh, how do we know how far we are from the solution? If we knew how

far away the solution was we’d probably already have found it, right?

Right. So rather than being exact, let’s just pick a healthy estimate of

how far we are from a solution. One approximation would be “how many

tiles are out of place?” That would at least differentiate

almost-solution nodes from not-even-close ones. But we’d like to be a

bit more precise.

So let’s say that the distance cost between a given node and the

solution node is the number of tile-movements that would be required if

tiles could move through each other and go straight to their goal

positions. So a near-solution node might have a distance cost of 3 and a

not-even-close node might have a distance cost of 26. That should give

us decent precision while also being fair. It’s important that our

cost-to-get-to-goal function doesn’t accidentaly deprioritize good

near-solution states.

To help us we’ll calculate the Manhattan

Distance between each

tile and where it’s supposed to be. Manhattan Distance is the distance

between two places if you have to travel along city blocks.

Essentially, you’re adding up the short sides of a right triangle rather

than shortcutting across the hypotenuse. The formula is pretty simple:

```
class Puzzle
def distance_to_goal # Here we `zip` the current puzzle with the solution
@cells.zip(Solution).inject(0) do |sum, (a,b)| # and total up the distances between each cell
sum += manhattan_distance a % 3, (a / 3).to_i, # and where that cell should be.
b % 3, (b / 3).to_i # This % and / stuff is just turning an integer
end # into puzzle x,y coordinates
end
private
def manhattan_distance x1, y1, x2, y2 # The manhattan distance of something is just
(x1 - x2).abs + (y1 - y2).abs # the distance between x coordinates
end # plus the distance between y coordinates
end
class State
def cost
steps_from_start + steps_to_goal # Now we have a more informed cost method
end # so our priority queue should be giving
# us better results.
def steps_from_start
path.size
end
def steps_to_goal
puzzle.distance_to_goal
end
end
require 'set'
def solve puzzle
@visited = Set.new
@frontier = PriorityQueue.new {|s| s.cost }
state = State.new puzzle
loop {
break if state.solution?
search state
state = @frontier.pop
}
state
end
```

If you’re following along at home (and using a real priority queue) you

might think the code is broken because it exited so fast. With a proper

priority queue implementation this latest search took 0.07 seconds.

This A* search is

able to quickly pick the best candidate to explore in

any situation where the distance from the current state to the goal

state is knowable. In real-world pathfinding, e.g., you can use the geospatial

distance between two points. It doesn’t work at all, however, in

situations where you know the goal when you see it but can’t determine

how close you are. A robot trying to find a door in unexplored

territory would not be able to use this, it would have to just keep

bumbling around.

The full reference code for this is on

GitHub

including a full implentation in

Clojure.

Huge thanks to my reviewer Ashish Dixit

without whom this post would have been a typo-filled mess of

half-conveyed ideas.

A quick recap of the relative time and memory costs for these search

algorithms:

```
uninformed depth-first: {stack overflow error}
breadth-first w/o tracking `visited`: {out of memory error}
uninformed breadth-first: 27 seconds, 47,892 explored states
uniform-cost (Dijkstra's): 10 seconds, 51,963 explored states
A* search: 0.07 seconds, 736 explored states
```