GraphineForPythonistas
Getting Started
Downloading and installing Graphine is really simple, and is documented
on the main Documentation page.
To make sure your installation went according to plan, navigate to the directory
you downloaded graphine to and do the following:
username@hostname:~/graphine$ cd graph username@hostname:~/graphine/graph$ ./test.py CorrectnessTest
If all goes well you should see a series of dots, some output, and OK.
If it doesn’t, let us know and we’ll try to help you out.
Of course, all the code examples provided here are in the package you just downloaded, under “examples”.
Why Should You Care?
I'm assuming that by the time you’re this interested in a graph library you already
know what one is and why you might want it, and the only question you really want
answered is whether I'm wasting your time or if Graphine will really make your job
easier. Because I have no idea what your job is, I'm hoping that you’ll tell me- but
in the meantime, I’ll answer the only way I can: by telling you how I've used it, or
would have had it existed.
Before I get to that, though, I want to answer another question that’s implicit in this
one: why not use another graph library? After all, there’s no shortage of them- and
some, like NetworkX and LEDA, are excellent, well-documented projects, run by
people who actually know what they’re talking about.
I developed Graphine because there didn’t seem to be any existing projects that
met my needs- first, for a graph library without any external dependencies, and
secondly for one that would play nicely with Python 3. In addition, I got tired of
having to represent my data structures as being a collection of big, beefy nodes
with some dainty little lines connecting them, when many graph applications are
primarily edge-centric, and finally, I wanted to be able to have a single graph class
that was capable of representing both directed and undirected edges. If any of that
sounds like something you’re interested in, keep reading- otherwise, NetworkX is a
truly remarkable project with a huge number of excellent third party tools, and I
highly recommend that you take a look at it.
Enough with the heart-to-heart though- on to the code.
Using Graphine
Graphine is divided up into one package with two modules, only one of which is really
interesting to us from here on out: graph.base. You can see the basic documentation
for it on the Base page.
Obviously, though, all of that is just examples. What happens when you try to build
a real application using Graphine? Does it wind up helping or hurting? Let’s find out.
Example 1: State Machine
A Finite State Machine is a model of computation that most programmers write and don’t
think about- its basic objective is just to track the state of execution against a given set
of input data. Here, we represent one using Graphine- a basic string pattern matcher that
nevertheless can be extended to match patterns of considerable complexity, albiet not
as efficiently as the regex module.
#! /usr/bin/env python3.0 from graph.base import Graph import sys def walk_fsm(data): fsm = Graph() fsm.add_edge("consume_one", "loop_on_zero", consumes="1", accept=False) fsm.add_edge("loop_on_zero", "loop_on_zero", consumes="0", accept=False) fsm.add_edge("loop_on_zero", "end_with_one", consumes="1", accept=True) accept = False w = fsm.walk_path("consume_one") for edges in w: try: next = {e.consumes: e for e in edges}[data.pop(0)] accept = next.accept w.send(next) except: return False return accept and not data if __name__ == "__main__": accept = walk_fsm(list(sys.argv[1])) if accept: print("Accepted!") else: print("Not accepted!")
The bulk of the work is done by the walk() function, which simply selects from the possible candidate edges
(transitions) one which will consume the given character, and bails if one isn’t found.
Example 2: Exploring Websites
Let’s say that you’re a web developer- who isn’t these days, right?– and you’re worried about the usability of
your website. Specifically, you want to make sure that all your pages are reachable from your index within
a certain number of clicks. You could test that by hand, but if you were interested in doing every boring and
repetitive task you came across by hand you probably wouldn’t have learned Python in the first place. So,
in addition to presuming you’re a web developer, let’s assume you’re a lazy web developer, and want to set
up a script to automatically tell you how many clicks it will take to get from your start page to any number
of other pages on your site. Again, Graphine makes this easy- just represent pages as nodes, with links
being the edges between them, then use the get_shortest_paths() function to get all the interesting path
lengths. Here’s the code:
#! /usr/bin/env python3.0 import sys import os.path from html.parser import HTMLParser from graph.base import Graph G = Graph() class UrlGetter(HTMLParser): """Extracts all the links from the given html""" links = None def handle_starttag(self, tag, attrs): attrs = dict(attrs) if tag == "a": href = attrs.get("href", False) if href: if not self.links: self.links = [] self.links.append(href) def get_urls(filename): """Extracts all the links from the given file.""" parser = UrlGetter() txt = str(open(filename).read()) parser.feed(txt) parser.close() return parser.links if __name__ == "__main__": # get the starting point start = sys.argv[1] # get the files we're interested in files = sys.argv[1:] # add nodes G.add_node(start) for f in files: G.add_node(f) # get all their links for f1 in files: links = get_urls(f1) # filter them against the files list for link in links: for f2 in files: if link == f2: G.add_edge(f1, f2) # get the shortest paths shortest_paths = G.get_shortest_paths(start) # print the results for node, path in shortest_paths.items(): print("The shortest path from %s to %s is %d clicks" % (start, node.name, path[0]))
As you can see, Graphine does pretty much all the non-HTML work here for you. In fact, the major problem here is
deciding what information we’re really interested in- the get_shortest_paths() function also returns information about
the complete path between the start point and end point which we cheerfully ignore, and it would be trivial to use
depth_first_traversal() or get_connected_components() to find out which other pages are reachable from each of the
others. Neat, huh?
A quick note: this code assumes that the names in the “href=” portion of the anchor tags will exactly match the
names you pass it to start out with. Since I'm showing off Graphine and not my l33t regex skillz, I’ll leave that as
an exercise for the reader, along with having it traverse the webroot rather than being passed in a static list.
Example 3: Mazes
Let’s say that you’re writing a game- one in which a player must complete a given objective before their opponent, an AI.
On one level, that challenge is a random maze, consisting of rooms with less than 5 doors. The goal is to get out
of the maze.
This is a non-trivial problem; it has several layers of complexity, two decision making processes, and a piece of data- the map-
which is itself not easy to represent. With Graphine, however, the problem quickly becomes simple- define the maze as a Graph,
with each room being a node and each edge being an open wall. Representing the players is also extremely simple- they are
going to be given the choice of where to go, with the player using the walk_nodes() function to make successive, and apparently
random, choices, while the AI uses a most-doors heuristic and heuristic_traversal..
Here’s the code:
#! /usr/bin/env python3.0 import random from graph.base import Graph NUM_ROOMS = 25 NUM_DOORS = 4 def build_maze(): """Creates the maze.""" maze = Graph() # add the player's starting positions and the end p1_start = maze.add_node("p1_start") p2_start = maze.add_node("p2_start") end = maze.add_node("END") # create a list of connected components nodes = [[p1_start], [p2_start], [end]] # add all the rooms, making each its own component for i in range(NUM_ROOMS): nodes.append([maze.add_node(i)]) # while some components are unconnected while len(nodes) > 1: # choose two components at random component_1, component_2 = random.sample(nodes, 2) # and one node from each component node_1 = random.choice(component_1) node_2 = random.choice(component_2) # and if they don't have too many doors... if len(node_1.outgoing) << NUM_DOORS and len(node_2.outgoing) < NUM_DOORS: # connect them maze.add_edge(node_1, node_2, is_directed=False) # then merge the components component_1.extend(component_2) nodes.remove(component_2) return p1_start, p2_start, maze def ai_path(start, maze): """Defines the AI's heuristic and finds the path it will take.""" # selector is the selection heuristic that the AI will use def selector(candidates): best = (0, -1, None) for pos, room in enumerate(candidates): if room.name == "END": return candidates.pop(pos) num_doors = len(room.outgoing) if num_doors > best[0]: best = (num_doors, pos, room) return candidates.pop(best[1]) # the total distance traveled distance = 0 previous = start # traverse the maze, using selector() as your heuristic for node in maze.heuristic_traversal(start, selector): # take all the steps between dead ends distance += maze.get_shortest_paths(previous)[node][0] # and end if you're at the end if node.name == "END": return distance previous = node def player_select(options): """Prints the player's options and gets their choice.""" selections = {} print("You have %s options:" % (len(options))) for pos, option in enumerate(options): selections[pos] = option print("%d. Room %s, with %s doors" % (pos, option.name, len(option.outgoing))) choice = int(input("Which do you want to take? ")) return selections[choice] def handle_player(start, maze, max_length): """Walks the maze, giving the player the choice of where to go.""" w = maze.walk_nodes(start) for next in w: # while the AI hasn't beat you and you've got places to go if next and max_length: max_length -= 1 # go where the player tells you selection = player_select(next) if selection.name == "END": print("You win!") return next = w.send(selection) # otherwise, lose print("You lose!") if __name__ == "__main__": # build the maze p1_start, p2_start, maze = build_maze() # run the simulation for the AI path_to_beat = ai_path(p1_start, maze) # get the player's moves handle_player(p2_start, maze, path_to_beat)
Conclusion
As you can see, Graphine makes pretty short work of some otherwise nontrivial tasks. We processed simple strings,
explored websites, and built an AI to traverse a maze, each in less than 100 lines of code. If you have any questions-
about this code, about Graphine, or other pieces of code you'd like to see worked through, you can post on our Google
group here or email me directly at debatem1@remove.this.gmail.com.

