Getting Started

Downloading and installing Graphine is really simple, and is documented
on the main Documentation page.

To make sure your installation went according to plan, navigate to the directory
you downloaded graphine to and do the following:

username@hostname:~/graphine$ cd graph
username@hostname:~/graphine/graph$ ./test.py CorrectnessTest

If all goes well you should see a series of dots, some output, and OK.
If it doesn’t, let us know and we’ll try to help you out.

Of course, all the code examples provided here are in the package you just downloaded, under “examples”.

Why Should You Care?

I'm assuming that by the time you’re this interested in a graph library you already
know what one is and why you might want it, and the only question you really want
answered is whether I'm wasting your time or if Graphine will really make your job
easier. Because I have no idea what your job is, I'm hoping that you’ll tell me- but
in the meantime, I’ll answer the only way I can: by telling you how I've used it, or
would have had it existed.

Before I get to that, though, I want to answer another question that’s implicit in this
one: why not use another graph library? After all, there’s no shortage of them- and
some, like NetworkX and LEDA, are excellent, well-documented projects, run by
people who actually know what they’re talking about.

I developed Graphine because there didn’t seem to be any existing projects that
met my needs- first, for a graph library without any external dependencies, and
secondly for one that would play nicely with Python 3. In addition, I got tired of
having to represent my data structures as being a collection of big, beefy nodes
with some dainty little lines connecting them, when many graph applications are
primarily edge-centric, and finally, I wanted to be able to have a single graph class
that was capable of representing both directed and undirected edges. If any of that
sounds like something you’re interested in, keep reading- otherwise, NetworkX is a
truly remarkable project with a huge number of excellent third party tools, and I
highly recommend that you take a look at it.

Enough with the heart-to-heart though- on to the code.

Using Graphine

Graphine is divided up into one package with two modules, only one of which is really
interesting to us from here on out: graph.base. You can see the basic documentation
for it on the Base page.

Obviously, though, all of that is just examples. What happens when you try to build
a real application using Graphine? Does it wind up helping or hurting? Let’s find out.

Example 1: State Machine

A Finite State Machine is a model of computation that most programmers write and don’t
think about- its basic objective is just to track the state of execution against a given set
of input data. Here, we represent one using Graphine- a basic string pattern matcher that
nevertheless can be extended to match patterns of considerable complexity, albiet not
as efficiently as the regex module.

#! /usr/bin/env python3.0

from graph.base import Graph  
import sys  

def walk_fsm(data):  
    fsm = Graph()  
    fsm.add_edge("consume_one", "loop_on_zero", consumes="1", accept=False)  
    fsm.add_edge("loop_on_zero", "loop_on_zero", consumes="0", accept=False)  
    fsm.add_edge("loop_on_zero", "end_with_one", consumes="1", accept=True)  
    accept = False  
    w = fsm.walk_path("consume_one")  
    for edges in w:  
        try:  
            next = {e.consumes: e for e in edges}[data.pop(0)]  
            accept = next.accept  
            w.send(next)  
        except:  
            return False  
    return accept and not data  

if __name__ == "__main__":  
    accept = walk_fsm(list(sys.argv[1]))  
    if accept: print("Accepted!")  
    else: print("Not accepted!")

The bulk of the work is done by the walk() function, which simply selects from the possible candidate edges
(transitions) one which will consume the given character, and bails if one isn’t found.

Example 2: Exploring Websites

Let’s say that you’re a web developer- who isn’t these days, right?– and you’re worried about the usability of
your website. Specifically, you want to make sure that all your pages are reachable from your index within
a certain number of clicks. You could test that by hand, but if you were interested in doing every boring and
repetitive task you came across by hand you probably wouldn’t have learned Python in the first place. So,
in addition to presuming you’re a web developer, let’s assume you’re a lazy web developer, and want to set
up a script to automatically tell you how many clicks it will take to get from your start page to any number
of other pages on your site. Again, Graphine makes this easy- just represent pages as nodes, with links
being the edges between them, then use the get_shortest_paths() function to get all the interesting path
lengths. Here’s the code:

#! /usr/bin/env python3.0

import sys  
import os.path  
from html.parser import HTMLParser  

from graph.base import Graph  

G = Graph()  

class UrlGetter(HTMLParser):  
    """Extracts all the links from the given html"""  

    links = None  

    def handle_starttag(self, tag, attrs):  
        attrs = dict(attrs)  
        if tag == "a":  
            href = attrs.get("href", False)  
            if href:  
                if not self.links:  
                    self.links = []  
                self.links.append(href)  


def get_urls(filename):  
    """Extracts all the links from the given file."""  
    parser = UrlGetter()  
    txt = str(open(filename).read())  
    parser.feed(txt)  
    parser.close()  
    return parser.links  

if __name__ == "__main__":  

    # get the starting point  
    start = sys.argv[1]  

    # get the files we're interested in  
    files = sys.argv[1:]  

    # add nodes  
    G.add_node(start)  
    for f in files:  
        G.add_node(f)  

    # get all their links  
    for f1 in files:  
        links = get_urls(f1)  
        # filter them against the files list  
        for link in links:  
            for f2 in files:  
                if link == f2:  
                    G.add_edge(f1, f2)  

    # get the shortest paths  
    shortest_paths = G.get_shortest_paths(start)  

    # print the results  
    for node, path in shortest_paths.items():  
        print("The shortest path from %s to %s is %d clicks" % (start, node.name, path[0]))

As you can see, Graphine does pretty much all the non-HTML work here for you. In fact, the major problem here is
deciding what information we’re really interested in- the get_shortest_paths() function also returns information about
the complete path between the start point and end point which we cheerfully ignore, and it would be trivial to use
depth_first_traversal() or get_connected_components() to find out which other pages are reachable from each of the
others. Neat, huh?

A quick note: this code assumes that the names in the “href=” portion of the anchor tags will exactly match the
names you pass it to start out with. Since I'm showing off Graphine and not my l33t regex skillz, I’ll leave that as
an exercise for the reader, along with having it traverse the webroot rather than being passed in a static list.

Example 3: Mazes

Let’s say that you’re writing a game- one in which a player must complete a given objective before their opponent, an AI.
On one level, that challenge is a random maze, consisting of rooms with less than 5 doors. The goal is to get out
of the maze.

This is a non-trivial problem; it has several layers of complexity, two decision making processes, and a piece of data- the map-
which is itself not easy to represent. With Graphine, however, the problem quickly becomes simple- define the maze as a Graph,
with each room being a node and each edge being an open wall. Representing the players is also extremely simple- they are
going to be given the choice of where to go, with the player using the walk_nodes() function to make successive, and apparently
random, choices, while the AI uses a most-doors heuristic and heuristic_traversal..

Here’s the code:

#! /usr/bin/env python3.0

import random  

from graph.base import Graph  

NUM_ROOMS = 25  
NUM_DOORS = 4  

def build_maze():  
    """Creates the maze."""  
    maze = Graph()  
    # add the player's starting positions and the end  
    p1_start = maze.add_node("p1_start")  
    p2_start = maze.add_node("p2_start")  
    end = maze.add_node("END")  
    # create a list of connected components  
    nodes = [[p1_start], [p2_start], [end]]  
    # add all the rooms, making each its own component  
    for i in range(NUM_ROOMS):  
        nodes.append([maze.add_node(i)])  
    # while some components are unconnected  
    while len(nodes) > 1:  
        # choose two components at random  
        component_1, component_2 = random.sample(nodes, 2)  
        # and one node from each component  
        node_1 = random.choice(component_1)  
        node_2 = random.choice(component_2)  
        # and if they don't have too many doors...  
        if len(node_1.outgoing) << NUM_DOORS and len(node_2.outgoing) < NUM_DOORS:  
            # connect them  
            maze.add_edge(node_1, node_2, is_directed=False)  
            # then merge the components  
            component_1.extend(component_2)  
            nodes.remove(component_2)  
    return p1_start, p2_start, maze  

def ai_path(start, maze):  
    """Defines the AI's heuristic and finds the path it will take."""  
    # selector is the selection heuristic that the AI will use  
    def selector(candidates):  
        best = (0, -1, None)  
        for pos, room in enumerate(candidates):  
            if room.name == "END": return candidates.pop(pos)  
            num_doors = len(room.outgoing)  
            if num_doors > best[0]:  
                best = (num_doors, pos, room)  
        return candidates.pop(best[1])  
    # the total distance traveled  
    distance = 0  
    previous = start  
    # traverse the maze, using selector() as your heuristic  
    for node in maze.heuristic_traversal(start, selector):  
        # take all the steps between dead ends  
        distance += maze.get_shortest_paths(previous)[node][0]  
        # and end if you're at the end  
        if node.name == "END": return distance  
        previous = node  

def player_select(options):  
    """Prints the player's options and gets their choice."""  
    selections = {}  
    print("You have %s options:" % (len(options)))  
    for pos, option in enumerate(options):  
        selections[pos] = option  
        print("%d. Room %s, with %s doors" % (pos, option.name, len(option.outgoing)))  
    choice = int(input("Which do you want to take? "))  
    return selections[choice]  

def handle_player(start, maze, max_length):  
    """Walks the maze, giving the player the choice of where to go."""  
    w = maze.walk_nodes(start)  
            for next in w:  
            # while the AI hasn't beat you and you've got places to go  
            if next and max_length:  
                max_length -= 1  
                # go where the player tells you  
                selection = player_select(next)  
                if selection.name == "END":  
                    print("You win!")  
                    return  
                next = w.send(selection)  
            # otherwise, lose  
            print("You lose!")  

if __name__ == "__main__":  
    # build the maze  
    p1_start, p2_start, maze = build_maze()  
    # run the simulation for the AI  
    path_to_beat = ai_path(p1_start, maze)  
    # get the player's moves  
    handle_player(p2_start, maze, path_to_beat)

Conclusion

As you can see, Graphine makes pretty short work of some otherwise nontrivial tasks. We processed simple strings,
explored websites, and built an AI to traverse a maze, each in less than 100 lines of code. If you have any questions-
about this code, about Graphine, or other pieces of code you'd like to see worked through, you can post on our Google
group here or email me directly at debatem1@remove.this.gmail.com.