Undirected graphs are used to represent connections between points.
A simple example would be a graph that represents your "Friends" on
FaceBook.
All your immediate FaceBook Friends are considered to be a "distance" of 1
link from you.
Each one of your Friends has connections to all their Friends, so all their
Friends are considered to have a "distance" of 2 links from you.
The above Graph Animation is presented to illustrated how complex the network of
connections can be, even for such a small graph.
A graph representing the whole FaceBook network of N total members would be huge.
The number of all potential connections is 1 * 2 * 3 * ... * (N - 1), also called
N factorial (more precisely, it is N - 1 factorial).
This value increases with N even faster than exponential growth.
You Can't Get There From Here?
One of the most common problems in graph theory is determining connectivity.
Sometimes there is just no way to get from on node to another. For example,
no matter how many levels of connections between successive FaceBook Friends
you search, you could never find a connection to me: I don't belong to FaceBook because I
think it is evil (just kidding, sort of).
The two most basic Algorithms used to efficiently determine connectivity between
"vertices" (points) in a graph are called "Breadth First Search" and "Depth
First Search".
These algorithms will be illustrated in upcoming articles.
A Friend of Mine is a Friend of Yours
There are several ways you can store the data representing a graph.
The most simple way is to use 2 arrays: one to represent all the vertices,
and another to represent all the connections.
A connection is also called an "edge" of the graph, which is just the
pair of the vertices that are connected.
To solve problems such as determining connectivity, it pays off to do some
preprocessing on the graph data in "edge format" (an array of edges just
described.)
Instead of using a simple list of edges, we store the data for
connectivity in terms of array from each vertex.
Just like in FaceBook, each member (vertex) has a list (array) of people
they are connected to.
The lists are symmetric: all your Friends on are in your connection list, and
you are in the connection list of each of your Friends.
This way of storing graph data is called "Adjacency" format, since for each
vertex, you are storing an array of the the adjacent vertices.
The Python code below contains several methods used to construct graphs
in Adjacency format from various input files, as well as methods to print
out the data. This code is a basic building block for all the graph
algorithms to be described in future articles.
class graph:
'''
un-directed graph
V = number of vertices
E = number of edges
adj = list[vertex i][vertex j] = edge
'''
def __init__(self, V=None):
'''
V = number of vertices
'''
if not V:
V = 0
self.V = V
self.E = 0
self.rev_order = False
self.init_adjacency_lists()
def init_adjacency_lists(self):
self.adj = []
for i in range(0, self.V):
self.adj.append([])
def set_rev_order(self, flag):
'''
Standard operation is for an edge to be appended to the end of the
adjacency list for that vertex. The order can be reversed so that
an edge is prepended to the adjacency list. This can be useful when
the results from depth-first-search are compared to textbook examples,
since the order of the edges will affect the paths.
'''
if type(flag) is not bool:
raise ValueError("graph.set_rev_order error: flag must be a bool")
self.rev_order = flag
def add_edge(self, v, w):
'''
Add an undirected edge to the adjacency list of both vertices.
v and w are vertex numbers
'''
if v < 0 or v >= self.V:
raise ValueError("graph.add_edge error: v is out of range = " + str(v))
if w < 0 or w >= self.V:
raise ValueError("graph.add_edge error: w is out of range = " + str(w))
self.E += 1
if not self.rev_order:
self.adj[v].append(w)
self.adj[w].append(v)
else:
self.adj[v].insert(0, w)
self.adj[w].insert(0, v)
def get_adj(self, v):
if v < 0 or v >= self.V:
raise ValueError("graph.get_adj error: v is out of range = " + str(v))
return self.adj[v]
def copy(self):
g = graph(self.V)
for i in range(0, self.V):
for j in range(0, self.adj[i]):
g.add_edge(i, self.adj[i][j])
return g
def read_data(self, file_name):
'''
data file is in edge format:
V
E
v1 w1
v2 w2
...
vE wE
'''
if not os.path.exists(file_name):
raise ValueError("graph.read_data error: file does not exist = " + file_name)
data_file = open(file_name, 'r')
x = data_file.readline().strip()
V = int(x)
x = data_file.readline().strip()
E = int(x)
self.__init__(V)
for x in data_file:
x = x.strip()
x = x.replace("\t", " ")
xlist = x.split(" ")
self.add_edge(int(xlist[0]), int(xlist[1]))
if self.E != E:
raise ValueError("graph.read_data exception: number of edges does not match input")
def read_data_adj(self, file_name):
'''
data file is in adjacency format
'''
if not os.path.exists(file_name):
raise ValueError("graph.read_data error: file does not exist = " + file_name)
data_file = open(file_name, 'r')
x = data_file.readline().strip()
x = x.replace("\t", " ")
xlist = x.split(" ");
V = int(xlist.pop(0))
self.__init__(V)
for x in data_file:
x = x.strip()
x = x.replace("\t", " ")
xlist = x.split(" ")
x0 = int(xlist[0])
x1 = int(xlist[1])
if x1 > x0:
self.add_edge(x0, x1)
def populate_random_graph(self, V, E):
self.__init__(V)
def random_edge(V):
'''
returns a random edge with end points 0 <= v,w < V
'''
if V < 2:
raise ValueError("random_edge: V must be >= 2")
v = random.randint(0, V - 1)
w = random.randint(0, V - 1)
# Remove self-loops
while w == v:
w = random.randint(0, V - 1)
return (v, w)
for i in range(0, E):
(v, w) = random_edge(V)
self.add_edge(v, w)
def print_graph(self):
'''
prints graph in edge format
'''
print "N = " + str(self.V)
print "E = " + str(self.E)
for i in range(0, self.N):
for j in range(0, len(self.adj[i])):
print str(i) + ", " + str(self.adj[i][j])
def print_graph_adj(self):
'''
prints graph in adjacency format
'''
print "N = " + str(self.V)
print "E = " + str(self.E)
for i in range(0, self.V):
g_string = "%d: " % i
for j in range(0, len(self.adj[i])):
g_string += " %d" % self.adj[i][j]
print g_string