A graph-searching algorithm can discover much about the structure of a graph. The graph algorithm is essential in social network searching, artificial intelligence, etc. This chapter mainly discusses about the representation of graphs, BFS and DFS.
Representation of Graphs
We have two kinds of graph: directed and undirected graphs. In either way, we can represent the graph in adjacency-list representation or adjacency-matrix representation.
(From CLRS)
Adjacency-List Representation
A graph \(G=(V, E)\) consists of an array \(Adj[V]\) of \(|V|\) lists, one for each vertex in \(V\). For each $uV $, the adjacency list \(Adj[u]\) contains all the vertices \(v\) such that there is an edge \((u,v)\in E\) where from \(u\) point to \(v\).
In the directed graph, the sum of the lengths of the adjacency list is \(|E|\), while in the undirected graph, it is \(2|E|\).
For both directed and undirected graphs, the adjacency-list representation requires \(\Theta(V+E)\).
Adjacency-Matrix Representation
A graph \(G=(V, E)\), we assume that the vertices are numbered \(1, 2, ..., |V|\). Then the matrix reorientation consists of a \(|V|*|V|\) matrix \(A=(a_{ij})\) such that \[ a_{ij} =\begin{cases} 1 & \text{if}\ ( i,j) \ \in E,\\ 0 & \text{otherwise}. \end{cases} \] Obviously, the adjacency-matrix representation requires \(\Theta(V^2)\) memory.
We can also conclude that the the matrix of an undirected graph is symmetry since the \((u,v)\) and \((v, u)\) represent the same edge. Therefore, the matrix \(A\) of an undirected graph is its own transpose: \(A=A^T\).
In-Degree and Out-Degree
- Degree: The degree of vertex is number of edges that are connected to the vertex.
- In Degree: represents the number of edges incoming to a vertex.
- Out Degree: represents the number of edges outgoing from a vertex.
Vertex | Degree | Out Degree | In Degree |
---|---|---|---|
4 | 6 | 3 | 3 |
3 | 3 | 1 | 2 |
7 | 3 | 1 | 2 |
Breath-First Search (BFS)
BFS is a basic graph method that explores every vertex that is reachable.
Suppose we have a source vertex \(s\), a graph \(G=(V,E)\). BFS could be illustrated in Java code with commands:
// code refer to Algorithm, 4th edition
public void BFS(Graph G, int s) {
// mark the known vertex
boolean[] marked = new boolean[G.V()];
// last vertex adjacent to this vertex
int[] edgeTo = new int[G.V()];
// use FIFS queue to manage processing vertex
Queue<Integer> q = new Queue<>();
// mark the source
marked[s] = true;
// put s into queue
q.enqueue(s);
while (!q.isEmpty()) {
int u = q.dequeue(); // process the next vertex
for (int v: G.adj(u)) {
if (!marked[v]) {
edgeTo[v] = u; // save last edge on a shortest path
marked[v] = true; // mark it since it is known
queue.enqueue(v); // put it into queue
}
}
}
}
The following illustrational image and table can further demonstrate the process of BFS, while \(d\) represents the depth of source and current vertex, and \(\pi\) denotes the parent of current vertex.
vertex | r | s | t | u | v | w | x | y |
---|---|---|---|---|---|---|---|---|
\(d\) | 4 | 3 | 1 | 0 | 5 | 2 | 1 | 1 |
\(\pi\) | s | w | u | Null | r | t | u | u |
Analysis
The total running time of BFS is \(O(V+E)\).
Explain: The total time to queue operation is \(O(V)\), and the total time in scanning the adjacency list is \(O(E)\). Therefore, the total running time is \(O(V+E)\).
- if using adjacency matrix representation, the running time will be \(O(V+V^2)\), since \(E=V^2\) in matrix representation.
Shortest paths
In graph \(G=(V, E)\), from a given source \(s\), we define the shortest-path distance \(\delta(s,v)\) as the shortest path from \(s\) to \(v\). If there is no path from \(s\) to \(v\), then \(\delta(s,u)=\infty\).
Lemma 1
In a direct or undirected graph, for any edge \((u,v)\in E\), \(\delta(s,v)\le \delta(s,u)+1\).
Proof: If u is reachable from s, then v is reachable from s. Then the inequality holds. If u is not reachable from s, the \(\delta(s,u)=\infty\), and the inequality still holds.
Lemma 2
BFS is run on \(G\) from a given source \(s\in V\). Then the depth \(v.d \ge \delta(s,v)\).
Proof: Idiot question in the first place... But can be proved applying Lemma 1. \[ \begin{equation} \begin{aligned} v.d &= u.d+1\\&\ge\delta(s,u)+1\\ &\ge \delta(s,v) \end{aligned} \end{equation} \]
Depth-First Search (DFS)
DFS is to search the deeper in the graph whenever possible.
In DFS, the predecessor subgraphs are composed of several trees, since the search may repeat from multiple sources. We let \(G_\pi=(V,E_\pi)\), where \[ E_\pi=\{(v,\pi,v): v \in V \text{ and }v.\pi\ne NUL \} \] A depth-first forest comprises several depth-first trees. The edges in \(E_\pi\) are tree edges.
// refer to Algorithm, 4th edition
public void dfs(Graph G, int v) {
// mark the known vertex
boolean[] marked = new boolean[G.V()];
// last vertex adjacent to this vertex
int[] edgeTo = new int[G.V()];
// mark the exploring vertex
marked[v] = true;
for (int w: G.adj(v)) {
if (!marked[v]) {
edgeTo[w] = v;
dfs(G, w);
}
}
}
Analysis
The running time of DFS is \(\Theta(V+E)\) when using list representation, while \(\Theta(V+V^2)\) in matrix representation.
Classification of Edges
The color of vertex \(v\) tells us something about the edge:
- WHITE indicates a tree edge,
- GRAY indicates a back edge
- BLACK indicates a forward or cross edge
We can define four types of edge in terms of the depth-first forest \(G_\pi\) produced by a depth-first search on \(G\):
- Tree Edges: edges in the depth-first forest \(G_\pi\)
- Back Edges: edges \((u,v)\) connecting a vertex \(u\) to an ancestor \(v\) in a depth-first tree.
- Forward Edges: edges \((u,v)\) connecting a vertex \(u\) to a descendant \(v\) in a depth-first tree.
- Cross Edges: all other edges.
Properties of DFS
d | f | |
---|---|---|
s | 1 | 10 |
z | 2 | 9 |
y | 3 | 6 |
x | 4 | 5 |
w | 7 | 8 |
t | 11 | 16 |
v | 12 | 13 |
u | 14 | 15 |
Parenthesis Theorem
In any directed or undirected graph \(G=(V,E)\), for any two vertices \(u\) and \(v\), exactly one of the following three conditions holds:
- the intervals \([u.d, u.f]\) and \([v.d, v.f]\) are entirely disjoint, and neither \(u\) nor v is a descendant of the other in the depth-first forest
- the intervals \([u.d, u.f]\) is contained entirely within the interval \([v.d, v.f]\), and \(u\) is a descendant of \(v\) in a depth-first tree
- or inversely
Undirected Graph Edge Theorem
In a depth-first search of an undirected graph \(G\), every edge of \(G\) is either a tree edge or a back edge.