Graph Search and Connectivity

GRAPH CONNECTIVITY.

The goal of Searching in a graph: O(m+n) time.

Generic Search Algorithm (Given graph G, initial vertex S){
Initially, S explored, all other vertexes unexplored.
While possible:
Choose an edge(u,v) with one explored and one unexplored.
Mark the new vertex explored.

BFS: (Explores nodes in layers, Uses Queue)

Can compute shortest paths.
Can compute connected components of an undirected graph.

DFS:(Explores aggressively, uses stack)

Compute topological ordering or a directed acyclic graph.
Can compute connected components of a directed graph.

BFS:(Graph G, start vertex S):
Mark S explored.
Q= queue, initialized with S.
While(Q!=null):
Remove the first node of Q, call it v.
For each edge(v,w):
Mark w explored; add to Queue

COMPUTE SHORTEST PATH:

Use BFS only, add extra code:
Initialize dist(V) : 0 if V==s, infinite if V!=S.
When considering edge (v,w) : if W is unexplored, dist(w) = dist(v) +1;

UNDIRECTED CONNECTIVITY:

To compute all components:
All nodes unexplored.
for i =1 to n (n number of nodes):
if (i not explored): BFS(G,i);

Running Time : O(m+n).

----------------------------------------------------------------------------------------------------

DFS Algo:
mimics BFS, uses a stack; Can do the whole thing recursively too. Try it out.

DFS(Graph G , vertex S):
Mark S explored.
For every edge (S,v):
if V unepxlored:
DFS(G,v).

%f(v) = current_label--;% (F rom topological sort)

------------------------------------------------------------------------------------------------------------

Topological Ordering:
Let V be the sink vertex of graph G.
F(V) = n.
Delete that node and recurse on G-{v}

A topological ordering of a directed graph is a labelling F of G's nodes such that:

F(v) belongs to the set of vertexes.
(u,v) -> f(u)

If G is cyclic, there can be no topological sort.

Every directed acyclic graph has a sink vertex.

DFS_LOOP(GRAPH G):
mark all nodes unexplored.
current_label = n,
for each vertex v in G:
if v is not explored:
DFS(G,v).

---------------------------------------------------------------------------------------------------------

STRONGLY CONNECTED COMPONENTS:

A graph is strongly connected if you can get from any one point to any other point.
Problem: We can get an SCC only if we start from the right node.

KOSARAJU'S TWO PASS ALGORITHM:

1) Reverse all the paths of the graph. Represented by G^(rev).
2) Run DFS-LOOP n the reverse graph from the highest node to the lowest. (It will compute a magical ordering of nodes).
-Let f(v) be the finishing time of each vertex.

3) Run DFS-LOOP on the original graph. (Once it goes through the magical prdering, it will reveal the SCC's one by one).
-SCC's - All nodes which have the same leader.

DFS Loop(Graph G):
global variable t = 0 (# of nodes so far) (For first pass).
global variable s = NULL (most recent source vertex) (For second pass).
Assume nodes labelled 1 to n.
for i =1 to n:
if i unexplored:
S:=i;
DFS(G,i);

DFS(Graph G, vertex i):
mark i as explored.
set leader(i) = s; //second pass
for (each arc(i,j)):
if j unexplored.
DFS(G,j);

t++; //first pass
set (f(i) = t; //first pass

SCC's can have complex structures but if we zoom out and treat every SCC as a single node, they have a simple directed acyclic graph.
If SCC1 has an outer edge to SCC2, then max F(SCC2) > max F(SCC1).
Leader is the same as the highest finishing time for that SCC.

--------------------------------------------------------------------------------------------------------------------------------------------------------------------

STRUCTURE OF THE WEB (OPTIONAL):

Vertices: Web Pages
Directed edges: Hyperlinks

Dijkstra, BST's

Dikstra Algorithnm:

Initiate:

X = {S} //Vertices processed so far
A[S] = //Contains shortest path distances.
B[S] = [has the shortest path] // just for understanding, not coding.

Consider two groups: X and V-X(X1)

MAIN LOOP:
while X!=V:
among all edges (v,w) with v in X and w in X1,
pick the one that minimizes A[v] +lvw
add w to X.
Set A[w]=A[v] + lvw.

Dijkstra doesn't work with negative path lengths.

Dijkstra running Time : O(mn).

For a super fast implementation- Use heaps.

HEAP IMPLEMENTATION:

Elements of Heap = Vertices of {V-X} vertices.
For ever V in {V-X}, key[v]=smallest value of all the paths that ends at V-X and starts at X.
If for a vertex v in {V-X}, if it doesnt have any incident edges on it which started at X, its key value is infinity.

Every iteration, we add an element from {V-X} to X. Consider new vertex K added to X. Update the key values of vertices with edges from K and still in {V-X} as they might be lower now.

Algorithm:

When w extracted from heap and added to X:
for each edge (w,v) where v in {V-X}:
delete v from heap.
key[v] = min {key[v],A[w] + wv}

-------------------------------------------------------------------------------------------------------------------------------------

HEAPS:

A container for objects which have keys. Supports mainly two functions. Insertion and extract min. (Both take time = O(LogN) , n= no of objects in heap).

Other Ops:
1) Heapify : To initialize a heap. We can call Insert n times. Or do it in batches, which takes linear time.
2) Delete (Any element): In logarithmic time.

When to use heap? Notice if your program uses repeated minimum computations.

Heap Sort: 1) Insert all elements in a heap. Extract min one by one. RunTime: O(NlogN).

Applications:
1) Can be used for event records with timestamps.
2) Can be used to calculate median after every entry of a new number.
-Use two heaps. First one is Hlow- A max heap. Second one is Hhigh- a min heap. Hlow will have all the smallest n/2 number of elements.
Hhigh will have the biggest n/2 number of elements. Hence, the root of Hlow and Hhigh will be the medians when n is even or one of
them will be median if n is odd, whichever has one more element.
- If the balance between Hlow and Hhigh is two, extract one element from the larger one and insert in the smaller one.

3) Can be used to speed up Dijkstra's algorithm:

Insertion : Insert into the next slot and bubble up.

Deletion: Delete first node. Replace that position with the last max value item. Restore heap property.

---------------------------------------------------------------------------------------------------------------------------------------------------------

BALANCED BINARY SEARCH TREES:

In a sorted array, we can implement a lot of operations in reasonable time. But this is only for static data. When the data is stored dynamically, INSERTION and DELETION become expensive. Hence, we need balanced binary search trees.

Hence, BBST = Sorted array + logarithmic time insertion and deletion.

OPERATION SORTED ARRAY BBST

Search                                                                  O(logN)                                            -
Select                                                                      O(1)                                            O(logN)
Min/Max                                                                 O(1)                                          O(logN)
Predeccessor/Successor                                      O(1)                                            O(logN)
Rank                                                                         O(logN)                                           -
Output in sorted order                                         O(N)                                               -
Insert                                                                                                                              O(logN)
Delete                                                                                                                              O(logN)

NOTE: Height of a tree determines how fast the operations are gonna be.

Height can vary between log(N) to N.

Operations:

Inorder Traversal: To print in sorted order. O(n) time.

Minimum of a tree: Left pointer continuously.
Maximum of a tree: Right pointer continuously.
Predecessor: Two cases:
1) Has left child : Find max element of it's left subtree. (Go left once and then keep going right)
2) Doesnt have left child: Keep going up until parent is smaller than the item.

Deletion of a key:
3 cases:
1) Easy Case: Node has no child.
2) Medium case: Node has one child. Attach that to the previous parent.
3)Difficult case: Node has two children.
              1) Compute k's predecessor. Will be easy cuz go left once, and then keep going right until you cant anymore. Name it l.
              2) Swap k and l.
              3) Delete new position of k. In case K has a left child (wont have right child cuz that would be the predecsoor otherwise), do Medium                      case. (Running time : O(height)).

SELECT AND RANK:
For this, we need to store some extra data. Let size( x node) = no of nodes of it's subtree +itself. size(x) =1+ size(left subtree) + size(right subtree)

ith Order Statistic: Start at root x with child y and z.
                 Let a = xize(y) (a =0 if x has no left child)
                 if a =i, return x,
                 if a>i, recurse left child
                 if a

Rank: Given a key, find how many keys are smaller or equal to this value. Similar to SELECT.

---------------------------------------------------------------------------------------------------------------------------------------------------

RED BLACK TREES:

1) Every node has one bit of information. Either red or black.
2) Root is always black.
3)No two reds in a row.
4)Every path taken from root to null pointer passes through exactly the same number of black nodes.

Use? In a red-black tree with N nodes, height is always <= 2log(N+1).

Graph Search and Connectivity

Dijkstra, BST's

Next up

Previous

Graph Search and Connectivity