Monday, June 04, 2012

Kruskal's Algorithm, Simplified

Note : This article is intended to the readers  who can implement Kruskal's algorithm manually at the least.


Kruskal's algorithm is an algorithm for building Minimum Spanning Tree (MST) from an undirected graph.  The algorithm can be summarized as "Go on building the MST by adding the edges in their ascending order of their cost such that they do not create cycles".This can be explained in steps as:-
1. Identify next minimum edge
2. See whether that edge creates a cycle,if added to the MST
3. If it doesn't, add it to MST ,else discard it and repeat step 1.
4. Go to step 1 if MST is not complete.


Consider that we are using the Adjacency Matrix for  representing  the graph. Then Step 1 is the easiest to carry out and Step 2 is the hardest. It is the Step 2 that I want to discuss in detail. Step 1 can be carried out simply by constructing nested loops. At the end of step 1, we have the next minimum cost and the name of the nodes connected by the minimum cost edge.


Now comes the trickiest part. Given an edge,how will we find whether that edge creates a cycle or not when added to the MST?This requires a clearer understanding of what cycles is.


Identifying Cycles

Cycles occurs when there are multiple paths from any node to any other node. So before adding an edge (x,y) to the MST, we have to see whether there is already a path that exists between nodes x and y. A path between nodes x and y can take  number of forms. Examples are:-
1. (x,z) (z,y)
    x is connected to z and z is in turn  connected to y,thus creating a path between x and y
2. (x,z) (z,a)(a,y)
    This can be similarly be explained as  example 1.

Explanation with an Example

Consider the below graph
The adjacency matrix representation of the above graph is given as follows:-
Note: The blank spaces can be regarded as infinity which denotes that the corresponding nodes are not related.Also the above matrix is a symmetric matrix.

Now we start applying the Kruskal's algorithm manually.
The first minimum cost edge is (1,3) or (3,1). We will add it to MST since it won't create any cycles for obvious reasons. Similarly by manual evaluation (4,6) , (2,5) and (3,6) are added to the MST. Now our the adjacency Matrix of the graph would look like as below:-
Note: For each edge added to the MST, the corresponding column is made blank(or infinity)

Now our MST graph would look like as below:-
The adjacency matrix for the same is given by:-
The next minimum cost edge is (1,4) with a cost of 5. But by manual evaluation we know that this edge will result in cyclic graph when added to MST. Now we will see how to infer whether an edge will result in cyclic property, from the adjacency matrix of the MST. Now we need to see whether there exists a path between 1 and 4.We follow the below steps:-
A.Take the columns of the first row. Name them as (1,k) where k=1 to n (n is the number of nodes). 
B.Select the non blank columns among (1,k)and name them as(1,nb)  . If any of the nb=4,then we can straight forward say that there is a path between 1 and 4.If non among (1,nb) has nb=4,then we will see whether there exists, a path between (nb,4) by applying same procedure as Step A.Thus this goes on recursively.
Based on this we develop a noPath() procedure,which will take the nodes,and MST matrix as arguments and returns 1 if there is no path between them else it will return 0.

procedure noPath(int node1,int node2,int nodePrev,int n,int MST[][])

begin

         int k,ret=1;

        for k in range 1 to n do

        begin

                 if(MST[node1][k]!=blank)

                 then

                       if k equal node2

                       then

                              ret=0

                      else if k not equal node1 and k not equal to nodePrev 

                                              then    
                                                                     ret=noPath(k,node2,node1,n,MST)
                                                                 end
                                             end
                       end
       return ret
end {noPath}

Note:The variable nodePrev has been used to avoid the reverse direction flow. For example
after having a search through edge (1,6),we do not want to turn back and search (6,1).


With the above algorithm I hope I have covered the most important aspect of Kruskal algorithm .You may see the real C program implementing Kruskal algo here

No comments:

Post a Comment