Remove Duplicate Elements in a Vector

Remove Duplicates in a Vector
Vector in java.util package is used as a dynamic array. Any no.of elements can be added to the vector. It might be essential sometimes to remove duplicate elements in the vector to save space. While this is not needed all the time, it is better to have an idea about the logic of how to remove duplicate elements in a Vector. Let us see how we can do this.





import java.util.*;
class RemoveDuplicates
{

    public static void main(String args[])
    {
    Vector<String> v=new Vector<String>();
    v.add("Gowtham");
    v.add(" Gutha's");
    v.add(" Java");
    v.add("-");
    v.add("demos");
    v.add(".");
    v.add("blogspot");

    // '.' again!
    v.add(".");
    v.add("com ");

    // Gowtham again!
    v.add("gowtham");
       
    System.out.println("Original");

        for(int i=0;i<v.size();i++)
        {
            System.out.print(v.elementAt(i));

        }

    System.out.println("\nAfter removing duplicates");
    removeDuplicates(v);

        for(int i=0;i<v.size();i++)
        {
            System.out.print(v.elementAt(i));
        }
   
    }


    // Applicable for all types of vectors

    public static void removeDuplicates(Vector v)
    {
        for(int i=0;i<v.size();i++)
        {
            for(int j=0;j<v.size();j++)
            {
                    if(i!=j)
                    {
                        if(v.elementAt(i).equals(v.elementAt(j)))
                        {
                        v.removeElementAt(j);
                        }
                    }
            }
        }
    }


    /*
        * Specifically applicable for String is written for equalIgnoreCase
        * The code..
   

        public static void removeDuplicates(Vector<String> v)
        {
            for(int i=0;i<v.size();i++)
            {
                for(int j=0;j<v.size();j++)
                {
                    if(i!=j)
                    {
                        if(v.elementAt(i).equalsIgnoreCase(v.elementAt(j)))
                        {
                        v.removeElementAt(j);
                        }
                    }
            }
        }
    */

}

Output


Original
Gowtham Gutha's Java-demos.blogspot.com gowtham
After removing duplicates
Gowtham Gutha's Java-demos.blogspotcom gowtham

Logic Analysis


A. Logic Applicable for all types of vectors

This is the logic that can be used for any type of Vector. java.util.Vector is a generic class and therefore any type of object can be stored. The equals() method compares two objects. It returns true if the object values are equal, false otherwise.
A double loop is used here which checks each element in the array at one index with other index. For a better understand, lets consider a case.
At the statement,
for(int i=0;i<v.size();i++)
The initial value of i is 0 and it is looped till it reaches v.size()-1 (This is because the elements are stored in the vector from 0th element and not from 1st element). v.size() returns the size of the vector i.e. number of elements in the vector. The i value increments every time the loop is executed. This is done till the condition of the loop fails i.e. i value equals to the size of the vector.

The next loop is the inner loop which consists of the same code but with i replaced by j.

A.1 Now what do the loops do?

The loops are used to check an element at particular index with other elements at other indexes and also the same element at the same index. Didn't get it? No problem. I'll explain analytically using the program.

First when i=0 and when control enters the inner for loop, then, j=0. The condition, if(i!=j) means that if the value of i is not equal to j. So control enters the inner if condition only if the value of i does not match with j. Next time, when the inner for loop (j loop) executes, the value of j becomes 1 due to j++ and the condition is i!=j is passed since i=0 and j=1 and now the control enters the inner most if condition and checks whether the element at i (here 0) is equal to the element at j (here j=1) which means that if first element (i.e. element at 0th index) is equal to the second element (i.e. element at the 1st index). This process continues till the inner for loop condition fails i.e. the value of j becomes the size of the vector. In all the cases (here) except 0 the outer if condition (i!=j condition) is satisfied and control enters the inner condition. If this condition is satisfied, then the element at j is removed. Here we are removing the elements that are repeated and at the positions after which they had already occurred. For example, if gowtham is an element that is repeated which occurs at 0th position in the vector and also at the fourth position in the vector, then the gowtham at the fourth position is removed not the one at the first position (i.e. not the one that has occurred first).
So whenever the i and j values never coincide (i.e. whenever they are not equal) and whenever the element at i position is equal to the element at j position, then the element at j position is removed.

i=0 and j=0; i!=j (false)
i=0 and j=1; i!=j (true) -> Check the inner if condition -> Remove duplicate if satisfied
i=0 and j=2; i!=j (true) -> Check the inner if condition -> Remove duplicate if satisfied
....

So, if we consider the 0th element is checked with 1st element, 2nd element, 3rd element, 4th element and so on till the end of all the elements in the vector. It is removed when it is found to be a duplicate.

Similarly,  1st element is checked with 0th element, 2nd element,3rd element, 4th element.... with all the elements in the vector to ensure that whether it is repeated or not.

Similarly, 2nd element is checked with 0th element, 1st element, 2nd element, 3rd element....

You'll have to note that the element 0th element is not checked with 0th element because they are however equal and in a similar way, 1st element is not checked with 1st element because they are however similar. This checking is prevented by i!=j condition. So, therefore the duplicate elements are removed.

B. Logic applicable for String type vectors only

With the help of this logic String type vectors can be ignored. You might already have an idea about equalIgnoreCase() method. It compares two strings ignoring the case. For example, java and Java are both equal according to equalIgnoreCase().

No comments: