Pages

Tuesday, 9 April 2013

Removing duplicates from an Array or Collection - Part 2 (last part)


In the previous post we discussed how to remove duplicates from Arrays/Collections.

But then question question in mind:
What if we want to remove duplicate Objects of user-defined Classes using this approach?
Suppose we have a class with its own fields. Then how to remove duplicates of such a class from an/a Array/Collection?
Suppose we have class like:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
public class TestObject {

    private int id;    
    private String name;

    public TestObject(int id, String name) {
        this.id = id;
        this.name = name;
    }

    // getter and setter methods

    // Override toString() method to print the object in a readable 
    // format
    @Override
    public String toString() {
        return "TestObject{" + "id=" + id + ", name=" + name + '}';
    }
}

To remove duplicates from an Array of this class, we modify our method like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
public static TestObject[] removeDuplicates(TestObject[] inputArray) {

    // first, convert Array into List
    List<TestObject> objectsList = Arrays.asList(inputArray);

    // pass this List into Set's constructor
    Set<TestObject> objectsSet = new LinkedHashSet(objectsList);

    // create an Array of length equal to set.size() 
    TestObject[] outputArray = new TestObject[objectsSet.size()];

    // pass the newly created array to the toArray method to store 
    // Set's objects in Array and return the array
    return objectsSet.toArray(outputArray);
}
The use case of this is given below:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
public static void main(String... args) {

    // create some objects of TestObject class
    TestObject obj1 = new TestObject(3, "Abdullah");
    // notice that obj2 is similar to obj1
    TestObject obj2 = new TestObject(3, "Abdullah");
    TestObject obj3 = new TestObject(1, "Maaz");
    // notice that obj4 is similar to obj3
    TestObject obj4 = new TestObject(1, "Maaz");
    TestObject obj5 = new TestObject(6, "Hamza");
    TestObject obj6 = obj5;
    TestObject obj7 = new TestObject(2, "Salman");
    TestObject obj8 = obj7;
    TestObject obj9 = new TestObject(5, "Hammad");
    TestObject obj10 = obj9;

    // create an array that contain duplicate objects
    TestObject[] arrayWithDuplicates = new TestObject[]{obj1, obj2, obj3, obj4, obj5, obj6, obj7, obj8, obj9, obj10};

    // print array with duplicates
    System.out.println("Printing Array with Duplicates------------------- ");
    for (TestObject obj : arrayWithDuplicates) {
        System.out.println(obj);
    }

    // pass this array to removeDuplicates method and get output in another array
    TestObject[] newArray = removeDuplicatesFrom(arrayWithDuplicates);

    System.out.println("Printing new Array------------------- ");
    for (TestObject obj : newArray) {
        System.out.println(obj);
    }
}

The output of this is:
run:
Printing Array with Duplicates------------------- 
TestObject{id=3, name=Abdullah}
TestObject{id=3, name=Abdullah}
TestObject{id=1, name=Maaz}
TestObject{id=1, name=Maaz}
TestObject{id=6, name=Hamza}
TestObject{id=6, name=Hamza}
TestObject{id=2, name=Salman}
TestObject{id=2, name=Salman}
TestObject{id=5, name=Hammad}
TestObject{id=5, name=Hammad}

Printing new Array------------------- 
TestObject{id=3, name=Abdullah}
TestObject{id=3, name=Abdullah}
TestObject{id=1, name=Maaz}
TestObject{id=1, name=Maaz}
TestObject{id=6, name=Hamza}
TestObject{id=2, name=Salman}
TestObject{id=5, name=Hammad}
BUILD SUCCESSFUL (total time: 2 seconds)
As we can see that it has only removed the duplicate objects with same references. i.e:
    TestObject obj5 = new TestObject(6, "Hamza");
    TestObject obj6 = obj5;
    TestObject obj7 = new TestObject(2, "Salman");
    TestObject obj8 = obj7;
    TestObject obj9 = new TestObject(5, "Hammad");
    TestObject obj10 = obj9;

But has not has not removed the duplicates with same values. like:
    TestObject obj1 = new TestObject(3, "Abdullah");
    TestObject obj2 = new TestObject(3, "Abdullah");
    TestObject obj3 = new TestObject(1, "Maaz");
    TestObject obj4 = new TestObject(1, "Maaz");

Solution:

To achieve our goal, we have to override the equals(Object) method of java.lang.Object class. 
We have to override this method to check if an object "equals" the object of our class? If yes, make it to return true, else return false.
After Overriding this method, out TestObject class looks like:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
public class TestObject {

    private int id;    
    private String name;

    public TestObject(int id, String name) {
        this.id = id;
        this.name = name;
    }

    // getter and setter methods

    // Override toString() method to print the object in a readable 
    // format
    @Override
    public String toString() {
        return "TestObject{" + "id=" + id + ", name=" + name + '}';
    }

    @Override
    public boolean equals(Object object) {

        // if given object is null, return false
        if (object == null) {
            return false;
        }

        // check if given object is an instance of TestObject class.
        // instanceof is an operator in java that checks if an object
        // is an instance of a given class. more info on instanceof 
        // operator can be found here

        if (object instanceof TestObject) {
            // cast object to TestObject 
            TestObject other = (TestObject) object;

            // match the values of given object with this object
            // and return true in case if values match!
            if (this.id == other.id &amp;&amp; this.name.equals(other.name)) {
                return true;
            }
        }
        return false;
    }
}

After modifing TestObject class, the output is:

run:
Printing Array with Duplicates------------------- 
TestObject{id=3, name=Abdullah}
TestObject{id=3, name=Abdullah}
TestObject{id=1, name=Maaz}
TestObject{id=1, name=Maaz}
TestObject{id=6, name=Hamza}
TestObject{id=6, name=Hamza}
TestObject{id=2, name=Salman}
TestObject{id=2, name=Salman}
TestObject{id=5, name=Hammad}
TestObject{id=5, name=Hammad}

Printing new Array------------------- 
TestObject{id=3, name=Abdullah}
TestObject{id=1, name=Maaz}
TestObject{id=6, name=Hamza}
TestObject{id=2, name=Salman}
TestObject{id=5, name=Hammad}
BUILD SUCCESSFUL (total time: 2 seconds)
Now we can see that now all duplicates have been removed!
But the output is not sorted because we are using the LinkedHashSet.
If we try using the TreeMap in removeDuplicates method, the compiler would throw an Exception.

In the next post, we will see how we can use TreeSet to remove duplicates and to sort Objects of such classes.

References:

See Also:

No comments:

Post a Comment