TreeSet VS HashSet VS LinkedHashSet; TreeMap VS HashMap VS LinkedHashMap

时间:2022-07-13 21:51:55

From online resources

Set

HashSet is much faster than TreeSet (constant-time versus log-time for most operations like add, remove and contains) but offers no ordering guarantees like TreeSet.

HashSet

  • class offers constant time performance for the basic operations (add, remove, contains and size).
  • it does not guarantee that the order of elements will remain constant over time
  • iteration performance depends on the initial capacity and the load factor of the HashSet.
    • It's quite safe to accept default load factor but you may want to specify an initial capacity that's about twice the size to which you expect the set to grow.

TreeSet

  • guarantees log(n) time cost for the basic operations (add, remove and contains)
  • guarantees that elements of set will be sorted (ascending, natural, or the one specified by you via its constructor) (implements SortedSet)
  • doesn't offer any tuning parameters for iteration performance
  • offers a few handy methods to deal with the ordered set like first()last()headSet(), and tailSet() etc

Important points:

  • Both guarantee duplicate-free collection of elements
  • It is generally faster to add elements to the HashSet and then convert the collection to a TreeSet for a duplicate-free sorted traversal.
  • None of these implementation are synchronized. That is if multiple threads access a set concurrently, and at least one of the threads modifies the set, it must be synchronized externally.
  • LinkedHashSet is in some sense intermediate between HashSet and TreeSet. Implemented as a hash table with a linked list running through it, however it provides insertion-ordered iteration which is not same as sorted traversal guaranteed by TreeSet.

So choice of usage depends entirely on your needs but I feel that even if you need an ordered collection then you should still prefer HashSet to create the Set and then convert it into TreeSet.

  • e.g. SortedSet<String> s = new TreeSet<String>(hashSet);

Map is an important data structure.

1. Map Overview

There are 4 commonly used  implementations of Map in Java SE - HashMap, TreeMap, Hashtable and LinkedHashMap. If we use one sentence to describe each implementation, it would be the following:

  • HashMap is implemented as a hash table, and there is no ordering on keys or values.
  • TreeMap is implemented based on red-black tree structure, and it is ordered by the key.
  • LinkedHashMap preserves the insertion order
  • Hashtable is synchronized, in contrast to HashMap.

This gives us the reason that HashMap should be used if it is thread-safe, since Hashtable has overhead for synchronization.

2. HashMap

If key of the HashMap is self-defined objects, then equals() and hashCode() contract need to be followed.

class Dog {
String color;
 
Dog(String c) {
color = c;
}
public String toString(){
return color + " dog";
}
}
 
public class TestHashMap {
public static void main(String[] args) {
HashMap hashMap = new HashMap();
Dog d1 = new Dog("red");
Dog d2 = new Dog("black");
Dog d3 = new Dog("white");
Dog d4 = new Dog("white");
 
hashMap.put(d1, 10);
hashMap.put(d2, 15);
hashMap.put(d3, 5);
hashMap.put(d4, 20);
 
//print size
System.out.println(hashMap.size());
 
//loop HashMap
for (Entry entry : hashMap.entrySet()) {
System.out.println(entry.getKey().toString() + " - " + entry.getValue());
}
}
}

Output:

white dog – 5
black dog – 15
red dog – 10
white dog – 20

Note here, we add "white dogs" twice by mistake, but the HashMap takes it. This does not make sense, because now we are confused how many white dogs are really there.  The Dog class should be defined as follows:

class Dog {
String color;
 
Dog(String c) {
color = c;
}
 
public boolean equals(Object o) {
return ((Dog) o).color == this.color;
}
 
public int hashCode() {
return color.length();
}
 
public String toString(){
return color + " dog";
}
}

Now the output is:

red dog – 10
white dog – 20
black dog – 15

The reason is that HashMap doesn't allow two identical elements. By default, the hashCode() and equals() methods implemented in Object class are used. The default hashCode() method gives distinct integers for distinct objects, and the equals() method only returns true when  two references refer to the same object. Check out the hashCode() and equals() contract if this is not obvious to you.

3. TreeMap

A TreeMap is sorted by keys. Let's first take a look at the following example to understand the "sorted by keys" idea.

class Dog {
String color;
 
Dog(String c) {
color = c;
}
public boolean equals(Object o) {
return ((Dog) o).color == this.color;
}
 
public int hashCode() {
return color.length();
}
public String toString(){
return color + " dog";
}
}
 
 
public class TestTreeMap {
public static void main(String[] args) {
Dog d1 = new Dog("red");
Dog d2 = new Dog("black");
Dog d3 = new Dog("white");
Dog d4 = new Dog("white");
 
TreeMap treeMap = new TreeMap();
treeMap.put(d1, 10);
treeMap.put(d2, 15);
treeMap.put(d3, 5);
treeMap.put(d4, 20);
 
for (Entry entry : treeMap.entrySet()) {
System.out.println(entry.getKey() + " - " + entry.getValue());
}
}
}

Output:

Exception in thread “main” java.lang.ClassCastException: collection.Dog cannot be cast to java.lang.Comparable
at java.util.TreeMap.put(Unknown Source)
at collection.TestHashMap.main(TestHashMap.java:35)

Since TreeMaps are sorted by keys, the object for key has
to be able to compare with each other, that's why it has to implement
Comparable interface. For example,  you use String as key, because
String implements Comparable interface.  Let's change the Dog, and make
it comparable.

class Dog implements Comparable{
String color;
int size;
 
Dog(String c, int s) {
color = c;
size = s;
}
 
public String toString(){
return color + " dog";
}
 
@Override
public int compareTo(Dog o) {
return  o.size - this.size;
}
}
 
public class TestTreeMap {
public static void main(String[] args) {
Dog d1 = new Dog("red", 30);
Dog d2 = new Dog("black", 20);
Dog d3 = new Dog("white", 10);
Dog d4 = new Dog("white", 10);
 
TreeMap treeMap = new TreeMap();
treeMap.put(d1, 10);
treeMap.put(d2, 15);
treeMap.put(d3, 5);
treeMap.put(d4, 20);
 
for (Entry entry : treeMap.entrySet()) {
System.out.println(entry.getKey() + " - " + entry.getValue());
}
}
}

Output:

red dog – 10
black dog – 15
white dog – 20

It is sorted by key, i.e., dog size in this case.  If "Dog d4 = new Dog("white", 10);" is replaced with "Dog d4 = new Dog("white", 40);", the output would be:

white dog – 20
red dog – 10
black dog – 15
white dog – 5

The reason is that TreeMap now uses compareTo() method to compare keys. Different sizes make different dogs!

4. Hashtable

From Java Doc:  The HashMap class is roughly equivalent to Hashtable, except that it is unsynchronized and permits nulls.

5. LinkedHashMap

LinkedHashMap is a subclass of HashMap. That means it inherits the features of HashMap. In addition, the linked list preserves the insertion-order.  Let's replace the HashMap with LinkedHashMap using the same code used for HashMap.

class Dog {
String color;
 
Dog(String c) {
color = c;
}
 
public boolean equals(Object o) {
return ((Dog) o).color == this.color;
}
 
public int hashCode() {
return color.length();
}
 
public String toString(){
return color + " dog";
}
}
 
public class TestHashMap {
public static void main(String[] args) {
 
Dog d1 = new Dog("red");
Dog d2 = new Dog("black");
Dog d3 = new Dog("white");
Dog d4 = new Dog("white");
 
LinkedHashMap linkedHashMap = new LinkedHashMap();
linkedHashMap.put(d1, 10);
linkedHashMap.put(d2, 15);
linkedHashMap.put(d3, 5);
linkedHashMap.put(d4, 20);
 
for (Entry entry : linkedHashMap.entrySet()) {
System.out.println(entry.getKey() + " - " + entry.getValue());
}
}
}

Output is:

red dog – 10
black dog – 15
white dog – 20

The difference is that if we use HashMap the output could be the following - the insertion order is not preserved.

red dog – 10
white dog – 20
black dog – 15