大廠之路的第五篇 HashMap(散列表)
前面幾篇我們介紹了兩種線性表:順序表和鏈表搜锰。這兩種線性表它們各有優(yōu)缺點(diǎn):順序表適合隨機(jī)查找比較多的場(chǎng)景半夷,而鏈表適合與需要頻繁插入刪除的場(chǎng)景掩缓。
那么睛藻,有沒有一種集合查找也快插入刪除也沒那么耗時(shí)呢蓄诽?答案是肯定的功偿,它就是我們今天要介紹的 散列表 也稱 哈希表筑舅。
HashMap
是如何做到查找也快插入刪除也快的呢座慰?
老樣子,我們還是到源碼里面去一探究竟翠拣。
我們先看一下它的put
方法:
public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
}
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
if (e != null) { // existing mapping for key
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;
}
我們不難發(fā)現(xiàn)版仔,這個(gè)函數(shù)里面又一個(gè)很重要的結(jié)構(gòu)--Node<K,V>
,我們?cè)诜治?code>LinkedList的源碼過程中也有這么一個(gè)Node
,不同的地方在于在HashMap
中這個(gè)Node
是以key误墓,value即鍵值對(duì)的形式存在的蛮粮。
ok,那我們重點(diǎn)來看一下Node
這個(gè)內(nèi)部類谜慌。
static class Node<K,V> implements Map.Entry<K,V> {
final int hash;
final K key;
V value;
Node<K,V> next;
Node(int hash, K key, V value, Node<K,V> next) {
this.hash = hash;
this.key = key;
this.value = value;
this.next = next;
}
public final K getKey() { return key; }
public final V getValue() { return value; }
public final String toString() { return key + "=" + value; }
public final int hashCode() {
return Objects.hashCode(key) ^ Objects.hashCode(value);
}
public final V setValue(V newValue) {
V oldValue = value;
value = newValue;
return oldValue;
}
public final boolean equals(Object o) {
if (o == this)
return true;
if (o instanceof Map.Entry) {
Map.Entry<?,?> e = (Map.Entry<?,?>)o;
if (Objects.equals(key, e.getKey()) &&
Objects.equals(value, e.getValue()))
return true;
}
return false;
}
}
interface Entry<K,V> {
/**
* Returns the key corresponding to this entry.
*
* @return the key corresponding to this entry
* @throws IllegalStateException implementations may, but are not
* required to, throw this exception if the entry has been
* removed from the backing map.
*/
K getKey();
/**
* Returns the value corresponding to this entry. If the mapping
* has been removed from the backing map (by the iterator's
* <tt>remove</tt> operation), the results of this call are undefined.
*
* @return the value corresponding to this entry
* @throws IllegalStateException implementations may, but are not
* required to, throw this exception if the entry has been
* removed from the backing map.
*/
V getValue();
/**
* Replaces the value corresponding to this entry with the specified
* value (optional operation). (Writes through to the map.) The
* behavior of this call is undefined if the mapping has already been
* removed from the map (by the iterator's <tt>remove</tt> operation).
*
* @param value new value to be stored in this entry
* @return old value corresponding to the entry
* @throws UnsupportedOperationException if the <tt>put</tt> operation
* is not supported by the backing map
* @throws ClassCastException if the class of the specified value
* prevents it from being stored in the backing map
* @throws NullPointerException if the backing map does not permit
* null values, and the specified value is null
* @throws IllegalArgumentException if some property of this value
* prevents it from being stored in the backing map
* @throws IllegalStateException implementations may, but are not
* required to, throw this exception if the entry has been
* removed from the backing map.
*/
V setValue(V value);
/**
* Compares the specified object with this entry for equality.
* Returns <tt>true</tt> if the given object is also a map entry and
* the two entries represent the same mapping. More formally, two
* entries <tt>e1</tt> and <tt>e2</tt> represent the same mapping
* if<pre>
* (e1.getKey()==null ?
* e2.getKey()==null : e1.getKey().equals(e2.getKey())) &&
* (e1.getValue()==null ?
* e2.getValue()==null : e1.getValue().equals(e2.getValue()))
* </pre>
* This ensures that the <tt>equals</tt> method works properly across
* different implementations of the <tt>Map.Entry</tt> interface.
*
* @param o object to be compared for equality with this map entry
* @return <tt>true</tt> if the specified object is equal to this map
* entry
*/
boolean equals(Object o);
/**
* Returns the hash code value for this map entry. The hash code
* of a map entry <tt>e</tt> is defined to be: <pre>
* (e.getKey()==null ? 0 : e.getKey().hashCode()) ^
* (e.getValue()==null ? 0 : e.getValue().hashCode())
* </pre>
* This ensures that <tt>e1.equals(e2)</tt> implies that
* <tt>e1.hashCode()==e2.hashCode()</tt> for any two Entries
* <tt>e1</tt> and <tt>e2</tt>, as required by the general
* contract of <tt>Object.hashCode</tt>.
*
* @return the hash code value for this map entry
* @see Object#hashCode()
* @see Object#equals(Object)
* @see #equals(Object)
*/
int hashCode();
/**
* Returns a comparator that compares {@link Map.Entry} in natural order on key.
*
* <p>The returned comparator is serializable and throws {@link
* NullPointerException} when comparing an entry with a null key.
*
* @param <K> the {@link Comparable} type of then map keys
* @param <V> the type of the map values
* @return a comparator that compares {@link Map.Entry} in natural order on key.
* @see Comparable
* @since 1.8
*/
public static <K extends Comparable<? super K>, V> Comparator<Map.Entry<K,V>> comparingByKey() {
return (Comparator<Map.Entry<K, V>> & Serializable)
(c1, c2) -> c1.getKey().compareTo(c2.getKey());
}
/**
* Returns a comparator that compares {@link Map.Entry} in natural order on value.
*
* <p>The returned comparator is serializable and throws {@link
* NullPointerException} when comparing an entry with null values.
*
* @param <K> the type of the map keys
* @param <V> the {@link Comparable} type of the map values
* @return a comparator that compares {@link Map.Entry} in natural order on value.
* @see Comparable
* @since 1.8
*/
public static <K, V extends Comparable<? super V>> Comparator<Map.Entry<K,V>> comparingByValue() {
return (Comparator<Map.Entry<K, V>> & Serializable)
(c1, c2) -> c1.getValue().compareTo(c2.getValue());
}
/**
* Returns a comparator that compares {@link Map.Entry} by key using the given
* {@link Comparator}.
*
* <p>The returned comparator is serializable if the specified comparator
* is also serializable.
*
* @param <K> the type of the map keys
* @param <V> the type of the map values
* @param cmp the key {@link Comparator}
* @return a comparator that compares {@link Map.Entry} by the key.
* @since 1.8
*/
public static <K, V> Comparator<Map.Entry<K, V>> comparingByKey(Comparator<? super K> cmp) {
Objects.requireNonNull(cmp);
return (Comparator<Map.Entry<K, V>> & Serializable)
(c1, c2) -> cmp.compare(c1.getKey(), c2.getKey());
}
/**
* Returns a comparator that compares {@link Map.Entry} by value using the given
* {@link Comparator}.
*
* <p>The returned comparator is serializable if the specified comparator
* is also serializable.
*
* @param <K> the type of the map keys
* @param <V> the type of the map values
* @param cmp the value {@link Comparator}
* @return a comparator that compares {@link Map.Entry} by the value.
* @since 1.8
*/
public static <K, V> Comparator<Map.Entry<K, V>> comparingByValue(Comparator<? super V> cmp) {
Objects.requireNonNull(cmp);
return (Comparator<Map.Entry<K, V>> & Serializable)
(c1, c2) -> cmp.compare(c1.getValue(), c2.getValue());
}
}
可以看出然想,主要的數(shù)據(jù)操作還是對(duì)這個(gè)Node
的操作。
在分析HashMap
的幾個(gè)重要函數(shù)之前欣范,我想先給大家介紹一下HashMap
存儲(chǔ)數(shù)據(jù)的主要形式又沾,也就是說它的數(shù)據(jù)結(jié)構(gòu)到底是怎樣的弊仪。
下面這張圖可以大概說明。為什么說是大概說明杖刷,因?yàn)樵趈ava8中HashMap
引入了紅黑樹來取代鏈表励饵。
現(xiàn)在我們?cè)偕钊氲奶骄?code>HashMap的幾個(gè)重要函數(shù):
1.構(gòu)造函數(shù)
public HashMap() {
this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}
public HashMap(int initialCapacity) {
this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
this.loadFactor = loadFactor;
this.threshold = tableSizeFor(initialCapacity);
}
public HashMap(Map<? extends K, ? extends V> m) {
this.loadFactor = DEFAULT_LOAD_FACTOR;
putMapEntries(m, false);
}
在前三個(gè)構(gòu)造函數(shù)中,主要是對(duì)HashMap
兩個(gè)重要的參數(shù)進(jìn)行賦值滑燃。
一個(gè)是loadFactor
:擴(kuò)容因子役听,要求要小于1大于0,當(dāng)容量達(dá)到閾值時(shí)表窘,就需要對(duì)哈希表進(jìn)行擴(kuò)容典予,而這個(gè)閾值就是由當(dāng)前容量和擴(kuò)容因子共同決定的。
另一個(gè)是initialCapacity
:即哈希表的初始容量乐严。
我們看一下第三個(gè)構(gòu)造函數(shù)中瘤袖,有這么一行代碼
this.threshold = tableSizeFor(initialCapacity);
我們仔細(xì)來看一下tableSizeFor
這個(gè)函數(shù):
static final int tableSizeFor(int cap) {
int n = cap - 1;
n |= n >>> 1;
n |= n >>> 2;
n |= n >>> 4;
n |= n >>> 8;
n |= n >>> 16;
return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}
關(guān)于這個(gè)算法的解釋,可以參考這篇文章昂验,在這里我們就不重復(fù)的去做解釋了捂敌。
接下來我們重點(diǎn)看第四個(gè)構(gòu)造函數(shù):
public HashMap(Map<? extends K, ? extends V> m) {
this.loadFactor = DEFAULT_LOAD_FACTOR;
putMapEntries(m, false);
}
final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {
int s = m.size();
if (s > 0) {
if (table == null) { // pre-size
float ft = ((float)s / loadFactor) + 1.0F;
int t = ((ft < (float)MAXIMUM_CAPACITY) ?
(int)ft : MAXIMUM_CAPACITY);
if (t > threshold)
threshold = tableSizeFor(t);
}
else if (s > threshold)
resize();
for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) {
K key = e.getKey();
V value = e.getValue();
putVal(hash(key), key, value, false, evict);
}
}
}
主要是看putMapEntries
這個(gè)函數(shù),先判斷哈希表的大小是不是大于0既琴,然后再判斷table
是不是為空占婉,也就是我們的哈希表內(nèi)有沒有元素。如果是空的甫恩,那么還得像前面那三個(gè)構(gòu)造函數(shù)一樣先對(duì)必要的參數(shù)賦值逆济;如果不為空判斷要添加的哈希表的size是否達(dá)到了擴(kuò)容閾值,如果達(dá)到了磺箕,就需要對(duì)哈希表進(jìn)行擴(kuò)容奖慌,也就是走resize
函數(shù)。
final Node<K,V>[] resize() {
Node<K,V>[] oldTab = table;
int oldCap = (oldTab == null) ? 0 : oldTab.length;
int oldThr = threshold;
int newCap, newThr = 0;
if (oldCap > 0) {
if (oldCap >= MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return oldTab;
}
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
oldCap >= DEFAULT_INITIAL_CAPACITY)
newThr = oldThr << 1; // double threshold
}
else if (oldThr > 0) // initial capacity was placed in threshold
newCap = oldThr;
else { // zero initial threshold signifies using defaults
newCap = DEFAULT_INITIAL_CAPACITY;
newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
}
if (newThr == 0) {
float ft = (float)newCap * loadFactor;
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
(int)ft : Integer.MAX_VALUE);
}
threshold = newThr;
@SuppressWarnings({"rawtypes","unchecked"})
Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
table = newTab;
if (oldTab != null) {
for (int j = 0; j < oldCap; ++j) {
Node<K,V> e;
if ((e = oldTab[j]) != null) {
oldTab[j] = null;
if (e.next == null)
newTab[e.hash & (newCap - 1)] = e;
else if (e instanceof TreeNode)
((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
else { // preserve order
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
do {
next = e.next;
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
else {
if (hiTail == null)
hiHead = e;
else
hiTail.next = e;
hiTail = e;
}
} while ((e = next) != null);
if (loTail != null) {
loTail.next = null;
newTab[j] = loHead;
}
if (hiTail != null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
}
}
}
}
return newTab;
}
resize
這個(gè)函數(shù)很重要松靡,所以我們一行一行的來分析它升薯。
首先,先判斷舊的哈希表是否為空:如果不為空击困,根據(jù)原來的大小確定是否需要擴(kuò)容涎劈,如果需要擴(kuò)容的話則是將擴(kuò)容閾值擴(kuò)大到之前的2倍。如果為空阅茶,判斷舊的擴(kuò)容閾值是否大于0蛛枚。如果大于0,則將其賦給新的哈希表容量脸哀;否則蹦浦,新的哈希表容量則為默認(rèn)容量即16,擴(kuò)容閾值則為16*0.75即12.
第二步撞蜂,經(jīng)過上一輪賦值過后盲镶,判斷新的擴(kuò)容閾值是否等于0侥袜。如果等于0,則對(duì)新的擴(kuò)容閾值進(jìn)行賦值溉贿。最終也就確定的新的擴(kuò)容閾值枫吧。
第三步,根據(jù)新的容量創(chuàng)建一個(gè)新的數(shù)組newTab
,判斷以前的哈希表中的table
數(shù)組是否為空宇色,如果不為空九杂,對(duì)擴(kuò)容前的哈希表的table
進(jìn)行遍歷,
然后對(duì)table中每一個(gè)鏈表進(jìn)行遍歷宣蠕。簡(jiǎn)單來說就是:遍歷hashmap
每個(gè)bucket里的鏈表例隆,每個(gè)鏈表可能會(huì)被拆分成兩個(gè)鏈表,不需要移動(dòng)的元素置入loHead為首的鏈表抢蚀,需要移動(dòng)的元素置入hiHead為首的鏈表镀层,然后分別分配給老的buket和新的buket。
上面就是整個(gè)resize或者叫rehash的過程皿曲,可能理解起來會(huì)比較困難唱逢,需要反復(fù)的去思考去驗(yàn)證。
resize的過程主要是兩步谷饿,一步是擴(kuò)容惶我,一步是對(duì)擴(kuò)容前的hash表的數(shù)據(jù)重新擺放位置的過程妈倔。
在resize之后博投,還要將即將添加的數(shù)據(jù)加入到新的哈希表當(dāng)中去。主要是調(diào)用putVal
這個(gè)函數(shù)盯蝴。
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
if (e != null) { // existing mapping for key
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;
}
這個(gè)函數(shù)也很重要毅哗,我們還是一行一行的來解讀它。
第一個(gè)if
語句捧挺,判斷添加前哈希表是否為空虑绵,如果為空則需要走一個(gè)resize
的過程,上面我們已經(jīng)分析過了闽烙。
第二步翅睛,計(jì)算hash并判斷對(duì)應(yīng)位置上是否有值,如果沒有值則直接將新的Node
存入指定位置黑竞;如果有值:1.判斷key和hash是否相同捕发,如果相同的話則將說明已經(jīng)存在了指定的key,只需要更新對(duì)應(yīng)value就行了很魂。2.判斷對(duì)應(yīng)節(jié)點(diǎn)是不是紅黑樹扎酷,如果是紅黑樹則調(diào)用紅黑樹的對(duì)應(yīng)方法(這一點(diǎn)咱們暫時(shí)帶過,后面我們會(huì)單獨(dú)分析紅黑樹的數(shù)據(jù)結(jié)構(gòu))3.如果以上兩者都不滿足遏匆,則遍歷bucket對(duì)應(yīng)的鏈表法挨,要么找到對(duì)應(yīng)的key谁榜,只需要更新value;要么找不到凡纳,則將數(shù)據(jù)添加入鏈表尾部窃植。
以上兩個(gè)函數(shù)是HashMap
中最重要的兩個(gè)函數(shù),理解了這兩個(gè)函數(shù)惫企,那HashMap
其實(shí)也就理解的差不多了撕瞧。
總結(jié)一下,HashMap主要的兩個(gè)函數(shù):resize和putval狞尔。
對(duì)于一個(gè)集合來說丛版,最重要的操作就是增刪改查。而在HashMap
中偏序,這幾個(gè)操作必要的步驟都是先通過hash尋找到其在數(shù)組的位置页畦,然后對(duì)比key和hash來找到對(duì)應(yīng)的值。這個(gè)就是HashMap
的關(guān)鍵研儒。
所以豫缨,為什么說HashMap
結(jié)合了順序表和鏈表的缺點(diǎn)呢,因?yàn)樗鼘㈨樞虮淼臄?shù)組存儲(chǔ)和鏈表的鏈?zhǔn)酱鎯?chǔ)相結(jié)合端朵,所以就具有了這兩者的優(yōu)點(diǎn)好芭。
今天就分析到這了,晚安各位~~~