HashMap implementation principle analysis

Hash concept

The concept of hashing (hash or hash)

Hashing is a method of converting a string of characters into a fixed-length (usually shorter) value or index value, called hashing, also known as hashing. Since it is faster to perform a database search with a shorter hash value than the original value, this method is generally used to index and search in the database, and is also used in various decryption algorithms.

HashMap concept

concept

HashMap is a non-synchronous implementation of the Map interface based on the hash table. This implementation provides all optional mapping operations and allows the use of null values ​​and null keys. HashMap stores key-value pairs, and HashMap is fast. This class does not guarantee the order of the mapping, in particular, it does not guarantee that the order will remain constant.

The location of HashMap in the Map system:

data structure

HashMap is actually a "chain table hash" data structure, that is, a combination of an array and a linked list.

Array: The storage interval is continuous, the memory is serious, the addressing is easy, and the insertion and deletion are difficult;
Linked list: The storage interval is discrete, the memory is loose, the addressing is difficult, and the insertion and deletion are easy;
Hashmap combines these two data structures to achieve easy addressing and easy insertion and deletion.

The structure of HashMap is as follows (this picture is transferred from Alibaba Cloud):

The basic storage principle and working principle of HashMap

The basic storage principle of HashMap and the composition of the stored content

Fundamental: First declare an array with a large range of subscripts to store the elements. In addition, a hash function (also called a hash function) is designed to obtain the function value of each element's Key (key array) (the hash value of the array), and the element stored in the array is an Entry class. There are three data fields, key, value (key-value pair), next (pointing to the next Entry).

For example, the first key value comes in for A. The index=0 obtained by calculating the hash of its key. Remember to do: Entry[0] = A.
The second key-value pair B, by calculating its index is also equal to 0, HashMap will B.next = A, Entry[0] = B,
The third key-value pair C, index is also equal to 0, then C.next = B, Entry[0] = C; thus we find that the index=0 is actually accessed. A, B, C three key-value pairs, they are linked together by the next attribute. We can call this place a bucket. For different elements, the same function value may be calculated, which creates a "conflict", which requires conflict resolution. "Direct addressing" and "resolving conflicts" are two characteristics of the hash table.

The working principle of HashMap and the access method process

How HashMap works :HashMap is based on the principle of hashing (also known as hashing). Use put(key, value) to store objects into HashMap and get(key) to get objects from HashMap. When we pass the key and value to the put() method, we first call the hashCode() method on the key, and the returned hashCode is used to find the bucket location to store the Entry object. HashMap stores key objects and value objects in the bucket as Map.Entry. It is not just storing values ​​in the bucket.

HashMap specific analysis

HashMap specific access process

Put key-value pairThe process of the method is (this picture is transferred from the US group):

1. Determine whether the key value pair array table[i] is empty or null, otherwise execute resize() to expand;
2. Calculate the hash value according to the key value key to get the inserted array index i. If table[i]==null, add the new node directly, turn to 6, if table[i] is not Empty, turn to 3;
3. Determine whether the first element of table[i] is the same as the key. If the same directly covers the value, otherwise it will turn to 4, where the same refers to hashCode and equals;
4. Determine whether table[i] is a treeNode, that is, table[i] is a red-black tree. If it is a red-black tree, insert a key-value pair directly in the tree, otherwise turn 5;
5. Traverse table[i] to determine whether the length of the linked list is greater than 8. If it is greater than 8, convert the linked list to a red-black tree, perform the insert operation in the red-black tree, otherwise insert the linked list. Operation; if the key is found to have a direct overlay value in the traversal process;
6. After the insertion is successful, it is judged whether the actual number of key-value pairs exceeds the maximum capacity threshold. If it exceeds, the capacity is expanded.

Get key valueThe process of the method is:
1. Specify the key to get the hash value of the key through the hash function int hash=key.hashCode();

2, call the internal method getNode (), get the bucket number (usually the hash value to the bucket number modulo)
int index =hash%Entry[].length;

3. Compare the internal elements of the bucket with the key. If they are not equal, they are not found. If they are equal, the value of the equal record is taken out.

4. If the head node of the bucket where the key is located happens to be the red-black tree node, the getTreeNode() method of the red-black tree node is called, otherwise the linked list node is traversed. The getTreeNode method makes a lookup by calling the tree node's find() method. Since the tree has been guaranteed to be ordered before the addition, the search is basically a half-find search, which is very efficient.

5. If the hash value of the comparison node is equal to the hash value to be found, it will judge whether the key is equal, and the equality will return directly; if it is not equal, it will be recursively searched from the subtree.

The direct address in HashMap is generated by the hash function; the conflict is resolved and solved by the comparison function. If there is only one element inside each bucket, then there is only one comparison when looking up. Many queries will be faster when there are no values ​​in many buckets (when they are not found).

Collision detection in HashMap and solution to collision

1. What happens when the hashcodes of two objects are the same?

When the hashcodes of two objects are the same, their bucket positions are the same, and ‘collision’ will occur. Because the HashMap uses the LinkedList to store objects, this Entry (the Map.Entry object containing the key-value pairs) is stored in the LinkedList. These two objects are the same as the hashcode, but they may not be equal.

2. When the hashcodes of two objects are the same, how do you get the values ​​of these two objects?

When the hashcodes of the two objects are the same, we call the get() method. The HashMap will use the hashcode of the key object to find the bucket location and traverse the LinkedList until the value object is found. After finding the bucket location, we call the keys.equals() method to find the correct node in the LinkedList, and finally find the value object we are looking for using an immutable object declared as final, with the appropriate equals() and hashCode() The method will reduce the occurrence of collisions and improve efficiency. Immutability makes it possible to cache hashcodes of different keys, which will increase the speed of the entire acquisition. It is a good choice to use a wrapper class such as String or Interger as the key.

HashMap expansion mechanism (how to adjust the size of HashMap)

Expansionresize) is to recalculate the capacity, add elements to the HashMap object, and when the array inside the HashMap object can not load more elements, the object needs to expand the length of the array, so that more elements can be loaded.

An instance of HashMap has two parameters that affect its performance:Initial capacitywithLoad factor. The capacity is the number of buckets in the hash table, and the initial capacity is just the capacity when the hash table is created. The load factor is a metric that allows the hash table to be satisfied before the capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is re-hashed (ie, the internal data structure is reconstructed) so that the hash table hasAbout twice the bucket

As a general rule, the default load factor (0.75) provides a good compromise between time and space costs. Higher values ​​reduce space overhead, but increase the cost of lookups (reflected in most operations of the HashMap class, including get and put ). When setting its initial capacity, consider the number of entries expected in the Map and its load factor to minimize the number of re-mapping operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rearrangement will occur.

Possible problems with HashMap expansion

When re-adjusting the HashMap size, in the case of multithreading, there may be a race condition, because if both threads find that the HashMap needs to be resized, they will try to resize at the same time. In the process of resizing, the order of the elements stored in the linked list will be reversed, because when moving to the new bucket location, the HashMap does not place the element at the end of the list, but on the head, which is To avoid tail traversing. If conditional competition occurs, then it will loop.

Reference:

Intelligent Recommendation

HashMap implementation principle and source code analysis

forward from   A hash table is also called a hash table. It is a very important data structure. The application scenarios are rich. The core of many caching technologies (such as memcached) is to...

HashMap implementation principle analysis - Interview dwell

1. HashMap data structure And an array data structure capable of storing a linked list of data, but there are basically two extremes. Array Array storage interval is continuous, severe memory footprin...

Implementation principle of HashMap of source code analysis

table of Contents One, write in front Second, chestnuts Three, HashMap design ideas Four, boundary variables Five, put method Six, resize method Seven, get method 8. Questions and Answers about the Im...

Analysis of HashMap implementation principle (detailed explanation)

1. Data structure of HashMap There are arrays and linked lists in the data structure to achieve the storage of data, but these two are basically two extremes. Array The storage interval of the array i...

HashMap implementation principle, source code analysis (jdk1.8)

Reference blog post below, thanks! Re-recognize HashMap for Java 8 series All aspects (mainly source code analysis of jdk1.8) HashMap source code analysis (jdk1.8, to ensure that you can understand) A...

More Recommendation

Implementation principle and source code analysis of hashmap in jdk1.8

Introduction Internal structure Common methods put get remove Internal method hash resize Similar data structure Inheritance Introduction reference Introduction HashMap can access data based on key va...

HashMap implementation principle and source code analysis, interview

1. Implementation principle The backbone of HashMap is an Entry array. Entry is the basic unit of HashMap. Each Entry contains a key-value pair. Entry is a static inner class in HashMap. code show as ...

Review a wave of analysis of the underlying implementation principle of HashMap

HashMap is the most common collection framework in JAVA. It is also a very typical data structure in the Java language. It is also the data structure we need to grow, and more importantly, it is one o...

Java8: HashMap source code analysis (implementation principle)

HashMap is a commonly used data structure in Java development. Understanding its internal implementation helps to use it better. HashMap in Java 8 is composed of three data structures: array, linked l...

Spring-HashMap underlying implementation principle analysis

One: Analysis of the underlying implementation principle of HashMap We have three common data structures: 1. Array structure 2. Linked list structure 3. Hash table structure Let's take a look at the c...

Copyright  DMCA © 2018-2026 - All Rights Reserved - www.programmersought.com  User Notice

Top