Mathematical foundation of Maps and collisions — Let’s
break it down from a mathematical perspective:
Ø What
is a Map in Math?
In mathematics, a map (or function) is a relationship
between a set of inputs (domain) and a set of possible outputs (codomain).
Definition:
A function f: A → B maps each element a ∈
A to a unique element b ∈ B.
Ø How
HashMap Reflects a Math Map
In Java:
Map<K,
V> map = new HashMap<>(); |
This conceptually mirrors:
f: Key →
Value |
But unlike pure math, we use hashing to determine
where to store/retrieve values, and this leads to collisions.
Ø Hash
Function as a Mathematical Function
A hash function is a mathematical function:
h:K→N |
Where:
- K is
the set of all possible keys.
- â„•
is the set of natural numbers (usually bounded by array size).
Example:
If you have:
java
hashCode("apple")
= 99162322 |
And bucket array size = 16:
h("apple")
= 99162322mod16 = 2 |
So "apple" gets stored in bucket index 2.
Ø Collisions
– The Core Problem
What is a Collision?
Two different keys k1 ≠ k2 map to the same index:
h(k1)modN=h(k2)modN |
This violates injectivity (1-1 mapping) in
mathematics.
Probability of Collisions: The Birthday Paradox
Let’s say:
- You
have n items (keys).
- HashMap
has m buckets (indexes).
The probability P(collision) increases as n grows:
$$P(collision)≈ 1−e^{-\frac{n^2}{(2m)}}$$ |
Example: If n = 23, m = 365 → there's a 50.7% chance
of at least one collision — just like birthdays!
Ø Collision
Resolution Techniques (Math-Wise)
A. Chaining (Linked Lists):
Multiple keys in the same bucket → stored as a list:
Bucket[i]=k1,k2,k3...ifh(k1)≡h(k2)≡h(k3)(modN) |
B. Open Addressing (Linear Probing):
If collision occurs, find the next available slot:
Index=(h(k)+i)modN |
Where i = 0, 1, 2... until empty spot is found.
C. Quadratic Probing:
Index=(h(k)+c1∗i+c2∗i^2)modN |
D. Double Hashing:
Index=(h1(k)+i∗h2(k))modN |
Ø Perfect
Hash Function (Ideal But Rare)
A perfect hash function has no collisions:
$$∀x\neq
y⇒h(x)\neq h(y)$$ |
Hard to achieve in practice unless the key set is known
ahead of time.
Ø
In math overview
Concept |
Math Equivalent |
Java Equivalent |
Map |
f: A → B |
Map<K, V> |
Hash Function |
h: K → N |
key.hashCode() |
Modulo Operation |
h(k) mod N |
Index in array |
Collision |
h(k1) ≡ h(k2) (mod N) |
Two keys, same bucket |
Resolution |
Probing / Chaining |
Linked list / Tree / Probing |
Let’s go deeper into how HashMap works with math
and the underlying logic — including formulas, modulus arithmetic, bitwise
tricks, and collision theory.
Ø Hashing
– The Math Behind It
Formula:
int hash =
key.hashCode(); int index =
hash % capacity; // classic formula |
Java's HashMap improves this to reduce collisions:
java
int hash =
hash(key); int index =
(n - 1) & hash; |
Why (n - 1) & hash instead of hash % n?
- n
must be a power of 2 (like 16, 32, 64...).
- Bitwise
AND is faster than %.
- Ensures
uniform distribution across buckets.
Example:
hash =
34 (binary: 100010) n = 16 (binary: 10000) n-1 = 15 (binary: 01111) index = 34
& 15 = 2 |
Ø Hash
Function and Distribution
Let’s say:
key =
"Java" hashCode =
key.hashCode(); // say: 2301506 |
With capacity n = 16:
java
index =
2301506 % 16 = 2 |
But collisions happen if:
java
2301506 % 16
== 50 % 16 == 2 |
Ø Multiple keys can land in the same bucket
Ø Collisions
Birthday Paradox Style:
As you insert more keys, the chance of two having the same
bucket increases (like two people sharing a birthday in a room of 23+ people).
Mathematically, the probability of collision increases
non-linearly as entries increase:
- With
16 buckets, inserting 8+ entries → ~50% collision chance.
Ø
Load Factor and Rehashing
java
Load Factor (LF)
= number of elements / number of buckets Default = 0.75 When size
>= capacity * loadFactor → resize happens E.g.,
capacity = 16: |
java
16 * 0.75 = 12
→ after 12 inserts → rehash to 32 |
Ø Custom
HashMap Recap with Math
Creating a custom HashMap in Java involves implementing the
core functionality of a hash-based data structure. Below is a simple
implementation of a custom HashMap. This implementation will include basic
methods such as put, get, remove, and containsKey. For simplicity, it won't
handle all edge cases and optimizations but should give you a foundational
understanding of how HashMap works.
Custom HashMap Implementation
java
import java.util.LinkedList; import java.util.Objects; public class CustomHashMap<K, V> { private static final int
DEFAULT_CAPACITY = 16; private static final float LOAD_FACTOR
= 0.75f; private LinkedList<Entry<K,
V>>[] table; private int size; // Entry class to store key-value pairs private static class Entry<K, V>
{ K key; V value; Entry<K,
V> next; Entry(K key, V
value, Entry<K, V> next) {
this.key = key;
this.value = value;
this.next = next; } }
@SuppressWarnings("unchecked") public CustomHashMap(int capacity) { table = new
LinkedList[capacity]; size = 0; } public CustomHashMap() {
this(DEFAULT_CAPACITY); } // Hash function to determine the index
in the table private int hash(Object key) { return
key.hashCode() % table.length; } // Add a key-value pair to the map public void put(K key, V value) { int index =
hash(key); if
(table[index] == null) {
table[index] = new LinkedList<>(); }
LinkedList<Entry<K, V>> bucket = table[index]; for
(Entry<K, V> entry : bucket) {
if (entry.key.equals(key)) {
entry.value = value;
return;
} } bucket.add(new
Entry<>(key, value, null)); size++; // Resize if
the load factor is exceeded if ((float)
size / table.length > LOAD_FACTOR) {
resize(); } } // Retrieve the value associated with a
key public V get(K key) { int index =
hash(key);
LinkedList<Entry<K, V>> bucket = table[index]; if (bucket !=
null) {
for (Entry<K, V> entry : bucket) {
if (entry.key.equals(key)) {
return entry.value;
}
} } return null; } // Remove a key-value pair from the map public V remove(K key) { int index =
hash(key);
LinkedList<Entry<K, V>> bucket = table[index]; if (bucket !=
null) {
Entry<K, V> prev = null;
for (Entry<K, V> entry : bucket) {
if (entry.key.equals(key)) {
if (prev != null) {
prev.next = entry.next;
} else {
bucket.remove(entry);
}
size--;
return entry.value;
}
prev = entry;
} } return null; } // Check if the map contains a key public boolean containsKey(K key) { int index =
hash(key);
LinkedList<Entry<K, V>> bucket = table[index]; if (bucket !=
null) {
for (Entry<K, V> entry : bucket) {
if (entry.key.equals(key)) {
return true;
}
} } return false; } // Resize the table when needed
@SuppressWarnings("unchecked") private void resize() {
LinkedList<Entry<K, V>>[] oldTable = table; table = new
LinkedList[oldTable.length * 2]; size = 0; for
(LinkedList<Entry<K, V>> bucket : oldTable) {
if (bucket != null) {
for (Entry<K, V> entry : bucket) {
put(entry.key, entry.value);
}
} } } // Get the current size of the map public int size() { return size; } // Main method for testing public static void main(String[] args)
{
CustomHashMap<String, Integer> map = new CustomHashMap<>();
map.put("One", 1);
map.put("Two", 2);
map.put("Three", 3);
System.out.println("Size: " + map.size()); // Output: Size: 3
System.out.println("Get 'Two': " + map.get("Two")); //
Output: Get 'Two': 2
map.remove("Two");
System.out.println("Contains 'Two': " +
map.containsKey("Two")); // Output: Contains 'Two': false
System.out.println("Size after removal: " + map.size()); // Output:
Size after removal: 2 } } |
Key Components
- Entry Class: Represents a key-value pair with
a reference to the next entry in the linked list.
- Hash Function: Maps a key to an index in the
table using its hash code.
- Put Method: Adds a key-value pair to the map.
Resizes the table if the load factor is exceeded.
- Get Method: Retrieves the value associated
with a key.
- Remove Method: Removes a key-value pair from
the map and adjusts the linked list accordingly.
- ContainsKey Method: Checks if the map contains
a specific key.
- Resize Method: Doubles the size of the table
and rehashes all entries when the load factor is exceeded.
- Main Method: Tests the basic functionality of
the CustomHashMap.
Ø Summary
of the Math Concepts
Concept |
Description |
Math Involved |
Hashing |
Convert key → int → bucket index |
hash % n, (n-1) & hash |
Load Factor |
Threshold to resize |
loadFactor = size / capacity |
Resizing |
Double the capacity, redistribute
entries |
newIndex = hash(key) % newCapacity |
Collision Probability |
More items = more chance of same
bucket |
Similar to Birthday Problem |
This custom implementation provides a fundamental structure
of a hash map with dynamic resizing. It can be expanded with additional
features or optimizations as needed.
custom HashMap |
For Tools information, visit:
Ø
Auto-Update
Batch File with Latest JAR & Start App Automatically
Ø
Connecting
to IBM WebSphere MQ in Java
Ø
How
to create maven project
Ø
VisualVM
monitoring like jconsole
Ø
Stylus
studio convert edifact message
Ø
JConsole
Monitoring for Java Standalone or Web application project
Ø Apache
Cluster router load balancer
Ø
Generate
a Token in Sonatype Nexus & Use It in Maven & Jenkins
Ø
CI/CD
Pipeline in Jenkins: Staging & Production Artifact Flow in java app
Ø
Master
in CI/CD pipeline using Maven with java
For Cloud information, visit:
Ø
creating
a hierarchical diagram for cloud logging
Ø
A
hierarchical structure that includes a broader range of google cloud services
For Chemistry information, visit:
Ø
Molecular
weight of chemistry in Java code
Ø
To
generate a chemical formula look using HTML
Ø
Orbitals
and Electron Configuration Hund’s Rule
For More Git related information, visit
Ø
Git
in IntelliJ and encountering the SSL certificate issue
0 Comments