Header Ads Widget

Responsive Advertisement

Creating a custom HashMap in Java

Mathematical foundation of Maps and collisions — Let’s break it down from a mathematical perspective:


Ø  What is a Map in Math?

In mathematics, a map (or function) is a relationship between a set of inputs (domain) and a set of possible outputs (codomain).

Definition:

A function f: A → B maps each element a A to a unique element b B.


Ø  How HashMap Reflects a Math Map

In Java:

Map<K, V> map = new HashMap<>();

 

This conceptually mirrors:

f: Key → Value

 

But unlike pure math, we use hashing to determine where to store/retrieve values, and this leads to collisions.


Ø  Hash Function as a Mathematical Function

A hash function is a mathematical function:

h:K→N

Where:

  • K is the set of all possible keys.
  • â„• is the set of natural numbers (usually bounded by array size).

Example:

If you have:

java

hashCode("apple") = 99162322

 

And bucket array size = 16:

h("apple") = 99162322mod16 = 2

 

So "apple" gets stored in bucket index 2.


Ø  Collisions – The Core Problem

What is a Collision?

Two different keys k1 ≠ k2 map to the same index:

h(k1)modN=h(k2)modN

 

This violates injectivity (1-1 mapping) in mathematics.

Probability of Collisions: The Birthday Paradox

Let’s say:

  • You have n items (keys).
  • HashMap has m buckets (indexes).

The probability P(collision) increases as n grows:

$$P(collision)≈ 1−e^{-\frac{n^2}{(2m)}}$$

 

Example: If n = 23, m = 365 → there's a 50.7% chance of at least one collision — just like birthdays!


Ø      Collision Resolution Techniques (Math-Wise)

A. Chaining (Linked Lists):

Multiple keys in the same bucket → stored as a list:

Bucket[i]=k1,k2,k3...ifh(k1)≡h(k2)≡h(k3)(modN)

 

B. Open Addressing (Linear Probing):

If collision occurs, find the next available slot:

Index=(h(k)+i)modN

 

Where i = 0, 1, 2... until empty spot is found.

C. Quadratic Probing:

Index=(h(k)+c1i+c2i^2)modN

 

D. Double Hashing:

Index=(h1(k)+ih2(k))modN

 


Ø       Perfect Hash Function (Ideal But Rare)

A perfect hash function has no collisions:

$$∀x\neq y⇒h(x)\neq h(y)$$

 

Hard to achieve in practice unless the key set is known ahead of time.


Ø  In math overview

Concept

Math Equivalent

Java Equivalent

Map

f: A → B

Map<K, V>

Hash Function

h: K → N

key.hashCode()

Modulo Operation

h(k) mod N

Index in array

Collision

h(k1) ≡ h(k2) (mod N)

Two keys, same bucket

Resolution

Probing / Chaining

Linked list / Tree / Probing

 

 


 

Let’s go deeper into how HashMap works with math and the underlying logic — including formulas, modulus arithmetic, bitwise tricks, and collision theory.


Ø      Hashing – The Math Behind It

Formula:

int hash = key.hashCode();

int index = hash % capacity; // classic formula

 

Java's HashMap improves this to reduce collisions:

java

int hash = hash(key);

int index = (n - 1) & hash;

 

Why (n - 1) & hash instead of hash % n?

  • n must be a power of 2 (like 16, 32, 64...).
  • Bitwise AND is faster than %.
  • Ensures uniform distribution across buckets.

Example:

hash = 34  (binary: 100010)

n = 16     (binary: 10000)

n-1 = 15   (binary: 01111)

 

index = 34 & 15 = 2

 


Ø  Hash Function and Distribution

Let’s say:

key = "Java"

hashCode = key.hashCode(); // say: 2301506

 

With capacity n = 16:

java

index = 2301506 % 16 = 2

 

But collisions happen if:

java

2301506 % 16 == 50 % 16 == 2

 

Ø      Multiple keys can land in the same bucket


Ø      Collisions

Birthday Paradox Style:

As you insert more keys, the chance of two having the same bucket increases (like two people sharing a birthday in a room of 23+ people).

Mathematically, the probability of collision increases non-linearly as entries increase:

  • With 16 buckets, inserting 8+ entries → ~50% collision chance.

  

Ø  Load Factor and Rehashing

java

Load Factor (LF) = number of elements / number of buckets

Default = 0.75

When size >= capacity * loadFactor → resize happens

E.g., capacity = 16:

 

 

java

16 * 0.75 = 12 → after 12 inserts → rehash to 32

 


Ø       Custom HashMap Recap with Math

Creating a custom HashMap in Java involves implementing the core functionality of a hash-based data structure. Below is a simple implementation of a custom HashMap. This implementation will include basic methods such as put, get, remove, and containsKey. For simplicity, it won't handle all edge cases and optimizations but should give you a foundational understanding of how HashMap works.

Custom HashMap Implementation

java

import java.util.LinkedList;

import java.util.Objects;

 

public class CustomHashMap<K, V> {

    private static final int DEFAULT_CAPACITY = 16;

    private static final float LOAD_FACTOR = 0.75f;

   

    private LinkedList<Entry<K, V>>[] table;

    private int size;

   

    // Entry class to store key-value pairs

    private static class Entry<K, V> {

        K key;

        V value;

        Entry<K, V> next;

       

        Entry(K key, V value, Entry<K, V> next) {

            this.key = key;

            this.value = value;

            this.next = next;

        }

    }

   

    @SuppressWarnings("unchecked")

    public CustomHashMap(int capacity) {

        table = new LinkedList[capacity];

        size = 0;

    }

   

    public CustomHashMap() {

        this(DEFAULT_CAPACITY);

    }

   

    // Hash function to determine the index in the table

    private int hash(Object key) {

        return key.hashCode() % table.length;

    }

   

    // Add a key-value pair to the map

    public void put(K key, V value) {

        int index = hash(key);

        if (table[index] == null) {

            table[index] = new LinkedList<>();

        }

       

        LinkedList<Entry<K, V>> bucket = table[index];

        for (Entry<K, V> entry : bucket) {

            if (entry.key.equals(key)) {

                entry.value = value;

                return;

            }

        }

       

        bucket.add(new Entry<>(key, value, null));

        size++;

       

        // Resize if the load factor is exceeded

        if ((float) size / table.length > LOAD_FACTOR) {

            resize();

        }

    }

   

    // Retrieve the value associated with a key

    public V get(K key) {

        int index = hash(key);

        LinkedList<Entry<K, V>> bucket = table[index];

        if (bucket != null) {

            for (Entry<K, V> entry : bucket) {

                if (entry.key.equals(key)) {

                    return entry.value;

                }

            }

        }

        return null;

    }

   

    // Remove a key-value pair from the map

    public V remove(K key) {

        int index = hash(key);

        LinkedList<Entry<K, V>> bucket = table[index];

        if (bucket != null) {

            Entry<K, V> prev = null;

            for (Entry<K, V> entry : bucket) {

                if (entry.key.equals(key)) {

                    if (prev != null) {

                        prev.next = entry.next;

                    } else {

                        bucket.remove(entry);

                    }

                    size--;

                    return entry.value;

                }

                prev = entry;

            }

        }

        return null;

    }

   

    // Check if the map contains a key

    public boolean containsKey(K key) {

        int index = hash(key);

        LinkedList<Entry<K, V>> bucket = table[index];

        if (bucket != null) {

            for (Entry<K, V> entry : bucket) {

                if (entry.key.equals(key)) {

                    return true;

                }

            }

        }

        return false;

    }

   

    // Resize the table when needed

    @SuppressWarnings("unchecked")

    private void resize() {

        LinkedList<Entry<K, V>>[] oldTable = table;

        table = new LinkedList[oldTable.length * 2];

        size = 0;

       

        for (LinkedList<Entry<K, V>> bucket : oldTable) {

            if (bucket != null) {

                for (Entry<K, V> entry : bucket) {

                    put(entry.key, entry.value);

                }

            }

        }

    }

   

    // Get the current size of the map

    public int size() {

        return size;

    }

   

    // Main method for testing

    public static void main(String[] args) {

        CustomHashMap<String, Integer> map = new CustomHashMap<>();

       

        map.put("One", 1);

        map.put("Two", 2);

        map.put("Three", 3);

       

        System.out.println("Size: " + map.size()); // Output: Size: 3

        System.out.println("Get 'Two': " + map.get("Two")); // Output: Get 'Two': 2

       

        map.remove("Two");

        System.out.println("Contains 'Two': " + map.containsKey("Two")); // Output: Contains 'Two': false

        System.out.println("Size after removal: " + map.size()); // Output: Size after removal: 2

    }

}

 

Key Components

  1. Entry Class: Represents a key-value pair with a reference to the next entry in the linked list.
  2. Hash Function: Maps a key to an index in the table using its hash code.
  3. Put Method: Adds a key-value pair to the map. Resizes the table if the load factor is exceeded.
  4. Get Method: Retrieves the value associated with a key.
  5. Remove Method: Removes a key-value pair from the map and adjusts the linked list accordingly.
  6. ContainsKey Method: Checks if the map contains a specific key.
  7. Resize Method: Doubles the size of the table and rehashes all entries when the load factor is exceeded.
  8. Main Method: Tests the basic functionality of the CustomHashMap.

 


Ø       Summary of the Math Concepts

Concept

Description

Math Involved

Hashing

Convert key → int → bucket index

hash % n, (n-1) & hash

Load Factor

Threshold to resize

loadFactor = size / capacity

Resizing

Double the capacity, redistribute entries

newIndex = hash(key) % newCapacity

Collision Probability

More items = more chance of same bucket

Similar to Birthday Problem

 

This custom implementation provides a fundamental structure of a hash map with dynamic resizing. It can be expanded with additional features or optimizations as needed.

 





custom HashMap
custom HashMap


For Tools information, visit:

Ø  Auto-Update Batch File with Latest JAR & Start App Automatically

Ø  Connecting to IBM WebSphere MQ in Java

Ø  How to create maven project

Ø  VisualVM monitoring like jconsole

Ø  Stylus studio convert edifact message

Ø  JConsole Monitoring for Java Standalone or Web application project

Ø  Apache Cluster router load balancer

Ø  Generate a Token in Sonatype Nexus & Use It in Maven & Jenkins

Ø  CI/CD Pipeline in Jenkins: Staging & Production Artifact Flow in java app

Ø  Master in CI/CD pipeline using Maven with java

 

For Cloud information, visit:

Ø  creating a hierarchical diagram for cloud logging

Ø  A hierarchical structure that includes a broader range of google cloud services

 

For Chemistry information, visit:

Ø  Molecular weight of chemistry in Java code

Ø  To generate a chemical formula look using HTML

Ø  Orbitals and Electron Configuration Hund’s Rule


For More Git related information, visit

Ø  How to use github

Ø  Git in IntelliJ and encountering the SSL certificate issue





Post a Comment

0 Comments