# Removing entries from a Map in Java

Today, I ran through some code iterating over a Java map to remove outdated values. I found the way it was done utterly disgusting. Yet, thinking it through, I would probably have done it the same way a couple of years ago.

As a consequence, I thought it would be a good idea to share the solution I now use.

# Classical solution for cleaning a Java map

## What you can’t do

The easiest way you can think of when cleaning a Java map would be the following:

However, this should produce a ConcurrentModificationException. Lucky for you! Would not that be the case, you would be removing elements from a collection you are iterating upon.

In some cases, I was able to execute such a code on a List, and this results in obvious errors (remove an element, shifting position and then moving to next index… you skipped one element!).

## What you can do

OK, so we established that you cannot remove elements while iterating on them. But you can save the elements to remove for later.

OK, this works. Now, why do I hate that?

## High complexity

Should I talk about complexity? Let be $n$ the magnitude of the number of entries in map (a magnitude only, absolutely no precise calculation).

In this solution, we have a first iteration to test each value in the map. So first, we read $n$ lines.

In first approximation, we can say that $n$ values are stored in the keysToRemove list. When iterating on it, we also read $n$ lines.

Then, for each key to remove, we call map.remove(). But when removing, we provide a key to the map, which will iterate to retrieve it. Optimized as it may be, it still can lead to reading $n$ lines.

As a result, since we have a loop in a loop, the second iteration actually leads to $n^2$ lines read.

All iterations taken into account, the current method has a $n+n^2\approx{}n^2$ iterations. Now I propose a solution which will decrease this complexity.

# Clean solution

## The code

What made me mad about seeing the code I was confronted with today was that the Iterator was used, and it is actually the main component of my solution (which I actually learned from Alcibiade).

Easy enough, right? But who knows of the Iterator.remove() method?

## Complexity

We iterate, reading $n$ lines. When removing, we do not have to iterate any more, since we have a direct pointer to the entry. The total complexity of the algorithm is $n$, and as you may know, $n\ll{}n^2$ (complexity $n$ is insignificant before complexity $n^2$).

# Comparison

Is it insignificant because I say it, or do I say it because it is?

Number of elements $n$ $n^2$
10 10 100
100 100 10,000
1,000 1,000 1,000,000
10,000 10,000 100,000,000

With such a table, it is easy to see that the more numerous the elements of the map are, the more important the gain will be.

But that is only a magnitude of the number of computer operations. Does that really translate into something we can understand?

To make sure there was a real interest in re-writing the code, I made a benchmark. I created a Map<String, Boolean>. The test for entry removal would return true if the boolean value of the entry was true.

The map I created contained 100,000 elements, and I iterated each solution 1,000 times. The benchmark was clear:

Average execution time:
Save and remove: 71 ms
Iterator: 30 ms

Now it is up to you to choose your way!