I recently found myself in need of a map-like sorted data structure, with ordering determined by the values and not the keys SortedMap
.
There is an implementation of an immutable
PriorityMap on GitHub already, though for the purposes of my assignment I would rather have mutability (I know...sue me).
Node
Lets start off with a simple map node wrapper object to store our key value pairs. Since our map is sorted by
by an entry's value we will have to implement the Ordered
trait as to allow ordering accordingly. We will consume an implicit
converter to support Rich
basic data type conversions for Int, Long, etc...
case class PMapNode[K, V](key: K, var value: V)(implicit ev: V => Ordered[V]) extends Ordered[PMapNode[K, V]] {
override def compare(that: PMapNode[K, V]): Int = this.value compare that.value
}
Now we can start creating the actual PriorityMap
object. Our implementation will consist of a remix of sorts, consisting of both
a heap to manage the ordering of our PMapNodes
, backed by a map for O(1) access to elements other than the head of the heap.
To instantiate the map we will need the types
for the key's and values as well as an implicit conversion for the value type (V)
in the case that V is not covariant with Ordered[V].
class PriorityMap[K, V](initSize: Int = 16)(implicit ev: V => Ordered[V]) extends mutable.Map[K, V] {
...
}
For quick access to any element in our heap we will use a Map
that when passed a key
it will return the index in the heap containing the entry associated with that key.
This index map will be updated at every insertion and removal operation.
To manage the heap we need know it's element count
and ensure that the element count
never exceeds the size of the array containing the elements. We can now implement the values needed for our map. It is
also a good idea to create a type alias for long type declarations like PMapNode[K,V] to clean up our code, this is completely
optional of course.
...
type MapNode = PMapNode[K, V] // type alias for the PMapNode[K,V] type
val ROOT_INDEX = 1
// The root of the heap is always in index 1
private var heap = new Array[MapNode](initSize) //
private var population = 0
private var indexMap: mutable.HashMap[K, Int] = mutable.HashMap()
override def size = population
...
Rules of the Heaps
To manage the ordering of our map we will use the classical heap data structure to store our PMapNodes
.
A heap consists of an Array
used to store the entrys, backed by a set of rules that create a tree-like
data structure.
Rule 1
: The root is always stored in index1
.Rule 2
: For any node in indexk
the left child is stored in index2 * k
.Rule 3
: For any node indexk
the right child is stored in index2k + 1
.Rule 4
: A node's parent index is equal tok / 2
wherek
is the node's current index in the heap.
Let's also declare a type alias for representing a MapNodes's children. We will use a pair of pairs where each element
in the pair will contain both the reference to the MapNode and the index of that child in the heap.
For a ChildPair
the first and second element will contain the storage data for the left and right child respectively.
type ChildPair = ((Option[MapNode], Int), (Option[MapNode], Int))
private def getChildren(index: Int): ChildPair = {
val leftInd = index * 2
val rightInd = leftInd + 1
(index * 2 <= population, index * 2 + 1 <= population) match { // Rule 2 and 3 for obtaining the child nodes.
case (hasLeft, hasRight) =>
val leftChild = hasLeft match {
case true => Some(heap(leftInd))
case false => None
}
val rightChild = hasRight match {
case true => Some(heap(rightInd))
case false => None
}
((leftChild, leftInd), (rightChild, rightInd))
}
}
private def parentIndex(index: Int) = index / 2
Functional Decomposition
I prefer to functionally decompose an operation before I write out the final sequence calls needed to complete said operation. Let's think of a problem as if it was a Ninja Warrior obstacle course, where each section of the course requires a unique set of skills to be completed successfully. That's functional decomposition as I have come to understand it, we first create the methods that form the arsenal of sub methods that our main method requires to compelte it's goal. We will use this technique for the first major operation in our collection.
Inserting AKA +='ing
Before we insert an element into our map we must first check if the array representing the heap is full, if it is then we double the size.
private def tryReload() = if (shouldReload) reload()
private def shouldReload: Boolean = population.toDouble == heap.length.toDouble
private def reload(): Unit = {
heap ++= new Array[MapNode](heap.length)
}
We'll try to reload before every insertion to be safe. When a new entry is inserted into our map it's key may
already be associated with a value, in which case we will update the value currently associated with said key.
If the key does not have a value already associated with it, we will add it to our collection as a new MapNode.
In either case we must maintain ordering by either sifting up
or sifting down
the MapNode to a
location in our heap where it's parent is ordered higher than itself.
Thanks to pattern matching we can represent both branches rather cleanly.
override def +=(kv: (K, V)): PriorityMap.this.type = {
tryReload()
indexMap.contains(kv._1) match {
case true =>
update(indexMap(kv._1), kv._2)
case false =>
val insertIndex = appendIndex
heap(insertIndex) = PMapNode[K, V](kv._1, kv._2)
indexMap += (kv._1 -> insertIndex)
population += 1
siftUp(insertIndex)
}
this
}
New Insertion (Sifting Up)
Elements added into a binary heap are placed in the next available index in the array.
We then continuously promote the new node up the heap until we reach the case where it's parent is of a higher
order than itself. We'll call this function siftUp
, and use it to
handle the case where a new key is being added to the map.
@tailrec
private def siftUp(targetInd: Int): Unit = {
parentIndex(targetInd) match {
case parentInd if parentInd != THE_ROOF =>
val parentNode = heap(parentInd)
val currentNode = heap(targetInd)
currentNode < parentNode match {
case true =>
swap(targetInd, (parentNode, parentInd))
siftUp(parentInd)
case false => ()
}
case _ => ()
}
}
That's it for Part 1
of our PriorityMap
implementation. In the next part of this tutorial
I will walk us through the Update
operation where we update the value of a pre-existing key in our map as well
as the Remove (-=)
Map
trait method.