This page looks best with JavaScript enabled

Leetcode Problem - Continuous Median

6th of 15 Leetcode problems

 ·   ·  ☕ 7 min read · 👀... views
Problem link: https://leetcode.com/problems/find-median-from-data-stream/
Difficulty: Hard
Category: Arrays

Step 1: Grasping the Problem

This problem is an ad-hoc problem. So there are tons of solutions with similar time and space complexity. Given a stream of input integers, at any time find the median of the integers. According to wiki, the definition of median is:

The median of a finite list of numbers is the “middle” number, when those numbers are listed in order from smallest to greatest.

If there is an odd number of numbers, the middle one is picked. For example, consider the list of numbers

[1, 3, 3, 6, 7, 8, 9]

This list contains seven numbers. The median is the fourth of them, which is 6.

If there is an even number of observations, then there is no single middle value; the median is then usually defined to be the mean of the two middle values.[1][2] For example, in the data set

[1, 2, 3, 4, 5, 6, 8, 9]

The the naive approach will be to sort the integers and pick the middle values (average them if needed). But it will take O(n*logn) for findMedian() function. Can we optimize this more? Notice that the values are given 1 at a time. So while getting the values with addNum(int) function, we can add data-structures specifically to get the medium as fast as possible.

Step 2: Building data structures concept with heaps

A binary heap is an abstract binary tree with the following properties:

  • A child parent relation will always hold in the whole tree. Most of the time, the relation is lesser than (<) or greater than (>).
  • It must be a complete binary tree. A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible.

Below are two representations of a min-heap (a heap where the child-parent relationship is “<”) and a max-heap (a heap where the child-parent relationship is “>”).

1
2
3
4
5
6
    1               10
   / \            /    \  
  5   3          7      3  
 /              / \    / \
7              3   5  1  -10
 min-heap        max-heap

The above two heaps can be represented in arrays like below:

1
2
min-heap: [1, 5, 3, 7]
max-heap: [10, 7, 3, 3, 5, 1, -10]

The implementation of heaps are very easy to do. But python has built-in heap modules to leverage our coding load. Below is TL;DR of the heaps documentation which is used to solve the problem. To learn more about heapq, please refer to the documentation.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
import heapq

an_array = []

heapq.heappush(an_array, 5)
heapq.heappush(an_array, 3)
heapq.heappush(an_array, 6)
heapq.heappush(an_array, 1)
heapq.heappush(an_array, 9)

print(an_array)
# [1, 3, 6, 5, 9]

print(heapq.heappop(an_array))
# 1
print(heapq.heappop(an_array))
# 3

print(an_array)
# [5, 9, 6] # 5 is top of tree

Step 3: addNum(int) implementation

Now that we are familiar with basic operations of heaps, In this problem, we are going to use two heaps to solve. The first one is a min-heap, and the second one is a max-heap. As we just want the middle of the numbers, the max-heap named left will keep the first half of all the numbers so that it has the middle value (or the left value of the middle values if there are even numbers) on top of the heap. And the min-heap named right will hold the second half of all the numbers so that it has the right value of the middle value/s. As there are multiple functions to use in the class, we have to make member variables to use these in all functions of the class.

1
2
3
4
def __init__(self):
    # Initialization
    self.left = [] # A minheap top is left[0]
    self.right = [] # A maxheap top is -right[0]

Python does not have any implementation for max-heaps. But it can be achieved with a very simple trick. Just negate the values before pushing them to the heap!

We will push the new values to the left first, then pop that value back and push it to the right heap. This makes the left and right arrays always heapified.

1
2
3
def addNum(self, num: int) -> None:
    heapq.heappush(self.left, num)
    heapq.heappush(self.right, -heapq.heappop(self.left))

To make the left heap always have the middle value, it needs always to have more or equal values than right heap. This way, it is guaranteed that left will hold the median if the number of elements is odd. But the new value is currently in the right heap (possibly making it bigger than left). So we compare the size of both, and move one value from right to left.

1
2
3

if len(self.left) < len(self.right):
    heapq.heappush(self.left, -heapq.heappop(self.right))

This completes the addNum(int) function implementation.

Step 4: findMedian() implementation

Because we have the median at the top of the two heaps, this implementation is pretty straight forward. We will check if we have even or odd values. We can also check if size of left is greater than size of right and return the top of left if not equal (total even numbers) or return the average of two tops if equal (total odd numbers). Remember that the top of the heap is already the first element of the array. And also remember that because the numbers were inserted into the max-heap after negating, you will have to negate it again to get the actual value. So when calculating the medium, we will subtract the top of right instead of adding with top of left.

1
2
3
4
def findMedian(self) -> float:
    if len(self.left) > len(self.right):
        return self.left[0]
    return (self.left[0] - self.right[0]) * 0.5

The complexity of addNum(int) is O(logn) because we are inserting a value in a binary tree. The complexity of findMedian(self) is constant.

The final solution is given below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
import heapq

class MedianFinder:
    def __init__(self):
        # Initialization
        self.left = [] # A minheap top is left[0]
        self.right = [] # A maxheap top is -right[0]
        

    def addNum(self, num: int) -> None:
        heapq.heappush(self.left, num)
        heapq.heappush(self.right, -heapq.heappop(self.left))
        
        if len(self.left) < len(self.right):
            heapq.heappush(self.left, -heapq.heappop(self.right))

    def findMedian(self) -> float:
        if len(self.left) > len(self.right):
            return self.left[0]
        return (self.left[0] - self.right[0]) * 0.5
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
import heapq

class MedianFinder:
    def __init__(self):
        # Initialization
        self.left = [] # A minheap top is left[0]
        self.right = [] # A maxheap top is -right[0]
        

    def addNum(self, num: int) -> None:
        heapq.heappush(self.left, num)
        heapq.heappush(self.right, -heapq.heappop(self.left))
        
        if len(self.left) < len(self.right):
            heapq.heappush(self.left, -heapq.heappop(self.right))

    def findMedian(self) -> float:
        if len(self.left) > len(self.right):
            return self.left[0]
        return (self.left[0] - self.right[0]) * 0.5
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class MedianFinder {
    priority_queue<int> left;
    priority_queue<int, vector<int>, greater<int> > right;
public:
    /** initialize your data structure here. */
    MedianFinder() { }
    
    void addNum(int num) {
        left.push(num);
        right.push(left.top());
        left.pop();

        if(left.size() < right.size()) {
            left.push(right.top());
            right.pop();
        }
    }
    
    double findMedian() {
        if(left.size() > right.size())
            return left.top();
        return (left.top() + right.top()) * 0.5;
    }
};

/**
 * Your MedianFinder object will be instantiated and called as such:
 * MedianFinder* obj = new MedianFinder();
 * obj->addNum(num);
 * double param_2 = obj->findMedian();
 */
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import java.util.*;
import java.lang.*;

class MedianFinder {
    /** initialize your data structure here. */
    PriorityQueue<Integer> left;
    PriorityQueue<Integer> right;
    public MedianFinder() {
        left = new PriorityQueue<Integer>();    // Java min-heap
        right = new PriorityQueue<Integer>(     // Java max-heap
            (Integer val1, Integer val2) -> (- Integer.compare(val1, val2))
        );
    }
    
    public void addNum(int num) {
        left.add(num);
        right.add(left.remove());

        if(left.size() < right.size())
            left.add(right.remove());
    }
    
    public double findMedian() {
        if(left.size() > right.size())
            return left.peek();
        return (left.peek() + right.peek()) * 0.5d;
    }
}

/**
 * Your MedianFinder object will be instantiated and called as such:
 * MedianFinder obj = new MedianFinder();
 * obj.addNum(num);
 * double param_2 = obj.findMedian();
 */
Share on

Rahat Zaman
WRITTEN BY
Rahat Zaman
Graduate Research Assistant, School of Computing