Find median from data stream#

Levels: level-5
Data structures: heap

LeetCode

Description#

  • Median is the middle value in an ordered integer list.

  • If the size of the list is even, there is no middle value.

  • So the median is the mean of the two middle value.

    1For example,
    2[2,3,4], the median is 3
    3
    4[2,3], the median is (2 + 3) / 2 = 2.5
  • Design a data structure that supports the following two operations:

    • void addNum(int num): Add a integer number from the data stream to the data structure.
    • double findMedian(): Return the median of all elements so far.

Example#

1addNum(1)
2addNum(2)
3findMedian() -> 1.5
4addNum(3)
5findMedian() -> 2

Follow up#

  • If all integer numbers from the stream are between 0 and 100, how would you optimize it?
  • If 99% of all integer numbers from the stream are between 0 and 100, how would you optimize it?

Python Solution#

 1from heapq import heappush, heappushpop, heappop
 2
 3
 4class MedianFinder:
 5    def __init__(self):
 6        self.heaps = [], []
 7
 8    def addNum(self, num):
 9        small, large = self.heaps
10        heappush(small, -heappushpop(large, num))
11        if len(large) < len(small):
12            heappush(large, -heappop(small))
13
14    def findMedian(self):
15        small, large = self.heaps
16        if len(large) > len(small):
17            return float(large[0])
18        return (large[0] - small[0]) / 2.0