Sliding Window Median - Complete Solution Guide
Sliding Window Median is LeetCode problem 480, a Hard level challenge. This complete guide provides step-by-step explanations, multiple solution approaches, and optimized code in python3, java, cpp, c.
Problem Statement
The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle values. For examples, if arr = [2, 3 ,4] , the median is 3 . For examples, if arr = [1, 2,3 ,4] , the median is (2 + 3) / 2 = 2.5 . You are given an integer array nums and an integer k . There is a sliding window of size k which is moving from the very left of the array to the very right. You can only see the k numbers in the window. Ea
Detailed Explanation
The problem asks us to find the median of a sliding window of size `k` as it moves across an array `nums`. The window moves one position at a time, and we need to return an array containing the median value for each window position. The median is the middle element when the elements in the window are sorted. If the window size `k` is even, the median is the average of the two middle elements.
Solution Approach
The solution uses two heaps: a max-heap (`lo`) to store the smaller half of the elements in the current window and a min-heap (`hi`) to store the larger half. The max-heap's root will always be the largest value in the smaller half, and the min-heap's root will always be the smallest value in the larger half. This arrangement allows for efficient calculation of the median. When the window slides, we remove the outgoing element and add the incoming element while maintaining the heap balance. A hashmap `to_remove` is used to track elements that need to be removed from the heaps. The balancing is done to ensure that the sizes of the heaps are either equal (if `k` is even) or the max-heap has one more element (if `k` is odd). This enables calculating the median in O(1) time by looking at the top elements of the heaps.
Step-by-Step Algorithm
- Step 1: Initialize two heaps: `lo` (max-heap) and `hi` (min-heap). Also, initialize a hash map `to_remove` to keep track of elements to be removed.
- Step 2: Fill the initial window (first `k` elements) into the heaps. Push the first `k` elements into the max-heap `lo`. Then move the smallest `k // 2` elements from `lo` to the min-heap `hi` to balance them initially.
- Step 3: Calculate the median of the first window. If `k` is odd, the median is the top element of `lo`. If `k` is even, the median is the average of the top elements of `lo` and `hi`.
- Step 4: Iterate through the rest of the array (from index `k` to `nums.length - 1`).
- Step 5: For each iteration, remove the element that's leaving the window (`out_num`) and add the new element that's entering the window (`in_num`). Update the `to_remove` map with the `out_num`.
- Step 6: Based on whether `out_num` and `in_num` should belong to the `lo` or `hi` heap, update the heap and calculate the `balance` (difference in sizes) after each operation
- Step 7: Rebalance the heaps to maintain the property that `lo` has either the same number of elements as `hi` (if `k` is even) or one more (if `k` is odd).
- Step 8: Clean the heap's tops by removing delayed deletions, ensuring that any delayed deletions present at the top are cleared from both heaps. Delayed deletions are maintained within a hashmap.
- Step 9: Calculate the median of the current window as in step 3 and add it to the result.
- Step 10: Repeat steps 5-9 until the end of the `nums` array.
Key Insights
- Insight 1: Using two heaps (a min-heap and a max-heap) allows efficient tracking of the window's elements and finding the median without sorting the window in each step.
- Insight 2: A crucial optimization is to use a `to_remove` map to handle elements that are no longer in the sliding window. This avoids directly removing elements from the heaps, which would be an O(k) operation, significantly improving performance.
- Insight 3: Balancing the heaps to maintain the median efficiently is crucial. The max-heap stores the smaller half of the elements, and the min-heap stores the larger half. Maintaining this balance after each window slide is essential for O(1) median retrieval.
Complexity Analysis
Time Complexity: O(n log k)
Space Complexity: O(k)
Topics
This problem involves: Array, Hash Table, Sliding Window, Heap (Priority Queue).
Companies
Asked at: Datadog, DoorDash, Flipkart, Point72, Snowflake.