Difficulty: Hard
Category: Strings
Step 1: Grasping the Problem
A strings text
and characters
is given. We need to return the smallest substring of text
which contains all the characters in characters
.
The problem would have been easier if it said all the characters is the string characters
will be unique. Then we could just manage a set and push characters from text
until we get all the characters. But as the characters might not be unique, instead of set, we will need a hashtable/counter to keep track of the frequencies of each characters we encounter in string text
.
A substring is a sequential subset of the given string text
. And we need to minimize this sequential subset to as small as possible. This tells us that a sliding window minimization approach is best to solve this problem.
We will try to keep the sliding window as small as possible. For this, first we need to start the window with 0 length. We will have to option other than increasing the window size until a valid substring is found. Whenever substring is found, we will keep it as a potential result and try shrinking the window for smaller substring. When we cannot shrink the window, we will start increasing it again until the right bound of the window reaches the end.
Below is a visual demonstration of the process. Here, text = "ADCBCAAB"
and characters = "BCAC"
.
Step 2: Sliding Window Initialization
A sliding window needs four things to operate.
- The left boundary of the window (
left
for this solution). - The right boundary of the window (
right
for this solution). - The target variable (
best_text
for this solution). This variable stores the target we want to get from the window. It can be of many variables or states like the starting and ending of the substring (which will be more efficient). But for simplicity, in this solution, we will take the substring directly. - Any intermediate variables to assert the conditions of window expanding/shrinking ability. In this problem, we need two python
collections.Counter
objects to keep track if we have all the characters we need.
Initially, we want the target variable to be the worst or unfeasible. So that it will be overwritten by any existent solution in the window expanding/shrinking loop. The initialization will look something like this:
|
|
Step 3: Expand the window
The sliding window will run until we have reached the end of the string. But do we need the left
variable to reach the end too? Yes, because we are going to need the smallest value. But we do not need that condition to check in the outer loop, we will take care in the shrinking loop.
So looping until right
has reached end will suffice. And at each time window is shrank, we will get a character char
. We will take it in our `chars
|
|
Step 4: Shrinking ability check
We can only expand the window when we have all the characters we need in our chars_taken
counter. One way to check this is looping through all the elements in chars_given
and check if that character in chars_taken
is greater. Something like below:
|
|
But python has a more elegant &
operator with better time complexity to handle this. chars_given & chars_taken
returns intersection of the two counters. We need to check if the intersection is equal to chars_given
.
|
|
Step 5: Get Minimum Substring
We will need to update best_text
everytime when we change the window size. This is for the following reasons.
- When expanding the window: It would not be necessary at all if we had guaranteed substring. But if it wasn’t guaranteed,
best_text
needs to be updated if there is only one string matching our criteria. - When shrinking the window: We are going to need the smallest substring matching the criteria, so each time the window shrinks, there is a possibility of getting a smaller substring, so we need to update to the smallest.
We will check the window size (right - left
) with the length of best_text
then update best_text
by slicing if needed.
|
|
Step 6: Shrink the Window
Shrinking the window means removing a character from left
in this context. So we will remove a character from chars_taken
as well. And finally update the window bound left
.
|
|
Step 7: Return Smallest Substring
After the while loops ends, the best_text
is guaranteed to be the smallest possible substring. But, the problem statement does not guaranteed that there will be a valid substring in text
. So we need to check whether there is a valid substring or not before returning best_text
.
Before the loops, best_text
had a dummy string with length longer than the given string. If it is not updated inside the sliding window, that means best_text
will still have that large dummy string. So we can check if length is greater than length of text
and then return accordingly.
|
|
Every sliding window problem takes O(n)
time itself, where n
is the length of text
. Inside the loop, we are making a look-up in a counters which is implemented by hashtables and incrementing and decrementing variables. The intersection of counters however will take O(S)
time where S
is the minimum number of elements in best_text
(the answer). It can be told that S
is very much negligible compared to n
. So the time complexity of the algorithm will remain O(n)
and space complexity is also O(n)
in the worst case. The program is given below.
|
|
|
|
|
|