Longest Substring with At Most K Distinct Characters (Facebook Interview 02/2019)

This question was asked in Facebook Interview

Source: (https://www.techiedelight.com/find-longest-substring-containing-k-distinct-characters/)

Given a string you need to print longest possible substring that has exactly M unique characters. If there are more than one substring of longest possible length, then print any one of them.

Examples:

"aabbcc", k = 1
Max substring can be any one from {"aa" , "bb" , "cc"}.

"aabbcc", k = 2
Max substring can be any one from {"aabb" , "bbcc"}.

"aabbcc", k = 3
There are substrings with exactly 3 unique characters
{"aabbcc" , "abbcc" , "aabbc" , "abbc" }
Max is "aabbcc" with length 6.

"aaabbb", k = 3
There are only two unique characters, thus show error message.

This question was asked by Facebook a month ago. It seems that string related problems are quite popular recently among companies like Google, Amazon etc..

Brute force

If you want to come up with a naive solution, brute force should be definitely the number one thing in your head. In this example, it should be extremely straightforward for you to get this approach – Iterate all the substrings and then check if any of them contains K unique characters. After the loop, you can return the longest one.

Let’s say the string has length N, the substring iteration has time complexity of O(N^2). To check if substring contains K unique characters, you have to go over the substring once, which is O(N). Therefore, the overall time complexity would be O(N^3), which is quite slow as expected.

Time complexity analysis

Let’s talk about some high-level strategies first. If we want to make it faster than O(N^3), it should be something like O(N^2), O(NlogN), O(N) or even O(1).

Since we have to iterate the string at least once, the lower bound should be O(N). So O(1) or O(logN) can be ignored. Also, the problem seems unrelated to sorting or search (if you sort the array, the substring order is not kept), so it’s unlikely to have logN. Thus O(NlogN) is out. As a result, we can focus on O(N^2) and O(N).

Optimization

If you have read here on http://blog.gainlo.co – If a String Contains an Anagram of Another String, you should realize how similar these two questions are.

Similarly, you can take use of sliding window. Use a start pointer and an end pointer to represent a substring in a sliding window, if the current substring has K unique characters or less, it means the string can potentially be longer and you just need to move forward the end pointer by one character.

If after moving forward the end pointer the current substring exceeds K unique characters, it indicates that the previous substring is a potential candidate that has K unique characters. In this case, you just need to move forward the start pointer until the current substring has K or less unique characters.

Remember that the reason sliding window works here (and other places as well) is that you never need to move the end pointer back, which makes the time complexity O(N).

Along the way, you can use a hash map to keep track of the character frequency of each substring and a global counter to track current number of unique characters. When moving the sliding window, adjust the hash map accordingly and once new key is inserted or an old one gets cleared, increase/decrease the global counter.

Below are the implementations of above. The implementations assume that the input string alphabet contains only 26 characters (from ‘a’ to ‘z’). The code can be easily extended to 256 characters.

// C++ program to find the longest substring with k unique
// characters in a given string
#include <iostream>
#include <cstring>
#define MAX_CHARS 26
using namespace std;
 
// This function calculates number of unique characters
// using a associative array count[]. Returns true if
// no. of characters are less than required else returns
// false.
bool isValid(int count[], int k)
{
    int val = 0;
    for (int i=0; i<MAX_CHARS; i++)
        if (count[i] > 0)
            val++;
 
    // Return true if k is greater than or equal to val
    return (k >= val);
}
 
// Finds the maximum substring with exactly k unique chars
void kUniques(string s, int k)
{
    int u = 0; // number of unique characters
    int n = s.length();
 
    // Associative array to store the count of characters
    int count[MAX_CHARS];
    memset(count, 0, sizeof(count));
 
    // Traverse the string, Fills the associative array
    // count[] and count number of unique characters
    for (int i=0; i<n; i++)
    {
        if (count[s[i]-'a']==0)
            u++;
        count[s[i]-'a']++;
    }
 
    // If there are not enough unique characters, show
    // an error message.
    if (u < k)
    {
        cout << "Not enough unique characters";
        return;
    }
 
    // Otherwise take a window with first element in it.
    // start and end variables.
    int curr_start = 0, curr_end = 0;
 
    // Also initialize values for result longest window
    int max_window_size = 1, max_window_start = 0;
 
    // Initialize associative array count[] with zero
    memset(count, 0, sizeof(count));
 
    count[s[0]-'a']++;  // put the first character
 
    // Start from the second character and add
    // characters in window according to above
    // explanation
    for (int i=1; i<n; i++)
    {
        // Add the character 's[i]' to current window
        count[s[i]-'a']++;
        curr_end++;
 
        // If there are more than k unique characters in
        // current window, remove from left side
        while (!isValid(count, k))
        {
            count[s[curr_start]-'a']--;
            curr_start++;
        }
 
        // Update the max window size if required
        if (curr_end-curr_start+1 > max_window_size)
        {
            max_window_size = curr_end-curr_start+1;
            max_window_start = curr_start;
        }
    }
 
    cout << "Max sustring is : "
         << s.substr(max_window_start, max_window_size)
         << " with length " << max_window_size << endl;
}
 
// Driver function
int main()
{
    string s = "aabacbebebe";
    int k = 3;
    kUniques(s, k);
    return 0;
}

Output:

Max sustring is : cbebebe with length 7

Time Complexity: Considering function “isValid()” takes constant time, time complexity of above solution is O(n).

Comments

  1. agen judi bola

    Way cool! Some ѵery valid points! I appгeciate
    you wгiting this aгticle plus the rest of the website is also really good.

  2. bandar judi bola

    I’ve learn ѕome just гight stuff here. Certainly price
    bookmarkіng for гevisiting. I surprise how much effort you set too
    create this sort ⲟf magnificent informative
    website.

  3. judi bola

    I’m іmpresseԁ, I must say. Rarely do І encoսnter a blοg thаt’sequally educative and
    interesting, and let me tell you, you hve hit the nail ⲟnn
    the head. The problem is an issu that not enough menn and wonen aare speaking intelligently аbout.
    I am very haрpy that I came acrօss this in my һunt for
    sоmething regarding this.

  4. judi bola

    We stuimƅled over here from a didferent website and
    thօught I may ass well check things out. I liҝe wyat I see soo now i’m foⅼlowing you.
    Look forward t᧐ going over your web page fߋr a second time.

  5. [email protected]@baicyl01.com

    Hey! I could have sworn I’ve ƅeen to this blg before but after readding through
    some օf the post I reɑlized it’s new tο me. Nonetheless, I’m
    definitely glad I found it and I’ll bee book-marking aand checkibg
    bazck frequently!

  6. mahirqq

    Ꮋelⅼ᧐ there! Quick question that’s totally off topic.
    Do you know how to mmake your site mobile friendly? My site looks
    weird when browsing from my iphone. I’m trying to find a templɑte oοr plugin that might be able tt᧐ correct thiѕ issue.

    If you һave any suggestions, please share.

  7. Life Experience Degree

    Hello I am so thrilled I found your webpage, I really found you by accident, while I was looking on Askjeeve for something else, Anyways I am here now and would just like to say thanks a lot for a tremendous post and a all round enjoyable blog (I also love the theme/design), I don’t have time to read it all at the minute but I have book-marked it and also added in your RSS feeds, so when I have time I will be back to read more, Please do keep up the excellent work.

  8. jempolqq

    Thanks іn support of sharing such a pⅼeaѕant thouցht, pɑragraph is fastidious,
    tһatѕ why i have read it fuⅼly

  9. mahir888.com

    Hello to every one, it’s actually а pleasant for me to pay a
    quick visit this ѕite, it contaіns priceless Informatіon.

  10. infonya disini

    Hello! Ɗo you know if they make any plugins to assist with Search Engine Optimization? I’m tryіng to get mmy blog to rank for some tarɡeted keywords but I’m not seeing ѵery
    gߋod gains. Ӏf you кnkw of any plerase
    shaгe. Thanks!

  11. infonya disini

    It’ѕ an remarkable article deѕigned foor aⅼl the internet visitors; they will take advantage frfom it I am sure.

  12. rental mobil murah pontianak murah

    Μagnificnt beat ! I wsh to apprentice while you amnd your web site, how could i subscribe for a blog site?
    The account helped me a ɑcceptable deal. I had been tiny bit acquainted

  13. minecraft

    Hi there, I log on to your new stuff regularly.
    Your story-telling style is witty, keep up the good work!

  14. minecraft

    I have been surfing online more than 2 hours today, yet I never
    found any interesting article like yours. It’s pretty worth enough for
    me. Personally, if all site owners and bloggers made good content as you did, the internet
    will be a lot more useful than ever before.8

  15. minecraft

    Excellent blog! Do you have any hints for aspiring writers?
    I’m hoping to start my own blog soon but I’m a little lost on everything.
    Would you advise starting with a free platform like WordPress or go for a paid option? There are so many choices out there that I’m
    totally overwhelmed .. Any ideas? Cheers!