LeetCode 2781: Length of the Longest Valid Substring
Problem Description
You are given a string word
and an array of strings forbidden
.
A string is called valid if none of its substrings are present in forbidden
.
Return the length of the longest valid substring of the string word
.
A substring is a contiguous sequence of characters in a string, possibly empty.
Example 1
Input: word = "cbaaaabc", forbidden = ["aaa","cb"]
Output: 4
Explanation: There are 11 valid substrings in word: "c", "b", "a", "ba", "aa", "bc", "baa", "aab", "ab", "abc" and "aabc". The length of the longest valid substring is 4.
It can be shown that all other substrings contain either "aaa" or "cb" as a substring.
Example 2
Input: word = "leetcode", forbidden = ["de","le","e"]
Output: 4
Explanation: There are 11 valid substrings in word: "l", "t", "c", "o", "d", "tc", "co", "od", "tco", "cod", and "tcod". The length of the longest valid substring is 4.
It can be shown that all other substrings contain either "de", "le", or "e" as a substring.
Solution
The problem has two degrees of freedom: start
of the substring, and end
of the substring. A brute-force method would need to check every possible substring word[start,end]
, in total N^2
cases. To improve the performance, DP or memoization helps in general.
The idea is to build the solution based on a smaller-size problem. Suppose we already know the solution for word[i+1:]
, how should we build the solution for word[i:]
? Of course, we have to check substring starting at the index i
to see whether it appears in forbidden
; in other words, we need to check substrings: word[i,i+1]
, word[i,i+2]
, ... , word[i:]
. If at some point, we find word[i,j]
appears in forbidden
for the first time, then word[i,j-1]
would be possible valid substring, and we do not need to check longer substring. However, word[i,j-1]
is ONLY POSSIBLE VALID, since we do not know whether its substring word[k,j-1]
(i<k<j-1
) is valid. But do we really need to check word[k,j-1]
? No, because this situation should have already been checked earlier in previous steps when we solve for word[k:]
. Therefore, this type of information should be memorized to be used later. This type of information needs to tell us a end
index, such that for any substring starting at j
(j>i
), word[j,end]
is always valid. Then, we can be sure that word[i,min(j-1,end)]
must be valid. To update end
at each step, when we find word[i,j]
appears in forbidden
for the first time, end
is set to j
to indicate any substring must have to stop at j
in order to be valid in future steps.
To efficiently check whether a substring exists in forbidden
, we could use a HashSet
or Trie
, which leads to two possible solutions below.
Code
HashSet
class Solution {
public int longestValidSubstring(String word, List<String> forbidden) {
HashSet<String> set = new HashSet<>();
int max_l = 0;
for( String s : forbidden ){
set.add( s );
max_l = Math.max( max_l, s.length() );
}
int res = 0;
int end = word.length();
for( int start = end-1; start>=0; start-- ) {
StringBuilder str = new StringBuilder();
for(int tmp = start; tmp<end && (tmp-start)<max_l; tmp++) {
str.append( word.charAt(tmp) );
if( set.contains(str.toString()) ){
end = tmp;
break;
}
}
res = Math.max( res, end-start );
}
return res;
}
}
Trie
class Solution {
class Node {
boolean isEnd = false;
Node[] children = new Node[26];
}
private Node root;
private void insert( String s ){
Node curr = this.root;
for( char c : s.toCharArray() ){
if(curr.children[c-'a'] == null) curr.children[c-'a'] = new Node();
curr = curr.children[c-'a'];
}
curr.isEnd = true;
}
private int findEnd( int s, int e, String str) {
Node curr = this.root;
for( int i = s; i<e; i++ ) {
char c = str.charAt(i);
if( curr.children[c-'a'] == null ) return e;
curr = curr.children[c-'a'];
if( curr.isEnd ) return i;
}
return e;
}
public int longestValidSubstring(String word, List<String> forbidden) {
this.root = new Node();
for(String s : forbidden){
this.insert( s );
}
int res = 0;
int end = word.length();
for( int start = end-1; start>=0; start-- ) {
end = this.findEnd( start, end, word );
res = Math.max( res, end-start );
}
return res;
}
}
The post is published under CC 4.0 License.