Third, the given problem is: "find the shortest string".
In your case, a "string" is equivalent to one line from the file.
So, if you have found a line that matches the condition, and the length of this line is X,
then you need to check only lines that have length < X.
For each new line check if it size is < X, if no - then just skip this line and take the next one.
If you find a new line shorter than X, that meet the condition too, then set X = length on this new line.
In this way you avoid searching substrings within the string probably for 50% or even more lines.

Ireneusz Kordal wrote:Third, the given problem is: "find the shortest string".
In your case, a "string" is equivalent to one line from the file.
So, if you have found a line that matches the condition, and the length of this line is X,
then you need to check only lines that have length < X.

Hmmm. Not so sure about that (although I totally agree that contains() is better than a regex).

If you used indexOf() and stored the results, you could then check the span between the first and last occurrence to work out your "shortest string", maybe something like:but the wording of the problem is very open to interpretation.

Winston

BTW: The above could be optimized along the lines that Ireneusz suggested to eliminate the third search if the first two produce a span that is greater than the one you already have; but personally I suspect it'll make the code a lot messier for not a lot of gain.

Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here

What if the shortest string is between lines? What if the shortest string is a sub string in a line? ex: a w1 w2 w1 w3 x

I believe the shortest Time possible is O(n) for this problem, since divide and conquer would give O(n log n).

Your algorithm would be the almost correct one since checking for 6 patterns is still consider constant order, you would only have to change your line reader and evaluate more like the pseudo code from Ireneusz Kordal.

Performance means the fastest and right one.

To be completely honest this is a limited example of maximum or minimum sub sequence where S is only of w1, w2, w3 and a that represents any other word.