# Power Strings

Jason Menard
Sheriff
Posts: 6450
Borrowed from the Northeastern Illinois University Computer Science Society
Power Strings
=============
Given two strings a and b we define a*b to be their concatenation. For example, if a = "abc" and b = "def" then a*b = "abcdef". If we think of concatenation as multiplication, exponentiation by a non-negative integer is defined in the normal way: a^0 = "" (the empty string) and a^(n+1) = a*(a^n).
Each test case is a line of input representing s, a string of printable characters. For each s you should print the largest n such that s = a^n for some string a. The length of s will be at least 1 and will not exceed 1 million characters. A line containing a period follows the last test case.

Joe Pluta
Ranch Hand
Posts: 1376
Not exactly pretty, but I'm tired:

Kao-Wei Wan
Greenhorn
Posts: 7

[ July 20, 2003: Message edited by: Kao-Wei Wan ]

Jim Yingst
Wanderer
Sheriff
Posts: 18671
Here's my solution. Nothing as short and sweet as the above code, but I think my algorithm will be much more efficient for long strings. The basic idea is to find all the prime factors of the input length, and use this info to drastically reduce the number of test which need be performed.
E.g. consider an input of length 30, with factors 2, 3, 5. First test to see if a power of 2 works - is the first half of the string equal to the second? If not, then there will be no point in testing any other multiples of two either. For any even power, the first half of the string would have had to be equal to the second.
Alternately if 2 does work as a power, then to test for additional powers it's sufficient to test just the first half of the string, not the whole string. E.g if the max power was 6, the the power of 2 would have shown up first, and the first 15 chars will also have a power of 3. And once you've found powers of 2 and 3, that means that the initial 30-char string consists of 6 repetitions of a single 5-char string. The only remaining untested factor is 5, and so the only remaining thing to test is if that 5-char substring consists of a single char repeated 5 times. Which will fail if the max power was really 6.
Generally my approach will be a little bit slower than Joe's if the input has very high power, meaning it repeats every 1, 2, 3 chars or so. Becaust those are the first types of patterns Joe tests. But for patterns with longer periods, or patterns with no repetition at all, the factoring method should be a big win, performing far fewer tests.
Primes.java:
PrimeFactors.java:
PowerStrings.java:
And some JUnit tests which came in handy:
PowerStrings.java:
[ July 21, 2003: Message edited by: Jim Yingst ]

Jim Yingst
Wanderer
Sheriff
Posts: 18671
Heh. Well, after doing it my way I finally took a closer look at KWW's code, and see now that there's a reasonably simple algorithm I missed. Ah well. I already had the Primes and PrimeFactors code - the additional code I wrote here is roughly equivalent in complexity to KWW's code, but when you include the two preexisting classes as well, mine's a lot more complex. In terms of performance they're very comparable (both kicking Joe's code's butt) Ah well, live and learn...

Kao-Wei Wan
Greenhorn
Posts: 7
Thanks,Jim Yingst
Now I add a liitle code to the original to make it more efficient
This will use the searching result to help us to find the sub-string more quickly...
Don't waste any useful information

[ July 21, 2003: Message edited by: Kao-Wei Wan ]

Jim Yingst
Wanderer
Sheriff
Posts: 18671
Well, this is fun. I decided to look more at performance of these two programs (KWW's and mine) in various circumstances. I created a wider variety of tests:

I revised the makeSampleString() method to use lower-case alpha chars only, in a (pseudo-)random order. Though the random seed is set to 1 always, for reproducibility. This didn't have a big effect on stats. I also created a number of additional tests for very long strings with either very high or very low powers. These produced the most extreme variations in performance between the two programs
Here are the times for the last 17 tests for each program. (All the earlier tests were too short to measure really - all 0-10 milliseconds.

The last six were interesting - I created some very unusual strings. For rows 12-14 I created very long strings that were all the same character, except the very last char is different. E.g. oooooooooooX, but much longer. (So it looks like a very high-power string until the end, but it's actually power 1.) Then for rows 15-17 I put the odd char in front rather than at the end - Xooooooooooo. I was expecting KWW to perform poorer on 12-14, and better on 15-17 - was surprised to see the reverse. I'm still not sure why that is. It's very annoying - without rows 12-14 I think I could claim a clear victory. Oh well...
Incidentally, I did use KWW's revised code, but it didn't seem to make any significant difference.

Jim Yingst
Wanderer
Sheriff
Posts: 18671
Of course for brevity, there's also

Unfortunately the performance for this, well, sucks. Especially once it gets to line 13 from the previous table - I skipped 13 and 14 because they tookway too long. Here's the revised table:

Joe Pluta
Ranch Hand
Posts: 1376
(frumph!)
Hey, I'm an application developer, not a CompSci major . I didn't see anything about performance as part of the question! My job is to get the code to work.
Besides, my version kicks BUTT on one-character strings .
Anyway, good job, guys. You not only showed me some great code, but made my head hurt at the same time!
Joe

Jim Yingst
Wanderer
Sheriff
Posts: 18671
Hey, I'm an application developer, not a CompSci major . I didn't see anything about performance as part of the question! My job is to get the code to work.
Absolutely. And being first to market is a big plus, congratulations. But it also means that those who come later need to work harder to differentiate themselves somehow. Kicking your butt in performance seemed like a good way to do that.
Besides, my version kicks BUTT on one-character strings .

Anyway, good job, guys. You not only showed me some great code, but made my head hurt at the same time!
In fact this was the true objective.
I am curious if any regex whizzes out there (you know who you are) see any good alternative to the regex solution I posted. Seems like there are often a bunch of different ways to approach a problem with a regex; so far I'm only seeing one, so I figure I'm probably missing something. Of course we could replace //A and //Z with ^ and \$; no big deal there. Is there any significant variation possible?

Kao-Wei Wan
Greenhorn
Posts: 7
Jim Yingst, you are nice to test the performance
for us.
Thank you
Can you do me a favor to test the performance of the code below,please? Thank you very much
I want to know if this will be better or not
Thank you

[ July 23, 2003: Message edited by: Kao-Wei Wan ]
[ July 23, 2003: Message edited by: Kao-Wei Wan ]