programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
• Tim Cooke
• Campbell Ritchie
• Paul Clapham
• Ron McLeod
• Liutauras Vilda
Sheriffs:
• Jeanne Boyarsky
• Rob Spoor
• Bear Bibeault
Saloon Keepers:
• Jesse Silverman
• Tim Moores
• Stephan van Hulst
• Tim Holloway
• Carey Brown
Bartenders:
• Piet Souris
• Al Hobbs
• salvin francis

# How to merge consecutive values into a range?

Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:
Sorted String İnput: 1, 2, 2A, 2B, 2C, 3, 3A, 4, 5, 6, 32C, 32D, 50, 51/1, 51/2, 60, 61, 62, 200-2E, 200-2F, 200-2G, 200-2H, 201C, 201/21P, 201/21R, 201/21S,300,300A,301-2A, 542/2K, 542/2L,583-1, 583-585D, 583-585E, 605, 605A, 605B,605C 800A.
Question is  about merging consecutive values into a range. E.g. 4,5,6 are consecutive values, so range is [4-6], and 2A, 2B, 2C are consecutive values, so range is 2[A-C]. No other values can be in a range.

output:
1
2
2[A-C]
3
3A
[4-6]
32[C-D]
50
51/1
51/2
[60-62]
200-2[E-H]
201C
201/21[P-S]
300
300A
301-2A
542/2[K-L]
583-1
583-585[D-E]
605
605[A-C]
800A

code output= merged: [1, 2, 2A, 2B, 2C, 3, 3A, [4-6], 32C, 32D, 50, 51/1, 51/2, [60-62], 200-2E, 200-2F, 200-2G, 200-2H, 201C, 201/21P, 201/21R, 201/21S, 300,300A,301-2A,542/2K, 542/2L,583-1, 583-585D, 583-585E, 605, 605A, 605B,605C 800A]

for example, I sorted 4,5,6 consecutive values in the range [4-6].
I have done consecutive numbers "[]" in this format but I couldn't make the letters. Sorted list should not be disrupted.

Sheriff
Posts: 3837
66
• Number of slices to send:
Optional 'thank-you' note:
Welcome to the Ranch!

I think we need more precise specification of the requirements. Generally, the input strings need to be split into parts on which the detection of consecutive ranges will happen. For example, "4" contains just one part. "2A" contains two: "2" and "A". "583-585D" contains how many? I can think of "583", "585" and "D", but perhaps "583-585" needs to stay together. What are all the rules that apply here?

Should he ranges be merged only for the last part of the string, or for the other parts too? For example, should "3A", "4A" and "5A" be merged into "[3-5]A"?

Why isn't "51/1" and "51/2" merged into "51/[1-2]" in the example output?

Without precise specification, no one can even tell whether a solution does meet all the requirements or not.

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:
Thanks.
Expected output is to only combine ranges for the trailing alphabets. 3A "," 4A "and" 5A numbers are consecutive but not letters.So it won't be sorted.for example : 3A 4B 5C = [3-5][A-C]

Values that are pure numbers can be in a range, e.g. [8-11]. Values that only differ by a letter at the end can be in a range. If the trailing letters are consecutive, they must be sorted.e.g 200-2E, 200- 2F, 200-2G, 200-2H =>200 [E-H]
e.g 200-2E, 200- 2F, 200-2H => 200-2[E-F] , 200-2H. It's broken.

"51/1" ve "51/2" "51 / [1-2]" No need to edit it.

I have to check the end of the value.

If there is a letter at the end of the value and the next value is a consecutive letter, I must include them in the range.

Martin Vashko
Sheriff
Posts: 3837
66
• Number of slices to send:
Optional 'thank-you' note:
What we know so far:

Only the trailing part of the string has to be processed. The entire trailing part contains either digits or letters.

Questions:
• What about uppercase and lowercase letters - are they the same or different? Is it ok to transform "a" and "B" to "[a-b]"? Or "[A-B]" Or not at all?
• The trailing part is made of all consecutive digits/letters found at the end of the string? Consider "1234" and "1235". What's the desired output: "[1234-1235]" or "123[4-5]"?
• Does the trailing part contain more than one letter at all? Consider the sequence "A01", "A02", ..., "A30". Is "A[01-30]" the desired output?

• "51/1" ve "51/2" "51 / [1-2]" No need to edit it.

Sorry, I don't understand this. Can you explain it in more detail, please?

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:
Sorted String İnput: 1, 2, 2A, 2B, 2C, 3, 3A, 4, 5, 6, 32C, 32D, 50, 51/1, 51/2, 60, 61, 62, 200-2E, 200-2F, 200-2G, 200-2H, 201C, 201/21P, 201/21R, 201/21S,300,300A,301-2A, 542/2K, 542/2L,583-1, 583-585D, 583-585E, 605, 605A, 605B,605C 800A.
All possibilities should be made according to this string.I've written all the necessary examples in the array.
No lower case.I wrote what should be in the range of output.
I've listed consecutive numbers.e.g 60 61 62 [60-62]
The consecutive letters at the end of the values should be put together.20A, 20B => 20 [A-B] =>To be included in the range, numbers must be the same, letters must be consecutive.

Marshal
Posts: 74381
334
• Number of slices to send:
Optional 'thank-you' note:
Welcome to the Ranch again.
I am afraid that last example doesn't add anything to the discussion. Please explain the rules you are using; something like 200‑2F might mean 200F 201F 202F or it might include 200A 201B 202C and 202F. Youi haven't made the rules clear yet.
When you make the rules really clear, those rules will help explain how to write your code.

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:
THANKS

200F 201F 202F pure numbers aren't. 200, 201 , 202 => [ 200-202]

Only consecutive letters should be placed in the range.
2A, 2B, 2C =>2[A-C]

32C, 32D =>32[C-D]

200-2E, 200-2F, 200-2G, 200-2H =>200-2[E-H]

201/21P, 201/21R, 201/21S   =>  201/21[P-S]

542/2K, 542/2L  => 542/2[K-L]

583-585D, 583-585E => 583-585[D-E]

605A, 605B,605C =>605[A-C]

To be included in the range, numbers must be the same, letters must be consecutive.I've written all possibilities

Campbell Ritchie
Marshal
Posts: 74381
334
• Number of slices to send:
Optional 'thank-you' note:
So what's 200‑2 and how does it differ from 200, 201, 202?

Sheriff
Posts: 16675
278
• Number of slices to send:
Optional 'thank-you' note:
It seems to me you have these  rules:

General pattern:
number[chars][A-Za-z]

number has to be consecutive to have a sequence
If chars is present, then they have to be equals() to be in a sequence
If the trailing letter is present, they have to be consecutive to have a sequence

A sequence is then abbreviated using: first + "-" + last

Where first is the first value in the sequence and last is the last value in the sequence.

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:
A project is also required.These are the numbers of something..I need to get the numbers into the range so they take up less space.

Martin Vashko
Sheriff
Posts: 3837
66
• Number of slices to send:
Optional 'thank-you' note:
Is this a homework, or a real-world problem? I ask because in a real-world problem, I'd expect that the example you listed in the first post might not represent all possible patterns in the input data.

You mentioned "[8-11]" earlier. Does it mean that "Z8", "Z9", "Z10" and "Z10" will be merged into "Z[8-11]"? Even if the token length doesn't match?

I understand that all trailing digits are processed as a token. Therefore, "ABC1234" and "ABC1235" will be merged into "ABC[1234-1235]". What about consecutive letters? Will "123AZ" and "123BA" be merged into "123[AZ-BA]" (because, assuming an English alphabet, BA is the next item in sequence after AZ). Similarly, should "8Z", "8AA" be merged into "8[Z-AA]"? (See how columns in Excel beyond the first 26 are named if you have trouble to understand these two examples.)

I still don't understand why "51/1" and "51/2" in the example weren't merged into "51/[1-2]". Did you make a mistake in your original post, or is it part of the requirement?

Junilu Lacar
Sheriff
Posts: 16675
278
• Number of slices to send:
Optional 'thank-you' note:
So how does 300A, 301B, 302C get merged? Would it be [300-301][A-C] [300-302][A-C]?

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:
yes.thank you. @Junilu Lacar
English is not my native language.I'm sorry I couldn't explain.

Junilu Lacar
Sheriff
Posts: 16675
278
• Number of slices to send:
Optional 'thank-you' note:

Martin Vashko wrote:
I still don't understand why "51/1" and "51/2" in the example weren't merged into "51/[1-2]". Did you make a mistake in your original post, or is it part of the requirement?

It seems my interpretation is correct or at least very close. Those items are not merged because the "chars" part of the pattern does not equals().

So I think this:

51/1, 52/1, 53/1 ==> not merged

51/1A, 51/1B, 51/1C ==> 51/1[A-C]

51/1A, 52/1B, 53/1C ==> [51-53]/1[A-C]

@OP please confirm my understanding is correct.

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:
a real-world problem. "Z8", "Z9", "Z10" ve "Z10"  =>No initial letters .
The letters are always at the end.There will only be one letter at the end of the value.

Martin Vashko
Sheriff
Posts: 3837
66
• Number of slices to send:
Optional 'thank-you' note:

Junilu Lacar wrote:So how does 300A, 301B, 302C get merged? Would it be [300-301][A-C]?

Assuming this is true, what would we do with this sequence: 300A, 300B, 300C, 301A, 301B, 301C, 302A, 302B, 302C?

Edit: and why not [300-302][A-C]?

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:
thanks yes Junilu Lacar ..
[300-302][A-C]  must be.

Junilu Lacar
Sheriff
Posts: 16675
278
• Number of slices to send:
Optional 'thank-you' note:

Martin Vashko wrote:

Junilu Lacar wrote:So how does 300A, 301B, 302C get merged? Would it be [300-301][A-C]?

Assuming this is true, what would we do with this sequence: 300A, 300B, 300C, 301A, 301B, 301C, 302A, 302B, 302C?

I see what you're getting at but that's probably getting a little too fractal. It seems the rules are more straightforward: 300[A-C], 301[A-C], 302[A-C] is how I would expect that to be merged. You're probably thinking that would now look like a pattern that would be again merged to [300-302][A-C], right? I think you only do one level of merge.

Junilu Lacar
Sheriff
Posts: 16675
278
• Number of slices to send:
Optional 'thank-you' note:

Martin Vashko wrote:

Junilu Lacar wrote:So how does 300A, 301B, 302C get merged? Would it be [300-301][A-C]?

Assuming this is true, what would we do with this sequence: 300A, 300B, 300C, 301A, 301B, 301C, 302A, 302B, 302C?

Edit: and why not [300-302][A-C]?

Sorry, that's what I meant to write.

Bartender
Posts: 4691
183
• Number of slices to send:
Optional 'thank-you' note:
Two other quick questions:

1) are the letters always consequtive? Or can we have, say, 1A, 1C, 1Q?
2) are the letters always in alphabetic order, like 1A, 1B, 1C? Or is 1C, 1A, 1B possible?

Sorry if these questions are already answered, I might have missed them.

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:
I sorted the string.

The order of the string should not be broken.

Junilu Lacar
Sheriff
Posts: 16675
278
• 1
• Number of slices to send:
Optional 'thank-you' note:
Piet, by my understanding,

1A, 1C, 1Q would not be merged

1C, 1A,1B ==> 1C, 1[A-B]

Martin Vashko
Sheriff
Posts: 3837
66
• Number of slices to send:
Optional 'thank-you' note:

Piet Souris wrote:Two other quick questions:

1) are the letters always consequtive? Or can we have, say, 1A, 1C, 1Q?
2) are the letters always in alphabetic order, like 1A, 1B, 1C? Or is 1C, 1A, 1B possible?

Sorry if these questions are already answered, I might have missed them.

1) They may or may not be consecutive. When they are, all consecutive items in a sequence are merged using brackets, as shown in the first post.
2) The list of strings is sorted before being processed. Now this might pose a problem: "8A, 9A, 10A" would be sorted into "10A, 8A, 9A", making sequence detection more complicated.

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:
I'm sorting the string first.
Then I'm sorting the consecutive.

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:

Junilu Lacar wrote:Piet, by my understanding,

1A, 1C, 1Q would not be merged

1C, 1A,1B ==> 1C, 1[A-B]

exactly

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:

Piet Souris wrote:Two other quick questions:

1) are the letters always consequtive? Or can we have, say, 1A, 1C, 1Q?
2) are the letters always in alphabetic order, like 1A, 1B, 1C? Or is 1C, 1A, 1B possible?

Sorry if these questions are already answered, I might have missed them.

the letters always in alphabetic order

Junilu Lacar
Sheriff
Posts: 16675
278
• Number of slices to send:
Optional 'thank-you' note:
Here's how I would approach it.

Track the three parts of the pattern. Only the number part is going to be required. If the number changes and is consecutive from the previous number in the input, you have at least a number sequence. If the optional chars part is present, then they need to be equals. At the same time track the optional letter part. If you have a sequence in either the number or letter part or both, do the range merge once one of them is broken.

That is

1C, 2D, 3F ==> [1-2][C-D], 3F

1C, 2D, 4E ==> [1-2][C-D], 4E

The middle chars part, if present will end a sequence when it changes between consecutive input values.

Saloon Keeper
Posts: 8760
71
• Number of slices to send:
Optional 'thank-you' note:
This all seems less than concise. I see this as you have an input stream that can be compressed to an output stream in such a way as that the output stream could later be uncompressed into the original input stream.

The compression process would start with first sorting the stream which would require breaking each element down into three components: number, chars, and letters. Or, as a regular expression: "(([0-9]+)([^a-zA-Z0-9]*))([a-zA-Z]+)". The sorting order would be based, first on the numeric value of group(2) followed by the character order of group(3) followed by character order of group(4).

Ranges would be based on entries with identical numeric values and identical characters and followed sequential letter values.

So, where have I gone wrong? And if I've gone wrong can someone correct this in clearly stated rules? This seems to be very elusive here.

Junilu Lacar
Sheriff
Posts: 16675
278
• Number of slices to send:
Optional 'thank-you' note:

Carey Brown wrote:Or, as a regular expression: "(([0-9]+)([^a-zA-Z0-9]*))([a-zA-Z]+)".

That regex will not match input like 51/1 or 51-1 and similar input.

I got better match results with the sample inputs that OP gave with this: (([0-9]+)([^0-9]*[^a-zA-Z]*))([a-zA-Z]?)

Junilu Lacar
Sheriff
Posts: 16675
278
• Number of slices to send:
Optional 'thank-you' note:

Junilu Lacar wrote:I got better match results with the sample inputs that OP gave with this: (([0-9]+)([^0-9]*[^a-zA-Z]*))([a-zA-Z]?)

I don't think even this is sufficient still. Best approach is to write some automated tests that you can use to break down different cases and validate against a candidate regex.

Piet Souris
Bartender
Posts: 4691
183
• Number of slices to send:
Optional 'thank-you' note:
I am experimenting with a TreeMap with a suitable Comparator. A bit clumsy right now, but this is how far I am.

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:
There will be no lower case [a-z] after the numbers.

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:

Piet Souris wrote:I am experimenting with a TreeMap with a suitable Comparator. A bit clumsy right now, but this is how far I am.

I looked at the output of the code.
Close to what I want.

Martin Vashko
Sheriff
Posts: 3837
66
• Number of slices to send:
Optional 'thank-you' note:

I looked at the output of the code.
Close to what I want.

That's good news! Can you tell us how is this solution different from what you need?

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:

Piet Souris wrote:I am experimenting with a TreeMap with a suitable Comparator. A bit clumsy right now, but this is how far I am.

code output:
1: [ ]
2: [ , A, B, C]
3: [ , A]
4: [ ]
5: [ ]
6: [ ]
32: [C, D]
50:[]
51/1:[]
51/2:[]
60: [ ]
61: [ ]
62: [ ]
200-2:[E, F, G, H]
201:[C]
201/21:[P, R, S]
300:[ , A]
301-2:[A]
542/2:[K, L]
583-1:[]
583-585:[D, E]
605:[ ,A, B, C]
800:[A]

Why does [] constantly produce it and doesn't merge in range?
How to combine letters in the range as follows?
output:
1
2
2[A-C]
3
3A
[4-6]
32[C-D]
50
51/1
51/2
[60-62]
200-2[E-H]
201C
201/21[P-S]
300
300A
301-2A
542/2[K-L]
583-1
583-585[D-E]
605
605[A-C]
800A

Piet Souris
Bartender
Posts: 4691
183
• Number of slices to send:
Optional 'thank-you' note:
hi Kiraz,
the map that I created was only meant to be an intermediate result. It was my intention to go on processing. My further ideas were:

create an enum

and have new new Class:

Now, I would have a method that processes my intermediate results in the map, and creates a List<FinalToken>. Supose you have this entry in the map:

Because of the empty char at the start of the list (coming from the Token 300 without any chars) I create two FinalTokens: with 300 and type JUST_A_NUMBER, and 300 with list [A, B, C], with type NUMBER_WITH_ChAR.
The last one will output the toString result: 300[A-C].
So I have a List<FinalToken>. My last step is to go from top to bottom, and when I encounter type JUST_A_NUMBER, I will check if the next couple of elements are also of type JUST_A_NUMBER and have consecutive ints. Then I will output the [300 - 350] form.

Well, sounds more complicated than it is (I hope), but I have not yet had time (or courage) to implement all this (well, I do have the enum).

This was what I had in mind. Wondering if others have come up with someting more simple.

Carey Brown
Saloon Keeper
Posts: 8760
71
• Number of slices to send:
Optional 'thank-you' note:
I don't understand how P, R, S is a sequence when it is missing the 'Q'.

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:

Carey Brown wrote:I don't understand how P, R, S is a sequence when it is missing the 'Q'.

Our alphabet does not have Q but can be added.

kiraz cevik
Greenhorn
Posts: 23
• Number of slices to send:
Optional 'thank-you' note:

Piet Souris wrote:hi Kiraz,
the map that I created was only meant to be an intermediate result. It was my intention to go on processing. My further ideas were:

create an enum

and have new new Class:

Now, I would have a method that processes my intermediate results in the map, and creates a List<FinalToken>. Supose you have this entry in the map:

Because of the empty char at the start of the list (coming from the Token 300 without any chars) I create two FinalTokens: with 300 and type JUST_A_NUMBER, and 300 with list [A, B, C], with type NUMBER_WITH_ChAR.
The last one will output the toString result: 300[A-C].
So I have a List<FinalToken>. My last step is to go from top to bottom, and when I encounter type JUST_A_NUMBER, I will check if the next couple of elements are also of type JUST_A_NUMBER and have consecutive ints. Then I will output the [300 - 350] form.

Well, sounds more complicated than it is (I hope), but I have not yet had time (or courage) to implement all this (well, I do have the enum).

This was what I had in mind. Wondering if others have come up with someting more simple.

Thank you. Do I create an enum class separate from the code you originally wrote?Is the FinalToken class created separately

Piet Souris
Bartender
Posts: 4691
183
• Number of slices to send:
Optional 'thank-you' note:

Carey Brown wrote:I don't understand how P, R, S is a sequence when it is missing the 'Q'.

I asked a question about that. The answer was (if I recall corretly) that these letters are always consecutive. Otherwise I guess it will be even more complex....

 You showed up just in time for the waffles! And this tiny ad: Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton