Win a copy of The Java Performance Companion this week in the Performance forum!

# Matching pattern in given list

samir ware
Ranch Hand
Posts: 192
Hello Rancher
I have a list few few thousand String records in it and this list is in the sorted order. I am willing to locate the indexes from the list for different alphabet. For example if records starting with alphabet "A" are 4,records starting with alphabet "B" are 5 and that with "C" are 6 then some XYZ function should return me a map as shown below

A - 0
B - 5
c - 10

which is nothing but the index inside the list at which records starting with specific letter are observed. i.e. records starting with alphabet A are startign at location 0 , records starting with alphabet B are starting at location 5 and records starting with alphabet C are at location 10(0 to 4 records are for A , 5 to 9 records starts with B so records with C will start at index 10).
This in a nutshell its a pattern matching in the list.
Any algorithm , pseudo code or white paper will greatly be appreciated.
Samir

Campbell Ritchie
Sheriff
Posts: 49382
62
One possible suggestion. There are many other ways to do this.

You realise that a char is a number? You can do arithmetic with it. You can even do arithmetic to get your chars into a consecutive sequence starting at 0, in which case you can use them as array indices.

Jelle Klap
Bartender
Posts: 1952
7
Also something to be aware of: if the List is sorted using String's natural ordering it will be in lexicographical order, not in alphabetical order.

Wendy Gibbons
Bartender
Posts: 1111
Jelle Klap wrote:Also something to be aware of: if the List is sorted use String's natural ordering it will be in lexicographical order, not in alphabetical order.

Sorry but what does this mean?
I found this in wikipedia but really am more confused than before

Lexicographical ordering

It is often useful to define an ordering on a set of strings. If the alphabet Σ has a total order (cf. alphabetical order) one can define a total order on Σ* called lexicographical order. For example, if Σ = {0, 1} and 0 < 1, then the lexicographical order on Σ* includes the relationships ε < 0 < 00 < 000 < ... < 0001 < 001 < 01 < 010 < 011 < 0110 < 01111 < 1 < 10 < 100 < 101 < 111 < 1111 < 11111 ...

I am guessing it means that apple wont be next to Apple in order but that Apple will come after zoo (or is it apple will come after Zoo)

Campbell Ritchie
Sheriff
Posts: 49382
62
Yes, Wendy, but the other way round.

Jelle Klap
Bartender
Posts: 1952
7
The second one (apple will come after Zoo), because the Unicode value of an uppercase letter is smaller than that of any lowercase letter. The values match those of the ASCII table, if that helps.
Also see String.compareTo().

Campbell Ritchie
Sheriff
Posts: 49382
62
Zoo precedes apple and apple precedes zoo. 10 precedes 11 and 19 precedes 2.
You will find that ordering called ASCIIbetical, and you can probably Google for that word.

Jelle Klap
Bartender
Posts: 1952
7
Campbell Ritchie wrote:You will find that ordering called ASCIIbetical, and you can probably Google for that word.

Hey, I've never heard of that term before, I like it!

Wendy Gibbons
Bartender
Posts: 1111
i never remember which is 65 'a' or 'A'

Campbell Ritchie
Sheriff
Posts: 49382
62
Don’t even try to remember. Get yourself a link like this Unicode page, and it will tell you.
0x0041 = 'A' and 0x0061 = 'a'. Now is 65 0x0041 or 0x0061?

It’s 0x0041, of course.

Rob Spoor
Sheriff
Posts: 20546
57
I prefer asciitable.com for the ASCII characters.