• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Liutauras Vilda
  • Jeanne Boyarsky
  • paul wheaton
Sheriffs:
  • Ron McLeod
  • Devaka Cooray
  • Henry Wong
Saloon Keepers:
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
  • Tim Moores
  • Mikalai Zaikin
Bartenders:
  • Frits Walraven

Collectors.groupingBy() API | OCP 17

 
Ranch Foreman
Posts: 38
5
MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi

First of all, Thank you all for your valuable guidance and support so far. I appreciate.

I was coding some examples for the Collectors API and came across the following example on Stackoverflow  [ https://stackoverflow.com/questions/47019946/java-8-group-by-a-list-sort-and-display-the-total-count-of-it ]

My question is regarding the Comparator used by the Map.Entry here.

I understand how this example works. My only question is the order of printing of map entries.

I mean when I run this example the output is always {apple=3, banana=2, papaya=1, orange=1} even after multiple executions of code.

I want to understand how comparingByValue works internally when it decides the tie case in sorting; I mean the case when both map values  are equal?(value is the sorting criteria in this example in reversed order)

Like in this case,  I understand that entries of this map are sorted based on the reverse natural order of values (3,2,1....) but value 1 corresponds to 2 keys papaya and orange.

Why is it always papaya entry before orange entry despite the fact that value corresponding to both keys is 1 ?

I mean why the output can't be {apple=3, banana=2, orange=1, papaya=1} in this case ?



My logic:
As per javadoc, there is no guarantee of a specific Map implementation type used by the groupingBy collector. Hence, it can be unordered. Further , on calling entryset method on it will give a set which is distinct at the most ( no ordering guarantee by set as well here ). Hence, encounter order of stream generated by entryset.stream() should be random.

Somehow, strangely,the intermediate sorted stream is always {apple=3, banana=2, papaya=1, orange=1} by always sorting papaya before orange. this is the point where I,m not understanding it properly.

But since the stream is sorted by comparingByValue and there is a constraint on forEachOrdered to adhere to encounter order. Somehow, forEachOrdered is adding values into the LinkedHashMap in the same insertion order in my code executions.I mean for value 1 since there are 2 keys; so there should be some chance that finalmap can contain orange before papaya.

Please provide your guidance on it.

Thanks



 
Bartender
Posts: 5558
213
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi Dalvir,

I seldomly use LinkedHashMaps, so I am not sure what is happening in your code, but here is a sure way to get orange before papaya ( :

output:

{orange=1, papaya=1, apple=3, pear=2}
{apple=3, pear=2, orange=1, papaya=1}
 
Marshal
Posts: 79956
396
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Can you see the logical misconception behind the following code?

Dalvir Singh Bains wrote:. . .. . .

What is more, I suspect there is a much simpler solution than anything they mentioned on that SO link.

Minor thing: when using Streams it is usual to indent the code so all the dot operators align vertically. Or something similar. They have used lines much too long.
 
Campbell Ritchie
Marshal
Posts: 79956
396
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

A few minutes ago, I wrote:. . . a much simpler solution . . .

I meant a simpler version than the code shown above; one person showed two better solutions still, avoiding the intermediate Map.
 
Campbell Ritchie
Marshal
Posts: 79956
396
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Unless you specifically want a List, I would avoid Arrays.asList(...) ... list.stream(); you can write Stream#of(T...) instead.
 
Campbell Ritchie
Marshal
Posts: 79956
396
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have thought of another reason why the version without using intermediate Maps might be better. It has to do with sorted().
 
Piet Souris
Bartender
Posts: 5558
213
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
........ Campbell Ritch(ie)cock, master of suspense......    

But if you do not determine the frequencies first, I do not see how you get a map where the most frequent element is put first (well, I do see a way, but not a very efficient one, that is): list::frequencyOf). I do not understand the necessity of a LinkedHashMap anyway.
 
Dalvir Singh Bains
Ranch Foreman
Posts: 38
5
MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks Piet and Campbell for your replies.

@Piet :following on the improvised example you've posted, I've refactored my orginal post as below:



@Campbell and @Piet: I understand the alternate ways to get the entries order as metioned in that SO post as well as using chaining comparators (thenComparing) but it wasn't I intended to ask originally. I was not very clear in explaining earlier.

I believe after refactoring the original post, I can myself understand it better and try to explain again what I intended to mean actually originally.

I think it all boils down to line 4 of this coding example. Maps generated from streams don't maintain any order unless a specific map implementation is requested( TreeMap, LinkedHashMap etc.).

So, considering this point the output map entries at line 5 of this code, should be in random order. but somehow due to an internal map (possibly HashMap implementation chosen by groupingBy collector here, order is always consistent resulting in a order dependecy on map variable by result variable.

At 19:00 minute of this youtube(https://www.youtube.com/watch?v=lwp2RZ__0ko) video, @Stuart Marks explains about iteration order of Map data structures. As per this video, new unmodifiable collections of Collections API do have randomized iteration order but older Maps like HashMap despite having unspecified iteration order still  give consistent iteration order sometimes due to behavior of hashing algos.

I believe this is happening here , map generated at line 5 is always consistent in terms of order of its entries i.e.( papaya always showing before orange even after multiple runs of code).

Due to this consistency in the  order of map entries at line 5, even after sorting is done at line 9 and sorted stream of map entries is generated, the order of enteries for papaya and orange is preserved despite both mapping to same value 1 ( as it was in the entryset).

Please provide your guidance on it.

Thanks
 
Campbell Ritchie
Marshal
Posts: 79956
396
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ritchcock? Hahahahahahahahahahahahahahahaha!

I was a bit confused about how the Map was to be sorted. I am sure we can design a Comparator to work on frequencies (I think we have already got that) and there must be some way to put that information into a sorted map. That has the advantage of not needing lots of memory; a sorted Stream is stateful and in the worst case requires enough memory to store the entire contents of the Stream. Nobody will notice for a seven‑element Stream, but you might for 7,000,000 elements. Fortunately, as you will have seen from the SO link, it is probably possible to dispense with sorted() altogether.
You can also create a Comparator to put orange and papaya in the right order. You obviously know about thenComparing(...) already. The original Comparator didn't distinguish the names of the keys.
The concept problem I saw earlier is that a linked map is intended to give iteration order the same as insertion order, but a tree map gives iteration order according to a value (as specified by the Comparator used to populate it) (=ordered by sorting), so what is the point in doing any sorting beforehand?
If you already have a Map, it is quite easy to write new TreeMap(oldMap, myComparator). I think that will work.

The documentation for Map doesn't say anything about randomising iteration order: it says,

• The iteration order of mappings is unspecified and is subject to change.

That means the exact implementation of the unmodifiable Map might change in future.
HashMap doesn't say anything about iteration order, but it probably has something to do with h % c  where h is the hash code of the key and c the size of the backing array. SortedMap says it is sorted by the keys, depending on whether you do or don't give it a Comparator as a constructor argument. So that makes me think it is a bit dubious to sort things by the “V” as the person is doing on SO. But we have seen how that can be done successfully. LinkedHashMap says it iterates by insertion order, but also tells you ways to alter its encounter order.
 
Saloon Keeper
Posts: 10929
87
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
@Campbell, I'm trying to get your suggestion to work. The TreeMap constructor does not take two arguments so this is what I came up with. However the compiler doesn't like lines 23  and 24 and I haven't been able to figure out what it wants. Help please.
 
Carey Brown
Saloon Keeper
Posts: 10929
87
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
GPT has this to say:

The issue with your code is on the line where you try to use addAll on the TreeMap object, which does not exist for TreeMap. Instead, you should use the putAll method to add all entries from the map.

Additionally, TreeMap expects a Comparator that compares its keys, but you are passing a Comparator that compares Map.Entry objects, which isn't directly compatible.
 
Dalvir Singh Bains
Ranch Foreman
Posts: 38
5
MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'l try to summarize our discussion and concepts that are validated:

1. Maps generated from streams in general have unspecified iteration order unless a specific sorted map implementation is requested ( Obviously every sorted collection is ordered ). Javadoc validates this as well about unspecified order of map entries.

2. Due to unspecified iteration order, map generated at line 4 in my refactored example has unspecified order. It just happens to be in a consistent order after multiple runs in this example( by chance) at line 5 but that is just co-incidence. Nothing can be said definitely. As it's behavior may change when generated using a large data set. Here, initial data set of 7 items is not enough to deduce anything conclusively.

3. In this specific coding example, In order to make sure that orange is ordered before papaya, we have some of possible solutions like the ones mentioned on that SO link or using a chaining comparator.

and @Campbell by "value" in your following remark:

The concept problem I saw earlier is that a linked map is intended to give iteration order the same as insertion order, but a tree map gives iteration order according to a value (as specified by the Comparator used to populate it)



do you mean to say the numerical int "value" which is the result of method: int compare( Object, Object) of Comparator interface i.e {-1,0,1} for custom sorting of TreeMap? It's not the Map's "Value" from key-value pair because TreeMaps decide sorting criteria based on Map keys not values.

Thanks for your insights on this post @Campbell.

 
Carey Brown
Saloon Keeper
Posts: 10929
87
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Dalvir Singh Bains wrote:2. Due to unspecified iteration order, map generated at line 4 in my refactored example has unspecified order. It just happens to be in a consistent order after multiple runs in this example( by chance) at line 5 but that is just co-incidence. Nothing can be said definitely. As it's behavior may change when generated using a large data set. Here, initial data set of 7 items is not enough to deduce anything conclusively.


Almost but not quite. The order will be consistent for a given JDK but the order is not specified for JDKs in general. The order is not "by chance". The order is not dependent on the size of the data set.
 
Campbell Ritchie
Marshal
Posts: 79956
396
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Dalvir Singh Bains wrote:. . . map . . . just happens to be in a consistent order . . . by chance . . .

To cntinue from what Carey says, the order is unpredictable, but is consistent as long as the same implementation is used.

Thanks for your insights on this post @Campbell.

That's a pleasure By value I meant the “V” of the Map, which in this case happens to be a Long.
 
Dalvir Singh Bains
Ranch Foreman
Posts: 38
5
MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I've understood this concept. Thanks @Carey and @Campbell.

Today I feel like sharing my heart with you all while prepairing for the OCP.

So, far in order to understand the working of API I write code examples and try my best to cover edge cases. This is so far my approach towards learning the java API. I know over time using API at job or in projects help to recall API faster and it kinda builds muscle memory.But I've to reach that stage yet. I also try to go over the problems posted by other users on this forum and on SO to build my code comprehension skills. Examples on Coderanch and SO helps me to see code in problem context and its possible solutions(submitted my first edit on SO recently and it got accepted-felt good . I strongly believe in quality over quantity in life in general; this belief I try my best to use in learning java.

I try to understand the concept concretely to the best of my abilities. Whatever I don't understand or when in doubt I try to read javadoc, see code examples, ask on coderanch and practice writing by myself. Taking analogy of swimming, like it can only be learned by jumping in the pool. Coding can be learned by coding...self-introspecting---coding....repeat, measuring and benchmarking the learning alongwith adjusting the learning path. I feel kinda low today so just feel like sharing with you all this. I respect you all as my seniors in this field. I know I've to put in more hours and it's just the beginning but I'll persist and persevere.

This is so far my method to understand java API. As you are seniors to me in this industry, I would appreciate your valuable feedback on my learning style and suggestions in areas of improvement.

I'm trying to give hands on practice to java dedicately first time in life.

Thanks
 
Campbell Ritchie
Marshal
Posts: 79956
396
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Dalvir Singh Bains wrote:. . . . Thanks @Carey and @Campbell.

That's apleasure

. . . using API . . . help to recall API faster and it kinda builds muscle memory. . . .
. . , swimming, like it can only be learned by jumping in the pool. . . .

Yes, you will learn some of the API by coding, but that is usually not enough to pass a cert. exam. I recommend you read the documentation for all the methods you are using. and those methods you expect to come up in he exam.
 
Piet Souris
Bartender
Posts: 5558
213
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi Dalvir,

well, your attitude certainly is superb! Practice as much as possible, and as you see you can get all the help you need from the people at SO and from us here.

Both for your very interesting topic, your contributions and as an encouragement for things to come, have a cow!

As an extra exercise: from the frequencyMap, can you create a datastructure that gives, when printed,

{3=[apple], 2=[pear], 1=[papaya, orange, banana]}

 
Campbell Ritchie
Marshal
Posts: 79956
396
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Piet Souris wrote:. . . for your very interesting topic . . . have a cow! . . .

Agree. Have another cow
 
Campbell Ritchie
Marshal
Posts: 79956
396
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

A minute or two ago, I wrote:. . . Have another cow

I gave it on your thread about varargs.
 
Dalvir Singh Bains
Ranch Foreman
Posts: 38
5
MySQL Database Java
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks @Campbell and @Piet for your feedback on my post. I earned some more cows yayyyyyyyyy!!!

@Piet as per your following remark:

As an extra exercise: from the frequencyMap, can you create a datastructure that gives, when printed,

{3=[apple], 2=[pear], 1=[papaya, orange, banana]}



I've tried to come up with the following solution. I know you intend to solve this exercise only using Collectors API. The following example is giving the desired result but it's using another map to sort the map in reverse order of keys at line 12. I'm thinking of some way to do it using Collectors itself without using another map. For now, I've this code as solution:



Please review and provide your feedback.

Thanks


 
Piet Souris
Bartender
Posts: 5558
213
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi Dalvik,

in line 13, the lists are still not sorted, although from the output it may look like they are. So, to be absolutely sure, in line 13 and a half, you could do:

But your code requires three maps (the frequency map, map1 and map2) and it can be done with two maps, by using the third form of groupingBy, with the Supplier<M> as argument. Here is what I had, and you see the enormous verbosity of java (but if that is a problem, don't use java in the first place). Here goes:
 
Carey Brown
Saloon Keeper
Posts: 10929
87
Eclipse IDE Firefox Browser MySQL Database VI Editor Java Windows ChatGPT
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Piet Souris wrote:


Piet, I don't see where your "tree2" is needed at all. "Tree" is already made from a TreeMap+Reverse.order so it should need no further sorting, and the fruit name list is made with a TreeSet+Reverse.order and so it should not need any further sorting. Am I missing something?

EDIT:
It looks like tree2 was offering a second way of doing it using toList() instead of toCollection().
 
Piet Souris
Bartender
Posts: 5558
213
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Indeed.

The Map(reverseOrder) says to use reverseOrder for the keys, it does not say anything about the values.

Anyway, it shows some things that are possible with 'groupingBy'. Its use here is far from simple. Not helped by the fact that I get numerous errors while editing the code (using NetBeans). For instance, at first I named my 'supMapTree' just 'supMap'. That was okay. Then I changed the name to 'supMapTree' and suddenly I got all sorts of errors from 'wrong arguments' to the dreaded 'cannot infer <K> <T> etc'. Having stared at all this for half an hour, I decided to give it another look later that day. So I pressed 'save' and, hey presto, gone were all the errors.
 
Dalvir Singh Bains
Ranch Foreman
Posts: 38
5
MySQL Database Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks @Piet for the solution. It's very interesting. It helped me revise the Collectors API related concepts.
 
Sheriff
Posts: 28323
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I don't mean to muddy the already-muddy waters, but I notice that Java 22 has a preview feature named Stream Gatherers. As the JEP says:

JEP 461 wrote:Stream::gather(Gatherer) is to intermediate operations what Stream::collect(Collector) is to terminal operations.

Perhaps it might be of interest to see whether these new things would simplify the questions being discussed here.
 
Piet Souris
Bartender
Posts: 5558
213
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks, Paul.

My first impression after reading it once: same impression as when I had a first look at the Collectors-class. Gonna take some time, I guess...
 
Master Rancher
Posts: 5060
81
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Piet Souris wrote:I do not understand the necessity of a LinkedHashMap anyway.


I think there was some confustion about the requirements, starting from the original post on StackOverflow.  The example given (repeated above) shows output with papaya coming before orange, and says that he wants orange before papaya.  But, why should orange be before papaya?  There are two possible reasons, I think:

1. Alphabetically, "orange" comes before "papaya".
2. In the original input, "orange" came before "papaya".

The first seems to be what the original poster intended: "I also wanted to sort based on the name of fruites" [sic].  However, because his only example showed input data with "orange" before "papaya", and because he included a LinkedList in his solution, many of the people in both StackOverflow and here interpreted the problem with the second interpretation. I haven't read all the posts super carefully; maybe someone previously sorted this out, or understood differently - but that's how I see it.  There is no reason for LinkedList in the original problem, and you pretty much do need to do a complete count of all the elements in order to sort properly - so yeah, you pretty much need to create the full map of counts first, then sort the entries - preserving the order in a Map using either TreeMap or LinkedListMap.  I don't see much room for optimization beyond your (Piet's) original solution - including with Gatherers.
 
My, my, aren't you a big fella. Here, have a tiny ad:
Gift giving made easy with the permaculture playing cards
https://coderanch.com/t/777758/Gift-giving-easy-permaculture-playing
reply
    Bookmark Topic Watch Topic
  • New Topic