• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Tim Cooke
  • Jeanne Boyarsky
  • Bear Bibeault
Sheriffs:
  • Knute Snortum
  • paul wheaton
  • Devaka Cooray
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Ron McLeod
  • Piet Souris
  • Ganesh Patekar
Bartenders:
  • Tim Holloway
  • Carey Brown
  • salvin francis

Find powerset of numbers up to a given number

 
Ranch Hand
Posts: 36
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello, so I want to find a way that for a given input number N, the power set of all numbers 1,...,N is calculated. I don't want to store the powerset in memory, neither print or edit it, what i really want to do is check if each subset has a particular property and if not ,move on to calculate another subset and check for this property again. For example if N = 10, I want to start checking the subsets: {1,2,3,...,10} and if none of them satisfies my condition start checking the subsets: { {1,2}, {1,3},...{1,10}, {2,3},...,{2,10},...} and then start checking the subsets that contain 3 numbers and so on... I am basically trying to bruteforce a problem and find the smallest possible subset that satisfies a condition. I just can't think of a way to do it without having N number of for loops XD. I thnk recursion is the answer but I don't really know how to do it . Keep in mind that storing the powerset in memory and then checking each subset is not an option since N can be quite big like 34.
 
Marshal
Posts: 65474
249
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Petros Papatheodoru wrote:. . . I want to start checking the subsets: {1,2,3,...,34} . . . .

I presume you want {{1},{2},{3},...,{34}}. Have you remembered what the smallest member of the powerset is? It is {}.
You are correct that a powerset can be large; in fact the size of ℙ{1, 2,...34} (also written 𝒫{1, 2,...34}) is 2³⁴. Yes, you can use nested loops, but a 34‑level nested loop is going to look very strange. Yes, you can use recursion, but again you will have a large block of code.
Maybe it would be better to work from the condition and see if you can't reason about one of the sets satisfying that condition.
 
Master Rancher
Posts: 3322
31
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:Yes, you can use recursion, but again you will have a large block of code.


I disagree on this point - there is a recursive solution which is quite concise.  However the one I'm thinking of doesn't do all the short ones first, instead it may be something like this (for size 3):


Note that this is all the powersets that don't contain 1, followed by all the powersets that do contain 1.  Further patterns are left as an exercise for the reader.

So one question is, does the order matter?  Does it need to be in the order described earlier?  Or is any order OK?

I agree with Campbell's point about how it may be a good idea to learn more about the condition being checked for.  The nature of the condition may make it possible to eliminate some (or many) branches in the tree of all possibilities, which could make things much faster.
 
Petros Papatheodoru
Ranch Hand
Posts: 36
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sorry, I didn't mention the empty set because it is irrelevant. What I am trying to solve is a graph problem where you have N people and each person has a number of connections with other people. I want to find the minimum vertex cover of the graph, eg. the smallest possible subset of people that together are connected with everyone, this is the condition I was talking about. Yes I know bruteforcing is a terrible solution but I am strictly asked to do it that way. Solving it using 34 for loops cannot be considered since N may vary. The thing that matters here is that I check all the subsets with the same number of elements first (in whatever order) and then increment the number of elements allowed and search again. My current solution does not find the smallest subset by quite a lot, although it finds a solution. Mike, if I understood correctly in the example you provided N was 3 and this algorithm calculates the powerset of the last 2 numbers and then adds the third from the end number to each subset, so an example with N=4 would look something like this : I see many fluctuations of the number of elements in the subset, so I think that this wouldn't work at all.
 
Mike Simmons
Master Rancher
Posts: 3322
31
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
OK, I see, thanks for the clarifications.  I agree that the order is important here; my previous solution wouldn't be good.

On further thought there's also a fairly short, simple recursive solution to this as well, that preserves the order you want.  So, don't shy away from a recursive solution if that's what comes to mind.  There's nothing wrong with that here.

[Edit - I mean, there's a short, simple way to code getting the powerset, ordered by increasing length.  I don't think there's a short way to solve the overall problem, which is NP complete. - Mike]

I suspect that in the long run, it may be better to ignore the powerset, which will contain many sets not connected to each other... and instead follow connections.  But, it's not clear how to best organize this.  At this stage, you're probably best off ignoring efficiency anyway, for now, and concentrate on getting any solution that is correct, first.  Once you have that, you can consider ways to speed it up.
 
Saloon Keeper
Posts: 3462
149
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi Petros,

what about using some heuristics? You could start with making a Map with key the size of a List, and value the list of a vertex and all of its direct neighbors. Then start with the list with largest size, eliminate all the keys from the map that are within that list, et cetera. Does not guarantee a minimum, but it is a sort of an A* algorithm.
 
Sheriff
Posts: 24635
56
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The easiest way to produce the power-set is to realize that the sets in the power-set of the integers from 1 to N are mapped one-to-one onto binary numbers with N digits.

For example the power-set of the numbers from 1 to 3 corresponds to 000, 001, 010, 011, 100, 101, 110, and 111.
 
Campbell Ritchie
Marshal
Posts: 65474
249
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
For 34 binary digits?
 
Piet Souris
Saloon Keeper
Posts: 3462
149
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
And, if the maximum outdegree is, say, 4, and there are 34 vertices, you need at leat 7 vertices to get a full coverage.
 
Mike Simmons
Master Rancher
Posts: 3322
31
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:For 34 binary digits?


Sure.  Why would that be a problem?  He said integers, not ints.  If ints are not big enough, it's trivial to use long instead, or even BigInteger, as far as the programming is concerned.  

Of course, this has exactly the same ordering issue as my post did, for purposes of solving Petros' actual problem.

And of course it could take some time to run this for n = 34.  But frankly that will probably be negligible next to the time required to check vertex coverage anyway.  
 
Petros Papatheodoru
Ranch Hand
Posts: 36
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Wow, the binary observation is so clever, I think i can make it work! So, I will use a byte array of length = N (34) to reprisent the subsets, by filling the array with 1s in the appropriate places (e.g. 01011 is {4,2,1}) . For every subset I generate, I will check if it satisfies the condition and then move on to generate another one, by using the same array, this way I wont have to save all the possible subsets.
 
Piet Souris
Saloon Keeper
Posts: 3462
149
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So you are still planning to go the brute force way?

I have some tips that may help ease the coding a little:

What I would do first is to try to reduce the size of the graph a little. Since the relation 'connected' is an equivalence relation, introducing a partition of the graph (into the maximal connected subgraphs), you can do the brute force on these (hopefully) reduced graphs (if the graph is connected, then this would fail, of course).

Also, I stated it before, if the maximal outdegree is 4, then you need at least 34 / 5 = 7 vertices, so there would be no need to investigate smaller subsets.

Lastly: I agree with Mike that using the Longs is a handy way. If you have all the Longs from 1 to 2^34 - 1, then you can create a Map<bitcount, List<Long> easily using the bitCount() method of a Long. To find which bits are set, I would turn my Long into a BitSet and use the method stream().

Happy coding!
 
Petros Papatheodoru
Ranch Hand
Posts: 36
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Piet Souris wrote:So you are still planning to go the brute force way?

I have some tips that may help ease the coding a little:

What I would do first is to try to reduce the size of the graph a little. Since the relation 'connected' is an equivalence relation, introducing a partition of the graph (into the maximal connected subgraphs), you can do the brute force on these (hopefully) reduced graphs (if the graph is connected, then this would fail, of course).

Also, I stated it before, if the maximal outdegree is 4, then you need at least 34 / 5 = 7 vertices, so there would be no need to investigate smaller subsets.

Lastly: I agree with Mike that using the Longs is a handy way. If you have all the Longs from 1 to 2^34 - 1, then you can create a Map<bitcount, List<Long> easily using the bitCount() method of a Long. To find which bits are set, I would turn my Long into a BitSet and use the method stream().

Happy coding!


Thanks for the tips man! The thing is though that I HAVE to use bruteforce, it is not a choice XD. I guess their point is to make us see the difference between exhaustive search and a smarter greedy algorithm (I had already implemented what you suggested in your last comment ).
 
Mike Simmons
Master Rancher
Posts: 3322
31
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Earlier you needed them in a particular order... not a concern any more?  The binary number order is basically the same order showed earlier.
 
Petros Papatheodoru
Ranch Hand
Posts: 36
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes you are right Mike this cannot work since I need to test all of the N elements each time before I increment the size of the subset. It only occured to me once I started implementing it. So, back to nothing
 
Piet Souris
Saloon Keeper
Posts: 3462
149
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't quite follow. The idea was to find a minimal set of vertices that would cover the entire graph. What order do you have in mind? My idea waas to start with all subsets of size 1, then 2, et cetera, for which I thought the map was an easy way to implement that.
 
Mike Simmons
Master Rancher
Posts: 3322
31
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Piet - for my part I was talking about the idea of using binary numbers to generate the combinations, which has the same ordering problems as what I had suggested earlier.  

[Disregard the latter parts of this post; I was misreading Piet's idea. - Mike]
 
Piet Souris
Saloon Keeper
Posts: 3462
149
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi Mike,

why not? My idea is based on this: if you have a vertex 1 that has neighbors 2, 3, and 4, then I have a map.entry<1, {1,2, 3, 4}. Likewise, if vertex 5 has neighbors 8 and 9, we have a map.entry<5, {5, 8, 9}> (so in fact a slightly extended adjescent list). If I then have a long with the bits 1 and 5 set, I make a Set of the two groups {1, 2, 3, 4} and {5, 8, 9} and see if the size is 34. Going from bitcounts 2 to 34 would (in the very long run) find the nminimal set.

Edit: my reply was before I read your edit. You're right, I made a "small" error isn caculating 2^34! But then you don't need to do that in one go. The series you showed is one that I use myself to generate all subsets: first all subsets length 1, then, based on that, those of size 2 (I did not do it in my routine, but then you can eliminate the 1 subsets, et cetera). Well, all in all not easy at all, but doable.
 
Mike Simmons
Master Rancher
Posts: 3322
31
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
OK, sorry for editing my post after; I was still catching up on what you meant.  Please disregard my last post.

You seem to be discussing how to determine vertex coverage for each set.  Great, sounds good.  But you are assuming we are using a mechanism to generate these sets, that generates short sets before longer sets.  Because it's a waste of time to verify a 25-element set, if a solution exists with 9 elements.  I think we all agree, checking short sets before long sets is the way to go here.

In the latest posts from myself and Petros, we are thinking of Paul's suggestion to use binary numbers to generate all the possible combinations.  The list of binary combinations shown by Paul is pretty much exactly equivalent to the list of combinations I showed in my first post, with the same problem: it's not ordered in increasing set size.  Yes, we can use a long to concisely represent a combination, but we need a way to generate these combination in order of increasing set size.  Simply counting 1, 1, 10, 11, 100, 101, 110, 111 will not do it.
 
Mike Simmons
Master Rancher
Posts: 3322
31
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Piet - so, to get all subsets of length 1, would you count 0, 1, 10, 11, 100 etc and eliminate all with a length not equal to 1?  Then do the same for length 2, length 3, etc?  That's possible, but seems convoluted and inefficient.  There are more direct ways to get the desired order, recursively or otherwise.
 
Piet Souris
Saloon Keeper
Posts: 3462
149
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, I do use recursion, but not in the way you write. Here, I generate all subsets of length 1, check whether any fulfills the requirement, then, starting from these, all subsets of length 2, eliminate all length 1 subsets, et cetera. You could do it the other way, but I don't think that would be more efficient. I've never done any tests on efficiency though (I do not need this on a daily basis), but anyway, wondering what teacher would torture his/her students with these kinds of misery...  

Over to Petros...
 
Petros Papatheodoru
Ranch Hand
Posts: 36
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am trying to follow you guys since I am a noob so correct me. Piet you suggested a Map<bitcount, List<Long> is this suggestion still relevant? If yes, what would I gain if a had all the longs between 1 and 2^34 - 1in a map based on their bitcount? In your latest reply you mentioned " I generate all subsets of length 1... then all subsets of length 2, etc. How do you do that though haha? Thanks for your patience guys. Also, come tell our teacher that Piet hehe
 
Piet Souris
Saloon Keeper
Posts: 3462
149
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi Petros,

I apologize, I did not realize that this is the Beginners Forum, since your task is not what I would call a beginners assignment.

I just made a very simplified demo version, to see if it worked and how I could do justice to Mike's remarks. I'll give some code to illustrate what I mean, but a bit reluctantly, since this is an assignment. But I hope you get some ideas to implement in your own code.

For instance, to create the Map I was talking about, I have this:
and I have a Map creator like:
and I have a method to get the bits set in a long:

The process is then: starting with bitCount 1, traverse the List<Long>, determine the bits of each long, and add the corresponding sets as I described before, testing for the size of the result. Currently, I do an O(n) operation for each long, to get an idea of the speed, but that should be O(n*n).
Like:

Well, it took 2 minutes to calculate 5 of the 14 keys so far, so the real thing would take a complete holiday, I guess. I can think of some optimizations, but at the cost of making the whole thing even more complicated. It is certainly more complicated than I thought at first.

Anyway, I hope it is clear what I am talking about. But I would certainly start with a graph of only a few elements, to test whatever you have.
 
Piet Souris
Saloon Keeper
Posts: 3462
149
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi Petros,

gaiined an experience, lost an illusion, as we say in Holland...

well, Mike is right (clever guy he is), my idea is nice for small sizes, but very inefficient for larger values. You better get the powerset via a more classic way. And besides, brute force takes forever.

So I decided to give my initial idea a go, creating a slightly extended adjacencyMap, and using some simple elimination process. It looks very promising, for instance creating a graph of 100 vertices 0...99, with an average outdegree of 25 (i.e. every vertex has on average 25 neighbors), I got this result [55, 52, 79, 17, 60, 82, 46]. Can't prove that this is a minimal coverage, but as said it looks promising, and the run time was less than a second.

Not brute force, but if you are interested in my code nevertheless, let me know and I'll drop it on my GitHub account.

 
Petros Papatheodoru
Ranch Hand
Posts: 36
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The thing is though that I am not supposed to make any optimazations to the problem in the bruteforce solution. In my mind, I rephrased the problem as try to find all the combinations of k elements from n, for k in range 1 to 34 so googled just that and i found a recursive method that does this. I modified it a bit and added my condition check and I ended up with this little piece of code that seems to be doing the job! For the given input of 34 vertices, it seems to be taking forever to execute(no surprise) but I tried a custom input of 18 vertices and it finds a correct solution(consisting of 9 vertices) in half a second!
 
Campbell Ritchie
Marshal
Posts: 65474
249
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Petros Papatheodoru wrote:. . . I am not supposed to make any optimazations to the problem in the bruteforce solution . . .

Find some really violent movies, and they should give you ideas about what to do to people who insist on using brute force
 
Mike Simmons
Master Rancher
Posts: 3322
31
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Petros - great news.  I think "taking forever" is unfortunately built into this problem.  

Campbell - nice and concise.  Though it does ignore the desire to find the shortest solution.  Also, is there any benefit to using 0x22 rather than 34?  I understand using hex for bitwise AND, OR, etc... but  how is it helpful to know we're left-shifting 0x22 bits, rather than left-shit 34 bits?
 
Campbell Ritchie
Marshal
Posts: 65474
249
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Mike Simmons wrote:. . . Campbell - nice and concise. . . .

Thank you. And it was just something I doodled between cups of coffee
It should have been “between glasses of beer”, which would explain the 0xxx22
 
Piet Souris
Saloon Keeper
Posts: 3462
149
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
But I suspect that 'hasMatchingEdges' will take yet another pint!

About this takes forever, as I remarked:

A rough estimate is that when the size is n, and the average number of neighbors is m, then the coverage has between n/m and, say, limit = 1.5*n/m elements. So, brute forcing would get you the sum of (n over k), for k ranging from 1 to this limit.

With 18 vertices and a minimal coverage of 9, gives m = 2, so the number of calcultations is about 150K-250K. Half a second run time means about 200.000 checks/second.

If we have n = 34, m = 5; then brute force takes around 4 to 6 minutes or so. m = 2 does take forever.
 
Petros Papatheodoru
Ranch Hand
Posts: 36
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
After analysing again the problem statement, It seems that what was asked was not to find the minimum vertex cover, but to find a combination of vertices that cover all edges (hence the name checkIfAllEdgesAreCovered()), it was not clearly stated. The actual condition is irrelevant to what we were trying to do here though, but it is relevant when it comes to execution times: the graph had a minimum vertex cover consisting of only 4 vertices but the actual solution needed 14 vertices and it took 4.8 hours to finish hehe. What's crazy is that there seems to be an even bigger test dataset given to us, consisting of 77 vertices XD. I thank all of you personaly guys for your help and patience
 
Piet Souris
Saloon Keeper
Posts: 3462
149
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi Petros,

you certainly kept us very nicely busy for a couple of days, with whatever was the problem (clear as mud to me now     ). A splendid moment therefore to stop. So, thank you, and keep'm coming!
 
No. No. No. No. Changed my mind. Wanna come down. To see this tiny ad:
professionally read, modify and write PDF files from Java
https://products.aspose.com/pdf/java
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!