aspose file tools*
The moose likes Java in General and the fly likes Sort Directory paths, sort of. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Sort Directory paths, sort of." Watch "Sort Directory paths, sort of." New topic
Author

Sort Directory paths, sort of.

David Shepherd
Ranch Hand

Joined: Mar 02, 2001
Posts: 35
I have a vector full of strings that resemble directory paths. I just need to sort them into an order similar to the one below, maintaining the parent child relationship. I was hoping there was an EASY way to do it. Like an algorithm that someone knew off the top of thier head. Thank you very much if you have it!
Exhibit / David
Exhibit / David / Car
Exhibit / David / Car / Seat
Exhibit / David / Car / Seat / Red
Exhibit / David / Car / Seat / Blue
Exhibit / David / Car / Seat / Green
Exhibit / Eric
Exhibit / Eric / Truck
Exhibit / Eric / Truck / Engine
Exhibit / Eric / Truck / Color
Exhibit / Eric / Truck / Size
Exhibit / Eric / Truck / Size / Big
Exhibit / Eric / Truck / Size / RealBig
Exhibit / Eric / Truck / Size / Tiny
Manfred Leonhardt
Ranch Hand

Joined: Jan 09, 2001
Posts: 1492
Hi David,
If you want something ordered (text being the easiest) I would place them all into some set (i.e., HashSet, TreeSet) which will automatically order them for you. Then you just need to read them back out!
Regards,
Manfred.
David Shepherd
Ranch Hand

Joined: Mar 02, 2001
Posts: 35
I wish it were that easy. A set will return them ordered but not quite the way I need. I need to preserve the parent child relationship of a tree. A set does not seem to reliably do that. Is there such a thing as a tree sort or something similar?
I've been trying to use a combination of a set and breaking each string apart using a StringTokenizer. It just seems to me that somewhere someone has already done this trickery. After all, directory structures are sorted somehow. Thanks for the help.
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920
It looks to me as if these "paths" are just strings, sorted in their natural order. Here's a test program I wrote to demonstrate:

Is there something I'm missing, or is it really this simple?


Read about me at frankcarver.me ~ Raspberry Alpha Omega ~ Frank's Punchbarrel Blog
David Shepherd
Ranch Hand

Joined: Mar 02, 2001
Posts: 35
It comes close but no cigar. If you run the code above you do not always obtain the pattern I listed in my first posting.
I tried sorting using a set and then using an array.sort. I think these basically do the same thing so that did not quit get me anywhere. I have seen examples of a binary tree sort that come closest to what I need but have not seen a lot of Java code fo such a sort. I was hoping I could just use a JTree to sort the info as a cheat. Any good web resources for Java sort algorithms?
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920
I'm puzzled now. Can you give an example of a "wrong" ordering output from the code above, it always gives the same output for me. There's obviously some subtlety in the problem that I'm missing.
I've always used this sort of sort for traversing tree structures (I use it in my XML parser, for example, where grouping subnodes is vital), so I'm very interested if you have found some sort of loophole.
Cindy Glass
"The Hood"
Sheriff

Joined: Sep 29, 2000
Posts: 8521
Well it isn't alphabetical, cuz the Red is before the Bue and Green, and Engine is before Color, so I am at a loss as to what the order is.


"JavaRanch, where the deer and the Certified play" - David O'Meara
Steve Fahlbusch
Bartender

Joined: Sep 18, 2000
Posts: 571
    
    7

Greetings David,
Let's start by clarifying a few points, shall we?
1) What is the order of the items in the first post, or said another way, what determines the order since it seems not to be data within the strings, is it some other data that does not make
it to the strings, such as the date/time when the path (subtree) was created?
2) Why and How does Frank's example not keep the parent -
child relationship?
3) You said the Frank's example does not always get the
correct results, could you please explain? It seems to me
that either Frank's code will always work or never work for
your intended results?
David Shepherd
Ranch Hand

Joined: Mar 02, 2001
Posts: 35
I think Frank may be right. I am using this in an XML context to develop an application. I need the sort to allow me to dynamically generate XSL. I'll look at it further this weekend and see if it will work.
However, using the Array.sort() I recieved this result.
exhibit/section
exhibit/section/byline/title
exhibit/section/para
exhibit/section/section/byline
exhibit/section/section/byline/subbyline
And I was trying to get this result.
exhibit/section
exhibit/section/para *******
exhibit/section/byline/title
exhibit/section/section/byline
exhibit/section/section/byline/subbyline
The difference is in the (exhibit/section/para) and
(exhibit/section/byline/title) pair. It is subtle and may not matter. I will try it.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
I think we all see where the difference is; the question is why do you think that "para" should come before "byline"? Is there some rule which you could explain to us (or your program) which explains why one ordering is "correct" and another is not? Offhand it seems as though it shouldn't matter; using alphabetical order is just the easiest sorting method to implement. If you can explain some other criteria to be used for sorting, it will be possible to create a java.util.Comparator class which can then be used to sort the classes the way you want (using the Arrays.sort(Object[], Comparator) method).

"I'm not back." - Bill Harding, Twister
David Shepherd
Ranch Hand

Joined: Mar 02, 2001
Posts: 35
I'll try to explain. It may not make any difference for my purpose (I have not tested it yet), however, once I get an idea in my head I'm stubborn. Sorry.
If section is the first path the next logical path (in my mind anyway) is direct children of section (that do not have children of thier own). "exhibit/para" would come before "exhibit/section/byline/title" because it has only one element past "exhibit/section", thus represents a direct child/descendant of "exhibit/section". If "exhibit/section" is considered the root, my sort would look for direct children of this root . After that the sort would look for direct children of the new root "exhibit/section/para", and so on. So, each path could be viewed as a parent while iterating through the sort.
So, as you iterate down the list:
- each string is a path that may have children (a root node)
- if the path has children, order them by how directly/ closely
they are related to the root. Most recent descendants
coming first.
I am having trouble just describing the logic, that's obviously why I am having trouble putting it into mathematical terms.
David Shepherd
Ranch Hand

Joined: Mar 02, 2001
Posts: 35
In addition to the above, descendants without children would come before descendants with children.
Frank Carver
Sheriff

Joined: Jan 07, 1999
Posts: 6920
I think the sort which I have described will generate the results you require, iff all "branch" nodes are present. In your recent example, I note that there is no "exhibit/section/byline" or "exhibit/section/section" entries, although your original example was "complete" in this way. In most forms of tree traversal these nodes would have to be present for their children to be there. Is this deliberate? Are these entries missing from your input for some important reason?
From your mention of XML, I guess the missing "tags" might not be shown because they have no significant (non-subtag) contents. For my own XML/string mapping I had to make sure these were stored to make the system work correctly.
Also, just in case you haven't come across this yet, there is a big problem with representing XML trees in this sort of way. Most DTDs allow repeated nodes:

which can cause a lot of problems unless you implement a way to handle or prevent this in your representation.
[This message has been edited by Frank Carver (edited March 10, 2001).]
David Shepherd
Ranch Hand

Joined: Mar 02, 2001
Posts: 35
The missing nodes is an excellent point. It is something I have overlooked. I will need to come up with a way of correcting that problem. If fact, it is something that would have been an issue and I'm glad you pointed it out to me now.
Thank you. I should have seen that earlier. Sometimes the little things are hard to see.
Thanks to everyone for the help.
 
 
subject: Sort Directory paths, sort of.