• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

I need a Hash function?

 
Monu Tripathi
Rancher
Posts: 1369
1
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a class which downloads a resource(image file etc.) in background and writes it onto a persistent storage. While saving the file, I am using the last portion of the URL as the file name. For e.g. if http://some.host.name/some/path/Resource.png is the URL the name of the file would be "Resource.png".

The problem is there could be more than one URL which has "Resource.png" as the last component which causes overwrites and therefore loss of data. I need some way to generate a unique ID for a resource referenced by a URL.
[Note: the only information passed to this module about the resource is the URL]

Is there a Hash function for such cases that I can use?

Thanks.
 
Christophe Verré
Sheriff
Posts: 14691
16
Eclipse IDE Ubuntu VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You could make a synchronized method which would return the file name, with a timestamp appended to it.
 
Jesper de Jong
Java Cowboy
Saloon Keeper
Posts: 15205
36
Android IntelliJ IDE Java Scala Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Note that hash functions do not generate unique IDs - so you are not looking for a hash function.

One thing you could do is store the resources in a directory structure. For example, if the URL is http://some.host.name/some/path/Resource.png, then create a directory "some.host.name", containing a directory "some", containing a directory "path", in which you store Resource.png.

Or, if for some reason you can't do that, replace characters in the URL until you get a valid filename, for example some_host_name_some_path_Resource.png (although in principle you could still get name clashes if you do that).
 
Monu Tripathi
Rancher
Posts: 1369
1
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Christophe Verré wrote:You could make a synchronized method which would return the file name, with a timestamp appended to it.

Thanks for your answer.

I'd given a thought to using timestamps; the problem is I have to retrieve files back again, given a URL. Generating the same timestamp will become a problem. I think I failed to mention retrieval in my original post. I apologize.
 
Christophe Verré
Sheriff
Posts: 14691
16
Eclipse IDE Ubuntu VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Then Jesper's idea seems more appropriate.
 
Monu Tripathi
Rancher
Posts: 1369
1
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jesper Young wrote:Note that hash functions do not generate unique IDs - so you are not looking for a hash function.

I knew this would come up and therefore I put a question mark in my thread topic. Anyways, I want unique Hash values for my problem.

Jesper Young wrote:
One thing you could do is store the resources in a directory structure. For example, if the URL is http://some.host.name/some/path/Resource.png, then create a directory "some.host.name", containing a directory "some", containing a directory "path", in which you store Resource.png.

This occurred to me but I ruled it out for creating the directory structure would take some time since I am dealing with lots of URLs here, each of varying depths and this code will run on mobile device.

Jesper Young wrote:
Or, if for some reason you can't do that, replace characters in the URL until you get a valid filename, for example some_host_name_some_path_Resource.png (although in principle you could still get name clashes if you do that).

This option of replacing characters in the URL appeals to me. The only drawback that I can see with them is long file names. I dont see the potential of name clashes because URLs of uniques resources will have to be different(else wise they are pointing to same resource and then overwrites are not a problem).

Thanks for your reply.


 
Christophe Verré
Sheriff
Posts: 14691
16
Eclipse IDE Ubuntu VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You could use something like a MessageDigest to turn the URL into an MD5 hash :


Then transform the byte array into hexadecimal values.

(Plenty of examples, like here)
 
David Newton
Author
Rancher
Posts: 12617
IntelliJ IDE Ruby
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Or just use a key/value DB.
 
Monu Tripathi
Rancher
Posts: 1369
1
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Christophe Verré wrote:You could use something like a MessageDigest to turn the URL into an MD5 hash...

thanks Christopher; haven't used it ever, will surely read up....
David Newton wrote:Or just use a key/value DB.

Yes, that is an option too. Thanks!
 
Campbell Ritchie
Sheriff
Posts: 48381
56
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Too difficult a question for "beginning". Moving thread.
 
Jim Hoglund
Ranch Hand
Posts: 525
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Is your problem duplicate file names? When you detect a duplicate, how about
the common approach of adding: (1), (2), (3), etc., just ahead of the file extension?
This way, you know exactly what to look for during further processing. And human
readers of the file names can easily see what is going on.

Jim ... ...
 
Monu Tripathi
Rancher
Posts: 1369
1
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jim Hoglund wrote:Is your problem duplicate file names? When you detect a duplicate, how about
the common approach of adding: (1), (2), (3), etc., just ahead of the file extension?
This way, you know exactly what to look for during further processing. And human
readers of the file names can easily see what is going on.

Jim ... ...


Thanks for your suggestion Jim.

The problem is to write and retrieve a resource to and from a persistent storage given only the resource URL. As I see it, your approach will not ensure easy retrieval unless I find a way to tag URL information also with the File.
 
Jim Hoglund
Ranch Hand
Posts: 525
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Montu : Did you consider David's idea. You could abstract the key
to a single integer by summing the URL characters, for example.

David Newton wrote:Or just use a key/value DB.

Jim ... :) ...
 
Monu Tripathi
Rancher
Posts: 1369
1
Android Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jim Hoglund wrote:Montu : Did you consider David's idea. You could abstract the key
to a single integer by summing the URL characters, for example....


David Newton wrote:Or just use a key/value DB.

Monu Tripathi wrote:Yes, that is an option too. Thanks!

I cant setup a database now unfortunately due to time constraints but it is a valid and useful suggestion.
I am going with Jesper's solution and in the meanwhile also reading up on MessageDigest.
 
Jim Hoglund
Ranch Hand
Posts: 525
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes, Jesper's idea looks very doable. And the results are
visible with just a file browser. Neat and clean . . .

Jim ... ...
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic