I have written a web crawler that crawls tube sites. Each tube it finds it grabs the thumbnails for to show in search results and these are stored on my server. At the moment for each tube I find I create a folder name based on a random uuid and store the thumbnails in that folder. I then store that uuid with the url of the tube so I know where the images for the tube are.
This is not good because if I ever lost some of my data that associates the url with the random uuid named folder it would be hard to rebuild.
I want to create a uuid or similar based upon the url and then use that as the folder name. This way if I lost the associting data I could rebuild by regenerating the uuid of the url and then finding that folder.
I have looked at the uuid api and I can use the UUID.nameUUIDFromBytes() method by creating a byte from the url and converting it into a uuid but I cant convert the uuid back to the url.
Using the url for a folder name results in to long folder names and urls have to many charecters that dont make for good folder names.
I have also thought about converting the url to a byte then using the bytes concatenated as the folder name but this results in the folder name being to long.
why don't you look at building a pattern for example if you download an image from say google.com/images/foo/bar/mypic.jpg then you can store the files in a folder structure like /google-images-foo-bar/mypic-thumb.jpg
this way you'll be able to retrieve the weburl by simply analyzing the folder name.. again there are lots of things to be considered, like '-' in the URL etc.,
this could be a start for you to look at things differently...
I’ve looked at a lot of different solutions, and in my humble opinion Aspose is the way to go. Here’s the link: http://aspose.com
subject: creating uuid's or similar based on a string