aspose file tools*
The moose likes Java in General and the fly likes URLEncoder question Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "URLEncoder question" Watch "URLEncoder question" New topic
Author

URLEncoder question

Greg Ostravich
Ranch Hand

Joined: Jul 11, 2002
Posts: 112
I am using URLEncoder's encode method to encode a string with spaces in it that is to become a URL. I have two problems/questions.
I'm using the "UTF-8" encoding class and the spaces do not convert to %20, but instead convert to the "+" sign. I suspect that is because this is for converting data on a form and that I may need to use a different encoding class. The other problem is that any colons or slashes that are a legitimate portion of the URL are encoded and I do not want them to be converted.

Is there a character class I should be using that is more appropriate to URLs instead of form data? Do I need to define my own? Is there a different method I need to be using?
Or do I just need to substitute manually within my strings?

I did try and look at the character sets at the URL below but I couldn't figure out what I needed from the set names and aliases.
http://www.iana.org/assignments/character-sets

The only references I found when searching the JavaRanch Big Moose Saloon Archives were to using URLEncoder or to one person that did manual substitutions to get the %20 they wanted.

If there's an existing tool in the JDK to do this I'd rather use that instead of manual subsitutions.

In case it matters, this is for JDK 1.4

Thanks in advance -

Greg


Greg Ostravich - SCPJ2
Joe Ess
Bartender

Joined: Oct 29, 2001
Posts: 8893
    
    8

Originally posted by Greg Ostravich:
I'm using the "UTF-8" encoding class and the spaces do not convert to %20, but instead convert to the "+" sign.

This is correct behavior according to the documentation for java.net.URLEncoder

Originally posted by Greg Ostravich:

The other problem is that any colons or slashes that are a legitimate portion of the URL are encoded and I do not want them to be converted.

You are not supposed to encode the URL, only the form data.


"blabbing like a narcissistic fool with a superiority complex" ~ N.A.
[How To Ask Questions On JavaRanch]
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 61180
    
  66

Joe is correct on both counts. The '+' is the correct encoding for space (why did you think otherwise?), and you should be individually encoding only the values of the request parameters.


[Asking smart questions] [Bear's FrontMan] [About Bear] [Books by Bear]
Greg Ostravich
Ranch Hand

Joined: Jul 11, 2002
Posts: 112
In my situation, since I want to encode a URL, not form data, is there another encoding class that would perform my conversion correctly and exclude converting slashes and colons that are a real part of my string?

Or do I need to do my conversion manually by substituting the spaces for the %20 in my string?

Or is there another method somewhere that does what I need?
Joe Ess
Bartender

Joined: Oct 29, 2001
Posts: 8893
    
    8

Originally posted by Greg Ostravich:
is there another encoding class that would perform my conversion correctly and exclude converting slashes and colons that are a real part of my string?


No. Perhaps you should give us an example. I'm confused as to why you are attempting to create a URL containing spaces, slashes and colons.
Greg Ostravich
Ranch Hand

Joined: Jul 11, 2002
Posts: 112
Here's my example.

I am generating a URL that will be e-mailed out to a list of people periodically based on database content. The URL will be composed of a path that has spaces in it as well as the project name, which has spaces in it. The two parts which make up the URL come from a database and are read in at run-time.

I suspect if I was opening the URL directly it would not be an issue, but because the URL is in an e-mail the link doesn't work correctly without some sort of encoding to escape the spaces in the generated URL that is in the e-mail message.

That's why I thought encoding was the way to go.

Here's an example of what it looks like before encoding:

http://someserver/ProjectServer 101/Lists/Some Project Name/overview.aspx

Here's what I thought it should look like after encoding:

http://someserver/ProjectServer%20101/Lists/Some%20Project%20Name/overview.aspx

I can do a straight substitution if I need to.
My question was if there was an existing "encode" that would handle the URL as I needed or if one of the Encoding classes was designed to do this already and I just needed to pick that instead of "UTF-8"

If there's not, that's fine, but I figured I'd ask here first.

Note: I just noticed after I posted that you can see the 'break' I'm talking about in my own URL included above. The one without the %20 is 'broken' in the URL.
[ June 13, 2005: Message edited by: Greg Ostravich ]
Joe Ess
Bartender

Joined: Oct 29, 2001
Posts: 8893
    
    8

Originally posted by Greg Ostravich:

My question was if there was an existing "encode" that would handle the URL as I needed or if one of the Encoding classes was designed to do this already and I just needed to pick that instead of "UTF-8"


No. Changing the encoding will not change will not change the handling of the space character.
Greg Ostravich
Ranch Hand

Joined: Jul 11, 2002
Posts: 112
Thanks for all the replies.
I had wanted to use something built-in in case there are other characters I needed to modify but instead I did this and it worked for what I needed:

 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: URLEncoder question