Win a copy of Mesos in Action this week in the Cloud/Virtualizaton forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

URLEncoder question

 
Greg Ostravich
Ranch Hand
Posts: 112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am using URLEncoder's encode method to encode a string with spaces in it that is to become a URL. I have two problems/questions.
I'm using the "UTF-8" encoding class and the spaces do not convert to %20, but instead convert to the "+" sign. I suspect that is because this is for converting data on a form and that I may need to use a different encoding class. The other problem is that any colons or slashes that are a legitimate portion of the URL are encoded and I do not want them to be converted.

Is there a character class I should be using that is more appropriate to URLs instead of form data? Do I need to define my own? Is there a different method I need to be using?
Or do I just need to substitute manually within my strings?

I did try and look at the character sets at the URL below but I couldn't figure out what I needed from the set names and aliases.
http://www.iana.org/assignments/character-sets

The only references I found when searching the JavaRanch Big Moose Saloon Archives were to using URLEncoder or to one person that did manual substitutions to get the %20 they wanted.

If there's an existing tool in the JDK to do this I'd rather use that instead of manual subsitutions.

In case it matters, this is for JDK 1.4

Thanks in advance -

Greg
 
Joe Ess
Bartender
Posts: 9280
10
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Greg Ostravich:
I'm using the "UTF-8" encoding class and the spaces do not convert to %20, but instead convert to the "+" sign.

This is correct behavior according to the documentation for java.net.URLEncoder

Originally posted by Greg Ostravich:

The other problem is that any colons or slashes that are a legitimate portion of the URL are encoded and I do not want them to be converted.

You are not supposed to encode the URL, only the form data.
 
Bear Bibeault
Author and ninkuma
Marshal
Pie
Posts: 64835
86
IntelliJ IDE Java jQuery Mac Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Joe is correct on both counts. The '+' is the correct encoding for space (why did you think otherwise?), and you should be individually encoding only the values of the request parameters.
 
Greg Ostravich
Ranch Hand
Posts: 112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In my situation, since I want to encode a URL, not form data, is there another encoding class that would perform my conversion correctly and exclude converting slashes and colons that are a real part of my string?

Or do I need to do my conversion manually by substituting the spaces for the %20 in my string?

Or is there another method somewhere that does what I need?
 
Joe Ess
Bartender
Posts: 9280
10
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Greg Ostravich:
is there another encoding class that would perform my conversion correctly and exclude converting slashes and colons that are a real part of my string?


No. Perhaps you should give us an example. I'm confused as to why you are attempting to create a URL containing spaces, slashes and colons.
 
Greg Ostravich
Ranch Hand
Posts: 112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here's my example.

I am generating a URL that will be e-mailed out to a list of people periodically based on database content. The URL will be composed of a path that has spaces in it as well as the project name, which has spaces in it. The two parts which make up the URL come from a database and are read in at run-time.

I suspect if I was opening the URL directly it would not be an issue, but because the URL is in an e-mail the link doesn't work correctly without some sort of encoding to escape the spaces in the generated URL that is in the e-mail message.

That's why I thought encoding was the way to go.

Here's an example of what it looks like before encoding:

http://someserver/ProjectServer 101/Lists/Some Project Name/overview.aspx

Here's what I thought it should look like after encoding:

http://someserver/ProjectServer%20101/Lists/Some%20Project%20Name/overview.aspx

I can do a straight substitution if I need to.
My question was if there was an existing "encode" that would handle the URL as I needed or if one of the Encoding classes was designed to do this already and I just needed to pick that instead of "UTF-8"

If there's not, that's fine, but I figured I'd ask here first.

Note: I just noticed after I posted that you can see the 'break' I'm talking about in my own URL included above. The one without the %20 is 'broken' in the URL.
[ June 13, 2005: Message edited by: Greg Ostravich ]
 
Joe Ess
Bartender
Posts: 9280
10
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Greg Ostravich:

My question was if there was an existing "encode" that would handle the URL as I needed or if one of the Encoding classes was designed to do this already and I just needed to pick that instead of "UTF-8"


No. Changing the encoding will not change will not change the handling of the space character.
 
Greg Ostravich
Ranch Hand
Posts: 112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for all the replies.
I had wanted to use something built-in in case there are other characters I needed to modify but instead I did this and it worked for what I needed:

 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic