This week's book giveaway is in the General Computing forum. We're giving away four copies of Arduino in Action and have Martin Evans, Joshua Noble, and Jordan Hochenbaum on-line! See this thread for details.
I need to remove all "ENTERs" present from a string, may that be \n or \r or anything. I did search this forum but could not find a suitable answer. I even googled but to no avail.
In the code, I have used System.getParameter("line.separator") to find the line separator and remove the same from the string. The problem is that the java web application when deployed in a linux box does not understand ENTERs pressed from a Windows client box.
So, if Server is Windows and Client is also Windows, it works fine. If server is Linux and Client is Windows, it does not work.
I do not want to hard code every possible line separator. Is there a way to find an ENTER in a string.
I do not want to hard code every possible line separator. Is there a way to find an ENTER in a string.
Well, there isn't many "every possible line separators"...
Unix uses the line feed, aka LF, with the ascii value of 10. Windows uses the carriage return / line feed combo (two characters). The carriage return, aka CR, has an ascii value of 13. And some systems uses just the CR.
If you want to get rid of the line separators. Get rid of all LFs and CRs, and you'll do it for all systems.
You need to go through the regular expressions classes, eg Matcher, Pattern, and find out what is used for line ends. Then you can use the replaceAll method of String, particularly if you have worked out how to get these line end characters into a regular expression.
Vijay Raj
Ranch Hand
Joined: Oct 10, 2005
Posts: 110
posted
0
Thanks Henry.
So, there is no other way of removing line separators other than doing -
Which one would be better - solving with regular expression or the one above? Won't the regular expression solution un-necessary load the classes?
Which one would be better - solving with regular expression or the one above? Won't the regular expression solution un-necessary load the classes?
The one above IS a regex based solution. The replaceAll() method is just a convenience method that calls the regex replace methods.... And you can combine the two statements like so...
Also, note that strings are not mutable. So, you need to assign the result back to the "str" reference, if you actually want to change it.
Which one would be better - solving with regular expression or the one above? Won't the regular expression solution un-necessary load the classes?
The one above IS a regex based solution. The replaceAll() method is just a convenience method that calls the regex replace methods.... And you can combine the two statements like so...
...
Note that although this works, you don't need to "double escape" them: the regex "\n" matches "\n". So, you could/should just do:
Note that although this works, you don't need to "double escape" them: the regex "\n" matches "\n". So, you could/should just do:
I am not too sure if I completely agree with this. I agree that it will work. But in the first case, I am sending a "\n" string to the regex engine. And in the second case, I am sending an ASCII 10 character to the regex engine. It should work as the regex engine will replace the "\n" with the LF character anyway.
But I would feel more comfortable if I send readible text to the regex engine than control characters -- whether it works or not.
Henry
Piet Verdriet
Ranch Hand
Joined: Feb 25, 2006
Posts: 266
posted
0
Henry Wong wrote:
Note that although this works, you don't need to "double escape" them: the regex "\n" matches "\n". So, you could/should just do:
I am not too sure if I completely agree with this. I agree that it will work. But in the first case, I am sending a "\n" string to the regex engine. And in the second case, I am sending an ASCII 10 character to the regex engine. It should work as the regex engine will replace the "\n" with the LF character anyway.
But I would feel more comfortable if I send readible text to the regex engine than control characters -- whether it works or not.
Henry
Hmmm... You got me thinking there.
<cricket sounds />
Well, I can't really figure out why one should be prefered over the other. Let's hope our true regex-monk, Alan Moore, has something to contribute about this subject.
; )
AFAIK, it only matters when the regex is coming from something outside the program, like a file or the console, where it can be difficult or impossible to input control characters like linefeeds. Then you have to use the escape sequence.
There are also a variety of regex editors, testers, and other plugins available for various IDEs. These usually are able to convert between Java string literals and the regex as seen by the regex engine. For people who use such plugins, it's nice to see the escape sequences rather than actual newlines, returns, tabs, etc. Otherwise you can't easily tell whether a line break is a \r, \n, or some combination. Or whether other whitespace is really a space, or perhaps a tab. So if you use such plugins, or if your code will be seen and used by people who use these plugins, it's probably a good idea to use the extra escapes. In my opinion, of course.