File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Beginning Java and the fly likes [newbie] String.replace()/replaceAll() removed spaces Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "[newbie] String.replace()/replaceAll() removed spaces" Watch "[newbie] String.replace()/replaceAll() removed spaces" New topic
Author

[newbie] String.replace()/replaceAll() removed spaces

Jon Camilleri
Ranch Hand

Joined: Apr 25, 2008
Posts: 660

This code is intended to read a source file, and write back the contents deleting the line numbers.

1. After a couple of hiccups, it's running, however, the spaces seem to be removed. Why?
2. Do different OSs require something else other than '\n' for a new line? I'm aware that Microsoft's Notepad can only read text files with CR+LF at the end of each line and I'll have to update my code eventually. Any thing else?




NOTE 1: Assuming that "code in one line" works, because the following runs.


NOTE 2: Attachments
1. (source/input file
2. franky's child ..er... output file :) - must be placed within c:\source

NOTE 3: This forum does not seem to accept .RAR files as attachments; both files are kinda small.



Jon
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18717
    
  40

After a couple of hiccups, it's running, however, the spaces seem to be removed. Why?


I already hinted this, on your previous topic. Basically, you are using a scanner to read in the file. But you are not reading in a line at a time, you are reading a token at a time. And by default, the scanner will use white space as your delimiters for the token, so you will lose all whitespaces... meaning spaces, tabs, carriage returns, line feeds, etc.

Henry


Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Jon Camilleri
Ranch Hand

Joined: Apr 25, 2008
Posts: 660

Henry Wong wrote:
After a couple of hiccups, it's running, however, the spaces seem to be removed. Why?


I already hinted this, on your previous topic. Basically, you are using a scanner to read in the file. But you are not reading in a line at a time, you are reading a token at a time. And by default, the scanner will use white space as your delimiters for the token, so you will lose all whitespaces... meaning spaces, tabs, carriage returns, line feeds, etc.

Henry

Thanks for reminding me I must have missed it; any idea how to go about it?
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 38363
    
  23
Go to the Pattern class and see what the possibilities for line termninator are. Then you can create a regular expression to pick up all the kinds of line terminator, and pass that to your Scanner to use as a delimiter. I think Scanner has a setDelimiter or useDelimiter method. You will probably find that of the many line termniator possibilities, only two are actually used.
Jon Camilleri
Ranch Hand

Joined: Apr 25, 2008
Posts: 660

Campbell Ritchie wrote:Go to the Pattern class and see what the possibilities for line termninator are. Then you can create a regular expression to pick up all the kinds of line terminator, and pass that to your Scanner to use as a delimiter. I think Scanner has a setDelimiter or useDelimiter method. You will probably find that of the many line termniator possibilities, only two are actually used.


How will line delimiters make the scanner read the line including the spaces? I'm probably misunderstanding something, because I tried checking whether a space is a delimiter and it was unsuccessful:

Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 38363
    
  23
What I meant was to look for only line ends. Lines ends count as whitespace, so if you look for whitespace (which is the default anyway) you get space, tab, form feed, newline, etc. So it will split into words, rather than lines. If you use the line terminator combinations (I think there are 6) you will get your source file one line at a time.
If you want it one statement at a time, that would be rather different.
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 38363
    
  23
You ought to call isWhitespace on Character since it is a static method. I could get space to record as whitespace; I wonder why you didn't.I found the list of Unicode numbers in the Character#isWhitespace(char) documentation.
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

I just use an Apache Commons thingie for reading in a file by lines (FileUtils.lineIterator or something like that).
Jon Camilleri
Ranch Hand

Joined: Apr 25, 2008
Posts: 660

David Newton wrote:I just use an Apache Commons thingie for reading in a file by lines (FileUtils.lineIterator or something like that).


Seems to be what I need :) Does it ignore spaces like my code does? Do you have more specific information (e.g. code snippets) on how to use it or where to lookup documentation?
Jon Camilleri
Ranch Hand

Joined: Apr 25, 2008
Posts: 660

Campbell Ritchie wrote:You ought to call isWhitespace on Character since it is a static method. I could get space to record as whitespace; I wonder why you didn't.I found the list of Unicode numbers in the Character#isWhitespace(char) documentation.



Thanks. The code shows me what kind of characters can be used as whitespaces, does it?. So does the scanner remove the white spaces when reading them to a string?



Is there any workaround?
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 38363
    
  23
You will have to look in the Scanner documentation for those details. If you provide a delimiter, then the Scanner removes whatever is in the delimiter, and the default delimiter if you don't provide one is "whitespace". I presume that is the same as "whitespace" for the Character class, but am not sure.
Jon Camilleri
Ranch Hand

Joined: Apr 25, 2008
Posts: 660

Campbell Ritchie wrote:You will have to look in the Scanner documentation for those details. If you provide a delimiter, then the Scanner removes whatever is in the delimiter, and the default delimiter if you don't provide one is "whitespace". I presume that is the same as "whitespace" for the Character class, but am not sure.


I see your point. This seems to work.

Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 38363
    
  23
Jon Camilleri wrote:. . . This seems to work.
. . .
Well done
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: [newbie] String.replace()/replaceAll() removed spaces