Win a copy of Mesos in Action this week in the Cloud/Virtualizaton forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Linefeed problem while scanning a line

 
Jagat Bandhu
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I'm working on some application, which reads a text file that contains line like the following:


When I try to read this line (please note that this is a single line, which contains some linefeed characters in one of the fields thus making it two lines), I get two lines, broken by the linefeed character in one of the fields.

I tried using scanner or other filestream objects, but not able to read the line in a single shot.

Could anyone suggest me something on how to read such a file/field please?

Thanks.

PG.
 
William Brogden
Author and all-around good cowpoke
Rancher
Posts: 13061
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You may think it is a single "line" containing a linefeed but Java and most other languages will call it multiple lines.

There is no magic solution, you will just have to do some programming.

What would you like to use as an end of line indicator?

Do you want those line-feed characters preserved?

Bill
 
Jagat Bandhu
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Bill,

Thanks for your reply.

I'm using newline as end of line indicator in this file. If I open this file in say, Editplus or UltraEdit, they are able to show this field properly. And yes, I want a solution on how I can preserve these line-feed characters that are appearing within that field.

I know that I need to do some coding, but some pointers on how to achieve it would really be helpful.

Thanks,
PG.
 
Campbell Ritchie
Sheriff
Pie
Posts: 48953
60
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Remember newline is the common name for what is officially called linefeed (ctrl-J or (char)0x0a).
 
Jagat Bandhu
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for pointing that out. I'm on Windows, so I think the characters are \r\n.

The problem here I face is that the same pattern is also appearing within a field of the file, thus breaking up the line prematurely.
Will appreciate if someone can give me some solution how to handle this.
 
Campbell Ritchie
Sheriff
Pie
Posts: 48953
60
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes, you are correct that Windows uses (char)0xod(char)0x0a usually spelt \r\n for line breaks. But how can \r\n appear in the "middle" of a line? That appears to be a contradiction in terms.
 
Jagat Bandhu
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well... this is some field generated by some "black box" algo to uniquely identify a row in the file

That's how these chars appear in btn a field itself

 
Campbell Ritchie
Sheriff
Pie
Posts: 48953
60
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Have you got some sort of binary file? In which case, don't know offhand.
But that is no longer a beginner's quesiton. Moving.
 
Jagat Bandhu
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator


To answer your question, no this is a text file, generated by some system, but that contains this one field, which is in binary. Since this is generated, it contains these linefeeds and thus preventing reading a complete line.

I have tried Scanner, but that fails, unless I use two scanners. Something similar below:



Idea was to read the next line when it encounters a linefeed, but this also fails as it is thorwing NullPointer at some stage while processing.

I used another approach, using readBytes(), as follows:



Some of the records in the case above shows not broken, but this still shows records, which are still coming in two lines.

[Not able to post such a line in here]

Thanks,
P.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic