aspose file tools*
The moose likes Java in General and the fly likes Parsing notepad content separated by pipes Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Parsing notepad content separated by pipes" Watch "Parsing notepad content separated by pipes" New topic
Author

Parsing notepad content separated by pipes

Edgar Henz
Greenhorn

Joined: Oct 28, 2006
Posts: 2
Hi there,

I really need your help guys, I need to read the content of a notepad file and then parse it, specifically it's a string just like this one,

AAA_AAAA.01 | BBB_BBBB.01 BBB_BBBB.02 | BBB_BBBB.03 | CCC_CCCC.01 | DDD_DDDD.01 | DDD_DDDD.02 | DDD_DDDD.03 | DDD_DDDD.04....and so on,

and so far I've got only this code

package tokenizer;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.StringTokenizer;

/**
* @author Henz
*
*/
public class ThreadParser {

/**
*
*/
public ThreadParser() {
// Auto-generated constructor stub
}

/*
* @param args
*/
public static void main(String[] args) {
// Auto-generated method stub
String line;

try {
System.out.print("Enter the path: ");
BufferedReader stdin = new BufferedReader(new InputStreamReader(System.in));
String file = stdin.readLine();
BufferedReader sfile = new BufferedReader(new FileReader(file));

File myFile = new File(file);
if (myFile.exists()) {
System.out.println("File founded, preparing parsing...");
} else {
System.out.println("File not founded, verify path...");
}

while ((( line = sfile.readLine())!= null))
System.out.println(line);

StringTokenizer st = new StringTokenizer(line)//Exactly here I need the string so I can parse it but it returns me a NullPointerEx
while(st.hasMoreTokens())
System.out.println(st.nextToken());

} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}

}


But I've got only this output

Enter the path: C:\Test.txt
File founded, preparing parsing...
Exception in thread "main" java.lang.NullPointerException
at java.util.StringTokenizer.<init>(StringTokenizer.java.182)
at java.util.StringTokenizer.<init>(StringTokenizer.java.219)
at tokenizer.ThreadParser.main(ThreadParser.java:54)

The question is, what else do I need so I can get this output ,

AAA_AAAA.01
BBB_BBBB.01 BBB_BBBB.02 BBB_BBBB.03
CCC_CCCC.01
DDD_DDDD.01 DDD_DDDD.02 DDD_DDDD.03 DDD_DDDD.04

The main objective is to parse the string above eliminating its pipes and in each line having just the same type prefix words

Regards!

[ January 17, 2008: Message edited by: Edgar Henz ]
[ January 17, 2008: Message edited by: Bear Bibeault ]
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39784
    
  28
Welcome to the Ranch.

Please use "code" tags around quoted code; it makes it easier to read.
Suggest you use a JFileChooser to find the requisite file (if it asks for a Component in any of the methods you can pass "null").
Suggest you use String.split() rather than StringTokenizer; StringTokenizer is legacy code.
Suggest you use java.util.Scanner instead of a BufferedReader for text files; it is much easier.

The real problem, I think, however, is lack of {} around your "while" loop.

This is what happened with your code unchanged, then with the appropriate {}:
[Campbell@queeg java]$ java tokenizer.ThreadParser
Enter the path: /home/Campbell/java/Henz
File founded, preparing parsing...
AAA_AAAA.01 | BBB_BBBB.01 BBB_BBBB.02 | BBB_BBBB.03 | CCC_CCCC.01 | DDD_DDDD.01 | DDD_DDDD.02 | DDD_DDDD.03 | DDD_DDDD.04....and so on,
Exception in thread "main" java.lang.NullPointerException
at java.util.StringTokenizer.<init>(StringTokenizer.java:182)
at java.util.StringTokenizer.<init>(StringTokenizer.java:219)
at tokenizer.ThreadParser.main(ThreadParser.java:54)
[Campbell@queeg java]$ javac -d . ThreadParser.java
[Campbell@queeg java]$ java tokenizer.ThreadParser
Enter the path: /home/Campbell/java/Henz
File founded, preparing parsing...
AAA_AAAA.01 | BBB_BBBB.01 BBB_BBBB.02 | BBB_BBBB.03 | CCC_CCCC.01 | DDD_DDDD.01 | DDD_DDDD.02 | DDD_DDDD.03 | DDD_DDDD.04....and so on,
AAA_AAAA.01
|
BBB_BBBB.01
BBB_BBBB.02
|
BBB_BBBB.03
|
CCC_CCCC.01
|
DDD_DDDD.01
|
DDD_DDDD.02
|
DDD_DDDD.03
|
DDD_DDDD.04....and
so
on,
[Campbell@queeg java]$
You will need to put in a regular expression to split the input; you have not quoted anything, so you will be using whitespace as a default.
You can actually set a Scanner to use a regular expression, so it will read the file and split the tokens in one operation!

CR
Bear Bibeault
Author and ninkuma
Marshal

Joined: Jan 10, 2002
Posts: 61603
    
  67

Also, please read this.


[Asking smart questions] [Bear's FrontMan] [About Bear] [Books by Bear]
Edgar Henz
Greenhorn

Joined: Oct 28, 2006
Posts: 2
Thanks a lot for your advise and your time, certainly it's my first time in a forum so I will try to improve my way of formulating my questions, my posts and I'll try not to distract or conflict anyone

Have a nice day, thanks again
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39784
    
  28
You're welcome.

Have you got it to work? Please post what changes you made.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Parsing notepad content separated by pipes