aspose file tools*
The moose likes Java in General and the fly likes Problem inputting and outputting strings correctly (weird error) Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Problem inputting and outputting strings correctly (weird error)" Watch "Problem inputting and outputting strings correctly (weird error)" New topic
Author

Problem inputting and outputting strings correctly (weird error)

Peter Mills
Greenhorn

Joined: Nov 17, 2003
Posts: 2
Hi,
I'm writing a program that will firstly remove all stop words ("a" and "the"), then all tabs ('\t') from a text file. I seem to have trouble when attempting to perform one action after the other. When I run the methods with the following code...
import java.util.*;
import java.io.* ;
public class RemoveAndUntabify
{
public static void main(String args[] ) throws IOException
{
if ( args.length != 0 && args.length == 3)
{
String input = args[0];
String output = args[1];
String wordFile = args[2];
String[] tabs = {input, output};
Untabify.main(tabs);
String[] words = {input, output, wordFile};
RemoveStopWords.main(words);
}
else
{
System.out.println("You must supply ''inFile, outFile'' ''stopWordFile''");
}
}
}
... It appears as if ONLY the "RemoveStopWords" class has been run (in that the untabify method appears in the output file as though it has not happened - only the "a" and "the"'s are gone, but after checking I believe both are actually being run - however I think someone is wrong with the classes it's calling and how they're accessing/closing the files...

removeStopWords...
import java.util.*;
import java.io.* ;
public class RemoveStopWords
{
public static void main(String args[] ) throws IOException
{
if ( args.length != 0 && args.length == 3)
{
FileReader reader = new FileReader(args[0]);
// wrap reader into a BufferedReader so we can use readLine()
BufferedReader in = new BufferedReader(reader);
FileWriter writer = new FileWriter (args[1]);
// wrap writer into a PrintWriter so we can use println()
PrintWriter out = new PrintWriter(writer);
FileReader stopWordsReader = new FileReader(args[2]);
// wrap reader into a BufferedReader so we can use readLine()
BufferedReader stopWords = new BufferedReader(stopWordsReader);
Collection col = new ArrayList();
String input;
while ((input = stopWords.readLine()) != null)
col.add(input);
TextProcessing.removeStopWords(out, in, col);
in.close();
out.close();
stopWords.close();
//Test
writer.close();
reader.close();
stopWordsReader.close();
}
else
{
System.out.println("You must supply ''inFile, outFile'' ''stopWordFile''");
}
}

}

Untabify...

import java.util.*;
import java.io.* ;
public class Untabify
{
public static void main(String args[] ) throws IOException
{
if ( args.length != 0 && args.length == 2)
{
FileReader reader = new FileReader(args[0]);
// wrap reader into a BufferedReader so we can use readLine()
BufferedReader in = new BufferedReader(reader);
FileWriter writer = new FileWriter (args[1]);
// wrap writer into a PrintWriter so we can use println()
PrintWriter out = new PrintWriter(writer);

TextProcessing.untabify(out, in);
in.close();
out.close();
//Test
writer.close();
reader.close();
}
else
{
System.out.println("You must supply ''inFile, outFile''");
}
}

}

and finally...
import java.util.*;
import java.io.* ;
public class TextProcessing {
static final String delims = " .,;:!?-`'\"()";
static boolean isDelim (String str) {return (delims.indexOf(str) > 0);}

public static String removeStopWords(String str, Collection stopWords)
{
StringTokenizer tokenizer = new StringTokenizer(str,delims, true);
StringBuffer sb = new StringBuffer();
while (tokenizer.hasMoreTokens())
{
String token = tokenizer.nextToken();
if (isDelim(token) || (!stopWords.contains(token)))
sb.append(token);
};
return (sb.toString());
}
// Task 1 - 2
public static void removeStopWords(PrintWriter out, BufferedReader in, Collection stopWords) throws IOException
{
String input;
while ((input = in.readLine()) != null)
{
StringTokenizer tokenizer = new StringTokenizer(input,delims, true);
StringBuffer sb = new StringBuffer();
while (tokenizer.hasMoreTokens())
{
String token = tokenizer.nextToken();
if (isDelim(token) || (!stopWords.contains(token)))
sb.append(token);
};
out.println(sb.toString());
}
}
// Task 1 - 3
//Doesn't work quite right - it swaps 'cloud' for eight blank spaces instead of \t char for now
public static void untabify(PrintWriter out, BufferedReader in) throws IOException
{
String input;
Collection stopWords = new ArrayList();
stopWords.add("cloud");
while ((input = in.readLine()) != null)
{
StringTokenizer tokenizer = new StringTokenizer(input,delims, true);
StringBuffer sb = new StringBuffer();
while (tokenizer.hasMoreTokens())
{
String token = tokenizer.nextToken();
//if (isDelim(token) || (!stopWords.contains(token)))
if (isDelim(token) || (!stopWords.contains(token)))
sb.append(token);
else
sb.append(" ");
};
out.println(sb.toString());
}
}

}

Sorry it's a lot of code there guys, but I think it's really close to working right; it would really make my day if anyone's got any ideas on how to fix it.
Cheers,
Peter
rom chatterjee
Ranch Hand

Joined: Dec 11, 2001
Posts: 46
Hi there,
I think you need to stop for a moment and ask yourself what you're coding. I dont mean this harshly, but Im sure that anyone new to coding is going to go steaght in and make something work. It is satisfying, but not pretty!
You have used the main method 3 times, did you really mean this? Remember the main method is your entry point to an application. Once you're up and running, why start more applications? This may be your problem (although I've not tested it).
Try to describe your application first. Name some objects that you think you might need, what methods they might have and what those methods will do.
For example this sounds like some kind of text filter application. So your class with the main method could be FileFilterApp. You could maybe have a TokenList (a list of all the tokens to remove), that would suggest a Token. These might be used by a FileFilter.
So, lets expand on that a bit:
A Token could have some methods like:
public Token(String token) //constructor
public String getToken() //get the Token value
A TokenList could have:
public TokenList() //create an empty list
public addToken(Token token) //add a token to the list
public removeToken(Token token) //remove a token forn the list
public iterator() //get the token list iterator for walking the list
A FileFilter could have
public TextFilter(TokenList list) // create a filter using this token list
public File filter(File file) // filter file based on list
public String filter(File file) // filter file based on list
public StringBuffer filter(File file) // filter file based on list
Finally, your FileFilterApp might look something like:

This is intended only as a starting point, so come back if you need more help.


Rom Chatterjee<BR>Sun Certified Java Programmer
Peter Mills
Greenhorn

Joined: Nov 17, 2003
Posts: 2
Hey thanks for the advice, though it's a Software Architecture assignment so we can't change that much to do with the program (since we're supposed to be demonstrating call return, pipe-filter and code adaptation). Incidently I did manage to fix my problem the next day, naturally it was one of those silly problems. Basically I was reading the same input, for two different tasks - not reading the input, outputting and reading in the output from the first task into the second; it's always the simplest things isn't it!
Thanks,
Peter
 
wood burning stoves
 
subject: Problem inputting and outputting strings correctly (weird error)