File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Beginning Java and the fly likes spell-checker: help with delimiters and moving Scanner into Array Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "spell-checker: help with delimiters and moving Scanner into Array" Watch "spell-checker: help with delimiters and moving Scanner into Array" New topic
Author

spell-checker: help with delimiters and moving Scanner into Array

Don Havan
Greenhorn

Joined: Nov 01, 2013
Posts: 4
I am having trouble getting my spell-checker program to work. I have to write two programs that use an ArrayList and an array (yes, dictionary and other more efficient methods exist, but I really do have to use these two structures for this part of the project), and each takes 2 files: a dictionary file and a file to spell-check. The class using an arrayList...I kind of got working. I'm still quite the beginner, so I have no idea if it's the most efficient class, but it works except for one small problem. I have to add a blank line at the top of the dictionary file or else the spell-checker keeps finding the space after a period as an incorrectly spelled word. The file I'm using to spell-check says "This file has no words spelled incorrectly. None at all." without quotes. It finds the space after a period as a spelling error without the blank like in the dictionary file, works fine without it. Maybe it's the syntax I'm using for the delimiter? How would I add a period and a space? Here's the program-



Then, an array class has to be created to see the difference between how long the iterator takes in the above class and a binary search takes in the array class. But I can't even get the array filled. What I want to do is create an array that's only as long as it needs to be. When I use the while-loop to increment a variable so I can find out how many words the Scanner file to check has, so I know how long of an array to create, it works fine but when I do another while-loop using the same hasNext() and next() methods, it doesn't work. Is there a way to set the internal iterator back to the beginning? Here's what I have so far. The loop makes the variable the right length, but then I have trouble putting each word into the corresponding array index. I can't even worry about the comparisons and binary search that I'd like to implement because I can't even get the file to check into an array . And if I "cheat" and just create an array long enough to handle the text file I'm checking, the loop I use doesn't add the last word. Or it doesn't add every word, or adds a space so that there's not enough room for all the actual words. Basically...I need a lot of help. Any an all is appreciated. Anyways, here's the class I have so far for this class:



That's what I have at the moment anyways. I've been trying and deleting like crazy. But yeah, originally I wanted to have 2 loops, using while (chck.hasNext() for each so that I could make an array the right length. But it wasn't working. And I don't know if I'm using the delimiter syntax correctly. I thought what I have now would make it use anything that's not a letter, but I don't know. Any guidance would be immensely appreciated.
Jim Venolia
Ranch Hand

Joined: Sep 07, 2013
Posts: 154
    
    2

Why not stuff your dictionary into a, um, a dictionary. Then for each word of your input file you just see if it's in the dictionary.

Caveat: I've used hash tables in awk (I'm old) and dictionaries in Python, but have not used them in Java.

It's a no-brainer. We just need to take it to the next level to turn this into a win-win situation. The best practice is to get rid of the low-hanging fruit first. Ping me with an agenda so we can go flag up on this thing
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 38865
    
  23
There are classes called Dictionary and Hashtable, but don't use them. They are implementations of maps, but the map you should use as a default is HashMap.
DH: your code is illegible on small screens because of the long lines. I shall see if I can sort it out, then you can see how to break lines.
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 38865
    
  23
In most cases, it was the comments causing the problem. A lot of those // comments should have been /**&hellip*/ comments, even on private members (well, I think they should).
You appear to have the same delimiter hardcoded thrice: that is not a good idea. It should be a constant: useDelimiter(DELIMITER)
Don Havan
Greenhorn

Joined: Nov 01, 2013
Posts: 4
Sorry, should have elaborated in that first sentence . But it literally does have to be an array and arrayList. It's a team project, in order to compare efficiency of different data structures / search methods, and I have those. Other people got dictionary, hashMap, etc. We'll just be comparing everything at the end. So...mine are these. But yes, more efficient ones existing is part of the assignment.

I'll see if I can edit the comments out in the first post, make the lines shorter.

Plus I figured it'd be good practice for working with Scanner files, iterators, etc. But yeah, can't use dictionary or any other structure for this.
Tyson Lindner
Ranch Hand

Joined: May 16, 2012
Posts: 172
Use "\\W+" as your delimiter. That will however include digits, so you have to decide if you want something like "234" or "apple55" to be considered a misspelled word or not. In this forum's spellchecker the former is not but the latter is.

You don't need to explicitly get an iterator for your ArrayLists, a for/each loop is fine.

In creating a regular array, if you're allowed to submit both your programs as one, you can just copy your ArrayList to the array using toArray(...). If you have to submit both separately you can always just create a new instance of the Scanner to loop through again.

Fwiw, this assignment seems pretty hard for a beginner.
Don Havan
Greenhorn

Joined: Nov 01, 2013
Posts: 4
Tyson Lindner wrote:Use "\\W+" as your delimiter. That will however include digits, so you have to decide if you want something like "234" or "apple55" to be considered a misspelled word or not. In this forum's spellchecker the former is not but the latter is.

You don't need to explicitly get an iterator for your ArrayLists, a for/each loop is fine.

In creating a regular array, if you're allowed to submit both your programs as one, you can just copy your ArrayList to the array using toArray(...). If you have to submit both separately you can always just create a new instance of the Scanner to loop through again.

Fwiw, this assignment seems pretty hard for a beginner.


I agree, ha. I'll try your suggestion for the delimiter. Is that the right syntax I'm using though? Why wouldn't it work the way I have it now? I thought that the ^ symbol means to use anything that ISN'T directly after it as a delimiter. So, if I have every letter after it, wouldn't that do what I'm trying to accomplish?

And for the array one...that would work, yes. But since we'll be timing these, I wanted to make it so that the dictionary word list and the text file to check (which may also be quite lengthy) is only read once, because it's about 500,000 words and reading that twice is going to make my class take almost twice as long as it normally would if I only read the dictionary word list once. Unless I can semi-cheat and subtract however many milliseconds it takes to do it a second time from the total time. There isn't a way to reset the internal iterator for and ? I tried .reset(), didn't work. Tried a few others too, no luck.

There's no way to get the dictionary words and the text file to check into arrays, making an array only as long as it needs to be for the arbitrary text file to check, while only Scanning each one once? Or am I just missing a step?
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: spell-checker: help with delimiters and moving Scanner into Array