aspose file tools*
The moose likes Java in General and the fly likes Some tips to create these methods Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Soft Skills this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Some tips to create these methods" Watch "Some tips to create these methods" New topic
Author

Some tips to create these methods

Hussein Baghdadi
clojure forum advocate
Bartender

Joined: Nov 08, 2003
Posts: 3479

Hi all.
I want to create a small application that takes a file as a parameter and performs the following tasks :
counts the number of characters.
counts the number of white spaces.
counts the number of lines.
counts the number of words.
search for a specific word.
but the problem is that I am't sure about these algorithms.
so would you mind giving me some tips to create these methods ?
(like how could I know that this line has ended and how to know that the word has ended)
here is some code :

any corrections about the previous code ?
I think there is some thing wrong with counting the spaces and chars, what do you think ?
thanks alot.
David Harkness
Ranch Hand

Joined: Aug 07, 2003
Posts: 1646
You've got everything counted except worsd. You need to define precisely what a word is for this program. English words? Any string of non-whitespace characters?

Once you decide, detecting them will be similar to your current checks. Use the Character helper methods as you have so far (and maybe new ones) to check for words.

This part seems like the crux of the assignment, so I don't want to hint too far. Try a few ways and post again if you don't get it.
Hussein Baghdadi
clojure forum advocate
Bartender

Joined: Nov 08, 2003
Posts: 3479

Thanks.
I want to count the english words.
but I am confused, in order to count spaces, should I use :
isWhitespace( ) or is SpaceChar( ) ??
which method in Character class counts the words ??
thanks again..
Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
Hi, good start.

Do you see a way to only do sum++ once?

I'm dyin to write some hints on word counting but making myself wait until you try it first. You get all the fun!


A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
David Harkness
Ranch Hand

Joined: Aug 07, 2003
Posts: 1646
Originally posted by John Todd:
in order to count spaces, should I use :
isWhitespace( ) or is SpaceChar( ) ??
What does the program specification say you should count: spaces or whitespace? Read the JavaDocs for both methods and see which matches the spec and use it.
which method in Character class counts the words ??
How could a Character be able to tell you anything about words (other than single-letter words like "I" and "a")? Think about how you detect English words (try defining one for starters) in the context of a text file. Then translate that into a sequence of logic steps and finally code.
Hussein Baghdadi
clojure forum advocate
Bartender

Joined: Nov 08, 2003
Posts: 3479

May I ask you what is the difference between the space char and white space ?
I'm confused about them.
David Harkness
Ranch Hand

Joined: Aug 07, 2003
Posts: 1646
Originally posted by John Todd:
May I ask you what is the difference between the space char and white space ?
Every system may vary. Typically, whitespace is considered to include space " ", horizontal tab "\t" and newline "\n". However, the JavaDoc for Character.isWhitespace(char) says
A character is considered to be a Java whitespace character if and only if it satisfies one of the following criteria:
  • It is a Unicode space separator (category "Zs"), but is not a no-break space (\u00A0 or \uFEFF).
  • It is a Unicode line separator (category "Zl").
  • It is a Unicode paragraph separator (category "Zp").
  • It is \u0009, HORIZONTAL TABULATION.
  • It is \u000A, LINE FEED.
  • It is \u000B, VERTICAL TABULATION.
  • It is \u000C, FORM FEED.
  • It is \u000D, CARRIAGE RETURN.
  • It is \u001C, FILE SEPARATOR.
  • It is \u001D, GROUP SEPARATOR.
  • It is \u001E, RECORD SEPARATOR.
  • It is \u001F, UNIT SEPARATOR.

  • What that tells me is that you should never read JavaDocs before coffee. No wait, what that tells me is that the three I mentioned above are included in that much wider definition. But I would bet you that when your program is tested, only the three I mentioned will be considered (maybe carriage return if tested on a Mac). Regardless, using Character.isWhitespace(char) will count them all correctly.

    The real trick is how to define and detect an "English word." How many words do the following sentences have?
  • This sentence has 5 or 7 words
  • Is punctuation part of the words preceding it or a separate word?
  • Perhaps you will only get letters and whitespace so it is easy

  • Hussein Baghdadi
    clojure forum advocate
    Bartender

    Joined: Nov 08, 2003
    Posts: 3479

    Yahoooooooooooooooooooooo
    I found it, I found how to count the words.
    but I have a question :
    look at this code please :

    I am increasing the number of characters every time I encounter a unicode space or a Java space, which ofcourse will cause the sum to produce a wrong result.
    if I want to count the number of charcs in a file, which method should I use :
    isWhitespace or isSpaceChar ?
    note :
    what is the difference between the char and the letter ?
    thanks ranchers.
    David Harkness
    Ranch Hand

    Joined: Aug 07, 2003
    Posts: 1646
    Think of it like this. For every character in the file, you want to take a set of actions.
  • Increment the total character counter.
  • If it's a whitespace character, increment the whitespace counter.
  • If it's a letter character, increment the letter counter.
  • ...

  • Note where the various counter increment steps take place. This will solve your problem with over-counting total characters. In fact, Stan pointed this out in an earlier reply.
    [ November 08, 2004: Message edited by: David Harkness ]
     
    I agree. Here's the link: http://aspose.com/file-tools
     
    subject: Some tips to create these methods