Win a copy of Re-engineering Legacy Software this week in the Refactoring forum
or Docker in Action in the Agile forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Some tips to create these methods

 
Hussein Baghdadi
clojure forum advocate
Bartender
Posts: 3479
Clojure Mac Objective C
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all.
I want to create a small application that takes a file as a parameter and performs the following tasks :
counts the number of characters.
counts the number of white spaces.
counts the number of lines.
counts the number of words.
search for a specific word.
but the problem is that I am't sure about these algorithms.
so would you mind giving me some tips to create these methods ?
(like how could I know that this line has ended and how to know that the word has ended)
here is some code :

any corrections about the previous code ?
I think there is some thing wrong with counting the spaces and chars, what do you think ?
thanks alot.
 
David Harkness
Ranch Hand
Posts: 1646
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You've got everything counted except worsd. You need to define precisely what a word is for this program. English words? Any string of non-whitespace characters?

Once you decide, detecting them will be similar to your current checks. Use the Character helper methods as you have so far (and maybe new ones) to check for words.

This part seems like the crux of the assignment, so I don't want to hint too far. Try a few ways and post again if you don't get it.
 
Hussein Baghdadi
clojure forum advocate
Bartender
Posts: 3479
Clojure Mac Objective C
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks.
I want to count the english words.
but I am confused, in order to count spaces, should I use :
isWhitespace( ) or is SpaceChar( ) ??
which method in Character class counts the words ??
thanks again..
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi, good start.

Do you see a way to only do sum++ once?

I'm dyin to write some hints on word counting but making myself wait until you try it first. You get all the fun!
 
David Harkness
Ranch Hand
Posts: 1646
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by John Todd:
in order to count spaces, should I use :
isWhitespace( ) or is SpaceChar( ) ??
What does the program specification say you should count: spaces or whitespace? Read the JavaDocs for both methods and see which matches the spec and use it.
which method in Character class counts the words ??
How could a Character be able to tell you anything about words (other than single-letter words like "I" and "a")? Think about how you detect English words (try defining one for starters) in the context of a text file. Then translate that into a sequence of logic steps and finally code.
 
Hussein Baghdadi
clojure forum advocate
Bartender
Posts: 3479
Clojure Mac Objective C
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
May I ask you what is the difference between the space char and white space ?
I'm confused about them.
 
David Harkness
Ranch Hand
Posts: 1646
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by John Todd:
May I ask you what is the difference between the space char and white space ?
Every system may vary. Typically, whitespace is considered to include space " ", horizontal tab "\t" and newline "\n". However, the JavaDoc for Character.isWhitespace(char) says
A character is considered to be a Java whitespace character if and only if it satisfies one of the following criteria:
  • It is a Unicode space separator (category "Zs"), but is not a no-break space (\u00A0 or \uFEFF).
  • It is a Unicode line separator (category "Zl").
  • It is a Unicode paragraph separator (category "Zp").
  • It is \u0009, HORIZONTAL TABULATION.
  • It is \u000A, LINE FEED.
  • It is \u000B, VERTICAL TABULATION.
  • It is \u000C, FORM FEED.
  • It is \u000D, CARRIAGE RETURN.
  • It is \u001C, FILE SEPARATOR.
  • It is \u001D, GROUP SEPARATOR.
  • It is \u001E, RECORD SEPARATOR.
  • It is \u001F, UNIT SEPARATOR.

  • What that tells me is that you should never read JavaDocs before coffee. No wait, what that tells me is that the three I mentioned above are included in that much wider definition. But I would bet you that when your program is tested, only the three I mentioned will be considered (maybe carriage return if tested on a Mac). Regardless, using Character.isWhitespace(char) will count them all correctly.

    The real trick is how to define and detect an "English word." How many words do the following sentences have?
  • This sentence has 5 or 7 words
  • Is punctuation part of the words preceding it or a separate word?
  • Perhaps you will only get letters and whitespace so it is easy

  •  
    Hussein Baghdadi
    clojure forum advocate
    Bartender
    Posts: 3479
    Clojure Mac Objective C
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Yahoooooooooooooooooooooo
    I found it, I found how to count the words.
    but I have a question :
    look at this code please :

    I am increasing the number of characters every time I encounter a unicode space or a Java space, which ofcourse will cause the sum to produce a wrong result.
    if I want to count the number of charcs in a file, which method should I use :
    isWhitespace or isSpaceChar ?
    note :
    what is the difference between the char and the letter ?
    thanks ranchers.
     
    David Harkness
    Ranch Hand
    Posts: 1646
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Think of it like this. For every character in the file, you want to take a set of actions.
  • Increment the total character counter.
  • If it's a whitespace character, increment the whitespace counter.
  • If it's a letter character, increment the letter counter.
  • ...

  • Note where the various counter increment steps take place. This will solve your problem with over-counting total characters. In fact, Stan pointed this out in an earlier reply.
    [ November 08, 2004: Message edited by: David Harkness ]
     
    • Post Reply
    • Bookmark Topic Watch Topic
    • New Topic