Win a copy of Think Java: How to Think Like a Computer Scientist this week in the Java in General forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Punction Correct (Sentances)

 
Peter Shipway
Ranch Hand
Posts: 71
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am trying to make a program that corrects the punction of a paragraph, part of this is the sentances. Only the first character of the sentance is a Capital the rest r lowercase until a full stop (.) occurs, when it does tehre are two spaces and a new sentance hence a new captial letter. eg

Hello World. hello User.
would become
Hello world. Hello user.

I am using a string tokenizer and making a completely new string (I tried a string buffer but it just dosent have the methods i need). I am using a counter to count along the sentance eg (plz note i have made the spaces more so that can understand how i am counting)
012345678910111213012345678910
Hello world . Hello user.

Two Questions, how can i check if the charAt is 0 and change that one character to an upper case (since im putting all to lower at the begining of the process). And check for a . to add two spaces after so i can reset the counter?

Any help would be abosultly great, also I am having trouble understanding the API, does anyone know of a good site that has the classes and methods and examples of the method being described? Ty all.
 
Paul Sturrock
Bartender
Posts: 10336
Eclipse IDE Hibernate Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I wouldn't use a StringTokenizer for this, since it is designed normally to tokenize by whole word, and what you are doing requires examining each character. So I'd use a character array and a StringBuffer. Just step through the charcter array, capitalizing the first alphabetic character after a '.', and append the character to your StringBuffer. Remember to watch for white space!

However, how you stop your program stripping out correctly capitalized proper nouns I don't know...
 
Peter Shipway
Ranch Hand
Posts: 71
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The problem i keep runing into with that is you cant use the string methods on a char and everytime i try to convert i get errors .
 
Paul Sturrock
Bartender
Posts: 10336
Eclipse IDE Hibernate Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
*Ahem*


Why are you trying to use methods for a String object on a primitive character?
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You could use StringTokenizer to break the paragraph into sentences rather than words. Just split on "."

I'm looking back at how I did this in REXX about 20 years ago. This editor macro reformats a paragraph within margins and optionally capitalizes the first word of each sentence.

I'm looking for punctuation on the end of a word. If I find :;.!? I insert an extra space after the word. If I find .!? I capitalize the next word.

There is a lot of other stuff going on here. We have left and right margin for the first line of a paragraph and all other lines, allowing indent, undent, etc. Rj is a boolean dealing with right justification - adding extra spaces to pad each line out to the right margin.

Boy, I hope my code has gotten a little easier to read in 20 years!
 
Stefan Wagner
Ranch Hand
Posts: 1923
Linux Postgres Database Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Trying to give tricky examples, i.e.:
John, George, and I went to Paris, Rome, etc. and had lunch there for 7.50 �.
John, george, and i went to paris, rome, etc. And had lunch there for 7. 50 �.

Spelling isn't regular in proper names: F.D.P. and SPD are political partys in Germany. -> "F. D. P. and spd"
[ May 26, 2004: Message edited by: Stefan Wagner ]
 
Peter Shipway
Ranch Hand
Posts: 71
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ok I have fixed some but am still having problems, I am trying to limit the amount of spaces to 1 except after a . where it is two spaces. Here is my current code



As you can see atm I am using a spaceCounter, however it is still not limiting the spaces in between words (my new fav passtime of hitting head against keyboard) any help would be great ty all for the help so far and ty for any future help.
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I think you can simplify a great deal. Just deal wtih space delimited words:
 
Peter Shipway
Ranch Hand
Posts: 71
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
But wont that still put in the normal spaces? given that tokenizer serpates words normaly? or are you saying add an additonal space after punction? I am trying to limit the amount of spaces atm to one between. ty for help.
 
Tim West
Ranch Hand
Posts: 539
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You'll learn a lot more by reading the Javadoc, or trying it out yourself. In this case, if you read you'll find that with StringTokenizer (when used normally), a sequence of separator characters (spaces in your case) will be treated as one separator, so you'll only get the words themselves, with no spaces.

I think Stan's given you a solid, simple approach to this problem...see what you can do with it.

-Tim
[ May 26, 2004: Message edited by: Tim West ]
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic