• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

regex bug

 
Ranch Hand
Posts: 155
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi everyone. would like to ask for assistsance with this method.

it capitalizes the first letter of every sentence found. but it does not capitalize the first sentence's letter. is there something lacking? thanks

 
Author
Posts: 12617
IntelliJ IDE Ruby
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Does the first sentence start with a character in the first matching group?
 
Sheriff
Posts: 22783
131
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You can get rid of using capturing groups completely. Instead, use a positive lookbehind. Include ^ as well - the start of the string.
 
mark goking
Ranch Hand
Posts: 155
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
im no good with regex stuffs. you mean i include ^ in my pattern string?
 
Ranch Hand
Posts: 47
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well for starters you need to change your check for the whitespace after a punctuation mark. It might not always be one space. It could also include a return or something like that. Use "([\\?!\\.]\\s*)([a-z])"

As for the first character of the first sentence, I messed around a little and couldn't find a regex that would get that and the other sentences easily. I suggest just grabbing the first alpha character in the string and checking its' case manually and then running your regex.

 
mark goking
Ranch Hand
Posts: 155
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
@shane: thanks. yeah. that's the quickest hack
 
mark goking
Ranch Hand
Posts: 155
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
@david: yes, the first character is a letter
 
David Newton
Author
Posts: 12617
IntelliJ IDE Ruby
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Not what I asked... I was giving you a clue to why the first sentence wasn't working as the others were.
 
mark goking
Ranch Hand
Posts: 155
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
oops. my bad. i think ill go see if setting group(1) to uppercase may do the trick.
 
Rancher
Posts: 280
VI Editor C++ Debian
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Am not entirely sure what you mean by "setting group(1) to uppercase", but what David meant by "first matching group" is the "[\\?!\\.] " portion in your RE.

Your RE, in effect, looks for either a period or a question mark, followed by a single space, followed by a lower case letter. So, it will not work for your first sentence because it is not preceded by a period or question mark.

I am no native English speaker, and variable spaced fonts are in vogue, but I think most writing styles require two spaces after a period or question mark. To that end, the first portion of Shane Burgel's post is good advice (which you may have overlooked).
 
mark goking
Ranch Hand
Posts: 155
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
ok. thanks. noted. will check on this
 
David Newton
Author
Posts: 12617
IntelliJ IDE Ruby
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Anand Hariharan wrote:I think most writing styles require two spaces after a period or question mark.


This is under some debate these days; I've moved away from it. There are lively discussions about it in the writing-oriented groups/lists/etc.
 
mark goking
Ranch Hand
Posts: 155
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
i added this code after the regex code as workaround

String.format("%s%s", Character.toUpperCase(mystr_variable.charAt(0)), mystr_variable.substring(1))

capitalizes first letter of the string he he he
 
mark goking
Ranch Hand
Posts: 155
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
i got another little problem. could not make this regex work

currently this is my regex pattern

/(\\S+)\\s?

it actually gets all words that start with slash /

however, if there is a word that starts with two slashes, it also gets that

i tried to add a [^/] after the / in the regex pattern but it doesnt work. the regex pattern needs only get words that start with 1 slash (/), any next character does not matter
 
Bartender
Posts: 5167
11
Netbeans IDE Opera Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

mark goking wrote:i tried to add a [^/] after the / in the regex pattern but it doesnt work.


No, because the second slash isn't followed by slash.

Maybe you can get away with surrounding the slash by not-slashor you may need to use negative lookahead/lookbehind.

More on http://www.regular-expressions.info/
 
mark goking
Ranch Hand
Posts: 155
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi darryl. that worked! thanks

i tried negative lookbehind but to no avail. e.g. /(?!/)

your regex works
 
mark goking
Ranch Hand
Posts: 155
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi all. this is more like an advice question.

the capitalize method is supposed to be used in an rtfeditorkit, where the jtextpane is bound to a keylistener wherein every key typed, will check the whole text which letters need to be capitalized.

im trying to avoid using stream to repopulate the document's content. i thought using DocumentFilter would work but the replace() method only reads character by character.

what im hoping to do is that for every key typed, the whole document's content will be formatted using the capitalize method above. while getting the document's text through doc.getText(0, doc.getLength()) is the way to go, im looking for some other alternative where i can format the text and pass it back to the document object without using stream.

am i on the right track using DocumentFilter?
 
mark goking
Ranch Hand
Posts: 155
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
hi daryl and to everyone. need a little help.

in my previous post, i worked on a regex that will get only words that start with / and only 1 slash

this is what i have so far




the problem is, if this is my string


it outputs


MATCH: /hehaehe
MATCH: /gox
MATCH: /ototh
MATCH: /gimme

the 3rd one should not be part of it since in the regex, it says \\w+ so it should only be getting alpha numeric right? and i wonder why it included that and until /ototh when i placed a \b at the beginning of the regex pattern to indicate it should be a word.

please let me know your thoughts. this is the code

 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic