• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

[newbie] regex anomaly

 
Ranch Hand
Posts: 664
Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm trying to create a program that reads source files line by line (commonly .java source files that I've pasted from a book in .chm format), and, removes the line numbers so that I can compile them straight away.

NOTE: Disregard my 2nd post, this is because when I tried to edit my intial post, the option was not available and I hacked around it by using a quote.

I'm stuck at the regex part where I'm trying to match for:


1. import java.net.*;
2. import java.awt.*;
...
...



NOTE: The line number (in bold for emphasis) should be removed.

 
Jon Camilleri
Ranch Hand
Posts: 664
Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Update...

Jon Camilleri wrote:I'm trying to create a program that reads source files line by line (commonly .java source files that I've pasted from a book in .chm format), and, removes the line numbers so that I can compile them straight away.

I'm stuck at the regex part where I'm trying to match for:


1. import java.net.*;
2. import java.awt.*;
...
...



NOTE: The line number (in bold for emphasis) should be removed.

 
Author
Posts: 12617
IntelliJ IDE Ruby
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Why not just use String's replaceFirst(regex, "") method and use (any number of digits followed by a period) as the regex?

(This is really just a sed script, I'm assuming you're doing it this way to get more Java practice in.)
 
Ranch Hand
Posts: 282
Eclipse IDE PHP Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
When you define a regex in Java, you have to be careful about using backslashes. When you want to use a backslash as part of your regex, you have to use two backslashes in your Java string.

So this:
should be this:
 
Jon Camilleri
Ranch Hand
Posts: 664
Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

David Newton wrote:Why not just use String's replaceFirst(regex, "") method and use (any number of digits followed by a period) as the regex?

(This is really just a sed script, I'm assuming you're doing it this way to get more Java practice in.)



The problem is finding the right regex pattern
 
Rancher
Posts: 600
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

if (Pattern.matches("[\d]\..", line)) //this is incorrect should include X+ ...
{//match?!}[/b]



The '\d' is already a character class, so you don't need to put it in square brackets. You might have been thinking of this:


However, matches() most likely won't work, since it's looking for an exact match. David's suggestion of using String's replaceFirst() method is probably what you really want.

John.
 
Jon Camilleri
Ranch Hand
Posts: 664
Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

John de Michele wrote:

if (Pattern.matches("[\d]\..", line)) //this is incorrect should include X+ ...
{//match?!}[/b]



The '\d' is already a character class, so you don't need to put it in square brackets. You might have been thinking of this:


However, matches() most likely won't work, since it's looking for an exact match. David's suggestion of using String's replaceFirst() method is probably what you really want.

John.



It's a good idea actually thanks Now I'm getting an error indicating something wrong with the escape sequence



 
David Newton
Author
Posts: 12617
IntelliJ IDE Ruby
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Backslashes ("\") in Java Strings are meaningful, and must be escaped.
 
Jon Camilleri
Ranch Hand
Posts: 664
Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

David Newton wrote:Backslashes ("\") in Java Strings are meaningful, and must be escaped.


Thanks that's it
 
Jon Camilleri
Ranch Hand
Posts: 664
Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Jon Camilleri wrote:

David Newton wrote:Backslashes ("\") in Java Strings are meaningful, and must be escaped.


Thanks that's it



I'm wondering why the String.replaceAll is not replacing the string.


Output

1. //line numbers are not removed??
import
java.net.*;
2.
import
java.awt.*;
3.
import
java.awt.event.*;
4.
import
java.io.*;
5.
import
java.util.*;
6.
import
javax.naming.*;
7.
import
javax.naming.directory.*;
8.
import
javax.swing.*;
9.
10.
/**


[HENRY: Deleted tons of output that is probably not relevant]
 
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


The pattern is one of more number of digits, followed by a period, followed by another character (any character). And since you are using a scanner to extract the token, separated by whitespace, there isn't any characters after the period.

Henry
 
Jon Camilleri
Ranch Hand
Posts: 664
Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Henry Wong wrote:

The pattern is one of more number of digits, followed by a period, followed by another character (any character). And since you are using a scanner to extract the token, separated by whitespace, there isn't any characters after the period.

Henry


Thanks

oops
 
I've been selected to go to the moon! All thanks to this tiny ad:
a bit of art, as a gift, that will fit in a stocking
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic