• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

regular expression question

 
Ranch Hand
Posts: 73
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I would like to extract some information from a text. I guess I will have to do that using regular expressions in java.

Example:

OS info
*******
UOS = Windows Vista 32-bit Service Pack 1
Admin=NO

From this text I would like extract the information that the operating system is Windows Vista 31bit Service Pack1 and the info that the user was not the admin and store it in 2 variables lets say String os = "Windows Vista 32-bit Service Pack 1" and String admin = "NO".

How could I do that?

thanks in advance.
 
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

I know that this is unlikely to be a homework problem, but, JavaRanch is still a learning site.... so, what have your tried so far? And what issues are you having?

Henry
 
Viv Singh
Ranch Hand
Posts: 73
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
For example to extract the Operating system, I have tried the following:





I want it to return the whole string: "Windows Vista 32-bit Service Pack 1". It is also possible that in the input the there is something like "Windows Vista 32-bit Service Pack 1 abcdef" but even then I still just want "Windows Vista 32-bit Service Pack 1".
 
Henry Wong
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In your regex, you are only extracting the first group of word characters. So, it will only get the first word, as the space won't match.

It is also possible that in the input the there is something like "Windows Vista 32-bit Service Pack 1 abcdef" but even then I still just want "Windows Vista 32-bit Service Pack 1".



Well, to do this part, you'll need to have a mechanism to define what is a valid OS name. Your program doesn't magically know what names are valid, and what names are not. Where is this validity data coming from?

Henry
 
Viv Singh
Ranch Hand
Posts: 73
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Isnt there any way to read till the end of line?
Because the problem is that I do not know the exact data. There could be many variations Like Windows 2000 Service Pack 1, Windows 2000 Service Pack 2, Windows XP Service Pack 1, Windows 98 ............
 
Henry Wong
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Viv Singh wrote:Isnt there any way to read till the end of line?
Because the problem is that I do not know the exact data. There could be many variations Like Windows 2000 Service Pack 1, Windows 2000 Service Pack 2, Windows XP Service Pack 1, Windows 98 ............



Sure, you can change your match criteria to everything but the carriage return / line feed, and it will match to the end of line. Or you can read it a line at a time, then match everything, which is to the end of line.

However, in your example...

Windows Vista 32-bit Service Pack 1 abcdef



It wasn't separated by an EOL -- in this case, how do you know if abcdef isn't part of the OS name?

Henry
 
Viv Singh
Ranch Hand
Posts: 73
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Henry Wong wrote:
Sure, you can change your match criteria to everything but the carriage return / line feed, and it will match to the end of line. Or you can read it a line at a time, then match everything, which is to the end of line.



How can I math everything but the carriage return?

Windows Vista 32-bit Service Pack 1 abcdef

It wasn't separated by an EOL -- in this case, how do you know if abcdef isn't part of the OS name?



This is a problem, I will have to think of some solution for this problem.
 
Sheriff
Posts: 22783
131
Eclipse IDE Spring VI Editor Chrome Java Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If I see that file format, I'm thinking of java.util.Properties to do the hard work for me.
 
Henry Wong
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

How can I math everything but the carriage return?




In regex, to match everything but a certain set of characters, you do this... [^abc] .... meaning don't match a, b, or c.

So... to not match the CR and LF, you have to do this .... [^\\r\\n]

If you use this in your original pattern, instead of \\w, meaning ... "UOS =\\s*([^\\r\\n]*)" ... this should extract to the EOL.

Henry
 
Henry Wong
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Rob Prime wrote:If I see that file format, I'm thinking of java.util.Properties to do the hard work for me.




Based on the format, I agree. But I am guessing that not everything is being shown here...

Henry
 
reply
    Bookmark Topic Watch Topic
  • New Topic