File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Linux / UNIX and the fly likes sed regular expression Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » Linux / UNIX
Bookmark "sed regular expression" Watch "sed regular expression" New topic
Author

sed regular expression

Pat Denton
Greenhorn

Joined: Oct 26, 2005
Posts: 17
I have a value aaa.c11234.001.xml and I am needing to use the sed command to pull out the x11234 and 001 as individule variables. So far, I am getting the 001 with the following command in my script

var1=`echo aaa.c11234.001.xml | sed -e 's/^.*\.\(.*\)\..*$/\1/'`

But I am having issues with getting the c11234 part. I can't seem to nail down the regular expression to do so. Any help would be greatly appreciated.
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24187
    
  34

Well, one option (deliberately similar to yours)

var2=`echo aaa.c11234.001.xml | sed -e 's/[^.]*\\.\([^.]*\)\\..*$/\1/'`

The regexp is (with group 1 parenthesized) "Any number of not-a-dot characters, followed by a dot, (followed by any number of not-a-dot characters), followed by a dot, followed by anything."


[Jess in Action][AskingGoodQuestions]
Pat Denton
Greenhorn

Joined: Oct 26, 2005
Posts: 17
Thanks Ernest. That did the trick. And it helped me with some others I had to create as well. The best part was the worded explination of the regex. That made difference in me understanding it.
[ November 09, 2005: Message edited by: Pat Denton ]
Harald Kirsch
Ranch Hand

Joined: Oct 14, 2005
Posts: 37
Why use sed?

Assuming you use (ba)sh, try this:


For the details of this percent and hash business see:
http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_06_02
[ November 09, 2005: Message edited by: Harald Kirsch ]

Harald.
Stefan Wagner
Ranch Hand

Joined: Jun 02, 2003
Posts: 1923

Regexpressions are often unreadable one-way-programming.
You may write them, but try to read them!

A little help is the possibility to combine multiple statements with semicolons: "sed 's1;s2'".
So you could snip away the text 'aaa.c' and '.xml' with those two commands:

By replacing `foo` with $(foo), nesting is more easy. Using backticks is discouraged for the bash for that reason.


http://home.arcor.de/hirnstrom/bewerbung
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24187
    
  34

Feh. Regular expressions can be ugly, so comments certainly help. One thing in their favor is if the shell script gets turned into a Perl, Ruby, or Python program, or recoded in Java, the regexp can go along for a ride; the other techniques would require a total rewrite.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: sed regular expression