File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Linux / UNIX and the fly likes Extracting only Matched string in linux Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » Linux / UNIX
Bookmark "Extracting only Matched string in linux" Watch "Extracting only Matched string in linux" New topic

Extracting only Matched string in linux

rajesh thiru

Joined: May 24, 2010
Posts: 2
New to Linux, I need help from the guru's is it possible to extract only the matched string per line, for eg.,


I need a way to extract like this


any help or suggestion will be appreciated..


Kees Jan Koster
JavaMonitor Support

Joined: Mar 31, 2009
Posts: 251
Dear Rajesh,

Looks like the input is XML. You can use XSLT to get data out in the format you want.

Kees Jan

Java-monitor, JVM monitoring made easy <- right here on Java Ranch
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 17417

Welcome to the JavaRanch, Rajesh.

There's several ways to do it. If the XML is neatly formatted and the data is one row per line, you can use one of the regular-expression based utilities such as sed, perl, awk or python to do the work. I did this, in fact, just yesterday. If you don't know how to use regular expressions, they're one of the most valuable things you can learn in a Linux/Unix environment.

Another alternative is to use XSL, which actually processes and parses the XML itself. Many Linux systems come with an "xsltproc" utility program that can be used. XSL code is more readable than regexes, although for me, it requires a lot of work.

A third alternative is to use an XML parsing package. There are XML parsers for Perl, Python, Java, C and more. Java in particular has quite a few different ways to parse XML, from the simple SAX processor up to things like DOM, StaX, the Apache Digester, JAXB, and so forth.

An IDE is no substitute for an Intelligent Developer.
Stefan Wagner
Ranch Hand

Joined: Jun 02, 2003
Posts: 1923

With scala, it's quiete easy:

xsl-solutions are to prefered, because they're agnostic to linefeeds in the string - however, in generated files, which just happen to not contain linefeeds, sed is much faster, but have a little bit a different quoting-policy which looks :

(the round braces are masked here).
I agree. Here's the link:
subject: Extracting only Matched string in linux
It's not a secret anymore!