Win a copy of Design for the Mind this week in the Design forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Extracting only Matched string in linux

 
rajesh thiru
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
New to Linux, I need help from the guru's is it possible to extract only the matched string per line, for eg.,

<message><v></v><v></v><v></v><v>26.00000</v><v>-27.00000</v></message>
<message><v></v><v></v><v></v><v>26.00000</v><v>-27.00000</v></message>
<message><v></v><v></v><v></v><v>26.00000</v><v>-27.00000</v></message>


I need a way to extract like this

26.00000,-2700000
26.00000,-2700000
26.00000,-2700000


any help or suggestion will be appreciated..


regards
rajesh

 
Kees Jan Koster
JavaMonitor Support
Rancher
Posts: 251
5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dear Rajesh,

Looks like the input is XML. You can use XSLT to get data out in the format you want.

Kees Jan
 
Tim Holloway
Saloon Keeper
Pie
Posts: 18098
50
Android Eclipse IDE Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to the JavaRanch, Rajesh.

There's several ways to do it. If the XML is neatly formatted and the data is one row per line, you can use one of the regular-expression based utilities such as sed, perl, awk or python to do the work. I did this, in fact, just yesterday. If you don't know how to use regular expressions, they're one of the most valuable things you can learn in a Linux/Unix environment.

Another alternative is to use XSL, which actually processes and parses the XML itself. Many Linux systems come with an "xsltproc" utility program that can be used. XSL code is more readable than regexes, although for me, it requires a lot of work.

A third alternative is to use an XML parsing package. There are XML parsers for Perl, Python, Java, C and more. Java in particular has quite a few different ways to parse XML, from the simple SAX processor up to things like DOM, StaX, the Apache Digester, JAXB, and so forth.

 
Stefan Wagner
Ranch Hand
Posts: 1923
Linux Postgres Database Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
With scala, it's quiete easy:

xsl-solutions are to prefered, because they're agnostic to linefeeds in the string - however, in generated files, which just happen to not contain linefeeds, sed is much faster, but have a little bit a different quoting-policy which looks :

(the round braces are masked here).

 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic