• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Need help for Shell Script to reduce a file

 
Ranch Hand
Posts: 104
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi

I am new to UNIX and UNIX scripting. I have a log file which is sitting in UNIX server. The log file is big enough and contains so many 50,000 lines(approx). I want to write a UNIX shell script to remove duplicate data from the file and make a smaller file(may be 100/200 lines or something like that). I really don’t know how to write that script. It would be great if someone can help me with some sample script for that. Please let me know if you need any other information. Thank you...

 
author
Posts: 5856
7
Android Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Could you show us some example log lines (both duplicate lines and unique lines)?

Do the lines have timestamps in them? If so, then filtering will be much more difficult.

I'm sure someone with Lisp expertise could write a one liner (with lots of parenthesis) to do this, but if I were to do it I would have to use Python, PHP or some other higher scripting language; I wouldn't event want to think about how to do it in bash.
 
Sam Saha
Ranch Hand
Posts: 104
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes, I have timestamps in the log file.

Here is an example duplicate lines from the file

 
Peter Johnson
author
Posts: 5856
7
Android Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Are you using Log4J to write these log entries? If so, this can be solved by setting "additivity" to false for your logger.
 
Sam Saha
Ranch Hand
Posts: 104
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have not written the application. So I do not know if the if they(who wrote the application) are using Log4J. Only I am having the log files and I have to reduce the file. This is the idea. As I am very new to UNIX and UNIX scripting I am wondering if we can write some unix shell script to remove the duplicate record from the log and make it smaller and if it is possible can you please send me some sample code for that how to do that. I would really appreciate that.
 
Peter Johnson
author
Posts: 5856
7
Android Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I would first try to find the log4j configuration file and edit it to remove the duplicates.

I think that a general algorithm for printing only unique log entries would be:



The "read next line from log" is a little complicated because you have to read the entire log entry which appears to be multiple physical lines based on what you displayed. The algorithm assumes that duplicate entries will be adjacent. I would be comfortable tackling this in Java, or perhaps in Python or PHP. Someone might be able to do this in a line or two of lisp. I wouldn't even try this in bash (thought I'm sure it could be done).
 
Normally trees don't drive trucks. Does this tiny ad have a license?
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic