• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Read log file from date to date as one object

 
kc pradeep
Greenhorn
Posts: 29
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi All,


I have a huge file and I wanted to extract the lines availble in the file between a particular time frame.


Note: The file has a specific format in each line of the log specifies Date/Time.Also Not all the lines in the logfile has date/Time stamp present in it . When there are errors during a particular time, the exceptions are also thrown in the file for that time. I want to extract all the data for a particular day or days along with the exception details as well

the file in the below format,




03-29-11:05:25:04 [SAPEngine_Application_Thread[impl:3]_7] ERROR com.company.uii.core.WidgetRefImpl - ##### exception !!!
VEN-PRICING-1237: Expression did not evaluate to true or false [file D:\PMMPRD\builds\web\app\WEB-INF\meta\LineItemsWidget.jsp, line 26].
at java.security.AccessController.doPrivileged(Native Method)
at com.sap.engine.core.thread.impl3.SingleThread.execute(SingleThread.java:102)
at com.sap.engine.core.thread.impl3.SingleThread.run(SingleThread.java:172)

04-01-11:00:08:32 [SAPEngine_Application_Thread[impl:3]_20] INFO com.vendavo.uii.controller.PageController - User [usb08025] disconnected.


In the above case, what do I do to extract the data from 03-29-11:05:25:04 to 04-01-11:00:08:32


Please suggest what should I use to extract this data. Should I use Pattern matching /Regular expressions to read/match? Please suggest. Thanks
 
Campbell Ritchie
Sheriff
Posts: 48381
56
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Read the entire file into a database and use SELECT . . . WHERE xxx.DATE BETWEEN . . .

???
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If this was my problem I might create a Lucene index, since Lucene makes it relatively easy to query for date/time ranges. It also wouldn't matter how many log files there were, or where they are located - each new one would be added to the index, and would then be available for searches. It seems a better "fit" to the problem than storing the data in a DB (which would work just fine, as Campbell said). You could search the actual contents that way, too, which I would imagine might be a handy feature.
 
kc pradeep
Greenhorn
Posts: 29
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If I have to load all the data into DB then first I have to separate the log entry into the following components :

Timestamp Thread Priority and Message, which is the fomat used to write the log file....

But my problem is that the message part may be of any number of lines so, I need to read till the next TimeStamps starts.

If I have to check the next line the read pointer will move to the next line....

So if I come back to this function again, I will miss out one line.....

Please suggest how to go about...

I have no idea about Lucene index, I need to check...

I am new to java language...
 
Matt Cartwright
Ranch Hand
Posts: 152
Linux VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
looks like you have two indicators for a new 'record'

1 - the date stamp at the beginning of a line

2 - your example looks like you have an empty line between individual messages

so, to keep it simple (maybe not the most efficient) you could read the file line by line and
check for one of the indicators.

if it is 1, you could pre-compile a regular expression before reading the file and then
match against the regex.

if it is 2, just check for 'line.trim().length() == 0', your empty line

hope that helps
Matt
 
kc pradeep
Greenhorn
Posts: 29
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Matt,

That is what I am also thinking...

Can you help me this problem:

I would be calling a function while will return the object ( one log entry ) say LogEntry containing the following components: timestamp, thread, priority and the message part of the log.

If I read line by line, to get one log entry I will have to read from one timestamp to next.. during this process I would be reading the first line of the next log entry.. so the file reader will be pointing to the second line of the next log entry because of which I would be missing the first line of the second log entry.

Correct me if I am wrong.
 
Matt Cartwright
Ranch Hand
Posts: 152
Linux VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I was thinking of coming up with some pseudo code, but wrote it down in Java.

The pre-compiled pattern and regular expression:


The method itself:


By processing the last entry in line 20 and starting a new one in 27, we do not lose anything.

You would need a method 'processLogEntryText(final String text)' that parses the individual log entry.

Have fun
Matt
 
kc pradeep
Greenhorn
Posts: 29
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Matt...

I will try it and let you know... Right now watching the cricket world cup final.....
 
Matt Cartwright
Ranch Hand
Posts: 152
Linux VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
you definitely got your priorities right
 
kc pradeep
Greenhorn
Posts: 29
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
India is playing the final...and I am a proud Indian fan
 
kc pradeep
Greenhorn
Posts: 29
Chrome Eclipse IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Matt,

Thanks a lot... I tried the code and it is working perfect... Just as I wanted..

 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic