Win a copy of Mesos in Action this week in the Cloud/Virtualizaton forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Automate Stripping of a record

 
Muhammad Imad Qureshi
Ranch Hand
Posts: 238
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am new to Unix scripting and here is what I am looking for. We are sometime required to strip records from our input data file because the job fails due to erroneous record. We strip by record number. So lets take a typical file. A header record, data and then a trailer record (that contains the total count of file and looks like this: TR5009P6160480001001000023911 (this is a real example).

"TR50" is what all trailer records start with "09P6160480" is what we call a reference number for the job. "001001000023911" is the total number of records in the file.

Now Lets say I want to delete record number "4301" then I would first make a backup copy of the file and call it "filename.backup". Then I would open the original file using "vim". Then I do "set number" and then type "4301 G" to go that record. Then I copy the record to a different file to send it to business so they can look at it and then strip the record out and reduce the count of trailer record by one. In case above the new trailer record would be "TR5009P6160480001001000023910". I save the file and restart the failed jobs.

Can someone help me write a script to automate this. How can I automate the process of copying a record and email it to business before stripping it.

Thanks
Imad
 
Gerardo Tasistro
Ranch Hand
Posts: 362
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
How do you know which row needs to be deleted?
 
Muhammad Imad Qureshi
Ranch Hand
Posts: 238
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
When the job fails it gives the record number it failed on. So We know what record to strip. We have been successfully doing that for over a year now. I just want to automate this process.
 
Gerardo Tasistro
Ranch Hand
Posts: 362
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Note: I'm asuming your file is called exactly like the trailing record. In this example TR5009P6160480001001000023911 means the file, NOT the trailing record.

I believe awk should provide a solution

This will provide the record which faulted




This will provide the file witout the faulty row



Then you'll just need to create a script that creates the new names

As tip...

This will tell you the number of rows the file has



you can run it before and after awk to obtain the two numbers 23911 and 23910

Then use sed to replace 23911 with 23910 using regular expressions. You'll need to be carefull about bufferin 0s when for example you have 10000 rows turning into 9999 rows.
 
Andrew Monkhouse
author and jackaroo
Marshal Commander
Pie
Posts: 11881
196
C++ Firefox Browser IntelliJ IDE Java Mac Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You also said you wanted to forward the failed record. Probably something like:

You probably want to automate this a little further, which would probably require automatically finding the record that failed.

Assuming you have a report along the lines of:

You could extract the record number with something like:

And then use that variable with the other statements ...

etc.
 
Stefan Wagner
Ranch Hand
Posts: 1923
Linux Postgres Database Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Is TR5009P6160480001001000023910 the first line of the file, and the total number of records always running from column 15 to 25?

1 001 000 023 910 is a pretty big number for records. Is that the number of records in a single file?

I don't know how sed handles files with more than billions of records/lines. And the arithmetic fails with leading zeros.

First, 0100100002... is interpreted as an octal number, because of the leading 0, but 3910 contains a 9 which is too big for an octal number.

Second, if your number is 10000000000000 and you reduce it by one, you won't get 099..., but 99..

 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic