File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

cutting file_id from file name by looping through foolders UNIX

 
prince davies
Ranch Hand
Posts: 74
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator


My files are located in UNIX Server, i want to extract file_id and file_name from each file and save it in a CSV file. How do I do that?

I have folders in unix environment, for example directory structure is as follows

year folder -> inside 12 months folders -> inside 30/31 days folders I did ls command for 2012 year as follows

2009 2010 2011 2012
$ cd 2012
$ ls
01 02 03 04 05 06 07 08 09
$ cd 09
$ ls
01 02 03 04 05 06 07 08 09 10 11 12 13
$ cd 13
$ ls

there are folders for each year like 2009,2010,2011 and 2012
and folder has 12 folders for each months like 01,02,03,04,05,06,07,08,09,10,11,12
and each month folder has 31 folders for days like 1,2,3, etc... 29,30,31


inside each day folder has files..

the file name is as follows,
sasmm_fsbc_durds_id00020532_t20100313192606.dat.trnsfr.gz
sasmm_fsbc_durds_id00020513_t20120913003312.dat.trnsfr.gz

I want to cut 20532 in a clumn and the whole file name in second column sasmm_fsbc_durds_id00020532_t20100313192606.dat

CSV file will look like

file_id file_name
20532 sasmm_fsbc_durds_id00020532_t20100313192606.dat
20513 sasmm_fsbc_durds_id00020513_t20120913003312.dat



file_id is to be cut from the file name , if you look at the file name closely, you can see; after 000 , file_ids in above file name examples , they are 20532 and 20513.


How do I loop through year 2012 and 12 months folders and 31 days folders inside it and create csv file which has data as shown above?

I am very new unix, please help me out.. If you provide a code , that would be great.. thanks..
 
prince davies
Ranch Hand
Posts: 74
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
looping through folders and read file name and cut string of file_id
creat a new CSV file
save this cut value in file_id column of csv file and file name in file_name column


first file
files are located in 2010 -> 03 -->13
sasmm_fsbc_durds_id00020532_t20100313192606.dat

second file
files are located in 2012 -> 09 -->13
sasmm_fsbc_durds_id00020513_t20120913003312.dat

OUTPUT CSV file

file_id | file_name
20532 | sasmm_fsbc_durds_id00020532_t20100313192606.dat
20513 | sasmm_fsbc_durds_id00020513_t20120913003312.dat


 
Tim Holloway
Saloon Keeper
Pie
Posts: 17620
39
Android Eclipse IDE Linux
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You can use the "find" command to iterate through the tree and get the raw filename or filename/path of each file.

To parse and format the paths, you can use any of a number of popular text-formatting tools, including sed, awk, perl or python.

Or (this being the JavaRanch), you can simply write a Java app that uses the java.io.File and java.util.Regex packages!
 
Anand Hariharan
Rancher
Posts: 272
C++ Debian VI Editor
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Did you answer your own question?

Let me know if this works or if it needs tweaking (assumes a reasonably modern shell):



NB: NOT TESTED.

Hope this helps,
- Anand

[Edit: Changed search and replace expressions]
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic