arrays - How to parse a specific Netbackup bpimagelist format data file, records separated by empty line, each line with data label and label -


nbu bpimagelist data output format snippet here. a single record separated blank or empty line, each line contains variable length data label separated colon, random number of spaces, variable data content. single record can of variable length , not specific number of lines.

i'd convert file comma separated format import excel in order analyze it. able extract data labels without problem.

client:            <hostname> backup id:         <hostname>_1396674012 policy:            m-portwarew2k03-prod-clf policy type:       ms-windows (13) proxy client:      (none specified) creator:           root name1:             (none specified) sched label:       monthly_full schedule type:     full (0) retention level:   5 weeks (4) backup time:       sat apr  5 01:00:12 2014 (1396674012) elapsed time:      2448 second(s) expiration time:   sat apr  3 01:00:12 2021 (1617426012) compressed:        no client encrypted:  no kilobytes:         37997291 number of files:   240819 number of copies:  1 number of fragments:   1 histogram:         0 0 0 0 0 0 0 0 0 0 db compressed:     no files file name:   m-portwarew2k03-prod-clf_1396674012_full.f ...many more lines of files data labels , data per record. 

i'd data csv format this...

client,backup id,policy,policy type,proxy client,creator,...more labels <hostname>,<hostname>_id#,m-portwarew2k03-prod-clf,ms-windows (13),(none specified),root,(none specified),monthly_full,full (0),5 weeks (4),sat apr  5 01:00:12 2014 (1396674012),...more...  # write output headers first file record - single record blank line blank line # first record , pull out first column of data , output single comma delimited line header=`sed '/^\s*$/q' $inputfile | cut -d: -f1 | tr '\n' ','` echo -e $header > $outfile  # repeat above on lines in file pull data (2nd column after : instead , output comma delimited # "cut -d: -f2-" removes first column of data left of colon delimiter, , # "tr -d ' ' " removes leading white space between colon , start of data, , # "tr '\n' ',' " or "paster -d, -s" replaces newlines commas between data. 

now, how add trailing newline between records?

sed '/^\s*$/d' $inputfile | cut -d: -f2- | tr -d ' ' | tr '\n' ',' >> $outfile 

so reformat data lines, showing data right of colon delimiter (removing intervening spaces between delimiter , start of data), removing line feeds between each line (as in source) , replacing commas, until data record output. when next blank line reached in source, output advanced new line , process should repeat until end of data.

client:            <hostname> backup id:         <hostname>_1349499621 policy:            m-portwarew2k03-prod-clf policy type:       ms-windows (13) proxy client:      (none specified) creator:           root name1:             (none specified) sched label:       monthly_full schedule type:     full (0) retention level:   7 years (14) backup time:       sat oct  6 01:00:21 2012 (1349499621) elapsed time:      3457 second(s) expiration time:   sat oct  5 01:00:21 2019 (1570251621) compressed:        no client encrypted:  no kilobytes:         37090868 number of files:   215304 number of copies:  1 number of fragments:   6 histogram:         0 0 0 0 0 0 0 0 0 0 db compressed:     no files file name:   m-portwarew2k03-prod-clf_1349499621_full.f previous backup files file name:   (none specified) parent backup image file name:   (none specified) sw version:        (none specified) options:           0x0 mpx:               1 tir info:          0 tir expiration:    wed dec 31 19:00:00 1969 (0) keyword:           (none specified) ext security info: no file restore raw:  no image dump level:  0 file system only:  no object descriptor: (none specified) previous bi time:  wed dec 31 19:00:00 1969 (0) bi full time:      wed dec 31 19:00:00 1969 (0) request pid:       0 backup status:     0 stream number:     0 backup copy:       standard (0) files file size:     0 pfi type:     0 image_attribute:     0 primary copy:      1 image type:        0  (regular) job id:            2123444 num resumes:       0 resume expiration: wed dec 31 19:00:00 1969 (0) data classification:    (none specified) data_classification_id: (none specified) storage lifecycle policy:    (none specified) storage lifecycle policy version:    0 stl_completed:      0 remote expiration time: wed dec 31 19:00:00 1969 (0) origin master server:  (none specified) origin master guid:    (none specified) snap time:      wed dec 31 19:00:00 1969 (0) ir enabled:      no client character set:     0 image on hold:     0 indexing status:   0 copy number:       1  fragment:         1  kilobytes:        0  remainder:        0  media type:       media manager (2)  density:          hcart3 (20)  file num:         8  id:               k14753  host:             <some_other_host>  block size:       262144  offset:           1220388  media date:       fri oct  5 19:00:10 2012 (1349478010)  dev written on:   2  flags:            0x40  (tape encrypted)  media descriptor:        ?  expiration time:  sat oct  5 01:00:21 2019 (1570251621)  mpx:              1  retention_lvl:    7 years (14)  try keep time:  wed dec 31 19:00:00 1969 (0)  copy creation time:  sat oct  6 01:57:58 2012 (1349503078)  data format:      undefined  checkpoint:       0  resume num:       0  key tag:                  41f841dd750ef07e68cc5387629bb22d21933ca3a4ea204a01abbee2ba98cd44  stl tag:          *null*  copy on hold:     0 copy number:       1  fragment:         2  kilobytes:        6423296  remainder:        0  media type:       media manager (2)  density:          hcart3 (20)  file num:         9  id:               k14753  host:             amarlp67  block size:       262144  offset:           1235772  media date:       fri oct  5 19:00:10 2012 (1349478010)  dev written on:   2  flags:            0x40  (tape encrypted)  media descriptor:        ?  checkpoint:       0  resume num:       0  copy on hold:     0 copy number:       1  fragment:         3  kilobytes:        3038464  remainder:        0  media type:       media manager (2)  density:          hcart3 (20)  file num:         10  id:               k14753  host:             amarlp67  block size:       262144  offset:           1538917  media date:       fri oct  5 19:00:10 2012 (1349478010)  dev written on:   2  flags:            0x40  (tape encrypted)  media descriptor:        ?  checkpoint:       0 

etcetera, until blank line. every record have random number of fragments.

i'm open methodology solve this, though simplest , elegant code efficient. realize source data millions of rows long.

this code generate csv output bpimagelist command:

echo "client_name, date1, date2, version, backupid, policy_name, client_type, proxy_client, creator, sched_label, sched_type, retention, backup_time, elapsed, expiration, compression, encryption, kbytes, num_files, copies, num_fragments, files_compressed, files_file, version, name1, options, primary, image_type, tir_info, tir_expiration, keywords, mpx, ext_security, raw, dump_lvl, fs_only, prev_bitime, bifull_time, obj_desc, requestid, backup_stat, backup_copy, prev_image, jobid, num_resumes, resume_expr, ff_size, pfi_type, image_attrib, ss_classification_id, ss_name, ss_completed, snap_time, slp_version[, remoteexpiration, origin_master_server, origin_master_guid, ir_enabled, client_charset, hold, indexing_status" bpimagelist -l | grep '^image' | sed -e 's/^image //' | tr ' ' ',' 

Comments

Popular posts from this blog

Android : Making Listview full screen -

javascript - Parse JSON from the body of the POST -

javascript - Chrome Extension: Interacting with iframe embedded within popup -