Confronted with a heap of colon separated text files which had to be merged and cleaned of unrelated lines and columns, i tryed my luck inside Excel and spend a lot of time doing it manually, but finally got fed up.
So I decided to use AWK on the task.
A FOR-loop lists the files in the folder into the UNIX pipe.
AWK selects the non-empty observations and adds the name of the file as a classifier to the beginning of the line (the result is a repeated measure dataset).
This is the code:
for CSV in `ls`
do
cat $CSV | awk -F ";" '{
if ($2 ~ /[0-9]+/) {print CSV , FS , $0;}}'
done
Remark: -F ";" option specifies how to distinguish the columns/fields of the lines/records in the file(default is ” ” or empty space).
BUT: The variable CSV gets not passed to AWK by default it has to be fed into AWK.
Solution:
The
-v CATEGORY=$CSV
option feeds the external variable CSV into the AWK-variable CATEGORY.
This gives:
for CSV in `ls`
do
cat $CSV | awk -F ";" -v CATEGORY=$CSV '{
if ($2 ~ /[0-9]+/) {print CATEGORY , FS , $0;}}'
done
.. and works
Hat tip:fpmurphy
Your use of the -v option made my script work like a charm. Thanks a lot.