Monday, March 22, 2010

Split a file into multiple files depending on a key value

You have, e.g., a log file to collect logs of different sources;
then you
want to separate the log records depending on a key value defined in the
log data records (the key might identify the source); records with equal
keys should go into the same file.
The program can, of course, be used for other applications than log files,
too.

A nice task for a one-liner... !!

# splitlog.sh - split logfile into multiple logfiles
#
# Depending on some key value in the first column of the logfile defined
# in the argument, it will be split into a set of logfiles, one for each
# key.  If no argument is specified, stdin is used.
#
# Usage: splitlog.sh [ logfile ]
#
# Janis Papanagnou, 2002-11-22

awk -v logfile=${1:-"stdin"} '{ print > logfile"-"$1 }' "$1"

: '--- An example to illustrate...
A file "logfile" with keys A, B, C, and containing the lines, e.g.:
        A 489257 8957 38tgzg75ßhg g5hg 5gh27hg 75gh 5hg    0
        C 8 c83h5g 85gh 5hg5hg h 8h8gh t2h gtj2            1
        B 459 wef2 eruhg uiregn euignutibngtnb ioj         2
        B 489257 8957 38tgzg75ßhg g5hg 5gh27hg 75gh 5hg    3
        A 459 wef2 eruhg uiregn euignutibngtnb ioj         4

will be split into three files "logfile-A" (only lines with key A):
        A 489257 8957 38tgzg75ßhg g5hg 5gh27hg 75gh 5hg    0
        A 459 wef2 eruhg uiregn euignutibngtnb ioj         4
"logfile-B" (only lines with key B):
        B 459 wef2 eruhg uiregn euignutibngtnb ioj         2
        B 489257 8957 38tgzg75ßhg g5hg 5gh27hg 75gh 5hg    3
and "logfile-C" (only lines with key C):
        C 8 c83h5g 85gh 5hg5hg h 8h8gh t2h gtj2            1
----'
      

No comments:

Post a Comment