问题
Let's say that we have multiple .log files on the prod unix machine(Sunos) in a directory: For example:
ls -tlr
total 0
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2017-01.log
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2016-02.log
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 todo2015-01.log
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 fix20150223.log
The purpose here is that via nawk I extract specific info from the logs( parse logs ) and "transform" them to .csv files in order to load them to ORACLE tables afterwards. Although the nawk has been tested and works like a charm, how could I automate a bash script that does the following:
1) For a list of given files in this path
2) nawk (to do my extraction of specific data/info from the log file)
3) Output separately each file to a unique .csv to another directory
4) remove the .log files from this path
What does concern me is that the loadstamp/timestamp on each file ending that is different. I have implemented a script that works only for the latest date. (eg. last month). But I want to load all the historical data and I am bit stuck.
To visualize, my desired/target output is this:
bash-4.4$ ls -tlr
total 0
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2017-01.csv
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 file2016-02.csv
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 todo2015-01.csv
-rw-r--r-- 1 21922 21922 0 Sep 10 13:15 fix20150223.csv
How could this bash script please be achieved? The loading will only takes one time, it's historical as mentioned. Any help could be extremely useful.
回答1:
An implementation written for readability rather than terseness might look like:
#!/usr/bin/env bash
for infile in *.log; do
outfile=${infile%.log}.csv
if awk -f yourscript <"$infile" >"$outfile"; then
rm -f -- "$infile"
else
echo "Processing of $infile failed" >&2
rm -f -- "$outfile"
fi
done
To understand how this works, see:
- Globbing -- the mechanism by which
*.log
is replaced with a list of files with that extension. - The Classic for Loop -- The
for infile in
syntax, used to iterate over the results of the glob above. - Parameter expansion -- The
${infile%.log}
syntax, used to expand the contents of theinfile
variable with any.log
suffix pruned. - Redirection -- the syntax used in
<"$infile"
and>"$outfile"
, opening stdin and stdout attached to the named files; or>&2
, redirecting logs to stderr. (Thus, when we runawk
, its stdin is connected to a.log
file, and its stdout is connected to a.csv
file).
来源:https://stackoverflow.com/questions/46141216/how-can-i-iterate-over-log-files-process-them-through-awk-and-replace-with-ou