Consider the following simplified example:
my_prog|awk \'...\' > output.csv &
my_pid=\"$!\" #Gives the PID for awk instead of for my_prog
sleep 10
kill $my
Here is a solution without wrappers or temporary files. This only works for a background pipeline whose output is captured away from stdout of the containing script, as in your case. Suppose you want to do:
cmd1 | cmd2 | cmd3 >pipe_out &
# do something with PID of cmd2
If only bash could provide ${PIPEPID[n]}!! The replacement "hack" that I found is the following:
PID=$( { cmd1 | { cmd2 0<&4 & echo $! >&3 ; } 4<&0 | cmd3 >pipe_out & } 3>&1 | head -1 )
If needed, you can also close the fd 3 (for cmd*) and fd 4 (for cmd2) with 3>&- and 4<&-, respectively. If you do that, for cmd2 make sure you close fd 4 only after you redirect fd 0 from it.