How can I use xargs to run a function in a command substitution for each match?

问题

While writing Bash functions for string replacements I have encountered a strange behaviour when using xargs. This is actually driving me mad currently as I cannot get it to work. Fortunately I have been able to nail it down to the following simple example:

Define a simple function which doubles every character of the given parameter:

function subs { echo $1 | sed -E "s/(.)/\1\1/g"; }

Call the function:

echo $(subs "ABC")

As expected the output is:

AABBCC

Now call the function using xargs:

echo "ABC" | xargs -I % echo $(subs "%")

Surprisingly the result now is:

ABCABC

It seems as if the sed command inside the function treats the whole string now as a single character. Why does this happen and how can it be prevented?

You might ask, why I use xargs at all. Of course, this is a simplified example and the actual use case is much more complex.

In the original use case, I have a program which produces lots of output. I pipe the output through several greps to get the lines of interest. Afterwards, I pipe the lines to sed to extract the data I need from the lines. Because some transformations I need to do on the data are too complex to do with regular expressions alone, I'd like to use a function for these. So, my original idea was to simply pipe into the function but I couldn't get that to work and end up with the xargs solution. My original idea was something like this:

command | grep ... | grep ... | grep ... | sed ... | subs

BTW: I do not do this from the command line but from within a script. The function is defined in the very same script in which it is used.

I'm using Bash 3.2 (Mac OS X default), so fancy Bash 4.x stuff won't help me, sorry.

I'll be happy about everything which might shed some light on this topic.

Best regards

Frank

回答1:

If you really need to do this (and you probably don't, but we can't help without a more representative sample), a better-practice approach might look like:

subs() { sed -E "s/(.)/\1\1/g" <<<"$1"; }
export -f subs

echo "ABC" | xargs bash -c 'for arg; do subs "$arg"; done' _

The use of echo "$(subs "$arg")" instead of just subs "$arg" adds nothing but bugs (consider what happens if one of your arguments is -n -- and that's assuming a relatively tame echo; they're allowed to consume backslashes even without a -e argument and to do all manner of other surprising things). You could do it above, but it slows your program down and makes it more prone to surprising behaviors; there's no point.
Running export -f subs export your function to the environment, so it can be run by other instances of bash invoked as child processes (all programs invoked by xargs are outside your shell, so they can't see shell-local variables or functions).
Without -I -- which is to say, in its default mode of operation -- xargs appends arguments to the end of the command it's given. This permits a much more efficient usage mode, where instead of invoking one command per line of input, it passes as many arguments as possible to the shortest possible number of subprocesses.

This also avoids major security bugs that can happen when using xargs -I in conjunction with bash -c '...' or sh -c '...'. (If you ever use -I% sh -c '...%...', then your filenames become part of your code, and are able to be used in injection attacks on your system).

回答2:

That's because the construct $(subs "%") gets expanded by the shell when parsing the pipeline, so xargs runs with echo %%.

来源：https://stackoverflow.com/questions/54768307/how-can-i-use-xargs-to-run-a-function-in-a-command-substitution-for-each-match

标签

Linux

bash

function

sed

xargs