I was playing with permutation in a couple of programs and stumbled upon this little experiment:
Permutation method 1:
permute([], []).
I suspect what triggered this investigation was the discussion about tail-recursive sum/2 using an accumulator versus not. The sum/2 example is very cut-and-dry; one version is doing the arithmetic on the stack, the other is using an accumulator. However, like most things in the real world, the general truth is "it depends." For instance, compare the efficiency of methods 1 and 2 using full instantiation:
?- time(permute([1,2,3,4,5,6,7,8,9], [1,2,3,4,5,6,7,8,9])).
% 18 inferences, 0.000 CPU in 0.000 seconds (66% CPU, 857143 Lips)
true ;
% 86,546 inferences, 0.022 CPU in 0.022 seconds (100% CPU, 3974193 Lips)
false.
?- time(permute([1,2,3,4,5,6,7,8,9], [1,2,3,4,5,6,7,8,9])).
% 18 inferences, 0.000 CPU in 0.000 seconds (62% CPU, 857143 Lips)
true ;
% 47 inferences, 0.000 CPU in 0.000 seconds (79% CPU, 940000 Lips)
false.
Method 1 beats method 2 when you're generating solutions (as in your tests), but method 2 beats method 1 when you're simply checking. Looking at the code it's easy to see why: the first one has to re-permute the whole tail of the list, while the second one just has to try selecting out one item. In this case it may be easy to point to the generating case and say it's more desired. That determination is simply one of the tradeoffs one must keep track of when dealing with Prolog. It's very difficult to make predicates that are all things to all people and always perform great; you must decide which are the "privileged paths" and which are not.
I do vaguely recall someone recently showed an example of appending lists "during the return" and how you could take something that isn't or shouldn't be tail recursive and make it work thanks to unification, but I don't have the link handy. Hopefully whoever brought it up last time (Will?) will show up and share it.
Great question, by the way. Your investigation method is valid, you'll just need to take into account other instantiation patterns as well. Speaking personally, I usually try to worry harder about correctness and generality than performance up-front. If I see immediately how to use an accumulator instead I will, but otherwise I won't do it that way until I run into an actual need for better performance. Tail recursion is just one method for improving performance; frequently there are other things that need to be addressed as badly or worse.