问题
Following the example given in the chapter GeneralRec of Chlipala book, I'm trying to write the mergesort algorithm.
Here is my code
Require Import Nat.
Fixpoint insert (x:nat) (l: list nat) : list nat :=
match l with
| nil => x::nil
| y::l' => if leb x y then
x::l
else
y::(insert x l')
end.
Fixpoint merge (l1 l2 : list nat) : list nat :=
match l1 with
| nil => l2
| x::l1' => insert x (merge l1' l2)
end.
Fixpoint split (l : list nat) : list nat * list nat :=
match l with
| nil => (nil,nil)
| x::nil => (x::nil,nil)
| x::y::l' =>
let (ll,lr) := split l' in
(x::ll,y::lr)
end.
Definition lengthOrder (l1 l2 : list nat) :=
length l1 < length l2.
Theorem lengthOrder_wf : well_founded lengthOrder.
Admitted.
The problem is that it is not possible to write the mergeSort function with the command Fixpoint
since the function is not structurally decreasing :
Fixpoint mergeSort (l: list nat) : list nat :=
if leb (length l) 1 then l
else
let (ll,lr) := split l in
merge (mergeSort ll) (mergeSort lr).
Instead, one can use the command Program Fixpoint
or Definition
with the term Fix
(as in Chlipala book).
However, if I'm writing this
Definition mergeSort : list nat -> list nat.
refine (Fix lengthOrder_wf (fun (l: list nat) => list nat)
(fun (l : list nat) => (fun mergeSort : (forall ls : list nat, lengthOrder ls l -> list nat )=>
if leb (length l) 1 then
let (ll,lr) := split l in
merge (mergeSort ll _) (mergeSort lr _)
else
l))).
I'm getting impossible goals :
2 subgoals, subgoal 1 (ID 65)
l : list nat
mergeSort : forall ls : list nat, lengthOrder ls l -> list nat
ll, lr : list nat
============================
lengthOrder ll l
subgoal 2 (ID 66) is:
lengthOrder lr l
That is why Chlipala suggests to change the definition of mergeSort this way:
Definition mergeSort : list nat -> list nat.
refine (Fix lengthOrder_wf (fun _ => list nat)
(fun (ls : list nat)
(mergeSort : forall ls' : list nat, lengthOrder ls' ls -> list nat) =>
if Compare_dec.le_lt_dec 2 (length ls)
then let lss := split ls in
merge (mergeSort (fst lss) _) (mergeSort (snd lss) _)
else ls)).
that generates the following goals:
2 subgoals, subgoal 1 (ID 68)
ls : list nat
mergeSort : forall ls' : list nat, lengthOrder ls' ls -> list nat
l : 2 <= length ls
lss := split ls : list nat * list nat
============================
lengthOrder (fst lss) ls
subgoal 2 (ID 69) is:
lengthOrder (snd lss) ls
This new definition sounds like magic to me. So I wonder:
- Fom the first definition, is it still possible to proof the well-foudness of the function?
- Otherwise why the first definition cannot work?
- How a basic user can go from the first definition to the second easily?
回答1:
It's easy to see that you need to make two changes in order to get to A. Chlipala's solution.
1) When doing split
you somehow need to remember that ll
and lr
came from split, otherwise they would be some arbitrary lists, which cannot possibly be shorter than the original list l
.
The following piece of code fails to save that kind of information:
let (ll,lr) := split l in
merge (mergeSort ll _) (mergeSort lr _)
and, thus, needs to be replaced with
let lss := split ls in
merge (mergeSort (fst lss) _) (mergeSort (snd lss) _)
which keeps what we need.
The failure happens due to Coq's inability to remember that ll
and lr
come from split l
and that happens because let (ll,lr)
is just match
in disguise (see the manual, §2.2.3).
Recall that the aims of pattern-matching is to (loosely speaking)
- unpack the components of some value of an inductive datatype and bind them to some names (we'll need this in the 2nd part of my answer) and
- replace the original definition with its special cases in the corresponding pattern-match branches.
Now, observe that split l
does not occur anywhere in the goal or context before we pattern-match on it. We just arbitrarily introduce it into the definition. That's why pattern-matching doesn't give us anything -- we can't replace split l
with its "special case" ((ll,lr)
) in the goal or context, because there is no split l
anywhere.
There is an alternative way of doing this by using logical equality (=
):
(let (ll, lr) as s return (s = split l -> list nat) := split l in
fun split_eq => merge (mergeSort ll _) (mergeSort lr _)) eq_refl
This is analogous to using the remember
tactic. We've got rid of fst
and snd
, but it is a huge overkill and I wouldn't recommend it.
2) Another thing we need to prove is the fact that ll
and lr
are shorter than l
when 2 <= length l
.
Since an if
-expression is a match
in disguise as well (it works for any inductive datatype with exactly two constructors), we need some mechanism to remember that leb 2 (length l) = true
in the then
branch. Again, since we don't have leb
anywhere, this information gets lost.
There are at least two possible solutions to the problem:
- either we remember
leb 2 (length l)
as an equation (just as we did in the 1st part), or - we can use some comparison function with result type behaving like
bool
(so it can represent two alternatives), but it should also remember some additional information we need. Then we could pattern-match on the comparison result and extract the information, which, of course, in this case have to be a proof of2 <= length l
.
What we need is a type which is able to carry a proof of m <= n
in the case when leb m n
returns true
and a proof of, say, m > n
otherwise.
There is a type in the standard library that does exactly that! It's called sumbool:
Inductive sumbool (A B : Prop) : Set :=
left : A -> {A} + {B} | right : B -> {A} + {B}
{A} + {B}
is just a notation (syntactic sugar) for sumbool A B
.
Just as bool
, it has two constructors, but in addition it remembers a proof of either of two propositions A
and B
. Its advantage over bool
shows up when you do case analysis on it with if
: you get a proof of A
in the then
branch and a proof of B
in the else
branch. In other words, you get to use context you saved beforehand, whereas bool
doesn't carry any context (only in the mind of the programmer).
And we need exactly that! Well, not in the else
branch, but we would like to get 2 <= length l
in our then
branch. So, let us ask Coq if it already has a comparison function with the return type like that:
Search (_ -> _ -> {_ <= _} + {_}).
(*
output:
le_lt_dec: forall n m : nat, {n <= m} + {m < n}
le_le_S_dec: forall n m : nat, {n <= m} + {S m <= n}
le_ge_dec: forall n m : nat, {n <= m} + {n >= m}
le_gt_dec: forall n m : nat, {n <= m} + {n > m}
le_dec: forall n m : nat, {n <= m} + {~ n <= m}
*)
Any of the five results would do, because we need a proof only in one case.
Hence, we can replace if leb 2 (length l) then ...
with if le_lt_dec 2 (length l) ...
and get 2 <= length
in the proof context, which will let us finish the proof.
来源:https://stackoverflow.com/questions/42285235/writing-well-founded-programs-in-coq-using-fix-or-program-fixpoint