问题
I wrote a OCaml program for union find algorithm. This algorithm I wrote is not optimal and is the simplest version.
I put my OCaml code here because I am not sure whether this code is good enough or not (despite of the algorithm itself), although this code can run without errors.
This is the first time I wrote a complete working thing after I started to learn OCaml, so please help me by reviewing it.
Useful suggestions will help me improving my OCaml skills. Thanks
type union_find = {id_ary : int array; sz_ary : int array};;
let create_union n = {id_ary = Array.init n (fun i -> i);
sz_ary = Array.init n (fun i -> 1)};;
let union u p q =
let rec unionfy id_ary i =
let vp = id_ary.(p) in
let vq = id_ary.(q) in
if i < Array.length id_ary then begin
if i != q && id_ary.(i) = vp then id_ary.(i) <- vq;
unionfy id_ary (i + 1)
end
else print_string "end of union\n"
in
unionfy u.id_ary 0;;
let is_connected u p q = u.id_ary.(p) = u.id_ary.(q);;
First of all,
Am I creating the data structure of union (as in union find) correctly?
Should I include two arrays inside or is there any better way?
Second,
I am using array in this code, but array is mutable which is not that good for fp right?
Is there a way to avoid using array?
Finally,
Overall, is this piece of code good enough?
Anything can be improved?
P.S. I am not using OCaml's object oriented bit yet, as I haven't learnt to that part.
回答1:
A few stylistic points:
Not sure why unionfy takes id_ary as a parameter since it keeps it constant throughout
don't use Array.init with a constant function. Just use Array.make.
print_string "...\n" is equivalent to print_endline "..."
The following definition can be cleaned up with punning to
let union u p q = to: let union {id_ary; _} p q so that there are no extraneous references to u.
Same punning trick for let is_connected u p q = u.id_ary.(p) = u.id_ary.(q);;
This might be a personal choice but I would get rid of:
let vp = id_ary.(p) in
let vq = id_ary.(q) in
Or at least shove them above the recursive definition so that it's clear they are constant.
EDIT: corrected version
let union {id_ary;_} p q =
let (vp, vq) = (id_ary.(p), id_ary.(q)) in
let rec unionfy i =
if i < Array.length id_ary then begin
if i != q && id_ary.(i) = vp then id_ary.(i) <- vq;
unionfy (i + 1)
end else print_endline "end of union"
in unionfy 0;;
回答2:
Some comments on the code:
You don't seem to use sz_ary for anything.
Your code iterates through the whole array for each union operation. This is not correct for the standard (Tarjan) union-find. For a linear number of union operations your code produces a quadratic solution. Wikipedia has the standard algorithm: Disjoint-set Data Structure.
To answer your second question: as far as I know, union-find is one of the algorithms for which there's no known functional (immutable) solution with the same complexity as the best imperative solution. Since an array is simply a map from integers to values, you could always translate any array-based solution into an immutable one using maps. As far as I've been able to determine, this would match the best known solution in asymptotic complexity; i.e., it would add an extra factor of log n. Of course there would also be a constant factor that might be large enough to be a problem.
I've implemented union-find a few times in OCaml, and I've always chosen to do it using mutable data. However, I haven't used arrays. I have a record type for my basic objects, and I use a mutable field in each record to point to its parent object. To do path compression, you modify the parent pointer to point to the current root of the tree.
来源:https://stackoverflow.com/questions/14759932/simple-non-optimal-unionfind-in-ocaml