Constraint programming suitable for extracting OneToMany relationships from records

人盡茶涼 提交于 2019-12-06 13:13:51

I'm not sure if I understand all the requirements of the problem, but here is a constraint programming model in MiniZinc (http://www.minizinc.org/). The full model is here: http://hakank.org/minizinc/one_to_many.mzn .

LATER NOTE: The first version of the project constraints where not correct. I have removed the incorrect code . See the edit history for the original answer.

enum mothers = {jane,claire,sophia};
enum children = {brian,stephen,emma,william,james,isabella};      

% decision variables

% who is the mother of this child?
array[children] of var mothers: x;


solve satisfy;

constraint
  % All mothers has at least one child
  forall(m in mothers) (
    exists(c in children) (
      x[c] = m
    )
  )
;

constraint
% NOTE: This is a more correct version of the project constraints.
% project 1
(
  ( x[brian] = jane /\ x[stephen] = claire) \/
  ( x[stephen] = jane /\ x[brian] = claire)
) 
/\
% project 2
(
  ( x[emma] = claire /\ x[william] = jane) \/
  ( x[william] = claire /\ x[emma] = jane) 
)
/\
% project 3
(
  ( x[william] = claire /\ x[james] = jane) \/
  ( x[james] = claire /\ x[william] = jane) 
)
/\
% project 4
( 
  ( x[brian] = jane /\ x[james] = sophia /\ x[isabella] = claire) \/
  ( x[james] = jane /\ x[brian] = sophia /\ x[isabella] = claire) \/
  ( x[james] = jane /\ x[isabella] = sophia /\ x[brian] = claire) \/
  ( x[brian] = jane /\ x[isabella] = sophia /\ x[james] = claire) \/
  ( x[isabella] = jane /\ x[brian] = sophia /\ x[james] = claire) \/
  ( x[isabella] = jane /\ x[james] = sophia /\ x[brian] = claire) 
)
/\

% project 4(sic!)
( x[brian] = claire) /\

% project 5
( x[emma] = jane)
;


output [
  "\(c): \(x[c])\n"
  | c in children
];

The unique solution is

brian: claire
stephen: jane
emma: jane
william: claire
james: jane
isabella: sophia

Edit2: Here is a more general solution. See http://hakank.org/minizinc/one_to_many.mzn for the complete model.

include "globals.mzn"; 

enum mothers = {jane,claire,sophia};
enum children = {brian,stephen,emma,william,james,isabella};      

% decision variables
% who is the mother of this child?
array[children] of var mothers: x;

% combine all the combinations of mothers and children in a project
predicate check(array[int] of mothers: mm, array[int] of children: cc) =
  let {
    int: n = length(mm);
    array[1..n] of var 1..n: y;
  } in
  all_different(y) /\
  forall(i in 1..n) (
     x[cc[i]] = mm[y[i]]
  )
;    

solve satisfy;

constraint
% All mothers has at least one child.
forall(m in mothers) (
  exists(c in children) (
    x[c] = m
  )
)
;


constraint
% project 1    
check([jane,claire], [brian,stephen]) /\
% project 2
check([claire,jane],[emma,william]) /\
% project 3
check([claire,jane],[william,james]) /\
% project 4
check([claire,sophia,jane],[brian,james,isabella]) /\
% project 4(sic!)
check([claire],[brian]) /\
% project 5
check([jane],[emma])
;

output [
 "\(c): \(x[c])\n"
 | c in children
];

This model use the following predicate to ensure that all the combinations of mothers vs children are considered:

predicate check(array[int] of mothers: mm, array[int] of children: cc) =
   let {
     int: n = length(mm);
     array[1..n] of var 1..n: y;
  } in
  all_different(y) /\
  forall(i in 1..n) (
    x[cc[i]] = mm[y[i]]
  )
;    

It use the global constraint all_different(y) to ensure that mm[y[i]] is one of the mothers in mm, and then assign the `i'th child to that specific mother.

A bit off topic, but since from SWI-Prolog manual:

Plain Prolog can be regarded as CLP(H), where H stands for Herbrand terms. Over this domain, =/2 and dif/2 are the most important constraints that express, respectively, equality and disequality of terms.

I feel authorized to suggest a Prolog solution, more general than the algorithm you suggested (progressively reduce relations based on single to single relations):

solve2(Projects,ParentsChildren) :-
    foldl([_-Ps-Cs,L,L1]>>try_links(Ps,Cs,L,L1),Projects,[],ChildrenParent),
    transpose_pairs(ChildrenParent,ParentsChildrenFlat),
    group_pairs_by_key(ParentsChildrenFlat,ParentsChildren).

try_links([],[],Linked,Linked).
try_links(Ps,Cs,Linked,Linked2) :-
    select(P,Ps,Ps1),
    select(C,Cs,Cs1),
    link(C,P,Linked,Linked1),
    try_links(Ps1,Cs1,Linked1,Linked2).

link(C,P,Assigned,Assigned1) :-
    (   memberchk(C-Q,Assigned)
    ->  P==Q,
        Assigned1=Assigned
    ;   Assigned1=[C-P|Assigned]
    ).

This accepts data in a natural format, like

data(1,
    [1-[jane,claire]-[brian,stephen]
    ,2-[claire,jane]-[emma,william]
    ,3-[jane,claire]-[william,james]
    ,4-[jane,sophia,claire]-[brian,james,isabella]
    ,5-[claire]-[brian]
    ,6-[jane]-[emma]
    ]).
data(2,
    [1-[jane,claire]-[brian,stephen]
    ,2-[claire,jane]-[emma,william]
    ,3-[jane,claire]-[william,james]
    ,4-[jane,sophia,claire]-[brian,james,isabella]
    ,5-[claire]-[brian]
    ,6-[jane]-[emma]
    ,7-[sally,sandy]-[grace,miriam]
    ]).

?- data(2,Ps),solve2(Ps,S).
Ps = [1-[jane, claire]-[brian, stephen], 2-[claire, jane]-[emma, william], 3-[jane, claire]-[william, james], 4-[jane, sophia, claire]-[brian, james, isabella], 5-[claire]-[brian], 6-[jane]-[emma], 7-[...|...]-[grace|...]],
S = [claire-[william, brian], jane-[james, emma, stephen], sally-[grace], sandy-[miriam], sophia-[isabella]].

This is my first CHR program, so I hope that someone will come and give me some advice on how to improve it.

My thinking is that you need to expand all the lists into facts. From there, if you know that a project has just one parent and one child, you can establish the parent relationship from that. Also, once you have a parent-child relationship, you can remove that set from the other facts in the other projects and reduce the cardinality of the problem by one. Eventually you will have figured out everything you can. The only difference between a completely determined dataset and an incompletely determined one is in how far that reduction can go. If it doesn't quite get there, it will leave around some facts so you can see which projects/parents/children are still creating ambiguity.

:- use_module(library(chr)).

:- chr_constraint project/3, project_parent/2, project_child/2, 
   project_parents/2, project_children/2, project_size/2, parent/2.

%% turn a project into a fact about its size plus 
%% facts for each parent and child in this project
project(N, Parents, Children) <=>
    length(Parents, Len),
    project_size(N, Len),
    project_parents(N, Parents),
    project_children(N, Children).

%% expand the list of parents for this project into a fact per parent per project
project_parents(_, []) <=> true.
project_parents(N, [Parent|Parents]) <=>
    project_parent(N, Parent),
    project_parents(N, Parents).

%% same for the children
project_children(_, []) <=> true.
project_children(N, [Child|Children]) <=>
    project_child(N, Child),
    project_children(N, Children).

%% a single parent-child combo on a project is exactly what we need
one_parent @ project_size(Project, 1), 
             project_parent(Project, Parent), 
             project_child(Project, Child) <=>
    parent(Parent, Child).

%% if I have a parent relationship for project of size N,
%% remove this parent and child from the project and decrease
%% the number of parents and children by one
parent_det @ parent(Parent, Child) \ project_size(Project, N), 
                                     project_parent(Project, Parent), 
                                     project_child(Project, Child) <=>
    succ(N0, N),
    project_size(Project, N0).

I ran this with your example by making a main/0 predicate to do it:

main :-
    project(1, [jane, claire], [brian, stephen]),
    project(2, [claire, jane], [emma, william]),
    project(3, [jane, claire], [william, james]),
    project(4, [jane, sophia, claire], [brian, james, isabella]),
    project(5, [claire], [brian]),
    project(6, [jane], [emma]).

This outputs:

parent(sophia, isabella),
parent(jane, james),
parent(claire, william),
parent(jane, emma),
parent(jane, stephen),
parent(claire, brian).

To demonstrate incomplete determination, I added a seventh project:

project(7, [sally,sandy], [grace,miriam]).

The program then outputs this:

project_parent(7, sandy),
project_parent(7, sally),
project_child(7, miriam),
project_child(7, grace),
project_size(7, 2),
parent(sophia, isabella),
parent(jane, james),
parent(claire, william),
parent(jane, emma),
parent(jane, stephen),
parent(claire, brian).

As you can see, any project_size/2 that remains tells you the cardinality of what remains to be solved (project seven has two parent/children relationships still remaining to be determined) and you get back exactly the parents/children that remain to be handled, as well as all of the parent/2 relations which could be determined.

I'm pretty happy with this outcome but hopefully others can come and improve my code!

Edit: my code has a shortcoming which was identified on the mailing list, that certain inputs will fail to converge even though the solution can be computed, for instance:

project(1,[jane,claire],[brian, stephan]),
project(2,[jane,emma],[stephan, jones]).

For more information, see Ian's solution, which uses set intersection to determine the mapping.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!