How to encode causal relations in Prolog (as a linear function)

限于喜欢 提交于 2019-12-24 16:12:17

问题


Suppose that two variables X and Y are causally and linearly related, so that an increase in X produces an increase in Y (e.g. travel distance for cars and their fuel consumption). Both X and Y are vectors of N observations (N individual cars in the example).

A way to represent such a relation is a simple linear equation Yi = a + bXi, which would describe the relation in the sample of N cases, where i = 1, 2, ..., N. Here a and b are constants, while Y and X are variables.

Do you have any suggestions how this could be represented in Prolog? My hunch is something like causes(cause(travelDistance), effect(fuelConsumption), a(0.5), b(1.23)).. What seems missing here, however, is code which states that the association specifically is between the ith value of X and the ith value of Y (a car's travel distance and that car's fuel consumption).

Any ideas? Thanks in advance!

/JC


回答1:


Forgive the fact that I'm answering, only to use a more appropriate format than comments, though this may not be the answer you're looking for at this point.

Unless I have misunderstood your question, I think the problem you describe here is an ill-defined / ill-described problem. My understanding of it is that you have a dataset of X and Y, which happen to follow a linear relationship, and you want to either 'infer' that X causes Y in the absence of any other information, or simply have a way to describe this is the case via a predicate. The problem is that, a correlated dataset can never give you that information by itself.

I you want to establish causality from a dataset, you need to describe what type of causality you're after and how that could be asserted and investigated first. Having a dataset that can never tell you nothing about causality if you don't know the ordering of events, or how alternatives behave.

I'm sure there are many models of causality out there, I have only come across two used meaningfully in practice: the chronological model, and the counterfactual model.

In the chronological model, if you are able to establish 'when' an event happens, then you can infer causality via a very simple "and X comes before Y" rule. E.g. if "X = travel" is deemed to take place before "Y = fuel-measurement", then you can establish causality using predicate logic, by showing that:

  • Whenever travel precedes fuel-measurement, the relationship is always necessarily linear
  • When fuel-measurement precedes travel, the relationship is not necessarily linear. (because if it were, then you're back to only being able to establish correlation rather than causality)
  • The closed world phenomenon applies (i.e. there is nothing else that contributes to fuel consumption in the absence of travel)

In the counterfactual model, you don't have any information about the chronology of the events, but what you do have is information on alternative events. Therefore causality of "X causes Y" is established by it's counterfactual, i.e. if you can show that "Had X not happened, Y would not have happened either" (or equivalently ¬X implies ¬Y).

A complicating factor in the counterfactual model is that it allows for the concept of 'responsibility', i.e. if both X and ¬X can result in Y, then they are both said to be potential causes for Y. However in the context of a dataset you can probably get around this by saying "if for ALL events X, the outcome is Y, whereas it is not necessarily true that for ALL events ¬X the outcome is Y, then we can infer that X causes Y". So, in your specific example, you could set up a world such that

  • Fuel consumption can either only occur from a 'travel' event or an alternative hypothesis which constitutes the non-travel event and is a mutually exclusive event, e.g. say, 'siphoning'
  • Both the travel 'event' and the siphoning 'event' result in a physical measurement, e.g. distance traveled. (which, in our trivial example, would probably just be zero for the siphoning event).
  • In your dataset you have information on 'both' what event occurred (e.g. travel or siphoning) and information on fuel consumption and distance travelled for that instance.

You can then establish that 'travelling' as an event 'causes' fuel consumption in a linear model fashion with respect to the distance traveled, by showing that:

  • Whenever you have a 'travel' event, the distance traveled does indeed correspond to fuel consumption according to your linear model
  • Whenever you have a 'siphoning' event, the distance traveled does not 'necessarily' correspond to fuel consumption according to that model.

Update to address the comment: the question is not one of inferring causality, but how to represent causality under the assumption that causality has already been established in practice. In this case, the above points still apply, since you need to define more clearly which type of causality you are referring to before you can represent it.

For example, if we are talking about events that occur in strict chronological order, chronological causality might look something like this (in prolog-like pseudocode):

%%%%%%%%%%%%%%%%%%
%%% facts database
%%%%%%%%%%%%%%%%%%

% eventtype/1: defines type of event
eventtype('travel')
eventtype('fuel_measurement') % ... etc

% eventtime/2: defines timepoints by index and a record of actual time
eventtime(1, "12:02am")
eventtime(2, "12:03am") % ... etc

% event/3: ['event type', 'time', 'related measurement']
event( [eventtype('travel'),           eventtime(1, _), 50km] )
event( [eventtype('fuel-measurement'), eventtime(2, _), 5L  ] ) % ... etc

%%%%%%%%%%%%%
%%% relations
%%%%%%%%%%%%%

immediately_precedes( event(X), event(Y) ) :- 
  get_eventtime_index(X, Xind),
  get_eventtime_index(Y, Yind),
  plus_one(Xind, Yind).   % assumes all above helper predicates are suitably defined elsewhere

is_linearly_related( event(X), event(Y) ) :- 
  get_measurement(X, Xmeas), 
  get_measurement(Y, Ymeas), 
  Model is a + b * Xmeas, 
  Ymeas = Model.

iscausal( eventtype(Xtype), eventtype(Ytype) ) :-   % expressed as pseudocode
  forall: 
    [event(X), event(Y)], 
    X = [Xtype, Xtime, Xmeas], 
    Y = [Ytype, Ytime, Ymeas], 
    immediately_precedes( event(X), event(Y) )
  it applies that:
    is_linearly_related( event(X), event(Y) )



回答2:


Based on your suggestions I think this code answers my original question. Thanks!

:-use_module(library(clpfd)).

causes(
          var(
              name(distance),
              value(Distance)
          ),
          var(
              name(fuelConsumption),
              value(FuelConsumption)
          )
)
:-
FuelConsumption #= 5 + 2 * Distance.

And a sample query:

?-causes(var(name(N), value(V)), var(name(fuelConsumption), value(3))).

Which yields N = distance,V = -1



来源:https://stackoverflow.com/questions/51303305/how-to-encode-causal-relations-in-prolog-as-a-linear-function

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!