Specification of Multinomial model in Tensorflow Probability

纵然是瞬间 提交于 2020-04-17 23:46:13

问题


I am playing with a mixed multinomial discrete choice model in Tensorflow Probability. The function should take an input of a choice among 3 alternatives. The chosen alternative is specified by CHOSEN (a # observationsx3 tensor). I have a previous question but the code/question has changed quite a bit:

Tensorflow Probability Error: OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed

Looking at the source code for Multinomial(), I should be able to give CHOSEN as an input to total_count and get the correct result based on how it appears in the log likelihood function. The model is a basic multinomial logit choice (or softmax function) where logits is the systematic utility for each alternative. I currently get (# observations=6768 and # alternatives=3):

ValueError: Dimensions must be equal, but are 3 and 6768 for '{{node Multinomial_1/sample/draw_sample/mul}} = Mul[T=DT_INT32](Multinomial_1/sample/draw_sample/ones_like, Multinomial_1/sample/Cast)' with input shapes: [3,6768], [6768,3].

I tried transposing the total_count tensor and get:

ValueError: The two structures don't have the same sequence length. Input structure has length 5, while shallow structure has length 10.

I have the following joint distribution function (plus helper function and log_prob() call):

def mmnl_func():  
  return tfd.JointDistributionSequential([
    tfd.Normal(loc=0., scale=1e5),  # mu_b_time
    tfd.HalfCauchy(loc=0., scale=5),  # sigma_b_time
    lambda sigma_b_time,mu_b_time: tfd.MultivariateNormalDiag(  # b_time
    loc=affine(tf.ones([num_idx]), mu_b_time[..., tf.newaxis]),
    scale_identity_multiplier=sigma_b_time),
    tfd.Normal(loc=0, scale=1e5), # a_train
    tfd.Normal(loc=0, scale=1e5), # a_car
    tfd.Normal(loc=0, scale=1e5), # b_cost
    lambda b_cost,a_car,a_train,b_time: tfd.Deterministic(loc=affine(DATA[:,0], tf.gather(b_time, IDX, axis=-1), (a_train + b_cost * DATA[:,1]))),  # V1
    lambda V1,b_cost,a_car,a_train,b_time: tfd.Deterministic(loc=affine(DATA[:,2], tf.gather(b_time, IDX, axis=-1), (b_cost * DATA[:,3]))),  # V2
    lambda V2,V1,b_cost,a_car,a_train,b_time: tfd.Deterministic(loc=affine(DATA[:,4], tf.gather(b_time, IDX, axis=-1), (a_car + b_cost * DATA[:,5]))),  # V3
    lambda V3,V2,V1: tfd.Multinomial(  # y
      total_count=CHOICE,
      logits=[V1,V2,V3])
  ])

@tf.function
def mmnl_log_prob(a_train, a_car, b_cost, mu_b_time, sigma_b_time):
  return mmnl_func().log_prob(
      [mu_b_time, sigma_b_time, a_train, a_car,b_cost])
@tf.function
def affine(x, kernel_diag, bias=tf.zeros([])):
  """`kernel_diag * x + bias` with broadcasting."""
  kernel_diag = tf.ones_like(x) * kernel_diag
  bias = tf.ones_like(x) * bias
  return x * kernel_diag + bias

Update

If I change:

logits=[V1,V2,V3]

to:

logits=tf.stack([V1,V2,V3],axis=1)

I get the below error/traceback. I am not sure how the inputs to log_prob() work and how they interact with the joint distribution function. It seems to be an error where I pass 5 inputs and it is looking for 10 (corresponding to the 10 dimensions of the joint distribution).

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-26-bb6078bf0f7c> in <module>()
     40     return samples_nuts_, stats_nuts_
     41 
---> 42 samples_nuts, stats_nuts = nuts_sampler(initial_state)

8 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
    966           except Exception as e:  # pylint:disable=broad-except
    967             if hasattr(e, "ag_error_metadata"):
--> 968               raise e.ag_error_metadata.to_exception(e)
    969             else:
    970               raise

ValueError: in user code:

    <ipython-input-26-bb6078bf0f7c>:34 nuts_sampler  *
        samples_nuts_, stats_nuts_ = tfp.mcmc.sample_chain(
    <ipython-input-25-39abca09aae1>:28 mmnl_log_prob  *
        [mu_b_time, sigma_b_time, a_train, a_car,b_cost])
    /usr/local/lib/python3.6/dist-packages/tensorflow_probability/python/distributions/joint_distribution.py:443 log_prob  **
        return self._call_log_prob(value, **unmatched_kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow_probability/python/distributions/distribution.py:862 _call_log_prob
        value = _convert_to_tensor(value, name='value', dtype_hint=self.dtype)
    /usr/local/lib/python3.6/dist-packages/tensorflow_probability/python/distributions/distribution.py:172 _convert_to_tensor
        check_types=False)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/nest.py:1118 map_structure_up_to
        **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/nest.py:1200 map_structure_with_tuple_paths_up_to
        expand_composites=expand_composites)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/nest.py:835 assert_shallow_structure
        input_length=len(input_tree), shallow_length=len(shallow_tree)))

    ValueError: The two structures don't have the same sequence length. Input structure has length 5, while shallow structure has length 10.

来源:https://stackoverflow.com/questions/61236004/specification-of-multinomial-model-in-tensorflow-probability

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!