How to find probability in R [closed]

问题

How could you use the prob argument in the sample command in R so that it covers conditional probability as well? For example, the letters of the alphabet: the probability of generating the letter 'b' depends on the previous letter generated.

Thanks!

回答1:

The question is still a bit difficult to answer, it would help if you specified an example of the conditional probability, seealso here.

But if we assume that we have a,b, and c, and a and b have a high probability of occuring after eachother, we can continue on the answer from here, and we could modify the function as follows:

# n = number of elements
# sample_from = draw random numbers from this range
random_non_consecutive <- function(n=10,sample_from = seq(1,5))
{
  y= rep(NA, n)
  prev=-1 # change this if -1 is in your range, to e.g. max(sample_from)+1
  probs = rep(1,length(sample_from));names(probs)=sample_from
  for(i in seq(n)){

    # your conditional probability rules should go here.
    if(prev=="a")
    {
      probs = c('a'=0,'b'=0.9,'c'=0.1)
    }
    if(prev=="b")
    {
      probs = c('a'=0.9,'b'=0,'c'=0.1)
    }
    if(prev=="c")
    {
      probs = c('a'=0.9,'b'=0.1 ,'c'=0)
    }

    y[i]=sample(setdiff(sample_from,prev),1,prob = probs[names(probs) %in% setdiff(sample_from,prev)])
    prev = y[i]
  }
  return(y)
}

In this case, a and b have a high probability of occuring after eachother. And indeed:

random_non_consecutive(40,letters[1:3])
 [1] "c" "a" "b" "a" "b" "a" "b" "c" "b" "a" "b" "a" "b" "a" "b" "a" "b" "a" "b" "a" "b" "a" "b" "a" "c" "a" "b" "a" "b" "c" "a" "b" "a" "b" "a" "b" "a" "b" "a" "c"

Hope this helps.

回答2:

If you are trying to string a series of random letters into a long char vector where current letter is conditioned on previous, then you can not do it with a single call to sample as prob is a vector of weights and each draw would be sampled with the same weight vector. It can be done be done, however, as follows:

library(tibble)
library(dplyr)
library(magrittr)

letters <- c('a', 'b', 'c')
marg <- c(11.29, 4.68, 4.39)
cond <- tribble(
  ~letters, ~a,   ~b,   ~c,
  'a',      0.02, 2.05, 3.85,
  'b',      8.47, 0.90, 0.06,
  'c',     13.19, 0.02, 1.76)

n <- 1000
s <- rep(NULL, n)

s[1] <- sample(x = letters, size = 1, prob = marg)
for (i in (2:n)) {
  p <- cond %>%
    filter(letters == s[i-1]) %>%
    select(-letters)

  s[i] <- sample(x = letters, size = 1, prob = p)
}

head(s)
# [1] "a" "c" "c" "a" "c" "a"

This is not an efficient way to approach the problem as it takes several seconds to create a 1000 string char vector.

来源：https://stackoverflow.com/questions/48132250/how-to-find-probability-in-r

标签

probability