问题
How could you use the prob
argument in the sample
command in R so that it covers conditional probability as well? For example, the letters of the alphabet: the probability of generating the letter 'b' depends on the previous letter generated.
Thanks!
回答1:
The question is still a bit difficult to answer, it would help if you specified an example of the conditional probability, seealso here.
But if we assume that we have a,b, and c, and a and b have a high probability of occuring after eachother, we can continue on the answer from here, and we could modify the function as follows:
# n = number of elements
# sample_from = draw random numbers from this range
random_non_consecutive <- function(n=10,sample_from = seq(1,5))
{
y= rep(NA, n)
prev=-1 # change this if -1 is in your range, to e.g. max(sample_from)+1
probs = rep(1,length(sample_from));names(probs)=sample_from
for(i in seq(n)){
# your conditional probability rules should go here.
if(prev=="a")
{
probs = c('a'=0,'b'=0.9,'c'=0.1)
}
if(prev=="b")
{
probs = c('a'=0.9,'b'=0,'c'=0.1)
}
if(prev=="c")
{
probs = c('a'=0.9,'b'=0.1 ,'c'=0)
}
y[i]=sample(setdiff(sample_from,prev),1,prob = probs[names(probs) %in% setdiff(sample_from,prev)])
prev = y[i]
}
return(y)
}
In this case, a and b have a high probability of occuring after eachother. And indeed:
random_non_consecutive(40,letters[1:3])
[1] "c" "a" "b" "a" "b" "a" "b" "c" "b" "a" "b" "a" "b" "a" "b" "a" "b" "a" "b" "a" "b" "a" "b" "a" "c" "a" "b" "a" "b" "c" "a" "b" "a" "b" "a" "b" "a" "b" "a" "c"
Hope this helps.
回答2:
If you are trying to string a series of random letters into a long char vector where current letter is conditioned on previous, then you can not do it with a single call to sample
as prob
is a vector of weights and each draw would be sampled with the same weight vector. It can be done be done, however, as follows:
library(tibble)
library(dplyr)
library(magrittr)
letters <- c('a', 'b', 'c')
marg <- c(11.29, 4.68, 4.39)
cond <- tribble(
~letters, ~a, ~b, ~c,
'a', 0.02, 2.05, 3.85,
'b', 8.47, 0.90, 0.06,
'c', 13.19, 0.02, 1.76)
n <- 1000
s <- rep(NULL, n)
s[1] <- sample(x = letters, size = 1, prob = marg)
for (i in (2:n)) {
p <- cond %>%
filter(letters == s[i-1]) %>%
select(-letters)
s[i] <- sample(x = letters, size = 1, prob = p)
}
head(s)
# [1] "a" "c" "c" "a" "c" "a"
This is not an efficient way to approach the problem as it takes several seconds to create a 1000 string char vector.
来源:https://stackoverflow.com/questions/48132250/how-to-find-probability-in-r