R simulations

GOAL: simulate data for testing.

Simulation of throwing a dice 4 times:

sample(1:6, 4, replace = TRUE)

Replaced = TRUE insert the data after each launch.
You can set up the probability of each event.
Es. throw a coin 100 times with 30% tails probability (0) and 70% heads (1)

sample(c(0,1), 100, replace = TRUE, prob = c(0.3, 0.7))

Each random variable has a function for simulation (named r[RVname]).
Es. binomial (like previous example):

rbinom(1, size = 100, prob = 0.7)   #heads number, probability 70%
rbinom(100, size = 1, prob = 0.7)   #results on 100 flips

Es. normal:

rnorm(10)   #10 random numbers from standard normal (avg 0, sd 1)
rnorm(10,100,25)   #10 random number from normal with avg 100 and sd 25

replicate() to replicate an operation n times.
Es. simulate 100 groups of random numbers, each of them with 5 values generated by a Poisson with average = 10:

replicate(100, rpois(5, 10))

#colMeans() to see the average of each column:
colMeans(replicate(100, rpois(5, 10)))

#histogram: we will see that are distributed like a normal (central limit theorem)
hist(colMeans(replicate(100, rpois(5, 10))))

With set.seed() you can replicate every time the sample you used:

set.seed(125)   #125 is a random integer
sample(1:6, 4, replace = TRUE)

If you use this function with set.seed(125) the results will be always the same.