To start a set of simulations in the AlphaSimR package
(Gaynor, Gorjanc, and Hickey 2021), four
steps must be implemented, as follows:
In this vignette, we will cover a few useful functions from AlphaSimR that will help to model a the breeding program. In addition, we will compare three pipelines over a few generations of selection.
Imagine that we got hired by a breeding company. In your first day, your team introduced you to the pipeline below:
Then, your manager mentioned that the you will implement changes in the pipeline and they gave you two options:
As there are no budget for implementing both, your task will be to choose each strategy is the best and why.
So, in the next topics, and following the 4 steps before mentioned, we will implement the pipeline and the scenarios to leverage evidence for a data-drive decision.
We will start by loading the packages.
# Loading packages
# install.packages("AlphaSimR") # Install if you do not have it
library(AlphaSimR)## Loading required package: R6
Let’s start the simulation itself by creating the base genome.
set.seed(52684)
# Creating the founding genome
founderGenomes = runMacs(nInd = 100, # Number of individuals that compose the genome
nChr = 10, # Number of chromosome pairs of the target species
segSites = 50, # number of segregation sites
species = "MAIZE") # We can use the base parameters and we have available in the package MAIZE, WHEAT, CATTLE, and GENERIC.
# Object created
founderGenomes## An object of class "MapPop"
## Ploidy: 2
## Individuals: 100
## Chromosomes: 10
## Loci: 500
With the founder genomes in perspective, we can proceed to add the characteristics of the target trait or traits to the simulation. AlphaSimR has a vignette that we recommend you take a look at link. It gives, thoroughly, an explanation on the traits of interest and how to interpret its effects.
We will start with a trait with only additive effects. For such, we have the following parameters:
set.seed(6985)
# Global simulation parameters from founder genomes.
SP = SimParam$new(founderGenomes)
# Additive trait
SP$addTraitA(nQtlPerChr = 20, # Number of QTL per chromosome
mean = 10, # Trait additive mean
var = 10) # Trait additive variance or varA
SP$setVarE(varE=20) # trait heritability
#
# varA 10
# h2 = ------------------- = ------------- = 0.33
# varA + varE/nreps 10 + 20
#
# QTL effects (for the traits)
SP$traits[[1]]@addEff## [1] -0.461665280 0.401970953 -0.162468487 0.102462278 -0.400742643
## [6] -0.277532766 -0.848304391 0.146805432 0.863171779 -0.016736397
## [11] -0.696456841 -0.038287308 0.383747681 -0.364386064 0.482625385
## [16] 0.214603293 0.278871869 0.368198182 -0.131123793 0.249230379
## [21] 0.442916834 -0.719218994 0.397263153 -0.307487543 0.217171343
## [26] -0.706694241 -0.047641081 -0.474597289 -0.295749820 0.729934755
## [31] -0.139512868 -0.501969770 0.145528117 0.844087847 0.511460312
## [36] 0.117810361 0.578736766 0.314206951 0.078945197 -0.001422712
## [41] 0.327342163 -0.254465172 0.020528076 -0.455401968 -0.081499700
## [46] 0.084264300 -0.022395934 0.457563053 -0.299016664 -0.006928243
## [51] -0.227359777 -0.319400423 -0.398409399 -0.052149974 -0.098296094
## [56] 0.273530298 1.155093323 0.066288692 -0.371035847 0.382634548
## [61] 0.400570776 0.045305685 -0.267556246 0.129471951 0.231982597
## [66] -0.520190901 -0.183675692 -0.553238070 0.187815156 -0.966631627
## [71] -0.550095604 -0.230195321 0.233429304 0.209879458 -0.286482424
## [76] -0.094343547 -0.260738929 -0.303447653 -0.037984336 -0.673131397
## [81] 0.537570019 0.235772621 0.710480986 0.307839374 -0.510079519
## [86] -0.656417466 -0.032579917 -0.019180429 0.215889572 0.199177047
## [91] -0.784688428 0.628416729 0.648299176 0.362468757 -0.008699142
## [96] -0.187075833 0.255910977 0.903406197 -1.053598567 -0.676564539
## [101] -0.834180487 0.886379779 -0.309572693 1.075586514 -0.720294771
## [106] 0.146918546 0.297383055 -0.481376470 0.186993125 -0.403558918
## [111] -0.117300799 -0.660296997 -0.332194465 0.615816408 0.254277603
## [116] 0.576829933 0.586953108 -0.410550839 0.507214170 0.072226681
## [121] -0.335943864 -0.777337079 0.851235589 0.067308575 0.230557604
## [126] 0.373773069 -0.082035295 0.437240692 -0.231714414 -0.282242309
## [131] -0.521099012 0.166914461 -0.431821654 0.006854767 -0.739023472
## [136] -0.195977320 0.001723814 0.004405849 0.391183317 -0.264211586
## [141] 0.175165737 0.180579075 -0.396543956 0.198599954 0.666161745
## [146] -0.426867869 0.073107332 1.024700914 -0.126605618 -0.172316699
## [151] 0.262498175 -0.492893522 0.208737381 0.303638980 -0.471008935
## [156] 0.747689623 -0.086393315 -0.366649436 0.366224632 0.260506182
## [161] -0.743601498 -0.214322527 -0.293096287 0.304135959 -0.443315461
## [166] 0.069288014 0.241518307 -0.057181138 -0.623430693 -0.345305256
## [171] -0.198929640 -0.356346590 -0.196815322 -0.403088367 -0.398741540
## [176] -0.295746876 0.392189742 -0.146491973 0.122058360 -0.541053558
## [181] 0.376497593 -0.452790977 0.045916966 -0.945953444 0.148775350
## [186] -0.093564627 0.108476833 -0.494405994 -0.235406109 0.037356424
## [191] 0.384697106 -0.323549286 0.556348711 0.057812811 -0.368637277
## [196] 0.336914430 -0.069067613 -0.390398499 -0.056477812 -0.128204974
# Generating base population
basePop = newPop(founderGenomes)
# Gen param
# genParam(basePop)
# Parents
Parents = basePopThe implementation of traits in AlphaSimR follows a biological model, which is responsible for converting into a genetic value each individual genotype before created. In a straightforward way, the genetic value is used to create the individuals’ phenotypes. The biological effects presented in AlphaSimR are:
A: additive effect
D: dominance effect
G: genotype by environment interaction effect
E: environmental effect
So, we can create traits with the combinations of those effects (assuming that all of them as, at least, additive) using the ADGE framework, as it follows:
# Traits that can be created in AlphaSimR:
SP$addTraitA()
SP$addTraitAD()
SP$addTraitADG()
SP$addTraitADEG()
SP$addTraitAG()
SP$addTraitAE()
SP$addTraitAEG()It is important to have in mind that for dominance effects (D) we set the mean of the dominance degree (0-1) and variance, whereas for genotype by environment effect (G) and environmental effect (E) we have just to adjust the variance.
First, at the beginning of each breeding cycle, we generate new crosses. We can implement random crosses as well as make oriented crosses by giving the list of crosses to make.
#------------- Year 1
F1pop = randCross(pop = Parents, nCrosses = 50, nProgeny = 2) # Crossing block
#------------- Year 2
DHpop = makeDH(pop=F1pop, nDH=10) # Doubled Haploids
#------------- Year 3
rowStage = setPheno(pop=DHpop, h2 = 0.1, reps = 1) # h2 = 0.1 (visual selection)
PrelimTrial = selectWithinFam(rowStage, nInd = 3, use = "pheno")
#------------- Year 4
PrelimTrial = setPheno(PrelimTrial, varE = 20, reps = 1) # h2 = 10/(10+20/1) = 0.33
AdvancTrial = selectInd(PrelimTrial, nInd = 50, use = "pheno") # Preliminary Trial
#------------- Year 5
AdvancTrial = setPheno(AdvancTrial, varE = 20, reps = 5) # h2 = 10/(10+20/5) = 0.71
EliteTrial = selectInd(AdvancTrial, nInd = 20, use = "pheno") # Advanced Trial
#------------- Year 6
EliteTrial = setPheno(EliteTrial, varE = 20, reps = 20) # h2 = 10/(10+20/20) = 0.90
Variety = selectInd(EliteTrial, nInd = 2, use = "pheno") # Elite Trial
#------------- Year 7
# Release varieties# Allocate vectors
nYears = 20 + 1 # +1 to store starting generation
meanGAll = numeric(nYears)
varGAll = numeric(nYears)
# Save the starting values
meanGAll[1] = meanG(basePop)
varGAll[1] = varG(basePop)
# From the same population
Parents = basePop
# Loop
for (i in 2:nYears){
set.seed(1)
#------------- Year 1
F1pop = randCross(pop = Parents, nCrosses = 50, nProgeny = 2) # Crossing block
#------------- Year 2
DHpop = makeDH(pop=F1pop, nDH = 10) # Doubled Haploids
#------------- Year 3
rowStage = setPheno(pop=DHpop, h2 = 0.1, reps = 1)
PrelimTrial = selectWithinFam(rowStage, nInd = 3, use = "pheno")
#------------- Year 4
PrelimTrial = setPheno(PrelimTrial, varE = 20, reps = 1)
AdvancTrial = selectInd(PrelimTrial, nInd = 50, use = "pheno") # Preliminary Trial
#------------- Year 5
AdvancTrial = setPheno(AdvancTrial, varE = 20, reps = 5)
EliteTrial = selectInd(AdvancTrial, nInd = 20, use = "pheno") # Advanced Trial
#------------- Year 6
EliteTrial = setPheno(EliteTrial, varE = 20, reps = 20)
Variety = selectInd(EliteTrial, nInd = 2, use = "pheno") # Elite Trial
#------------- Year 7
# Release varieties
# Updating the parents
Parents = AdvancTrial
# Track performance
meanGAll[i] = meanG(Parents)
varGAll[i] = varG(Parents)
}
# Track performance
meanG_Scen1 = meanGAll
varG_Scen1 = varGAll# Allocate vectors
nYears = 20 + 1 # +1 to store starting generation
meanGAll = numeric(nYears)
varGAll = numeric(nYears)
# Save the starting values
meanGAll[1] = meanG(basePop)
varGAll[1] = varG(basePop)
# From the same population
Parents = basePop
# Loop
for (i in 2:nYears){
set.seed(11)
#------------- Year 1
F1pop = randCross(pop = Parents, nCrosses = 70, nProgeny = 2) # Crossing block
#------------- Year 2
DHpop = makeDH(pop=F1pop, nDH=10) # Doubled Haploids
#------------- Year 3
rowStage = setPheno(pop=DHpop, h2 = 0.1, reps = 1)
PrelimTrial = selectWithinFam(rowStage, nInd = 3, use = "pheno")
#------------- Year 4
PrelimTrial = setPheno(PrelimTrial, varE = 20, reps = 1)
AdvancTrial = selectInd(PrelimTrial, nInd = 50, use = "pheno") # Preliminary Trial
#------------- Year 5
AdvancTrial = setPheno(AdvancTrial, varE = 20, reps = 5)
EliteTrial = selectInd(AdvancTrial, nInd = 20, use = "pheno") # Advanced Trial
#------------- Year 6
EliteTrial = setPheno(EliteTrial, varE = 20, reps = 20)
Variety = selectInd(EliteTrial, nInd = 2, use = "pheno") # Elite Trial
#------------- Year 7
# Release varieties
# Updating the parents
Parents = AdvancTrial
# Track performance
meanGAll[i] = meanG(Parents)
varGAll[i] = varG(Parents)
}
# Track performance
meanG_Scen2 = meanGAll
varG_Scen2 = varGAll# Allocate vectors
nYears = 20 + 1 # +1 to store starting generation
meanGAll = numeric(nYears)
varGAll = numeric(nYears)
# Save the starting values
meanGAll[1] = meanG(basePop)
varGAll[1] = varG(basePop)
# From the same population
Parents = basePop
# Loop
for (i in 2:nYears){
set.seed(111)
#------------- Year 1
F1pop = randCross(pop = Parents, nCrosses = 50, nProgeny = 2) # Crossing block
#------------- Year 2
DHpop = makeDH(pop=F1pop, nDH=14) # Doubled Haploids
#------------- Year 3
rowStage = setPheno(pop=DHpop, h2 = 0.1, reps = 1)
PrelimTrial = selectWithinFam(rowStage, nInd = 3, use = "pheno")
#------------- Year 4
PrelimTrial = setPheno(PrelimTrial, varE = 20, reps = 1)
AdvancTrial = selectInd(PrelimTrial, nInd = 50, use = "pheno") # Preliminary Trial
#------------- Year 5
AdvancTrial = setPheno(AdvancTrial, varE = 20, reps = 5)
EliteTrial = selectInd(AdvancTrial, nInd = 20, use = "pheno") # Advanced Trial
#------------- Year 6
EliteTrial = setPheno(EliteTrial, varE = 20, reps = 20)
Variety = selectInd(EliteTrial, nInd = 2, use = "pheno") # Elite Trial
#------------- Year 7
# Release varieties
# Updating the parents
Parents = AdvancTrial
# Track performance
meanGAll[i] = meanG(Parents)
varGAll[i] = varG(Parents)
}
# Track performance
meanG_Scen3 = meanGAll
varG_Scen3 = varGAll# Plot mean of genetic values over time
meanRanges = range(c(meanG_Scen1, meanG_Scen2, meanG_Scen3))
plot(x = 1:nYears, y = meanG_Scen1, type = "l", col = "black", lwd = 3,
xlab = "Years", ylab = "Genetic values", ylim = meanRanges)
lines(x = 1:nYears, y = meanG_Scen2, type = "l", col = "blue", lty = 2, lwd = 3)
lines(x = 1:nYears, y = meanG_Scen3, type = "l", col = "grey", lty = 2, lwd = 3)
legend(x = "topleft", legend = c('Scen1','Scen2','Scen3'), title = "Scenario",
lwd = 3, lty = c(1, 2,2), col = c("black", "blue", 'grey'), bty = "n")professor, University of Florida, mresende@ufl.edu↩︎
Post doc, University of Florida, deamorimpeixotom@ufl.edu↩︎