AlphaSimR: Practical implementation of genomic selection in AlphaSimR
In this vignette, we will present how to use AlphaSimR objects to deploy genomic selection (GS).
Genomic selection is a method that leverages both phenotype and genome data from historical individuals to develop a predictive model. This model is then applied to new individuals, utilizing their genome data alongside the prediction model to forecast their phenotype performance. The foundation of this predictive model lies in the inferred associations between variations in phenotypes and the corresponding variations along the genome. These associations are calculated for numerous positions in the genome, commonly referred to as genomic markers. The prevailing choice for such markers is often bi-allelic Single Nucleotide Polymorphisms (SNPs), a type of genetic variation that AlphaSimR emulates. Once these associations between phenotype and genome are established, it becomes possible to predict the performance of any individual genotyped for the selected set of SNP markers.
The first implementation will use one additive trait. Let’s create
the founder genome and also a base population and assume the trait
characteristics. In addition to the argument that we have been
describing, we need to use the function addSnpChip to add
the SNPs for each individual.
## Warning: package 'AlphaSimR' was built under R version 4.2.3
## Carregando pacotes exigidos: R6
# Founder haplotypes
founderPop = runMacs(nInd=1000, nChr=10, segSites=600)
# trait parameters
SP = SimParam$
new(founderPop)$
addTraitA(100)$
addSnpChip(500)$ # Number of SNPs per pair of chromosome
setVarE(h2=0.4)
# Split the founders into two population
pop1 = newPop(founderPop[1:500])
pop2 = newPop(founderPop[501:1000])
#SNPs
SNPop = pullSnpGeno(pop1)
SNPop[1:10,1:10]## 1_1 1_3 1_4 1_7 1_8 1_9 1_10 1_11 1_12 1_13
## 1 0 0 0 0 0 0 2 2 0 0
## 2 0 0 1 1 1 0 1 1 1 0
## 3 0 0 0 0 0 0 2 2 0 0
## 4 0 1 1 0 0 0 1 0 0 2
## 5 0 2 0 0 0 0 2 0 0 2
## 6 0 0 2 2 2 0 0 0 2 0
## 7 0 1 1 0 0 0 1 0 0 1
## 8 0 1 0 0 0 0 2 2 0 0
## 9 0 0 1 1 1 0 0 0 1 0
## 10 0 1 1 0 0 0 0 0 0 1
Now, we can create a large F1 population derived from the initial
individuals and fit a genomic selection model using the function
RRBLUP() from AlphaSimR. We may use the
phenotypes assumed out of setPheno() function and, together
with the marker’s information, create a model that will give us the
marker’s effects, as follows:
Now that we have a calibration, we can deploy genomic selection by using the marker effects to predict, together with the genotypes of the target population, the estimated breeding value of the individuals.
## est_GV_Trait1
## [1,] 0.1242688
## [2,] -1.2727601
## [3,] 0.7484424
## [4,] -1.2779063
## [5,] 0.9605354
## [6,] -0.2876188
We can measure how good was the prediction by comparing the actual true genetic value (gv) with the prediction (ebv) for each individual.
## est_GV_Trait1
## Trait1 0.8384834
## est_GV_Trait1
## Trait1 0.6007227
## est_GV_Trait1
## Trait1 0.7101821
In this part of the vignette, we will implement the genomic selection in a Maize pipeline. Two scenarios will be implemented after a fifteen-year Burn-In period. The first scenario will account only for truncated phenotypic selection. In the second scenario, we will implement genomic selection in the pipeline. So, in year 4, the individuals will be selected based on the estimated breeding values that came from a genomic model (ridge regression model). In addition, the parents will be selected based on ‘ebv’.
for(year in 1:burninYears){
p = P[year]
cat("Working on year:",year,"\n")
source("aux_files/UpdateParents.R") # Pick new parents based on last year's data
source("aux_files/UpdateTesters.R") # Pick new testers and hybrid parents
source("aux_files/AdvanceYearPS.R") # Advances yield trials by a year
source("aux_files/WriteRecordsGS.R") # Write records for GS predictions
source("aux_files/UpdateResults.R") # Track summary data
}
# Saving the info
save.image("results/BURNIN.RData")# 3.0 Loading the scenarios
load("results/BURNIN.RData")
for(year in (burninYears+1):(burninYears+futureYears)){
p = P[year]
# 3.1 Loop
cat("Working on year:",year,"\n")
source("aux_files/UpdateParents.R") # Pick new parents based on last year's data
source("aux_files/UpdateTesters.R") # Pick new testers and hybrid parents
source("aux_files/AdvanceYearPS.R") # Advances yield trials by a year
source("aux_files/UpdateResults.R") # Track summary data
}
# 3.2 Recording results
output1 = data.frame(scenario=rep("PS", 35), # Scenario name
MeanG_pop,
MeanA_pop,
VarG_pop,
stringsAsFactors=FALSE)
# 3.3 Saving the results as RDS
saveRDS(output1,"results/Results_PS.rds")# 3.0 Loading the scenarios
load("results/BURNIN.RData")
for(year in (burninYears+1):(burninYears+futureYears)){
p = P[year]
cat("Working on year:",year,"\n")
if(year == (burninYears+1)){
source('aux_files/fillGS.R')
}
source("aux_files/UpdateParents_GS.R") # Pick new parents based on last year's data
source("aux_files/UpdateTesters.R") # Pick new testers and hybrid parents
source("aux_files/AdvanceYearGS.R") # advance the populations
source("aux_files/WriteRecordsGS.R") # Track records for GS
source("aux_files/UpdateResults.R") # Track summary data
}
# 3.2 Recording results
output1 = data.frame(scenario=rep("GS", 35), # Scenario name
MeanG_pop,
MeanA_pop,
VarG_pop,
stringsAsFactors=FALSE)
# 3.3 Saving the results as RDS
saveRDS(output1,"results/Results_GS.rds")# Loading the results
Scenario1 = readRDS("results/Results_PS.rds")
Scenario2 = readRDS("results/Results_GS.rds")
# Plot hybrid mean of genetic values over time
meanRanges = range(c(Scenario1$MeanG_pop[15:35], Scenario2$MeanG_pop[15:35]))
plot(x = 15:35, y = Scenario1$MeanG_pop[15:35], type = "l", col = "black", lwd = 3,
xlab = "Year", ylab = "Mean of genetic values", ylim = meanRanges)
lines(x = 15:35, y = Scenario2$MeanG_pop[15:35], type = "l", col = "blue", lwd = 3)
legend(x = "topleft", legend = c('Pheno', 'GS'), title = "Scenarios",
lwd = 3, lty = c(1, 1), col = c("black", "blue"), bty = "n")University of Florida, mresende@ufl.edu↩︎
University of Florida, deamorimpeixotom@ufl.edu↩︎