Introduction

Mate selection is the process of choosing mating pairs or groups — that is, the simultaneous selection and mate allocation of individuals entering a breeding program. An extension of optimum contribution selection that delivers a practical crossing plan is to jointly optimize contributions and cross allocations (Kinghorn et al., 2009; Woolliams, 2015; Gorjanc et al., 2018).

In optimal cross selection, we combine two pieces of information for each candidate cross:

Criterion: a measure of cross performance (e.g., mid-parental value, total genetic value, or usefulness criterion).
relatedness: a measure between the parents, used to manage inbreeding and maintain diversity.

The goal is to find a crossing plan that maximizes genetic gain while constraining the rate of inbreeding to an acceptable level.

Managing Genetic Diversity

Genetic variation is crucial for long-term genetic progress. Unconstrained selection maximizes short-term gain but erodes diversity and increases inbreeding, reducing future response.

Animal breeders commonly target the rate of inbreeding per generation (\(\Delta F\)), with a typical guideline of \(\Delta F \approx 0.25\)–\(0.5\%\) per generation, equivalent to an effective population size (\(N_e\)) of 50–100.

Inbreeding (\(F\)) can be estimated via:

Relative decrease in heterozygosity or expected heterozygosity.
Runs of homozygosity (ROH).
Genomic relationship matrices (A: tracking IBD and G: tracking IBS).

In a breeding scenario, \(F\) and \(\Delta F\) relates as follows:

Key consideration: If \(\Delta F\) is controlled and kept constant, the future trajectory of inbreeding is similar regardless of the current level of \(F\). This is why breeders target \(\Delta F\) as the management variable, not \(F\) itself.

The optimal cross selection framework balances genetic gain against diversity loss, avoiding both the “risky” quadrant (high gain, low diversity) and the “undesirable” quadrant (low gain, low diversity).

Below, we will implement two algorithms to optimize crossing plan based on a criterion (total genetic value) and inbreeding (coming from a realized relationship matrix). The implementations are from SimpleMating and COMA packages. A base population of individuals will be created and individuals derived from that using AlphaSimR package.

Optimization of crosses

In this first exercise, we set up a simulated breeding population using AlphaSimR, compute the total genetic values for all possible crosses, and select an optimized mating plan.

Setting up the population

rm(list = ls())

# Loading packages
# install.packages("COMA") # if needed istall it
library(COMA)
library(AlphaSimR)
library(AGHmatrix)
library(SimpleMating)

set.seed(52684)

# Creating the founding genome
founderGenomes <- runMacs(nInd = 200,
                          nChr = 10,
                          segSites = 500)

# Global simulation parameters from founder genomes
SP <- SimParam$new(founderGenomes)

# Additive + dominance trait
SP$addTraitAD(nQtlPerChr = 200,
              mean = 10,
              var = 10,
              meanDD = 0.9,
              varDD = 0.3)

SP$addSnpChip(300)
SP$setVarE(varE = 20)

# Generating base population in HWE
Parents <- newPop(founderGenomes)

Creating population structure

We advance the population through 100 generations of selection to create linkage disequilibrium and population structure, mimicking a real breeding population.

ParentsPop <- Parents

for (year in 1:100) {
  set.seed(6985)
  ParentsPop <- setPheno(ParentsPop, h2 = 0.4)
  ParentsPop <- selectCross(pop = ParentsPop, nInd = 10, nCrosses = 150)
}

Checking inbreeding levels

We can compare inbreeding levels across different population states using the genomic relationship matrix. Note that the G matrix (IBS) from the doubled-haploid population indicates and F = 1.

# HWE (founder population)
relMat_HWE <- AGHmatrix::Gmatrix(pullSnpGeno(Parents))

## Initial data: 
##  Number of Individuals: 200 
##  Number of Markers: 3000 
## 
## Missing data check: 
##  Total SNPs: 3000 
##   0 SNPs dropped due to missing data threshold of 0.5 
##  Total of: 3000  SNPs 
## 
## MAF check: 
##  No SNPs with MAF below 0 
## 
## Heterozigosity data check: 
##  No SNPs with heterozygosity, missing threshold of =  0 
## 
## Summary check: 
##  Initial:  3000 SNPs 
##  Final:  3000  SNPs ( 0  SNPs removed) 
##  
## Completed! Time = 0.299  seconds

# After 100 generations of selection (no longer in HWE)
relMat_noHWE <- AGHmatrix::Gmatrix(pullSnpGeno(ParentsPop))

## Initial data: 
##  Number of Individuals: 150 
##  Number of Markers: 3000 
## 
## Missing data check: 
##  Total SNPs: 3000 
##   0 SNPs dropped due to missing data threshold of 0.5 
##  Total of: 3000  SNPs 
## 
## MAF check: 
##  No SNPs with MAF below 0 
## 
## Heterozigosity data check: 
##  No SNPs with heterozygosity, missing threshold of =  0 
## 
## Summary check: 
##  Initial:  3000 SNPs 
##  Final:  3000  SNPs ( 0  SNPs removed) 
##  
## Completed! Time = 0.132  seconds

# Doubled haploids from the selected population
ParentsPop_DH <- makeDH(ParentsPop, nDH = 2)
relMat_DH <- AGHmatrix::Gmatrix(pullSnpGeno(ParentsPop_DH))

## Initial data: 
##  Number of Individuals: 300 
##  Number of Markers: 3000 
## 
## Missing data check: 
##  Total SNPs: 3000 
##   0 SNPs dropped due to missing data threshold of 0.5 
##  Total of: 3000  SNPs 
## 
## MAF check: 
##  No SNPs with MAF below 0 
## 
## Heterozigosity data check: 
##  No SNPs with heterozygosity, missing threshold of =  0 
## 
## Summary check: 
##  Initial:  3000 SNPs 
##  Final:  3000  SNPs ( 0  SNPs removed) 
##  
## Completed! Time = 0.334  seconds

Computing Total Genetic Values

Using SimpleMating, we compute the total genetic value (TGV) for all possible half-diallel crosses.

# 1. Creating the mating plan (all pairwise crosses)
CrossPlan <- planCross(TargetPop = ParentsPop@id,
                       MateDesign = "half")

## Number of potential crosses generated: 11175

# 2. Relationship matrix
relMat <- AGHmatrix::Gmatrix(pullQtlGeno(ParentsPop))

## Initial data: 
##  Number of Individuals: 150 
##  Number of Markers: 2000 
## 
## Missing data check: 
##  Total SNPs: 2000 
##   0 SNPs dropped due to missing data threshold of 0.5 
##  Total of: 2000  SNPs 
## 
## MAF check: 
##  No SNPs with MAF below 0 
## 
## Heterozigosity data check: 
##  No SNPs with heterozygosity, missing threshold of =  0 
## 
## Summary check: 
##  Initial:  2000 SNPs 
##  Final:  2000  SNPs ( 0  SNPs removed) 
##  
## Completed! Time = 0.089  seconds

# 3. QTL genotypes
Markers <- pullQtlGeno(ParentsPop)

# 4. Marker effects
add_eff <- SP$traits[[1]]@addEff
dom_eff <- SP$traits[[1]]@domEff

# 5. Compute TGV for all crosses
ST_tgv <- getTGV(MatePlan = CrossPlan,
                 Markers = Markers,
                 addEff = add_eff,
                 domEff = dom_eff,
                 ploidy = 2,
                 K = relMat)

## TGVs predicted for 11175 crosses.

head(ST_tgv, 20)

##    Parent1 Parent2        Y           K
## 1    15051   15052 48.49128  0.27732908
## 2    15051   15053 48.76521  0.28877597
## 3    15051   15054 51.50141 -0.43192639
## 4    15051   15055 49.67416 -0.17181082
## 5    15051   15056 49.03153  0.02022586
## 6    15051   15057 47.73578  0.08679857
## 7    15051   15058 49.88101  0.08634672
## 8    15051   15059 49.18132 -0.14680840
## 9    15051   15060 48.70614  0.33230428
## 10   15051   15061 50.68784 -0.28296617
## 11   15051   15062 48.90147  0.02489498
## 12   15051   15063 48.50213  0.11511457
## 13   15051   15064 47.58813  0.23319830
## 14   15051   15065 49.23301  0.15517869
## 15   15051   15066 48.93509  0.10713187
## 16   15051   15067 50.12453 -0.20268731
## 17   15051   15068 48.83394  0.02203326
## 18   15051   15069 49.18187  0.01887030
## 19   15051   15070 49.17813  0.04824062
## 20   15051   15071 49.44040  0.12294665

plot(ST_tgv$Y, ST_tgv$K,
     xlab = "Total Genetic Value (Y)",
     ylab = "Coancestry (K)",
     main = "Cross Merit vs. Coancestry",
     pch = 16, col = rgb(0, 0, 1, 0.3))

Optimization with SimpleMating

The selectCrosses() function in SimpleMating uses an algorithm to select an optimal subset of crosses that maximizes the criterion while constraining relatedness (culling.pairwise.k).

MatingPlan <- selectCrosses(data = ST_tgv,
                            n.cross = 100,
                            max.cross = 4,
                            culling.pairwise.k = 0)

# Summary statistics
MatingPlan[[1]]

##   culling.pairwise.k target.Y   target.K
## 1                  0 52.03096 -0.5200644

# Selected mating plan
head(MatingPlan[[2]], 6)

##   Parent1 Parent2        Y          K
## 1   15167   15193 52.86967 -0.7126765
## 2   15089   15137 52.85122 -0.4552720
## 3   15167   15200 52.84035 -0.6905358
## 4   15165   15196 52.83736 -0.5909779
## 5   15111   15140 52.82395 -0.5700422
## 6   15059   15137 52.81030 -0.6584544

# Optimization plot
MatingPlan[[3]]

Optimization with COMA

COMA (Convex Optimization for Mate Allocation) uses a Lagrangian approach to optimize mate allocation. It directly constrains the rate of inbreeding (\(\Delta F\)) and finds optimal contribution proportions for each cross.

# 1. Coancestry matrix (halved relationship matrix)
KCoan <- relMat / 2

# 2. Compute TGV with coancestry matrix
OMAtgv <- getTGV(MatePlan = CrossPlan,
                 Markers = Markers,
                 addEff = add_eff,
                 domEff = dom_eff,
                 ploidy = 2,
                 K = KCoan)

## TGVs predicted for 11175 crosses.

OMAtgv$Parent1 <- as.character(OMAtgv$Parent1)
OMAtgv$Parent2 <- as.character(OMAtgv$Parent2)

# 3. Parent constraints
parentsID <- data.frame(id = ParentsPop@id,
                        min = 0,
                        max = 1)

# 4. Mating constraints
matingsID <- data.frame(parent1 = OMAtgv$Parent1,
                        parent2 = OMAtgv$Parent2,
                        merit = OMAtgv$Y,
                        min = 0,
                        max = 1)

# 5. Optimization (5% inbreeding increment constraint)
omaModel <- COMA::oma(dF = 0.025, # 0.25%
                      parents = parentsID,
                      matings = matingsID,
                      ploidy = 2,
                      K = KCoan,
                      dF.adapt = list(step = 0.001, max = 0.2),
                      base = "current")

# 6. Derive 100 crosses from optimal contributions
rep_times <- round(omaModel$om$value * 100)

MatingPlan_COMA <- data.frame(
  Parent1 = rep(omaModel$om$parent1, times = rep_times),
  Parent2 = rep(omaModel$om$parent2, times = rep_times))

head(MatingPlan_COMA, 6)

##   Parent1 Parent2
## 1   15053   15169
## 2   15054   15185
## 3   15061   15140
## 4   15061   15140
## 5   15061   15140
## 6   15061   15140

Breeding Program Simulation: GS vs. OCS

This section compares two breeding strategies over 10 years:

Genomic Selection (GS): Standard pipeline with random mating of selected parents.
Optimal Cross Selection (OCS): Same pipeline but with optimized mate allocation using SimpleMating.

Helper functions

# Inbreeding based on expected heterozygosity
inbCoef = function(Pop){
  
  Markers = pullQtlGeno(Pop)
  p = colMeans(Markers)/Pop@ploidy
  f = sum(p^2)/ncol(Markers)
  
  return(f)
}

# inbreeding based on IBS
calcFibs = function(Markers, refPop, ploidy = NULL){
  
  W_allele <- pullQtlGeno(refPop)
  W = Markers
  
  ##### Allele frequency coming from the reference population
  af = apply(W_allele, 2, function(x) mean(x, na.rm = T)/ploidy) # Allele frequency
  
  freq_p = ifelse(af < 0.5, af, 1-af) # Minor allele frequency
  P_allele <- matrix(rep(af,nrow(W)),ncol=ncol(W),byrow=T) # Freq P matrix
  
  twopq <- ploidy*(t(freq_p)%*%(1-freq_p)) # two*p*q
  
  Z = W - ploidy*P_allele
  Z[is.na(Z)] <- 0
  
  gMatrix <- (tcrossprod(Z,Z))/as.numeric(twopq) # G matrix
  
  return((mean(diag(gMatrix))-1)/(refPop@ploidy - 1))
  
  
}


# Estimation of inbreeding rating through generation
fdF <- function(x) {
  x <- pmin(x, 1)
  nx <- length(x)
  1 - ((1-x)/(1-x[1]))^(1/(1:nx))
}

Simulation setup

rm(list = ls()[!ls() %in% c("inbCoef", "calcFibs", "fdF")])

# Loading packages
library(AlphaSimR)
library(AGHmatrix)
library(SimpleMating)
library(ggplot2)
library(ggpubr)


set.seed(52684)
# Creating the founding genome
founderGenomes = runMacs(nInd = 200, # Number of individuals that compose the genome
                         nChr = 10,   # Number of chromosome pairs of the target species
                         segSites = 10000)  # number of segregation sites
# Global simulation parameters from founder genomes.
SP = SimParam$new(founderGenomes)

# Additive trait
SP$addTraitA(nQtlPerChr = 1000, # Number of QTL per chromosome
             mean = 100, # Trait additive mean
             var = 100) # Trait additive variance or varA
SP$addSnpChip(500)

SP$setVarE(h2=0.5)

# Generating base population
Parents = newPop(founderGenomes)
ParentsPop = Parents
refPop = Parents

Burn-in phase (10 years)

The burn-in establishes a realistic breeding population with a training set for genomic prediction.

for(year in 1:10){
  cat("Working on year:", year, "\n")

#------------- Year 1 
set.seed(6985)
Clones = randCross(pop = ParentsPop, nCrosses = 100) # Crossing block

#------------- Year 2
cloneStage = setPheno(pop = Clones, h2 = 0.1, reps = 1) # h2 = 0.1 (visual selection)
PrelimTrial = selectInd(cloneStage, nInd = 60, use = "pheno") 

#-------------  Year 3
PrelimTrial = setPheno(PrelimTrial, varE = 20, reps = 1) # h2 = 10/(10+20/1) = 0.33
AdvancTrial = selectInd(PrelimTrial, nInd = 50, use = "pheno") # Preliminary Trial

#------------- Year 4
AdvancTrial = setPheno(AdvancTrial, varE = 20, reps = 5) # h2 = 10/(10+20/5) = 0.71
EliteTrial = selectInd(AdvancTrial, nInd = 5, use = "pheno") # Advanced Trial

#------------- Year 5
EliteTrial = setPheno(EliteTrial, varE = 20, reps = 20) # h2 = 10/(10+20/20) = 0.90
Variety = selectInd(EliteTrial, nInd = 2, use = "pheno") # Elite Trial

# Updating the parents
Parents = mergePops(list(EliteTrial,Parents))
ParentsPop = selectInd(Parents, nInd = 5, use = "pheno") 

# calibrate the model
AdvancTrial@fixEff = as.integer(rep(paste0(year),nInd(AdvancTrial)))
EliteTrial@fixEff = as.integer(rep(paste0(year),nInd(EliteTrial)))
PrelimTrial@fixEff = as.integer(rep(paste0(year),nInd(PrelimTrial)))

if(year >= 5){
  
  if(!exists("trainPop") || is.null(trainPop)){
    trainPop <-c(AdvancTrial, EliteTrial, PrelimTrial)
  } else {
    trainPop <- mergePops(list(trainPop, c(AdvancTrial, EliteTrial, PrelimTrial)))
  }
  
}

}

## ----- 
burnInPop = Parents
trainPop_burnin = trainPop

Allocate tracking vectors

nYears <- 10 + 1
meanG_Scen1 <- meanG_Scen2 <- numeric(nYears)
fInb_Scen1 <- fInb_Scen2 <- numeric(nYears)
fIbs_Scen1 <- fIbs_Scen2 <- numeric(nYears)
genicA_Scen1 <- genicA_Scen2 <- numeric(nYears)

# Starting values
meanG_Scen1[1] <- meanG_Scen2[1] <- meanG(burnInPop)
genicA_Scen1[1] <- genicA_Scen2[1] <- genicVarA(burnInPop)
fIbs_Scen1[1] <- fIbs_Scen2[1] <- calcFibs(
  Markers = pullQtlGeno(burnInPop), refPop = refPop, ploidy = 2)
fInb_Scen1[1] <- fInb_Scen2[1] <- inbCoef(burnInPop)

Scenario 1 — Baseline Genomic Selection

Standard genomic selection pipeline: parents are selected on GEBVs and crossed randomly.

# From the same population
Parents = burnInPop
trainPop  = trainPop_burnin

# Loop
for (year in 2:nYears){
  cat("Working on year:", year, "\n")
  
  gsmodel = RRBLUP(trainPop , use = 'pheno', useReps = TRUE) # Calibrate the genomic model
  if(year == 2){ Parents = setEBV(Parents, gsmodel)}
  
  set.seed(11)  
  
  #------------- Year 1 
  Clones = randCross(pop = Parents, nCrosses = 100) # Crossing block
  Clones = setEBV(Clones, gsmodel)
  PrelimTrial = selectInd(Clones, nInd = 60, use = "ebv") 

  # #------------- Year 2
  # cloneStage = setPheno(pop = Clones, h2 = 0.1, reps = 1) # h2 = 0.1 (visual selection)
  # PrelimTrial = selectInd(cloneStage, nInd = 60, use = "pheno") 
  
  #-------------  Year 3
  PrelimTrial = setPheno(PrelimTrial, varE = 20, reps = 1)
  AdvancTrial = selectInd(PrelimTrial, nInd = 10, use = "pheno") # Preliminary Trial
  
  #------------- Year 4
  AdvancTrial = setPheno(AdvancTrial, varE = 20, reps = 5)
  EliteTrial = selectInd(AdvancTrial, nInd = 5, use = "pheno") # Advanced Trial
  
  #------------- Year 5
  EliteTrial = setPheno(EliteTrial, varE = 20, reps = 20)
  Variety = selectInd(EliteTrial, nInd = 2, use = "pheno") # Elite Trial
  
  #------------- Year 6    
  # Release varieties
  
  # Updating the parents
  Parents = selectInd(Clones, nInd = 60, use = "ebv")

  AdvancTrial@fixEff = as.integer(rep(paste0(year, 1L),nInd(AdvancTrial)))
  EliteTrial@fixEff = as.integer(rep(paste0(year, 1L),nInd(EliteTrial)))
  PrelimTrial@fixEff = as.integer(rep(paste0(year, 1L),nInd(PrelimTrial)))
  
  # Track performance
  meanG_Scen1[year] = meanG(Parents)
  genicA_Scen1[year] = genicVarA(Parents)
  fInb_Scen1[year] = inbCoef(Parents)
  fIbs_Scen1[year] = calcFibs(Markers = pullQtlGeno(Parents), refPop = refPop, ploidy = 2)
  
  # calibrate model
  if(year !=2){
    trainPop  = trainPop[-c(1:nInd(c(AdvancTrial, EliteTrial, PrelimTrial)))]
    trainPop  = mergePops(list(trainPop, c(AdvancTrial, EliteTrial, PrelimTrial)))
  }else{
    trainPop  = mergePops(list(trainPop, c(AdvancTrial, EliteTrial, PrelimTrial)))
  }
  
}

Scenario 2 — Optimal Cross Selection

Same pipeline, but crosses are optimized using SimpleMating to balance genetic gain and diversity.

# From the base population
Parents = burnInPop
trainPop  = trainPop_burnin

# Loop
for (year in 2:nYears){
  
  cat("Working on year:", year, "\n")
  gsmodel = RRBLUP(trainPop , use = 'pheno', useReps = TRUE)
  Parents = setEBV(Parents, gsmodel)
  
  set.seed(11)  
  #------------- Year 1 

  # 1. Criterion
  crit <- data.frame(Id = Parents@id,
                     Criterion = Parents@ebv)
  
  # 2. Creating relationship matrix
  relMat <- Gmatrix(pullSnpGeno(Parents)) 

  # 3.Mating Plan
  CrossPlan <- planCross(TargetPop = Parents@id)
  
  # 4.Single trait mid-parental value
  mpv <- getMPV(MatePlan = CrossPlan,
                     Criterion = crit,
                     K = relMat)
 
  # 5. Optimize
  matingPlan <- selectCrosses(mpv,
                              n.cross = 100,
                              max.cross = 4,
                              min.cross = 1,
                              culling.pairwise.k = 0)
 
  plan = as.matrix(matingPlan[[2]][, c(1,2)])
  
  # Cross
  Clones = makeCross(Parents, plan)
  
  # GS them
  Clones = setEBV(Clones, gsmodel)
  PrelimTrial = selectInd(Clones, nInd = 60, use = "ebv") 
  
  #-------------  Year 3
  PrelimTrial = setPheno(PrelimTrial, varE = 20, reps = 1)
  AdvancTrial = selectInd(PrelimTrial, nInd = 50, use = "pheno") # Preliminary Trial
  
  #------------- Year 4
  AdvancTrial = setPheno(AdvancTrial, varE = 20, reps = 5)
  EliteTrial = selectInd(AdvancTrial, nInd = 20, use = "pheno") # Advanced Trial
  
  #------------- Year 5
  EliteTrial = setPheno(EliteTrial, varE = 20, reps = 20)
  Variety = selectInd(EliteTrial, nInd = 2, use = "pheno") # Elite Trial
  
  #------------- Year 6    
  # Release varieties
  
  # Track performance
  meanG_Scen2[year] = meanG(Parents)
  genicA_Scen2[year] = genicVarA(Parents)
  fInb_Scen2[year] = inbCoef(Parents)
  fIbs_Scen2[year] = calcFibs(Markers = pullQtlGeno(Parents), refPop = refPop, ploidy = 2)
  
  # Updating the parents
  Parents = Clones 
 
  # calibrate model
  AdvancTrial@fixEff = as.integer(rep(paste0(year, 1L),nInd(AdvancTrial)))
  EliteTrial@fixEff = as.integer(rep(paste0(year, 1L),nInd(EliteTrial)))
  PrelimTrial@fixEff = as.integer(rep(paste0(year, 1L),nInd(PrelimTrial)))
  
  # calibrate model
  if(year !=2){
    trainPop  = trainPop[-c(1:nInd(c(AdvancTrial, EliteTrial, PrelimTrial)))]
    trainPop  = mergePops(list(trainPop, c(AdvancTrial, EliteTrial, PrelimTrial)))
  }else{
    trainPop  = mergePops(list(trainPop, c(AdvancTrial, EliteTrial, PrelimTrial)))
  }
}

Results — Comparing GS vs. OCS

Preparing data for visualization

###---- Scenario one
dfScen1 <- data.frame(scenario = "GenomicSel",
                  meanG = meanG_Scen1,
                  Fgmat = fIbs_Scen1,
                  Ffreq = fInb_Scen1,
                  GenicVarG = genicA_Scen1)
dfScen1$Year <- 0:10
dfScen1$dFgmat <- 100*fdF(dfScen1$Fgmat)
dfScen1$dFfreq <- 100*fdF(dfScen1$Ffreq)
dfScen1$diversity <- 1-sqrt(dfScen1$GenicVarG)/sqrt(dfScen1$GenicVarG[1])
dfScen1$gain <- dfScen1$meanG - dfScen1$meanG[1]

###---- Scenario two
dfScen2 <- data.frame(scenario = "OCS",
                      meanG = meanG_Scen2,
                      Fgmat = fIbs_Scen2,
                      Ffreq = fInb_Scen2,
                      GenicVarG = genicA_Scen2)
dfScen2$Year <- 0:10
dfScen2$dFgmat <- 100*fdF(dfScen2$Fgmat)
dfScen2$dFfreq <- 100*fdF(dfScen2$Ffreq)
dfScen2$diversity <- 1-sqrt(dfScen2$GenicVarG)/sqrt(dfScen2$GenicVarG[1])
dfScen2$gain <- dfScen2$meanG - dfScen2$meanG[1]

pdat <- rbind(dfScen1,dfScen2)

###---- Plot one
p1 <- ggplot(pdat, mapping=aes(x=Year, y=dFgmat, col=scenario)) +
  ylab("Genomic Inbreeding Rate (%)") +
  theme_bw() +
  geom_line(data=pdat, linewidth=1.1) +
  scale_color_manual(values = c("blue", "wheat"), name = NULL)

p2 <- ggplot(pdat, mapping=aes(x=Year, y=dFfreq, col=scenario)) +
  ylab("Genomic Inbreeding Rate (%)") +
  theme_bw() +
  geom_line(data=pdat, linewidth=1.1) +
  scale_color_manual(values = c("blue", "wheat"), name = NULL)


# Plot three
df_p3 <- merge(aggregate(gain~scenario+Year,data=pdat,FUN=mean),
              aggregate(diversity~scenario+Year,data=pdat,FUN=mean))
df_p3 <- df_p3[df_p3$Year==10,]; df_p3$x <- 0; df_p3$y <- 0


p3 <- ggplot() +
  theme_bw() + 
  geom_line(data=df_p3,
            mapping=aes(x=diversity, y=gain,
                        col=scenario),
            alpha=0.05) + 
  ylab("Genetic Gain") +
  xlab("Loss Genic SD") + 
  geom_segment(data=df_p3,
               arrow = arrow(length = unit(0.3, "cm")),
               mapping=aes(x=x, y=y,
                           xend=diversity, yend=gain,
                           col=scenario),
               linewidth=1.1) +
  scale_color_manual(values =  c("blue", "wheat"), name = NULL)


ggarrange(p1,p2,p3,nrow=1,common.legend = TRUE)

References

Akdemir, D., Sanchez, J. I., & Jannink, J.-L. (2016). Optimization of genomic selection training populations with a genetic algorithm. G3: Genes, Genomes, Genetics, 6(3), 721–731.
Allier, A., Teyssèdre, S., Lehermeier, C., Moreau, L., & Charcosset, A. (2019). Optimized breeding strategies to harness genetic resources with different performance levels. G3: Genes, Genomes, Genetics, 9(5), 1593–1605.
Falconer, D. S., & Mackay, T. F. C. (1996). Introduction to Quantitative Genetics (4th ed.). Longman, Harlow, UK.
Fernández-González, J., Metwally, S. M., & Isidro y Sánchez, J. (2026). MateR: a novel genomic mating framework. Genetics, iyag013.
Gorjanc, G., Gaynor, R. C., & Hickey, J. M. (2018). Optimal cross selection for long-term genetic gain in two-part programs with rapid recurrent genomic selection. Genetics, 210(4), 1241–1259.
Kinghorn, B. P. (1999). An algorithm for efficient constrained mate selection. Livestock Production Science, 59(2–3), 119–130.
Kinghorn, B. P., et al. (2009). Mate selection in livestock breeding programs: Advances and challenges. In: Proceedings of the Association for the Advancement of Animal Breeding and Genetics (AABG).
Lehermeier, C., Teyssèdre, S., & Schön, C.-C. (2017). Genomic prediction of within-family variance in biparental populations. Genetics, 207(4), 1631–1644.
Peixoto, M. A., Amadeu, R. R., Bhering, L. L., Ferrão, L. F. V., Munoz, P. R., & Resende Jr, M. F. (2025). SimpleMating: R‐package for prediction and optimization of breeding crosses using genomic selection. The plant genome, 18(1), e20533.
Wolfe, M. D., et al. (2021). Genomic mating in outbred species: predicting cross usefulness in cassava breeding. The Plant Genome, 14(1), e20099.
Woolliams, J. A. (2015). Optimal contribution selection and the management of genetic diversity. In: Proceedings of the 66th Annual Meeting of the European Federation of Animal Science (EAAP).

University of Florida, deamorimpeixotom@ufl.edu ↩︎

|

Mate Allocation of Breeding Crosses

Optimal Cross Selection — Theory and Simulations

Marco Antonio Peixoto, Associate researcher¹

Gainesville, 2026

Introduction

Managing Genetic Diversity

Optimization of crosses

Setting up the population

Creating population structure

Checking inbreeding levels

Computing Total Genetic Values

Optimization with SimpleMating

Optimization with COMA

Breeding Program Simulation: GS vs. OCS

Helper functions

Simulation setup

Burn-in phase (10 years)

Allocate tracking vectors

Scenario 1 — Baseline Genomic Selection

Scenario 2 — Optimal Cross Selection

Results — Comparing GS vs. OCS

Preparing data for visualization

References

| Mate Allocation of Breeding Crosses

Optimal Cross Selection — Theory and Simulations

Marco Antonio Peixoto, Associate researcher1

Gainesville, 2026

Introduction

Managing Genetic Diversity

Optimization of crosses

Setting up the population

Creating population structure

Checking inbreeding levels

Computing Total Genetic Values

Optimization with SimpleMating

Optimization with COMA

Breeding Program Simulation: GS vs. OCS

Helper functions

Simulation setup

Burn-in phase (10 years)

Allocate tracking vectors

Scenario 1 — Baseline Genomic Selection

Scenario 2 — Optimal Cross Selection

Results — Comparing GS vs. OCS

Preparing data for visualization

References

|

Mate Allocation of Breeding Crosses

Marco Antonio Peixoto, Associate researcher¹