|

Spatial analyses of field experiments
Spatial analyses represents an optimal tool for post-hoc correction of field trials information. Heterogeneous factors that were not full corrected by the plotting techniques (i.e. soil fertility, diseases, and water holding capacity) can cause some disturbance in the prediction of BLUP values and, at the end, in the selection of the genotypes.
Below, we implement a set of models that is known as spatial analyses. In addition to demonstrate each step of the analysis, we also cover several parameters of interest for estimation/prediction purposes.
The goodness-of-fit of a model to the data can be measured by different parameters. Below, we describe the most useful for spatial analyses. It is worth to mentioning that the spatial analysesâ objective is to select the most accurately and parsimoniously model that captures the spatial effects in the field. This will provides the most accurate and precise estimation of variance components and prediction of the genotypes values (Isik et al. 2018).
The heritability represents how much of the phenotypic variation came from a genetic source. We can measure it based on the the variance components estimated by the model. The simplest representation of heritability (\(h^2\)) follows the equation:
\[ h^2 = \frac{\sigma^2_g}{\sigma^2_g + \sigma^2_{res}} \]
where \(\sigma^2_g\) is represents the genotypic variance and \(\sigma^2_{res}\) represents the residual variance.
The prediction accuracy represents a measure of how close a prediction is to the real genotypic value of a genotype for a target trait. It can be estimated by:
\[ r_{\tilde{g}g} = cor(predicted.value,real.value) \]
One simple way to get an estimation of accuracy is using the square root of the heritability:
\[ r_{\tilde{g}g} = {\sqrt{h^2}} \]
In mixed models, one alternative to estimated the heritability is to use the prediction error variance (PEV) from the prediction. It can be easily estimated by:
\[ PEV = std.error^2 \]
where the \(std.error\) represents the standard error from the genotypes predictions. After, we can use the following equation (Mrode, 2014):
\[ r_{\tilde{g}g} = \sqrt{1-\frac{PEV}{\sigma^2_g}} \]
The accuracy values can vary from 0 to 1. Values close to 1 represents the best accuracy from a prediction of a target trait.
The likelihood information test (LRT - Wilks, 1938) is indicated to compared models. However, the comparison take into account that the models share the same fixed effects and one model (reduced model) has a subset of the random effects of the other model (full model). The LRT formula is given as follows:
\[ LRT = -2(logL_{full}-logL_{reduced}) \] where \(logL_{full}\) represents the log likelihood of the full model (i.e. all random effects), and \(logL_{reduced}\) represents the log likelihood of the reduced model (i.e. missing one random effect). The difference in-between models should be higher than 3.84 (\(\chi^2\) test, with 5% of probability and 1 degree of freedom). At this level (higher than 3.84) we consider that we have enough evidence to reject the null hypothesis, where the models are similar.
The most indicated parameters to measure the goodness-of-fit of one model to the data are Akaike Information Criteria (AIC - Akaike, 1974) and Schwarzâs Bayesian Information Criteria (BIC - Schwarz, 1978). They are indicated for models that are not nested and share the same fixed effects. The formula definition for AIC and BIC are the following:
\[ AIC = -2*logL + 2t \]
\[ BIC = -2*logL + 2t*log(v) \]
where LogL represents the log likelihood of a model, \(t\) represents the number of covariance parameters in the model, \(v\) is the residual degrees of freedom, where \(v = n-p\), being \(n\) the number of observations and \(p\) the number of parameters in fixed effect factors. If the difference in-between two models is superior to 2, we can consider that the model with lower value of AIC/BIC is the best fit model to the data. In addition, if the difference is lower than 2, we should choose the most parsimoniously model, in other words, the model with lower number of parameters.
When we have models with different number of fixed effects, the AIC/BIC criterion becomes prohibitive, once the log-likelihood came from REML estimates. One alternative to compare such kind of models is to use the Verbyla et al. (2019) correction for estimating the AIC/BIc corrected. With this correction, we will be able to compare those models. The calculation of the information criteria is an adaption of the code supplied in File S1 of Verbyla (2019) given in the function below (icREML). The corrected log-likelihood is calculated as:
\[ {loglik = log(REML) - log(|C|)/2}, \]
where C is log.determinant \((LogDet((X'H^{-1X})^{-1}))\) .
The AIC and BIC are calculated as:
\[ AIC = {- 2 * loglik + 2 * (rand.par + fixed.par)} \] \[ BIC = {- 2 * loglik + (fixed.par + rand.par) * log(n - r + fixed.par)}, \]
where rand.par represents the number of variances estimated by the full model , fixed.par represents the fixed degrees of freedom estimated by the full model, n is the number of observations, and r is the rank of the fixed effects design matrix.
## Function icREML from Verbyla et al. (2019)
# Author Ari Verbyla (ari.verbyla@csiro.au)
icREML <- function(fm, scale=1) {
if(!is.list(fm)) stop(" Models need to be in a list\n")
if(is.null(names(fm))) namesfm <- paste("fm", 1:length(fm))
else namesfm <- names(fm)
require(asreml)
asreml.options(Cfixed = TRUE, gammaPar=FALSE)
fm <- lapply(fm, function(el) {
if(is.null(el$Cfixed)) {
out <- update(el, maxit=1) }
else out <- el
out})
logl <- lapply(fm, function(el) el$loglik)
summ <- lapply(fm, function(el) summary(el, coef=TRUE)$coef.fixed)
which.X0 <- lapply(summ, function(el) !is.na(el[, "z.ratio"]))
p.0 <- lapply(which.X0, function(el) sum(el))
Cfixed <- lapply(fm, function(el) el$Cfixed)
logdet <- lapply(1:length(fm), function(el, Cfixed, which.X0, scale) {
log(prod(svd(as.matrix(scale*Cfixed[[el]][which.X0[[el]], which.X0[[el]]]))$d))
}, Cfixed, which.X0, scale)
vparam <- lapply(fm, function(el) summary(el)$varcomp)
q.0 <- lapply(vparam, function(el) sum(!(el$bound == "F" | el$bound == "B")))
b.0 <- lapply(vparam, function(el) sum(el$bound == "F" | el$bound == "B"))
logl <- lapply(1:length(fm), function(el, logl, logdet, p.0) {
logl[[el]] - logdet[[el]]/2}, logl, logdet,p.0)
aic <- unlist(lapply(1:length(fm), function(el, logl, p.0, q.0) {
-2*logl[[el]] + 2*(p.0[[el]] + q.0[[el]])}, logl, p.0, q.0))
bic <- unlist(lapply(1:length(fm), function(el, logl, p.0, q.0, fm) {
-2*logl[[el]] + log(fm[[el]]$nedf+p.0[[el]])*(p.0[[el]] + q.0[[el]])},
logl, p.0, q.0, fm))
results <- data.frame(model=namesfm, loglik = unlist(logl), p=unlist(p.0),
q=unlist(q.0), b = unlist(b.0), AIC = aic, BIC = bic, logdet=unlist(logdet))
row.names(results) <- 1:dim(results)[1]
invisible(results)
}## Online License checked out Mon Nov 24 15:24:36 2025
This dataset refers to an evaluation of 78 maize genotypes, in one location, and the trait measured was Yield.
| Factor | Levels |
|---|---|
| Location | One |
| Genotypes | 78 |
| Checks | 6 |
| Blocks | 3 |
| Ranges | 18 |
| Rows | 14 |
# Setting as factors
data$Row = data$Row %>% as.factor()
data$Range = data$Range %>% as.factor()
data$Block = data$Block %>% as.factor()
data$Genotype = data$Genotype %>% as.factor()
data$Check = data$Check %>% as.factor()Exploring the dataset by graphs
boxplot(Yield~Block,
data=data,
main="Trait performance over Blocks",
xlab="Block Number",
ylab="Yield (kg/ha)",
col="orange",
border="brown"
)Exploring the dataset by graphs
boxplot(Yield~Range,
data=data,
main="Trait performance over ranges",
xlab="Range Number",
ylab="Yield (kg/ha)",
col="orange",
border="brown"
)Exploring the dataset by graphs
boxplot(Yield~Row,
data=data,
main="Trait performance over rows",
xlab="Row number",
ylab="Yield (kg/ha)",
col="orange",
border="brown"
)For the spatial analyses we list in the table below the possible models for spatial analyses. We used the before mentioned parameters to choose the best model (AIC, BIC, accuracy, heritability). The given model in a matrix notation is as follow:
\[ y = Xb + Zg + e \]
where \(y\) is the vector of phenotypic data; \(b\) is the vector of checks (assumed to be fixed) plus the overall mean; \(g\) is the vector of genotype effects (assumed to be random), where \(g~N(0,\sigma^2_g)\); and \(e\) is the vector of residuals (random), where \(e~N(0,\sigma^2_{res})\). The letters \(X\) and \(Z\) refers to the incidence matrices for \(b\) and \(g\), respectively.
| Model | Action | Fitted model for residual |
|---|---|---|
| Mod1 | Only residual effect | \(\sigma^2_{res}I_râI_c\) |
| Mod2 | Includes AR1 for row | \(\sigma^2_{đ}â_r(đē_r)âI_c\) |
| Mod3 | Includes AR1 for range | \(\sigma^2_{đ}I_rââ_c(đē_c)\) |
| Mod4 | Includes AR1 for row/range | \(\sigma^2_{đ}â_r(đē_r)ââ_c(đē_c)\) |
| Mod5 | Includes AR1 for row/range and nugget effect | \(\sigma^2_{đ}â_r(đē_r)ââ_c(đē_c), \sigma^2_đŧ\) |
# Toy model
fm = asreml(fixed = Yield~Block + Check,
random = ~ Genotype,
na.action = na.method(x = "include"),
data = data)## ASReml Version 4.2 24/11/2025 15:24:37
## LogLik Sigma2 DF wall
## 1 -1758.211 764665.89 239 15:24:37
## 2 -1757.840 745540.39 239 15:24:37
## 3 -1757.558 723297.95 239 15:24:37
## 4 -1757.475 705501.14 239 15:24:37
## 5 -1757.475 704507.71 239 15:24:37
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## âš Please use `linewidth` instead.
## âš The deprecated feature was likely used in the asreml package.
## Please report the issue to the authors.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
# Model 1 Same model, but with structure
mod1 = asreml(fixed = Yield~Block + Check,
random = ~ Genotype,
na.action = na.method(x = "include"),
residual = ~id(Row):id(Range),
data = data)## ASReml Version 4.2 24/11/2025 15:24:37
## LogLik Sigma2 DF wall
## 1 -1758.211 764665.89 239 15:24:37
## 2 -1757.840 745540.39 239 15:24:37
## 3 -1757.558 723297.95 239 15:24:37
## 4 -1757.475 705501.14 239 15:24:37
## 5 -1757.475 704507.71 239 15:24:37
Model outcomes
#---- Outcomes
# BLUPs for the genotypes
BLUPs<-summary(mod1,coef=T)$coef.random
BLUP.geno<-as.matrix(BLUPs[1:78,])
# variance components
summary(mod1)$varcomp## component std.error z.ratio bound %ch
## Genotype 152638.4 68632.40 2.224000 P 0
## Row:Range!R 704507.7 78298.04 8.997769 P 0
## Estimate SE
## H2 0.1780775 0.07367362
## [1] 0.4219923
# Definition via PEV (Predicted error variance)
mean.PEV<-mean(BLUP.geno[,2]^2)
sigma2g<-summary(mod1)$varcomp[1,1]
(Acc.PEV.mod1<-sqrt(1-mean.PEV/sigma2g))## [1] 0.6198571
# Model 2 - Trends in Rows
mod2 = asreml(fixed = Yield~Block + Check,
random = ~Genotype,
na.action = na.method(x = "include"),
residual = ~ar1(Row):id(Range),
data = data)## ASReml Version 4.2 24/11/2025 15:24:37
## LogLik Sigma2 DF wall
## 1 -1754.561 748982.22 239 15:24:37
## 2 -1752.926 726224.94 239 15:24:37
## 3 -1751.483 706478.13 239 15:24:37
## 4 -1750.862 699733.88 239 15:24:37
## 5 -1750.834 701926.22 239 15:24:37
## 6 -1750.833 702401.38 239 15:24:37
Model outcomes
#---- Outcomes
# BLUPs for the genotypes
BLUPs<-summary(mod2,coef=T)$coef.random
BLUP.geno<-as.matrix(BLUPs[1:78,])
# variance components
summary(mod2)$varcomp## component std.error z.ratio bound %ch
## Genotype 1.733852e+05 6.586114e+04 2.632587 P 0.1
## Row:Range!R 7.024014e+05 8.044066e+04 8.731920 P 0.0
## Row:Range!Row!cor 3.006683e-01 7.272461e-02 4.134341 U 0.2
## Estimate SE
## H2 0.1979766 0.06784472
## [1] 0.4449456
# Definition via PEV (Predicted error variance)
mean.PEV<-mean(BLUP.geno[,2]^2)
sigma2g<-summary(mod2)$varcomp[1,1]
(Acc.PEV.mod2<-sqrt(1-(mean.PEV/sigma2g)))## [1] 0.6715797
# Model 3
mod3 = asreml(fixed = Yield~Block + Check,
random = ~Genotype,
na.action = na.method(x = "include"),
residual = ~Row:ar1(Range),
data = data)## ASReml Version 4.2 24/11/2025 15:24:37
## LogLik Sigma2 DF wall
## 1 -1753.378 741532.76 239 15:24:37
## 2 -1750.778 714741.81 239 15:24:37
## 3 -1748.471 695384.48 239 15:24:37
## 4 -1747.619 694717.81 239 15:24:37
## 5 -1747.464 700825.27 239 15:24:37
## 6 -1747.458 702887.87 239 15:24:37
## 7 -1747.457 703240.78 239 15:24:37
Model output
#---- Outcomes
# BLUPs for the genotypes
BLUPs<-summary(mod3,coef=T)$coef.random
BLUP.geno<-as.matrix(BLUPs[1:78,])
# variance components
summary(mod3)$varcomp## component std.error z.ratio bound %ch
## Genotype 1.685575e+05 6.246899e+04 2.698259 P 0.0
## Row:Range!R 7.032408e+05 8.130975e+04 8.648910 P 0.0
## Row:Range!Range!cor 3.489894e-01 6.671926e-02 5.230715 U 0.1
## Estimate SE
## H2 0.1933446 0.06506777
## [1] 0.4397097
# Definition via PEV (Predicted error variance)
mean.PEV<-mean(BLUP.geno[,2]^2)
sigma2g<-summary(mod3)$varcomp[1,1]
(Acc.PEV.mod3<-sqrt(1-mean.PEV/sigma2g))## [1] 0.6765499
# Model 4
mod4 = asreml(fixed = Yield~Block + Check,
random = ~Genotype,
na.action = na.method(x = "include"),
residual = ~ar1(Row):ar1(Range),
data = data)## ASReml Version 4.2 24/11/2025 15:24:37
## LogLik Sigma2 DF wall
## 1 -1750.221 729345.30 239 15:24:37
## 2 -1747.591 701054.78 239 15:24:37
## 3 -1745.133 677512.51 239 15:24:37
## 4 -1743.910 671571.65 239 15:24:37
## 5 -1743.814 674922.82 239 15:24:37
## 6 -1743.807 675642.36 239 15:24:37
## 7 -1743.806 675814.98 239 15:24:37
#---- Outcomes
# BLUPs for the genotypes
BLUPs<-summary(mod4,coef=T)$coef.random
BLUP.geno<-as.matrix(BLUPs[1:78,])
# variance components
summary(mod4)$varcomp## component std.error z.ratio bound %ch
## Genotype 1.998635e+05 6.462991e+04 3.092430 P 0.1
## Row:Range!R 6.758150e+05 7.785376e+04 8.680570 P 0.0
## Row:Range!Row!cor 2.416316e-01 7.952655e-02 3.038377 U 0.2
## Row:Range!Range!cor 3.105714e-01 7.302496e-02 4.252948 U 0.1
## Estimate SE
## H2 0.2282384 0.06447421
## [1] 0.4777431
# Definition via PEV (Predicted error variance)
mean.PEV<-mean(BLUP.geno[,2]^2)
sigma2g<-summary(mod4)$varcomp[1,1]
(Acc.PEV.mod4<-sqrt(1-mean.PEV/sigma2g))## [1] 0.7211822
# Model 5
mod5 = asreml(fixed = Yield~Block + Check,
random = ~Genotype + idv(units),
na.action = na.method(x = "include"),
residual = ~ar1(Row):ar1(Range),
data = data)## ASReml Version 4.2 24/11/2025 15:24:37
## LogLik Sigma2 DF wall
## 1 -1750.804 668853.81 239 15:24:37
## 2 -1748.635 380878.77 239 15:24:37 ( 2 restrained)
## 3 -1739.354 291125.58 239 15:24:37 ( 1 restrained)
## 4 -1734.592 211785.41 239 15:24:37
## 5 -1733.940 264767.79 239 15:24:37
## 6 -1733.869 291729.85 239 15:24:37
## 7 -1733.851 306451.39 239 15:24:37
## 8 -1733.850 297084.96 239 15:24:37
## 9 -1733.850 300364.32 239 15:24:37
## 10 -1733.850 297061.24 239 15:24:37
## 11 -1733.849 298964.34 239 15:24:37
Model output
#---- Outcomes
# BLUPs for the genotypes
BLUPs<-summary(mod5,coef=T)$coef.random
BLUP.geno<-as.matrix(BLUPs[1:78,])
# variance components
summary(mod5)$varcomp## component std.error z.ratio bound %ch
## Genotype 1.199431e+05 5.377807e+04 2.230335 P 0.1
## units!units 4.664785e+05 6.593710e+04 7.074598 P 0.1
## Row:Range!R 2.989643e+05 1.457003e+05 2.051913 P 0.0
## Row:Range!Row!cor 7.743999e-01 1.643659e-01 4.711439 U 0.1
## Row:Range!Range!cor 8.813115e-01 7.370977e-02 11.956509 U 0.0
## Estimate SE
## H2 0.1354698 0.06132768
## [1] 0.3680623
# Definition via PEV (Predicted error variance)
mean.PEV<-mean(BLUP.geno[,2]^2)
sigma2g<-summary(mod5)$varcomp[1,1]
(Acc.PEV.mod5<-sqrt(1-mean.PEV/sigma2g))## [1] 0.6216511
h2 = data.frame(Model = c("Mod1","Mod2", "Mod3", "Mod4","Mod5"),
h2 = c(H2.mod1$Estimate[1],H2.mod2$Estimate[1],H2.mod3$Estimate[1],H2.mod4$Estimate[1],H2.mod5$Estimate[1]))
h2## Model h2
## 1 Mod1 0.1780775
## 2 Mod2 0.1979766
## 3 Mod3 0.1933446
## 4 Mod4 0.2282384
## 5 Mod5 0.1354698
Acc.h2 = data.frame(Model = c("Mod1","Mod2", "Mod3", "Mod4","Mod5"),
Acc.h2 = c(Acc.mod1,Acc.mod2,Acc.mod3,Acc.mod4,Acc.mod5))
Acc.h2## Model Acc.h2
## 1 Mod1 0.4219923
## 2 Mod2 0.4449456
## 3 Mod3 0.4397097
## 4 Mod4 0.4777431
## 5 Mod5 0.3680623
Acc.PEV = data.frame(Model = c("Mod1","Mod2", "Mod3", "Mod4","Mod5"),
Acc.PEV = c(Acc.PEV.mod1,Acc.PEV.mod2,Acc.PEV.mod3,Acc.PEV.mod4,Acc.PEV.mod5))
Acc.PEV## Model Acc.PEV
## 1 Mod1 0.6198571
## 2 Mod2 0.6715797
## 3 Mod3 0.6765499
## 4 Mod4 0.7211822
## 5 Mod5 0.6216511
mod.AIC = data.frame(Model = c("Mod1","Mod2", "Mod3", "Mod4","Mod5"),
AIC = c(aic.mod1,aic.mod2,aic.mod3,aic.mod4,aic.mod5))
mod.AIC## Model AIC
## 1 Mod1 3518.950
## 2 Mod2 3507.666
## 3 Mod3 3500.915
## 4 Mod4 3495.612
## 5 Mod5 3477.699
mod.BIC = data.frame(Model = c("Mod1","Mod2", "Mod3", "Mod4","Mod5"),
BIC = c(bic.mod1,bic.mod2,bic.mod3,bic.mod4,bic.mod5))
mod.BIC## Model BIC
## 1 Mod1 3525.902
## 2 Mod2 3518.095
## 3 Mod3 3511.344
## 4 Mod4 3509.518
## 5 Mod5 3495.081
out = data.frame(Model = c("Mod1","Mod2", "Mod3", "Mod4","Mod5"),
h2 = h2$h2,
Acc.h2 = Acc.h2$Acc.h2,
Acc.PEV = Acc.PEV$Acc.PEV,
AIC = mod.AIC$AIC,
BIC = mod.BIC$BIC)
out## Model h2 Acc.h2 Acc.PEV AIC BIC
## 1 Mod1 0.1780775 0.4219923 0.6198571 3518.950 3525.902
## 2 Mod2 0.1979766 0.4449456 0.6715797 3507.666 3518.095
## 3 Mod3 0.1933446 0.4397097 0.6765499 3500.915 3511.344
## 4 Mod4 0.2282384 0.4777431 0.7211822 3495.612 3509.518
## 5 Mod5 0.1354698 0.3680623 0.6216511 3477.699 3495.081
Based on AIC and BIC from those models, we can conclude that the goodness-of-fit came from the model 4, where we modeled AR1 for rows and AR1 for columns. The following comparisons will be based on this model.
After the modelling of the effects of spatial correlated errors, we can include the information of row and range as an covariate in the model using the function \(lin()\). In this case, we are accounting for an single linear trend for range and another for row effect. In addition, we check the possibility of inclusion of the row and range effects as random effects in the model. We described the models below:
| Model | Global/Extraneous | Natural | Random parameter | Fixed parameter |
|---|---|---|---|---|
| Mod5 | \(\sigma^2_{đ}â_r(đē_r)ââ_c(đē_c)\) | 4 | 21 | |
| Mod6 | \(\beta_r à R\) | \(\sigma^2_{đ}â_r(đē_r)ââ_c(đē_c)\) | 4 | 22 |
| Mod7 | \(\beta_r à R,\beta_c à C\) | \(\sigma^2_{đ}â_r(đē_r)ââ_c(đē_c)\) | 4 | 23 |
| Mod8 | \(\beta_r à R,\beta_c à C, \sigma^2_rI_r\) | \(\sigma^2_{đ}â_r(đē_r)ââ_c(đē_c)\) | 5 | 23 |
| Mod9 | \(\beta_r à R,\beta_c à C, \sigma^2_rI_r, \sigma^2_cI_c\) | \(\sigma^2_{đ}â_r(đē_r)ââ_c(đē_c)\) | 6 | 23 |
For comparison purposes, we will borrow the best residual model from the previous step to be the baseline structure for the next step.
# Model 6
mod6 = asreml(fixed = Yield~Block + Check + lin(Row),
random = ~Genotype,
na.action = na.method(x = "include"),
residual = ~ar1(Row):ar1(Range),
data = data)## ASReml Version 4.2 24/11/2025 15:24:37
## LogLik Sigma2 DF wall
## 1 -1733.881 653688.63 238 15:24:37
## 2 -1732.734 630009.54 238 15:24:37
## 3 -1731.773 605195.67 238 15:24:37
## 4 -1731.403 588098.50 238 15:24:37
## 5 -1731.389 587309.98 238 15:24:37
## 6 -1731.389 587280.38 238 15:24:37
#---- Outcomes
# BLUPs for the genotypes
BLUPs<-summary(mod6,coef=T)$coef.random
BLUP.geno<-as.matrix(BLUPs[1:78,])
# variance components
summary(mod6)$varcomp## component std.error z.ratio bound %ch
## Genotype 1.710435e+05 6.024437e+04 2.839162 P 0.1
## Row:Range!R 5.872804e+05 6.638806e+04 8.846174 P 0.0
## Row:Range!Row!cor 1.442504e-01 8.198674e-02 1.759436 U 0.3
## Row:Range!Range!cor 2.060071e-01 7.685306e-02 2.680533 U 0.1
## Estimate SE
## H2 0.2255547 0.06969605
## [1] 0.474926
# Definition via PEV (Predicted error variance)
mean.PEV<-mean(BLUP.geno[,2]^2)
sigma2g<-summary(mod6)$varcomp[1,1]
(Acc.PEV.mod6<-sqrt(1-mean.PEV/sigma2g))## [1] 0.6929692
# Model 7
mod7 = asreml(fixed = Yield~Block+Check+lin(Row)+lin(Range),
random = ~Genotype,
na.action = na.method(x = "include"),
residual = ~ar1(Row):ar1(Range),
data = data)## ASReml Version 4.2 24/11/2025 15:24:38
## LogLik Sigma2 DF wall
## 1 -1729.958 654411.38 237 15:24:38
## 2 -1728.917 632418.04 237 15:24:38
## 3 -1728.031 609197.12 237 15:24:38
## 4 -1727.678 593082.30 237 15:24:38
## 5 -1727.664 592218.41 237 15:24:38
## 6 -1727.663 592155.47 237 15:24:38
#---- Outcomes
# BLUPs for the genotypes
BLUPs<-summary(mod7,coef=T)$coef.random
BLUP.geno<-as.matrix(BLUPs[1:78,])
# variance components
summary(mod7)$varcomp## component std.error z.ratio bound %ch
## Genotype 1.672210e+05 6.026109e+04 2.774941 P 0.1
## Row:Range!R 5.921555e+05 6.717669e+04 8.814895 P 0.0
## Row:Range!Row!cor 1.487639e-01 8.203234e-02 1.813479 U 0.3
## Row:Range!Range!cor 2.042100e-01 7.677841e-02 2.659733 U 0.1
## Estimate SE
## H2 0.2202083 0.06998301
## [1] 0.4692635
# Definition via PEV (Predicted error variance)
mean.PEV<-mean(BLUP.geno[,2]^2)
sigma2g<-summary(mod7)$varcomp[1,1]
(Acc.PEV.mod7<-sqrt(1-mean.PEV/sigma2g))## [1] 0.6869388
# Model
mod8 = asreml(fixed = Yield~Block+Check+lin(Row)+lin(Range),
random = ~Genotype + Row,
na.action = na.method(x = "include"),
residual = ~ar1(Row):ar1(Range),
data = data)## ASReml Version 4.2 24/11/2025 15:24:38
## LogLik Sigma2 DF wall
## 1 -1730.543 629043.56 237 15:24:38
## 2 -1728.924 612205.57 237 15:24:38
## 3 -1727.842 593490.36 237 15:24:38
## 4 -1727.492 578823.59 237 15:24:38
## 5 -1727.477 577445.20 237 15:24:38
## 6 -1727.477 577257.86 237 15:24:38
#---- Outcomes
# BLUPs for the genotypes
BLUPs<-summary(mod8,coef=T)$coef.random
BLUP.geno<-as.matrix(BLUPs[1:78,])
# variance components
summary(mod8)$varcomp## component std.error z.ratio bound %ch
## Row 1.434870e+04 2.676769e+04 0.5360457 P 0.1
## Genotype 1.705439e+05 6.065430e+04 2.8117360 P 0.2
## Row:Range!R 5.772579e+05 6.872706e+04 8.3992800 P 0.0
## Row:Range!Row!cor 1.597616e-01 8.443777e-02 1.8920635 U 0.3
## Row:Range!Range!cor 1.824142e-01 8.524195e-02 2.1399585 U 0.1
## Estimate SE
## H2 0.2237667 0.06986698
## [1] 0.4730398
# Definition via PEV (Predicted error variance)
mean.PEV<-mean(BLUP.geno[,2]^2)
sigma2g<-summary(mod8)$varcomp[2,1]
(Acc.PEV.mod8<-sqrt(1-mean.PEV/sigma2g))## [1] 0.6916815
# Model
mod9 = asreml(fixed = Yield~Block+Check+lin(Row)+lin(Range),
random = ~Genotype + Row + Range,
na.action = na.method(x = "include"),
residual = ~ar1(Row):ar1(Range),
data = data)## ASReml Version 4.2 24/11/2025 15:24:38
## LogLik Sigma2 DF wall
## 1 -1733.571 617561.33 237 15:24:38 ( 1 restrained)
## 2 -1729.082 609433.09 237 15:24:38 ( 1 restrained)
## 3 -1727.831 592638.87 237 15:24:38 ( 1 restrained)
## 4 -1727.491 578689.89 237 15:24:38 ( 1 restrained)
## 5 -1727.477 577432.27 237 15:24:38 ( 1 restrained)
## 6 -1727.477 577254.74 237 15:24:38
#---- Outcomes
# BLUPs for the genotypes
BLUPs<-summary(mod9,coef=T)$coef.random
BLUP.geno<-as.matrix(BLUPs[1:78,])
# variance components
summary(mod9)$varcomp## component std.error z.ratio bound %ch
## Row 1.434893e+04 2.676845e+04 0.5360388 P 0.1
## Range 5.841408e-02 NA NA B NA
## Genotype 1.705472e+05 6.065713e+04 2.8116600 P 0.1
## Row:Range!R 5.772547e+05 6.872724e+04 8.3992133 P 0.0
## Row:Range!Row!cor 1.597693e-01 8.443868e-02 1.8921343 U 0.3
## Row:Range!Range!cor 1.824183e-01 8.524316e-02 2.1399763 U 0.1
## Estimate SE
## H2 0.2237709 0.06986924
## [1] 0.4730443
# Definition via PEV (Predicted error variance)
mean.PEV<-mean(BLUP.geno[,2]^2)
sigma2g<-summary(mod9)$varcomp[3,1]
(Acc.PEV.mod9<-sqrt(1-mean.PEV/sigma2g))## [1] 0.6916728
## Heritabilities
h2 = data.frame(Model = c("Mod1","Mod2", "Mod3", "Mod4","Mod5","Mod6","Mod7", "Mod8", "Mod9"),
h2 = c(H2.mod1$Estimate[1],H2.mod2$Estimate[1],H2.mod3$Estimate[1],
H2.mod4$Estimate[1],H2.mod5$Estimate[1],H2.mod6$Estimate[1],
H2.mod7$Estimate[1],H2.mod8$Estimate[1],H2.mod9$Estimate[1]))
h2## Model h2
## 1 Mod1 0.1780775
## 2 Mod2 0.1979766
## 3 Mod3 0.1933446
## 4 Mod4 0.2282384
## 5 Mod5 0.1354698
## 6 Mod6 0.2255547
## 7 Mod7 0.2202083
## 8 Mod8 0.2237667
## 9 Mod9 0.2237709
Acc.h2 = data.frame(Model = c("Mod1","Mod2", "Mod3", "Mod4","Mod5","Mod6","Mod7", "Mod8", "Mod9"),
Acc.h2 = c(Acc.mod1,Acc.mod2,Acc.mod3,Acc.mod4,Acc.mod5,Acc.mod6,Acc.mod7,Acc.mod8,Acc.mod9))
Acc.h2## Model Acc.h2
## 1 Mod1 0.4219923
## 2 Mod2 0.4449456
## 3 Mod3 0.4397097
## 4 Mod4 0.4777431
## 5 Mod5 0.3680623
## 6 Mod6 0.4749260
## 7 Mod7 0.4692635
## 8 Mod8 0.4730398
## 9 Mod9 0.4730443
Acc.PEV = data.frame(Model = c("Mod1","Mod2", "Mod3", "Mod4","Mod5","Mod6","Mod7", "Mod8", "Mod9"),
Acc.PEV = c(Acc.PEV.mod1,Acc.PEV.mod2,Acc.PEV.mod3,Acc.PEV.mod4,Acc.PEV.mod5,Acc.PEV.mod6,Acc.PEV.mod7,Acc.PEV.mod8,Acc.PEV.mod9))
Acc.PEV## Model Acc.PEV
## 1 Mod1 0.6198571
## 2 Mod2 0.6715797
## 3 Mod3 0.6765499
## 4 Mod4 0.7211822
## 5 Mod5 0.6216511
## 6 Mod6 0.6929692
## 7 Mod7 0.6869388
## 8 Mod8 0.6916815
## 9 Mod9 0.6916728
AICBIC = list(mod1,mod2,mod3,mod4,mod5, mod6, mod7, mod8, mod9)
icREML <- function(fm, scale=1) {
if(!is.list(fm)) stop(" Models need to be in a list\n")
if(is.null(names(fm))) namesfm <- paste("fm", 1:length(fm))
else namesfm <- names(fm)
require(asreml)
asreml.options(Cfixed = TRUE, gammaPar=FALSE)
fm <- lapply(fm, function(el) {
if(is.null(el$Cfixed)) {
out <- update(el, maxit=1) }
else out <- el
out})
logl <- lapply(fm, function(el) el$loglik)
summ <- lapply(fm, function(el) summary(el, coef=TRUE)$coef.fixed)
which.X0 <- lapply(summ, function(el) !is.na(el[, "z.ratio"]))
p.0 <- lapply(which.X0, function(el) sum(el))
Cfixed <- lapply(fm, function(el) el$Cfixed)
logdet <- lapply(1:length(fm), function(el, Cfixed, which.X0, scale) {
log(prod(svd(as.matrix(scale*Cfixed[[el]][which.X0[[el]], which.X0[[el]]]))$d))
}, Cfixed, which.X0, scale)
vparam <- lapply(fm, function(el) summary(el)$varcomp)
q.0 <- lapply(vparam, function(el) sum(!(el$bound == "F" | el$bound == "B")))
b.0 <- lapply(vparam, function(el) sum(el$bound == "F" | el$bound == "B"))
logl <- lapply(1:length(fm), function(el, logl, logdet, p.0) {
logl[[el]] - logdet[[el]]/2}, logl, logdet,p.0)
aic <- unlist(lapply(1:length(fm), function(el, logl, p.0, q.0) {
-2*logl[[el]] + 2*(p.0[[el]] + q.0[[el]])}, logl, p.0, q.0))
bic <- unlist(lapply(1:length(fm), function(el, logl, p.0, q.0, fm) {
-2*logl[[el]] + log(fm[[el]]$nedf+p.0[[el]])*(p.0[[el]] + q.0[[el]])},
logl, p.0, q.0, fm))
results <- data.frame(model=namesfm, loglik = unlist(logl), p=unlist(p.0),
q=unlist(q.0), b = unlist(b.0), AIC = aic, BIC = bic, logdet=unlist(logdet))
row.names(results) <- 1:dim(results)[1]
invisible(results)
}
est_AICBIC = icREML(AICBIC)## ASReml Version 4.2 24/11/2025 15:24:38
## LogLik Sigma2 DF wall
## 1 -1757.475 704499.27 239 15:24:38
## Warning in asreml(fixed = Yield ~ Block + Check, random = ~Genotype, residual =
## ~id(Row):id(Range), : Log-likelihood not converged
## ASReml Version 4.2 24/11/2025 15:24:38
## LogLik Sigma2 DF wall
## 1 -1750.833 702490.41 239 15:24:38
## Warning in asreml(fixed = Yield ~ Block + Check, random = ~Genotype, residual =
## ~ar1(Row):id(Range), : Log-likelihood not converged
## ASReml Version 4.2 24/11/2025 15:24:38
## LogLik Sigma2 DF wall
## 1 -1747.457 703307.57 239 15:24:38
## Warning in asreml(fixed = Yield ~ Block + Check, random = ~Genotype, residual =
## ~Row:ar1(Range), : Log-likelihood not converged
## ASReml Version 4.2 24/11/2025 15:24:38
## LogLik Sigma2 DF wall
## 1 -1743.806 675861.30 239 15:24:38
## Warning in asreml(fixed = Yield ~ Block + Check, random = ~Genotype, residual =
## ~ar1(Row):ar1(Range), : Log-likelihood not converged
## ASReml Version 4.2 24/11/2025 15:24:38
## LogLik Sigma2 DF wall
## 1 -1733.849 298672.03 239 15:24:38
## Warning in asreml(fixed = Yield ~ Block + Check, random = ~Genotype +
## idv(units), : Log-likelihood not converged
## ASReml Version 4.2 24/11/2025 15:24:38
## LogLik Sigma2 DF wall
## 1 -1731.389 587268.48 238 15:24:38
## Warning in asreml(fixed = Yield ~ Block + Check + lin(Row), random = ~Genotype,
## : Log-likelihood not converged
## ASReml Version 4.2 24/11/2025 15:24:38
## LogLik Sigma2 DF wall
## 1 -1727.663 592136.90 237 15:24:38
## Warning in asreml(fixed = Yield ~ Block + Check + lin(Row) + lin(Range), :
## Log-likelihood not converged
## ASReml Version 4.2 24/11/2025 15:24:38
## LogLik Sigma2 DF wall
## 1 -1727.477 577217.77 237 15:24:38
## Warning in asreml(fixed = Yield ~ Block + Check + lin(Row) + lin(Range), :
## Log-likelihood not converged
## ASReml Version 4.2 24/11/2025 15:24:38
## LogLik Sigma2 DF wall
## 1 -1727.477 577217.09 237 15:24:38
## Warning in asreml(fixed = Yield ~ Block + Check + lin(Row) + lin(Range), :
## Log-likelihood not converged
output = data.frame(Model = c("Mod1","Mod2", "Mod3", "Mod4","Mod5","Mod6","Mod7", "Mod8", "Mod9"),
h2 = round(h2$h2,3),
Acc.h2 = round(Acc.h2$Acc.h2,3),
Acc.PEV = round(Acc.PEV$Acc.PEV,3),
AIC = est_AICBIC$AIC,
BIC = est_AICBIC$BIC)
output## Model h2 Acc.h2 Acc.PEV AIC BIC
## 1 Mod1 0.178 0.422 0.620 3638.877 3677.524
## 2 Mod2 0.198 0.445 0.672 3628.146 3670.307
## 3 Mod3 0.193 0.440 0.677 3620.966 3663.127
## 4 Mod4 0.228 0.478 0.721 3616.150 3661.824
## 5 Mod5 0.135 0.368 0.622 3601.421 3650.609
## 6 Mod6 0.226 0.475 0.693 3597.763 3646.951
## 7 Mod7 0.220 0.469 0.687 3599.557 3652.259
## 8 Mod8 0.224 0.473 0.692 3601.364 3657.579
## 9 Mod9 0.224 0.473 0.692 3601.364 3657.579
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Automat Contr 19:716â723. https://doi.org/10.1109/TAC.1974.1100705
Coelho, I., Peixoto, M. A., Marcal, T. D. S., Bernardeli, A., Alves, R., Lima, R. O., & Bhering, L. L. (2021). Accounting for spatial trends in multi-environment diallel analysis in maize breeding. PloS one, 16(10), e0258473.
Cullis, B.R., A.B. Smith, and N.E. Coombes. 2006. On the design of early generation variety trials with correlated data. J. Agric. Biol. Environ. Stat. 11(4): 381â393. doi: 10.1198/108571106X15444
Werner, C. R., Gemenet, D. C., & Tolhurst, D. J. (2024). FieldSimR: an R package for simulating plot data in multi-environment field trials. Frontiers in plant science, 15, 1330574.
Gilmour, A.R.,Cullis, B.R. & Verbyla, A.P. (1997). Accounting for natural and extraneous variationin the analysis of field experiments.Journal of Agricultural, Biological and Environmental Statistics2,269â293
Isik et al., 2017. Genetic Data Analysis for Plant and Animal Breeding.
Mrode, R. A. (2014). Linear models for the prediction of animal breeding values. Cabi.
Schwarz G. Estimating the dimension of a model. Ann Stat. 1978; 6: 461â464.
Verbyla, A. P. (2019). A note on model selection using information criteria for general linear models estimated using REML. Australian & New Zealand Journal of Statistics, 61, 39â50. doi: 10.1111/anzs.12254.
Wilks SS. The Large-Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses. Ann Math Stat. 1938; 9: 60â62. https://doi.org/10.1214/aoms/1177732360
University of Florida, mresende@ufl.eduâŠī¸
University of Florida, deamorimpeixotom@ufl.eduâŠī¸