Resemblance Between Individuals

Genetic Variation and Genetic Progress

Genetic variation is the cornerstone of any successful breeding program. Without it, selection cannot produce genetic progress. The relationship between genetic variation at generation \(t\) and inbreeding is described by:

\[\sigma^2_t = \left(1 - (F_t - F_{base})\right) \cdot \sigma^2_{base}\]

Where:

\(\sigma^2_t\) = additive genetic variance at generation \(t\)
\(\sigma^2_{base}\) = additive genetic variance in the base (reference) population
\(F_t\) = inbreeding coefficient at generation \(t\)
\(F_{base}\) = inbreeding coefficient of the base population

Interpretation: As inbreeding accumulates (i.e., \(F_t - F_{base}\) increases), additive genetic variance erodes proportionally. This directly reduces the response to selection, making inbreeding management a central concern in long-term breeding programs.

Measuring and Managing Inbreeding

Increment of Inbreeding per Generation (\(\Delta F\))

Animal breeders typically target the rate of inbreeding per generation (\(\Delta F\)) rather than the absolute level of \(F\), because \(\Delta F\) reflects the process driving inbreeding rather than the accumulated state.

The inbreeding increment between consecutive generations is:

\[\Delta F = \frac{F_t - F_{t-1}}{1 - F_{t-1}}\]

This formulation expresses the new inbreeding gained at generation \(t\) relative to the remaining non-autozygous fraction \((1 - F_{t-1})\), making it a true per-generation rate rather than a simple difference.

Typical Guidelines

Parameter	Recommended Range
\(\Delta F\) per generation	\(\approx 0.25\%\) to \(0.5\%\)
Effective population size (\(N_e\))	\(\approx 50\) to \(100\)

These thresholds balance the need for genetic gain through selection against the long-term cost of inbreeding depression and loss of genetic diversity. Plant breeders face the problem without an specific guideline, once management of inbreeding and diversity follows different patterns.

Effective Population Size (\(N_e\))

The effective population size \(N_e\) summarizes the rate of inbreeding in a single intuitive parameter — the size of an idealized random-mating population that would accumulate inbreeding at the same rate as the actual population:

\[N_e = \frac{1}{2\Delta F}\]

A larger \(N_e\) implies slower inbreeding accumulation and better long-term sustainability of the breeding program.

Cumulative Inbreeding Rate Over \(t\) Generations

When evaluating inbreeding accumulated over multiple generations (e.g., comparing two time points separated by \(t\) generations), the per-generation \(\Delta F\) can be back-calculated as:

\[\Delta F_t = 1 - \left(\frac{1 - F_t}{1 - F_0}\right)^{1/t}\]

Where:

\(F_0\) = inbreeding at the initial generation
\(F_t\) = inbreeding at generation \(t\)
\(t\) = number of generations elapsed

This is particularly useful when generational boundaries are not clearly defined (e.g., overlapping generations) or when comparing across programs with different generation lengths.

# Demonstrate F accumulation over generations under constant delta F

generations <- 0:50
delta_F_values <- c(0.0025, 0.005, 0.01, 0.02)  # 0.25%, 0.5%, 1%, 2%
colors <- c("steelblue", "forestgreen", "orange", "tomato")

plot(
  NULL,
  xlim = c(0, 50), ylim = c(0, 1),
  xlab = "Generation",
  ylab = expression(F[t] ~ "(Inbreeding coefficient)"),
  main = expression("Inbreeding Accumulation Under Constant " * Delta * F)
)

for (k in seq_along(delta_F_values)) {
  dF  <- delta_F_values[k]
  F_t <- 1 - (1 - dF)^generations
  lines(generations, F_t, col = colors[k], lwd = 2)
}

legend(
  "topleft",
  legend = paste0("\u0394F = ", delta_F_values * 100, "% (Ne \u2248 ",
                  round(1 / (2 * delta_F_values)), ")"),
  col    = colors,
  lwd    = 2,
  bty    = "n"
)
abline(h  = 0.25, lty = 2, col = "grey50")
text(51, 0.26, "F = 0.25", adj = 1, cex = 0.8, col = "grey40")

Inbreeding trajectory over generations for different delta F values

Estimating the Inbreeding Coefficient (\(F\))

Definition and Individual-Level Estimation

The inbreeding coefficient \(F\) at generation \(t\) is formally defined as the probability that two alleles at a given locus within an individual are identical by descent (IBD) — that is, they are copies of the same ancestral allele traced through the pedigree.

At the individual level, \(f_i\) is estimated from a relationship matrix as:

\[f_i = \frac{K_{ii} - 1}{\emptyset - 1}\] This is the familiar diag(G) - 1 formulation. For a non-inbred individual in Hardy-Weinberg equilibrium, \(K_{ii} = 1\) and \(f_i = 0\). For a doubled-haploid — completely homozygous at every locus — \(K_{ii} = 2\) and \(f_i = 1\), correctly capturing complete homozygosity. For polyploids (\(\emptyset > 2\)), the denominator \((\emptyset - 1)\) rescales the diagonal appropriately for the higher ploidy level, ensuring \(f_i\) remains on the \([0, 1]\) scale.

And the population-level inbreeding coefficient at generation \(t\) is the average across all \(n\) individuals:

\[F_t = \frac{\sum f_i}{n}\]

Where:

\(K_{ii}\) = diagonal element of the relationship matrix \(K\) (either pedigree-based A or genomic G) for individual \(i\)
\(\emptyset\) = ploidy of the individual (e.g., \(\emptyset = 2\) for diploids)
\(n\) = number of individuals in the population

Note on the matrix \(K\): When base population allele frequencies are used (rather than current ones), this estimator approximates IBD-based inbreeding, because deviations from base-population heterozygosity reflect the cumulative effect of inbreeding since the base generation, so, using founder allele frequencies when building \(G\) brings it closer to a true IBD measure.

Beyond pedigree-based estimation, \(F\) can be computed from genomic data using several approaches:

Individual Inbreeding as Excess Homozygosity

The inbreeding coefficient is formally equivalent to the relative reduction in expected heterozygosity compared to a non-inbred reference. A practical interpretation of \(f_i\): compared to a non-inbred individual from the same population, how much more homozygous is this individual?

\[F_t = \frac{1}{m} \sum_{i=1}^{m} p_{ij}^2\]

Where \(p_{ij}\) is the frequency of allele \(j\) at marker \(i\), and \(m\) is the number of markers.

This formula computes the probability of identity in state averaged across all markers — anchored to the allele frequencies.

Relative Decrease in (Expected) Heterozygosity

The inbreeding coefficient is formally equivalent to the relative reduction in expected heterozygosity compared to a non-inbred reference:

\[F_t = 1 - \frac{H_t}{H_0} = 1 - \frac{1}{m} \sum_i \sum_j p_{ij}^2\]

Where:

\(H_0 = 2pq\) = expected heterozygosity under Hardy-Weinberg equilibrium (base population)
\(H_t\) = observed/expected heterozygosity at generation \(t\)
\(p_{ij}\) = frequency of the \(j\)-th allele at the \(i\)-th marker
\(m\) = number of markers

Key insight: The relative (not absolute) decrease is used because it normalizes against baseline diversity, making \(F\) a descriptor of the inbreeding process rather than a snapshot of current heterozygosity. This is especially important when comparing populations or species with different levels of ancestral diversity.

# Illustrate f_i = (K_ii - 1) / (ploidy - 1) across ploidy levels

K_diag  <- seq(1, 3, by = 0.01)   # diagonal of relationship matrix
ploidy  <- c(2, 4, 6)              # diploid, tetraploid, hexaploid
colors  <- c("steelblue", "forestgreen", "tomato")

plot(
  NULL,
  xlim = c(1, 3), ylim = c(0, 1),
  xlab = expression(K[ii] ~ "(Diagonal of relationship matrix)"),
  ylab = expression(f[i] ~ "(Individual inbreeding coefficient)"),
  main = expression(f[i] == frac(K[ii] - 1, "\u00d8" - 1) ~
                      "across ploidy levels")
)

for (k in seq_along(ploidy)) {
  fi <- (K_diag - 1) / (ploidy[k] - 1)
  fi[fi > 1] <- NA
  lines(K_diag, fi, col = colors[k], lwd = 2)
}

legend(
  "topleft",
  legend = paste0("Ploidy = ", ploidy),
  col    = colors,
  lwd    = 2,
  bty    = "n"
)
abline(h = 1, lty = 2, col = "grey60")
abline(h = 0, lty = 2, col = "grey60")

Individual inbreeding f_i for different ploidy levels as K_ii varies

References

Falconer, D.S. & Mackay, T.F.C. (1996). Introduction to Quantitative Genetics. 4th ed. Longman.
Lynch, M. & Walsh, B. (1998). Genetics and Analysis of Quantitative Traits. Sinauer Associates.
VanRaden, P.M. (2008). Efficient methods to compute genomic predictions. Journal of Dairy Science, 91(11), 4414–4423.
Meuwissen, T., Hayes, B. & Goddard, M. (2013). Accelerating improvement of livestock with genomic selection. Annual Review of Animal Biosciences, 1, 221–237.

University of Florida, deamorimpeixotom@ufl.edu ↩︎

|

Mate Allocation of Breeding Crosses