Empirical assessment of a genomic breeding strategy in perennial ryegrass

In genomic selection (GS) DNA markers and trait data are integrated in a model that then predicts genomicestimated breeding values (GEBV’s) for individuals from marker information alone, improving breeding efficiency. This study assessed a genomic breeding strategy (APWFGS) for improving dry matter yield (DMY) in perennial ryegrass. In APWFGS, the bestperforming half-sibling families (HS) are identified using phenotypic data and GS is used to select the best individuals within HS. Four selections were made from three breeding populations: Base (random sample of plants from all HS), HSP (random sample from the six phenotypically best HS), APWFGS and APWFGS-L (top or bottom 5% of plants, respectively, selected by GEBV from the six HS). Plants from within the selections were polycrossed, creating 12 experimental synthetics that were evaluated for DMY (n=7 harvests) at two locations over 18 months. In each population, mean DMY across locations and harvests showed a trend of APWFGS> HSP>Base, with APWFGS-L closest to Base performance. When averaged across populations, APWFGS increased DMY by 43% (P<0.05) compared to Base, more than twice the level of improvement achieved with HSP. These results showed that APWFGS can substantially improve selection response for a genetically complex trait from a single breeding cycle.


Introduction
Perennial ryegrass (Lolium perenne L.) is the most important source of nutrition for ruminant livestock grown on New Zealand farms and contributes nearly $14.6B to annual GDP (NZIER 2016). Major goals for genetic improvement in perennial ryegrass are annual and seasonal dry matter yield (DMY) (Williams et al., 2007;Lee et al., 2012), with persistence, nutritive quality, disease and pest resistance and symbiont compatibility also targeted. Significant improvements in DMY have been achieved through plant breeding, but the rate of genetic gain (∆G) for this genetically complex trait has been moderate, estimated at only 0.3 -0.7% per annum (Van Wijk and Reheul 1990;Easton et al., 2002;Sampoux et al., 2010). Genomic selection (GS), which is already implemented in livestock species (Meuwissen et al., 2016) and is under adoption in major economic plant species (Crossa et al., 2014;Lin et al., 2014), is a promising approach for improving ∆G for complex traits, such as DMY, in forages (Hayes et al., 2013).
To undertake GS a set of individuals or families, referred to as a training set, is genotyped using 10's to 100's of thousands of genome-wide DNA markers (single nucleotide polymorphisms, SNP) and phenotyped for the trait of interest. The genotypic and phenotypic data are combined to derive a statistical model, referred to as a genomic prediction model. The genomic prediction model is subsequently used to predict the trait or genomic-estimated breeding value (GEBV) for new individuals, by acquiring and entering their SNP genotype information into the model. Superior plants can therefore be selected based on GEBVs without recourse to expensive or long-term phenotyping, and then recombined by crossing to create the next breeding generation or a new cultivar.
A challenge, once genomic prediction models are developed, is their practical implementation in GS breeding schemes. Faville et al. (2020) demonstrated that application of genomic prediction models for DMY or heading date in perennial ryegrass could improve these traits. However, the efficacy of GS generally diminished when applied into selection populations that were less related to the original training set. This suggested that the use of GS to reduce generation interval (speeding up breeding and improving ∆G by completing more selection cycles per unit time) is likely to have diminishing returns as the level of genetic relatedness between the training set and the selection population is expected to decline over multiple selection cycles. Based on the relatively small training sets currently available in forages (n = 200-1000), computer simulation estimated a 60-72% reduction in predictive ability for a perennial ryegrass DMY genomic prediction model between a first and second cycles of selection .
Within-cycle applications of GS, which aim to improve ∆G by increasing selection accuracy from a single breeding cycle are, therefore, appealing. Modelling predicts a substantial increase in ∆G for DMY by applying a within-cycle strategy called among (by phenotype) and-within (by genomic selection) half-sibling family selection (A P WF GS ) Barrett et al., 2021). Here, the breeder uses DMY phenotypic data from sown plot trials to identify the best-performing half-sibling families (HS) (among-family selection) and then GS is applied to select the best individuals from within those top HS (withinfamily selection; Figure 1). The latter component is achieved by growing seedlings from saved seed of the HS, then genotyping and generating GEBV's for the seedlings. In Figure 1 following the first breeding cycle there are options to implement a second cycle of GS, or to advance selected individuals to cultivar development, or to return to commence a new cycle. A second GS cycle would be based on application of the same GS model, without further training from field data , enabling selection of another generation of elite parental seedlings based on GEBV's in the space of one year. Integration of additional traits for GS is also possible, by phenotyping the traits either on HS family plots or directly on the parental generation, as single plants.
The objective of the following work was to assess the potential of the A PWF GS strategy for improving DMY in perennial ryegrass, by applying existing DMY genomic prediction models  in commercial selection populations and comparing progeny against those from a conventional breeding approach, phenotypic half-sibling family selection (HS P ). The HS P system differed from A P WF GS only in that, instead of selecting individuals from within the The objective of the following work was to assess the potential of the APWFGS strategy for improving DMY in perennial ryegrass, by applying existing DMY genomic prediction models  in commercial selection populations and comparing progeny against those from a conventional breeding approach, phenotypic half-sibling family selection (HSP). The HSP system differed from APWFGS only in that, instead of selecting individuals from within the bestperforming HS by GEBV, a random sample of individuals was taken.

Plant material
Selection was undertaken in three Grasslands Innovation Ltd perennial ryegrass An among-and-within-half sibling (HS) breeding scheme for dry matter yield (DMY) in perennial ryegrass that bases the among-family selection on phenotypic (P) data from field trials and uses genomic selection (GS) for the within-family selection component. A P WF GS = among P -and-within GS family selection; GBS = genotyping-by-sequencing; SNP = single nucleotide polymorphism marker; BLUP = best linear unbiased predictor; GEBV = genomic estimated breeding value.
best-performing HS by GEBV, a random sample of individuals was taken.

Plant material
Selection was undertaken in three Grasslands Innovation Ltd perennial ryegrass selection populations (SelPop I, SelPop III and SelPop V), which consisted of 96, 115 and 106 HS, respectively, that had been evaluated previously for DMY in multi-year, multienvironment plot trials . These HS were the progeny from the parental generation that had been used to train the DMY GS models in that study. The six highest ranking HS were identified in each population based on their DMY phenotypic values. DMY phenotypic values were an index of mean DMY calculated across seasons, years and locations, as detailed in Faville et al. (2020). Forty seeds (selection candidates) were randomly sampled from stored seed of each of the six selected HS, then germinated and grown under standard greenhouse conditions for 4 weeks, until two tillers had emerged.

Genotyping-by-sequencing and genomic selection
Approximately 100 mg of leaf tissue per seedling was sampled and DNA extraction completed using the method of Anderson et al. (2018). DNA samples were used to develop GBS libraries for the selection candidates. For methodological details of GBS library development, sequencing of GBS libraries, bioinformatic data processing, SNP genotype calling and genomic relationship matrix (GRM) development, see Faville et al. (2018). GBS data generated for selection candidates were merged with data from the original GS training set , SNP genotypes were determined, and these were used to estimate the genomic relationship matrix (GRM), consisting of selection candidates + training set individuals (Dodds et al., 2015). DMY GEBV's for each of the 720 selection candidates were then derived by Genomic Best Linear Unbiased Prediction (GBLUP), as described in Faville et al., (2020). Briefly, five environment-specific DMY GEBV's for each selection population individual were estimated using the genomic prediction models developed previously by Faville et al. (2018): Waikato standard management; Waikato severe summer grazing management; Manawatu standard management; Manawatu severe summer grazing management and Canterbury standard management (Table 1). Each environment-specific prediction model was trained using a multi-population training set composed of small numbers of HS from five breeding populations, including SelPop I, SelPop III and SelPop V . For each selection candidate genomic estimated breeding values (GEBV's) were generated for each of the five models. The GEBV values for an individual were then entered into a weighted index to generate a single GEBV value , on which selections were based.
Combination of the environment -specific GEBV's into a weighted selection index  generated a single DMY GEBV for each individual, providing a measure of across-environment performance. For each selection population, an additional 30 seeds were randomly sampled from across all HS and propagated as described abovethese plants were not genotyped. Four selections were Table 1 Predictive ability (r A , the Pearson correlation coefficient between predicted and observed trait values) and narrow sense heritability (h 2 n ), for five environment-specific dry matter yield genomic prediction models  in each of the three selection populations (SelPop I,III and V).

Population development
Individuals selected within each of the 12 SG's were grown to maturity and polycrossed under isolation (no mixing between SGs) during spring 2017 at AgResearch Grasslands, Palmerston North. Syn-1 generation seed was harvested from individual plants within a SG polycross and a balanced bulk for the SG was created by combining equal seed quantities from each plant. This resulted in 12 Syn-1 experimental synthetic populations, one for each of the SG's.

Field evaluation of SG synthetic populations
Field trials were conducted at AgResearch Ruakura in Waikato (37.78°S, 175.32°E) and AgResearch Grasslands in Manawatu (40.21°S, 175.37°E) from May 2018. Each of the SG synthetics, along with two control cultivars, were direct-drilled as 2 m rows (0.6 g seed per row) with 30 cm spacing between rows and 40 cm gaps at the ends of the rows, in a row-column design with three replicates. Soil fertility levels were adjusted to ensure nutrients did not limit plant growth. Nitrogen was applied (15-30 kg N/ha) at each defoliation. Superphosphate fertiliser (8.8 kg P/ha) was applied in late autumn each year. Trials were defoliated by sheep grazing whenever they reached the two to three leaf stage of development, except when DMY harvests were taken, in which case the plants were defoliated manually. Between November 2018 and May 2020, seven seasonal DMY harvests were completed at each site by manual cutting, drying and weighing herbage to determine grams of DMY per 2 m row, as described by .

Statistical analysis
Mean DMY for each SG across all harvests and locations was determined. Data were analysed by a linear mixed model, using the variance component analysis procedure residual maximum likelihood (REML) option in DeltaGen software v0.03 (Jahufer and Luo 2018). Repeated checks, location, harvest date, and SG were treated as fixed effects; replicates, rows, and columns were treated as random effects; and the model included SG-by-year, SG-by-location and SGby-season interaction effects. The final DMY values for each SG synthetic were generated as best linear unbiased estimators (BLUEs).

Results and Discussion
This study is the first empirical assessment of a genomics-driven breeding strategy in perennial ryegrass, namely an among-and-within-HS selection method (A P WF GS ), that leveraged phenotypic data for the among-family selection component and GS for and the within-family selection component. In forage breeding, among-and within-family selection methods enable utilisation of 100% of the additive (heritable) genetic variation within a population (Casler and Brummer 2008) but historically this has been difficult to implement, particularly for traits such as DMY.
Commonly applied HS breeding approaches, such as HS P , use only the 25% of additive genetic variation that occurs amongst HS (Falconer 1989), usually based on phenotypic assessment of HS as sown plots in multiple locations over several years. The remaining 75% of the total additive genetic variation in a population occurs within the HS, but this is generally inaccessible to breeders for sward traits such as DMY. This is because within-family selection typically relies on measuring traits in single plants randomly sampled from within the HS. However, there is negligible correlation between single plant and sward DMY (Lazenby and Rogers 1964;Hayward and Vivero 1984) and so single plant DMY cannot be used to reliably select for sward DMY within families. Genomic selection makes the application of meaningful within-HS selection pressure for sward DMY possible on single plants, by using genomic prediction models trained using sown row or plot DMY data. Esfandyari et al. (2020) showed that the ability to accurately select single plants for sward traits, by using sward trait GEBV's, substantially increased ∆G.

Selection
With reference to Figure 1, the first steps of the breeding scheme (HS family generation, phenotypic evaluation of HS, genotyping parents and GS model training) were achieved previously as described in Faville et al. (2018), enabling selection of the six topranked HS from SelPopI, III and V. Random selection of two individuals per HS followed and these were used to generate HS P SG synthetics for each selection population at among-family selection pressures of 6%, 5% and 5% for SelPopI, III and V, respectively. Genotypes were successfully called at 777k SNP loci for 701 individuals sampled from the selected HS (235 individuals from each of SelPop III and V and 231 from SelPop I) plus 566 training set individuals. This allowed generation of a DMY GEBV for mean performance across environments for the selection population individuals. Ranking of the plants by DMY GEBV enabled selection of the two top-ranked and two bottom-ranked individuals within HS samples from SelPop I, III and V. This provided within-family selection pressure of 5% in each selection population. The selected individuals were used to generate A P WF GS and A P WF GS -L SG synthetics, respectively, for each selection population. Overall, for these SGs the among and within-family selection pressures were 5-6% and 5%, respectively.

Field evaluation of SG synthetics
Evaluation of the 12 SG synthetics at two locations over 18 months generated DMY data from seven seasonal harvests at Manawatu and Waikato. The expectation from simulation studies  was that the GS breeding system, A P WF GS , should deliver improved DMY outcomes compared with the conventional phenotypic selection approach, HS P . That expectation was endorsed by the selection responses observed in the current study.
The magnitude of selection response differed by population, but in each of SelPopI, SelPop III and SelPop V DMY across locations and harvests showed a consistent trend for A P WF GS > HS P > Base (Table  2), where the latter represented the source population. Greater selection responses in SelPopI and SelPopV may have been influenced by higher predictive abilities (r A ) for some of the DMY prediction models used (Table 1), particularly Wai SEV and Man SEV, but because the models were used as part of an index, it was not possible to assess that definitely. When averaged across all selection populations (All SelPop), applying HS P improved DMY by 18% compared to the Base control (Table 2, Figure 2), although this difference was not significant. In contrast, A P WF GS increased DMY by 43% (P<0.05) from the Base population, more than double the level of improvement achieved using the conventional method.
These relative differences corresponded closely to those from a simulation study reported by Barrett et al. (2021), which showed that application of 5% among-and 5% within-HS selection pressure in A P WF GS doubled ∆G relative to HS P in a perennial ryegrass selection population. Similarly, modelling by Esfandyari et al. (2020) showed that ∆G increased in GS schemes because it enabled more accurate selection of single plants for sward traits by using GEBV's.  These relative differences corresponded closely to those from a simulation study reported by Barrett et al. (2021), which showed that application of 5% among-and 5% within-HS selection pressure in APWFGS doubled ∆G relative to HSP in a perennial ryegrass selection population. Similarly, modelling by Esfandyari et al. (2020) showed that ∆G increased in GS schemes because it enabled more accurate selection of single plants for sward traits by using GEBV's.
The impact of APWFGS was exemplified further by comparison with a divergent selection, APWFGS-L, which was based on selection of the two lowest-ranking

Figure 2
Relative performance of selection group Syn-1 synthetics (All SelPop in Table 2) for dry matter yield (DMY) averaged across three selection populations (SelPopI, II, V) and multiple harvests (n=7) at two locations. Data are best linear unbiased estimators (BLUEs), error bars are SE. A P WF GS = among P and-within half-sibling family GS selection, (P = phenotypic, GS = genomic selection) selecting for high DMY; A P WF GS -L = among P and-within halfsibling family GS selection, selecting for low DMY; HS P = half-sibling family selection; Base = source population. to Base (%) A P WF GS +68 +13 +50 +43

Faville et al., Empirical assessment of a genomic breeding strategy in perennial ryegrass
Data are BLUEs (±SE) for DMY across multiple seasonal harvests ( n=7) and two evaluation environments. DMY with different letters within a column (SelPop) are significantly different (P< 0.05), supported by least significant difference (LSD 0.05 ). A P WF GS = among P and-within half-sibling family GS selection, (P = phenotypic, GS = genomic selection) selecting for high DMY; A P WF GS -L = among P and-within half-sibling family GS selection, selecting for low DMY; HS P = half-sibling family selection; Base = source population.
The impact of A P WF GS was exemplified further by comparison with a divergent selection, A P WF GS -L, which was based on selection of the two lowestranking individuals per HS as opposed to the two best plants. In all selection populations, except SelPopI, DMY performance of the A P WF GS -L SG was significantly (P<0.05) lower than A P WF GS (Table 2). When averaged across all selection populations (All SelPop), there was a 39% DMY differential between these SG (Table 2, Figure 2) and, in most cases, the DMY of A PWF GS -L was close to that of the unimproved Base population. This result indicated that a high level of additive genetic variation for DMY existed within HS and illustrated the positive impact of being able to identify and eliminate poor candidates, and, hence, undesirable alleles, from the breeding programme.
Due to external factors, the field evaluation period was shorter than the three years typically employed for DMY assessment of perennial ryegrass (Easton et al., 2001). However, previous studies based on plots showed that ranking of trial entries was reliably consistent when comparing first year performance and DMY in later years (Easton et al., 2001;Chapman et al., 2015). Stability of diploid synthetics, according to the Hardy-Weinberg rule, is achieved from the Syn-2 generation and generations thereafter, provided mating is completely at random and there is no selection pressure (Allard 1960). The SG synthetics evaluated in the current study were a Syn-1 generation, therefore it was possible that the results were influenced by non-additive genetic effects, notably heterosis. This was mitigated by ensuring that selection sizes within HS were balanced, using the same number of Syn-0 individuals (n = 12) from each HS for each of the A P WF GS , HS P and A P WF GS -L SG's. However, the Base selections used 30 randomly-sampled Syn-0 parents -which was to enable capture of the overall withinpopulation genetic diversity of the population (Kubik et al., 2001). In perennial ryegrass, a slight decrease in yield from Syn-1 to Syn-2 is often observed and the extent of decline depends on the number of parents in the Syn-0 and the inbreeding co-efficient of these parents (Wright 1922;Allard 1960;Becker 1988). Due to the discrepancies in Syn-0 parent numbers, a smaller decline in performance can be expected in the Syn-1 Base population relative to the nine other Syn-1 SG's, when moving to Syn-2. However, given i), the number of parents in the nine selected Syn-0 SG populations were still quite high, ii), the Syn-0 parents were unlikely to be inbred, and iii), the proportion of non-additive genetic variance to additive genetic in ryegrass is thought to be minor (Breese and Hayward 1972), any upwards bias in performance of the Syn-1 SG populations was, most likely, negligible. Future development and assessment of Syn-2 generations from the 12 SGs, over a longer evaluation period, is required to validate the current results and enable fair comparison against industry cultivars.
The five genomic prediction models used in this study had low to moderate predictive ability, r A . Additional improvement in selection response from the A P WF GS breeding strategy may be expected by increasing the r A of the genomic prediction models used. For example, modelling by  estimated a 50% increase in ∆G for DMY by improving r A from 0.27 to 0.50 at a fixed selection pressure. Utilising larger training sets to develop the prediction models is one way to achieve higher r A and Esfandyari et al. (2020) demonstrated that, in a long term breeding programme, r A could be increased by enlarging the training set through recruitment of data from multiple breeding cycles started in consecutive years. That approach is compatible with the breeding system described here. Greater gains would also be expected through the application of higher selection pressures than those tested in the current study .

Conclusions
The results provided empirical evidence that a GS breeding approach (A P WF GS ), when applied as an adjunct to a conventional breeding strategy (HS P ), considerably improved selection response for a genetically complex trait from a single breeding cycle. This was achieved not by reducing generation interval, but rather by improving selection accuracy and more accurately exploiting within-family additive genetic variation. It can be recommended that the seed industry adopt genomic breeding, supported by continued research to improve efficiency and accuracy, as well as on-farm evaluation to monitor real-world impact.