REPEATABILITY ANALYSIS OF GUAVA FRUIT AND LEAF CHARACTERISTICS ANÁLISE DE REPETIBILIDADE DE CARACTERÍSTICAS DE FRUTOS E FOLHAS DE GOIABA

: Psidium guajava L. (guava) is an important species that presents high genetic variability due to its mixed reproductive system, which is desired in breeding programs. Repeatability is an important tool for the selection of genotypes in pre-breeding studies. When genetic variability is present, the knowledge regarding the number of samples to be used in repeatability studies is indispensable. This study aims to determine the number of necessary measures while optimizing resources and maintaining the reliability of the results for the variables evaluated in P. guajava . The experiment was carried out with genotypes from three Brazilian States: Espírito Santo, São Paulo, and Minas Gerais, and a total of 79 P. guajava genotypes were collected. The following characteristics were evaluated: young leaf length and width; developed leaf length and width; fruit length; fruit diameter and fruit cavity diameter; and fruit weight and pulp weight. For the evaluated characteristics, deviance, permanent phenotypic and temporary environment variance, coefficients of repeatability and determination, accuracy and the number of estimated measurements required were determined. We established that the number of measurements required in repeatability analysis for a coefficient of repeatability with a reliability of 80% is four, for the measurements of developed leaf width, pulp weight, fruit diameter, and fruit cavity diameter.


INTRODUCTION
Psidium guajava L. (guava, Myrtaceae), native to the American continent, has a wideranging origin, from Mexico to Peru and Brazil (PEREIRA;KAVATI, 2011). This species has great importance in traditional medicine because of the presence of extracts and metabolites in its leaves and fruits, (GUTIÉRREZ et al., 2008) and is popular in many tropical and subtropical countries (RODRÍGUEZ et al., 2010). The leaves of P. guajava act as antimicrobial agents and are also used in the treatment of coughs (JAIARJ et al., 1999). The guava fruit is rich in vitamin C and has high nutritional value, containing essential amino acids, dietary fiber, pectin, and antioxidants (GUTIÉRREZ et al., 2008).
Psidium guajava presents high genetic variability due to its mixed reproductive system (ALVES; FREITAS, 2007), from the germination of seedlings from seeds of heterozygous parent plants (PESSANHA et al., 2011) to its ability to adapt to different soil and climatic conditions. While high genetic variability is desired in breeding programs, it is important to note the number of samples showing such variability (MANFIO et al., 2011) in order to optimize cost, time, and labor, without the loss of selective efficiency in the later stages (CHIA et al., 2009). Thus, the coefficient of repeatability that determines the appropriate number of samples should always be estimated (CRUZ et al., 2012).
Repeatability is an indispensable tool in the selection of genotypes in pre-breeding as it estimates the maximum value that heritability can reach by expressing the proportion of the phenotypic variance attributed to the genetic differences, compounded with the permanent effects that act on the variety (FERREIRA et al., 2010).
For this reason, our goal was to determine the number of necessary measures, optimize resources, and maintain the reliability of the results for nine characteristics analyzed in P. guajava.

MATERIAL AND METHODS
We studied the genetic material from 79 P. guajava genotypes collected in three different Brazilian states (Espírito Santo, Minas Gerais, and São Paulo). Of the evaluated samples, 61 were from the Southern and Caparaó regions in the state of Espírito Santo (Southern region: Cachoeiro de Itapemirim, Jeronimo Monteiro, Mimoso do Sul and Muqui; Caparaó region: Alegre and Guaçuí); Ten were from the Alta Paulista region in the state of São Paulo (Arco-Íris, Herculândia, and Tupã), and 8 were from the Caparaó region in the state of Minas Gerais (Caparaó). The climate of the state of Espírito Santo is classified as Aw (except for the city of Guaçui, which is classified as Cfa), the state of Minas Gerais is classified as Cwb and São Paulo is classified as Cfa (Koppen, 1928).
Five leaves and five fruits were evaluated in each genotype for the following characteristics: young leaf length and width (YLL and YLW); developed leaf length and width (DLL and DLW); fruit length (FL); fruit diameter and fruit cavity diameter (FD and FCD), measured using a digital pachymeter of 0.01 mm of precision; and fruit weight and pulp weight (FW and PW) measured using an electronic scale of 0.01 g of precision.
The evaluated characteristics were analyzed on deviance, permanent phenotypic variance ( ), variance of the temporary environment ( ), coefficients of repeatability ) and determination ( ), and the predicting number of measurements required (ƞ 0 ) for an adequate estimation of the individual's real value.
The deviance analysis was performed through the restricted maximum likelihood method (REML) using the basic repeatability model that assumes no design and can be written in matrix form through equation [ where: y denotes the vector of the variable to be analyzed; m denotes the vector of measurement effects assumed as fixed and added to the overall average; p denotes the vector of permanent phenotypic effects assumed as random; ɛ denotes the vector of random errors; X is the incidence matrix for the fixed effects; Z is the incidence matrix for the permanent phenotypic effects; V and Y are the variance and covariance matrices, respectively; P denotes the variance and covariance matrix for the permanent phenotypic effects; R denotes the residual variance and covariance matrix; denotes the variance of the permanent phenotypic values (genetic variation + permanent environment variation); and denotes the residual variance due to the temporary environment.
Based where: ln(L) denotes the maximum point of the logarithmic function of the restricted maximum likelihood (PATTERSON; THOMPSON, 1971); y denotes the vector of the variable to be analyzed; m denotes the vector of measurements' effects assumed as fixed and added to the overall average; X is the incidence matrix for the fixed effects; V and Y denote the variance and covariance matrices, respectively.
The estimates of the statistics of the likelihood ratio test (LRT) for the variables in the study were obtained by [7] (RESENDE, 2002a): where: denotes the estimation of the maximum point of the restricted likelihood function for the reduced model (without the permanent phenotypic effects) and denotes the estimation of the maximum point of the restricted likelihood equation for the complete model (with the permanent phenotypic effects).
The methods adopted to estimate the coefficients of repeatability were restricted maximum likelihood (REML) using the basic repeatability model that assumes no design (RESENDE, 2002a), principal components based on the correlation (PCC) and covariance matrices (PCCV) (ABEYWARDENA, 1972), and structural analysis based on the correlation (SAC) and covariance matrices (SACV) (MANSOUR et al., 1981) among the repeated measures.
Based on the adopted model [1], the coefficient of repeatability is estimated in [8] (RESENDE, 2002b): [11] where: denotes the estimation of the residual variance that can be obtained by the iterative estimator shown in [9], where N is the total number of data and p (X) is the rank of the matrix X; denotes the estimation of the variance of the permanent phenotypic values (genetic variation + permanent environment variation) which can be obtained by the iterative estimator indicated in [10], where tr represents the matrix trace operation and S the number of columns of matrix Z.
In addition, this estimator depends on [11], which is the sub-matrix of the generalized inverse of the coefficient's matrix of the equations of mixed models.
In the PCC method, the coefficient of repeatability is estimated by [12] (RUTLEDGE, 1974): [13] [14] where: denotes the number of measurements carried out in the experiment; -denotes the estimation of the highest eigenvalue associated with the estimation of the correlation matrix between the repeated measures ( ) [13] and this estimation is obtained by solving the model shown in [14].
The PCCV method uses the calculation in [15] to estimate the coefficient of repeatability (MORRISON, 1976): [17] [18] where: denotes the number of measurements carried out in the experiment; [16] denotes the estimator for the sum of the residual variance with the permanent environment variance; denotes the estimation of the highest eigenvalue associated with the estimation of the covariance matrix between repeated measures ( ) [17] and this estimation is obtained by solving the model shown in [18].
The estimation of the coefficient of repeatability by the SAC method is possible by [19] (MANSOUR et al., 1981): where: denotes the autovector associated with the highest eigenvalue of the correlation matrix estimate between repeated measures ( ); denotes the number of measurements carried out in the experiment; denotes the represents the estimates of the correlations between repeated measures.
The SACV method uses the calculation in [20] to estimate the coefficient of repeatability (MANSOUR et al., 1981): where: is the autovector associated with the highest eigenvalue of the covariance matrix estimate between repeated measures ( denotes the estimator for the sum of the residual variance with the permanent environment variance; denotes the number of measurements carried out in the experiment; denotes the represents the covariance estimates between repeated measurements. The coefficient of determination estimative ( ), which represents the certainty of the prediction of the individuals' real value for the analyzed variables based on performed measures, was obtained by [21] (CRUZ et al., 2012): where: denotes the number of measurements carried out in the experiment; denotes the estimation of the coefficient of repeatability.
The calculation of the number of measurements required (ƞ 0 ) for the prediction of the individuals' real value for the analyzed variables was obtained by equation [22] (CRUZ et al., 2012): where: is the coefficient of determination; is the estimation of the coefficient of repeatability.
All analyses present in this study were performed using the R program (R, 2013).

RESULTS AND DISCUSSION
The difference between the deviances of the reduced and complete models, known as the likelihood ratio test (LRT), demonstrated a significant difference for the characteristics YLW, DLL, DLW, FW, PW, FL, FD, and FCD at a 5% probability level (χ 2 ) for the joint and individual analyses of the genotypes from Espírito Santo (ES), São Paulo (SP), and Minas Gerais ( Table 1).
The existence of significant difference demonstrates the presence of genetic variability in the studied populations of P. guajava (Table 1). According to Resende (2002a), the procedure of the deviance analysis (ANADEV), using the χ 2 statistic to test the LRT, is scientifically recommended for the random effects of the model as it can prove the existence of the effects' variability.
The YLL variable was discarded from the joint and individual repeatability analyses of the genotypes from Espírito Santo, São Paulo, and Minas Gerais as it did not present significant difference by the analysis of deviance (Table 1).
The two variables of developed leaf (DLL and DLW) and five fruit variables (FW, PW, FL, FD, and FCD) presented permanent phenotypic variance ( ) higher than the temporary environment variance ( ), for the variable YLW in the joint and individual analysis (Table 1).
These results are important in breeding programs as the lessened influence of the temporary environment on the expression of these characteristics can provide genetic gains.
The methods used for estimating the coefficient of repeatability ( ), in the joint repeatability analysis, showed close values for DLL, DLW, FL, and FD (0.66, 0.60, 0.62, and 0.66, respectively) ( Table 2). This measure refers to the proportion of the total variance, which is explained by the variations provided by the genotype regarding the environment (CRUZ; REGAZZI, 2012) and also refers to the upper limit of the coefficient of heritability (FALCONER, 1987).
The estimates of the joint and individual repeatability analysis of the genotypes from Espírito Santo, São Paulo, and Minas Gerais presented similar values when the methods REML and SACV were compared when analyzing the same characteristic, and the methods PCC and SAC also presented close values when the same characteristic was analyzed (Tables 2, 3, 4, and 5). This suggests that if using both methods, when the results are the same, it is possible to choose either the REML method or the SACV method to study the rest of the characteristics. The same was observed for the PCC and SAC methods.
The YLW and FD characteristics resulted in a lower than 0.5 and higher than 0.6, respectively, in all methods applied for the joint and individual analyses of the genotypes from Espírito Santo, São Paulo, and Minas Gerais. When high values of repeatability estimates are obtained for a certain characteristic, it means that it is feasible to predict the real value of the individual by using a relatively small number of measurements, and when the repeatability is low, the inverse occurs (CARGNELUTTI; CASTILHOS, 2004). According to Resende et al. (2002b), the coefficients of repeatability above 0.6 are considered high.
In general, the evaluated variables presented a R 2 superior to 80% and a higher accuracy of 0.9 except for YLW. For the FW variable in Minas Gerais, the REML and SACV methods resulted in high accuracy (Table 2, 3, 4, and 5). According to Resende et al. (2008), an accuracy of 0.9 ≤ accuracy ≤ 1 is considered very high and an accuracy of 0.7 ≤ accuracy <0.9 is considered high.
In the joint analysis, the number of required measurements (ƞ 0 ) of 80%, 85%, and 90% presented seven characteristics with constant values between the applied methods (REML; PCC; PCCV; SAC; SACV), with the exception of FL with ƞ 0 of 85 % and FL and FCD with ƞ 0 of 90% (Table 2). Corroborating these results, Manfio et al. (2011) observed the same synchrony between the different methods tested when working with macaúba fruits (Acrocomia aculeata).
According to the individual analysis for the Espírito Santo state, the ƞ 0 parameter of 80%, 85%, and 90% presented seven characteristics with constant values between the applied methods (REML; PCC; PCCV; SAC; SACV), except for FCD with ƞ 0 of 85% and PW and FL with ƞ 0 of 90% (Table 3). In São Paulo for ƞ 0 of 80%, 85%, and 90%, three characteristics (DLL, DLW, and DD) presented constant values between the adopted methods (Table 4). AS-All the States together; 1 SV -Sources of variation, PFE -permanent phenotypic effect (genetic causes + permanent environment), CM -complete model, LRT -likelihood ratio test, permanent phenotypic variance and -temporary environment variance; + Deviance of the adjusted model without the referred effect, ++ Deviance of the completely adjusted model. **, * Significant through the χ 2 test, with 1 degree of freedom, at the levels of 1% and 5% of probability, respectively. Table 2. Estimates of the coefficient of repeatability ), coefficients of determination , accuracy ( ) and number of necessary measurements (ƞ 0 ), for the characteristics: young leaves width (YLW), developed leaves length and width (DLL and DLW), fruit and pulp weight (FW and PW), fruit length (FL), fruit and fruit cavity diameter (FD and FCD). Joint analysis of 79 P. guajava accessions.
According to Manfio et al. (2011), the synchrony between the different tested methodologies provides a greater reliability to the results; according to Ferreira et al. (2005), this synchrony also refers to a good genetic control of these characteristics.
In general, for the ƞ 0 of 80%, it would not be necessary to evaluate all five replicates for four characteristics (DLW, PW, FD, and FCD), and for DLW, PW, and FCD, 2 to 4 repetitions would suffice. In fact, for the characteristic PW, two measurements would be enough (Tables 2, 3, 4, and 5). According to Chia et al. (2009), within the precision levels acceptable by the author, the reduction in the evaluation period and measurements must be sought for saving resources and time. Thus, the REML and SACV methods presented the same coefficient of repeatability in the joint and individual analysis of the genotypes from Espírito Santo, São Paulo, and Minas Gerais for all variables. The YLW characteristic presented a low coefficient of repeatability. The individual analysis of genotypes from São Paulo and Minas Gerais presented a higher effect of the temporary environment. The FCD characteristic presented a very high coefficient of repeatability, coefficient of determination, and accuracy in the joint and individual analysis of genotypes from Espírito Santo, São Paulo, and Minas Gerais. In conclusion, the number of measurements required for a repeatability coefficient that optimizes time and labor and demonstrates the minimum reliability required or the estimation of reliability of 80% is four, and the characteristics measured would be DLW, PW, FD, and FCD. Council of Scientific and Technological Development (CNPq) for the financial support.