Alternative approach for precise and accurate Student ́ s t critical values and application in geosciences

We applied an alternative Monte Carlo simulation approach to obtain precise and accurate critical values (with 3 to 8 decimal places) for the Student ́s t test for normal samples of degrees of freedom up to 6000 and for two-sided confidence levels of 50% to 99.9% and correspondingly for one-sided confidence levels of 75% to 99.95%. As an innovation, unlike the existing literature precision estimates of critical values are also individually reported. Prior to our work, critical values (with 2 or 3 decimal places) were available in tables in the published literature. Twenty-eight regression models were evaluated to obtain the best regression fitting of the tabulated data. All conventional polynomial (from quadratic up to 8th order) regressions failed for this purpose. New equations based on double or triple natural logarithm-transformations are proposed for all not-tabulated degrees of freedom, including fractional values if required as well as for probability computations. More importantly, we suggest that these kinds of log-transformations are likely to be useful for all cases where conventional polynomial regressions fail to perform satisfactorily. To demonstrate the utility of these new critical values and best-fit equations, we provide specific examples of applications in geochemistry, chemistry and medicine. The probability estimates, obtained from the present explicit approach, are consistent with commercial and freely available software. Additionally, a more extensive application of our t values is given for processing inter-laboratory data for geochemical reference materials granites G-1 and G-2 from U.S.A. as well as for evaluating geochemical data for basic rocks from the Canary and Azores Islands.


Introduction
The Student´s t significance test is among the most widely used statistical methods for comparing the means of two statistical samples (e.g., univariate data arrays in geosciences) as documented in several books (e.g., on geography by Ebdon, 1988; petroleum research and geosciences by Jensen et al., 1997; analytical chemistry or chemometrics by Otto, 1999 andMiller andMiller, 2005; biology and geology by Blaesild and Granfeldt, 2003; physical sciences by Bevington and Robinson, 2003; geochemistry or geochemometrics by Verma, 2005; criminology and justice by Walker and Maddan, 2005; all sciences by Kenji, 2006).
For application of the t test, critical values or percentage points are required for the pertinent degrees of freedom and the chosen confidence or significance level, generally being 95% or 0.05 (Miller and Miller, 2005) or 99% or 0.01 (Verma, 1997(Verma, , 2005(Verma, , 2009)).Because statistical tests, including the Student t test, are applied at a pre-established confidence level, the accuracy and precision of critical values is of utmost importance, especially when the test probability estimates for two samples are close to the chosen level for the "decision" of hypotheses, as illustrated in the present work.
Similarly, the significance of linear correlation of two variables or vectors can be tested through the transformation of Pearson´s linear correlation coefficient r to a Student´s t value using the equation where n is the number of sample pairs being regressed (Fisher, 1970;Miller and Miller, 2005).The absolute value of t calculated from the above equation is compared with the corresponding critical value for t (for ) 2 (n degrees of freedom) at a chosen confidence level from the same Student´s t tables.Although the parameter r can be directly used for testing the statistical significance of linear correlations, the limited nature of the tabulated critical values of r (e.g., Bevington and Robinson, 2003;Verma, 2005), for example, their scarceness for n > 100, should make the application of t test more appropriate and versatile.
With the availability of modern analytical techniques, it is now possible to generate analytical data with greater precision than was possible in the past.A freely available software R ( R Development Core Team, 2009) can be used to generate precise critical values to more decimal places than two or three currently available in tables of standard textbooks.Nevertheless, all currently available t values, including those in the software R, have been traditionally calculated from the consideration of Student´s t distribution.According to the sampling theory, Student´s t value represents the critical difference at a given confidence level between two small or finite samples drawn from a normal (Gaussian) distribution.
In the present paper, we used this alternative Monte Carlo type approach to simulate new precise and accurate critical values for the t test.Because such values could not be generated in a reasonable time for all sample sizes, we resorted to obtaining best-fit polynomial equations based on double and triple natural logarithm-transformations for interpolation or extrapolation of critical values as well as for probability estimates.Our results are fully consistent with the traditional approach, but our approach teratura publicada se disponía de los valores críticos con solamente 2 ó 3 puntos decimales.Veintiocho modelos de regresión fueron evaluados con el propósito de obtener el mejor ajuste de los datos tabulados.Todos los modelos convencionales con polinomio de cuadrático hasta la potencia 8 fracasaron para este propósito.Ecuaciones nuevas basadas en transformaciones logarítmicas de tipo doble o triple han sido propuestas para estimar los valores críticos correspondientes a grados de libertad no-tabulados, incluyendo grados no-enteros si esto fuese necesario, así como también para las computaciones de probabilidad.Aún más importante sería el hecho de que estos tipos de transformaciones logarítmicas podrían ser útiles en todos aquellos casos donde las regresiones polinomiales convencionales fracasan.Con el fin de demostrar la utilidad de los nuevos valores críticos y las ecuaciones mejor ajustadas, se proporcionan ejemplos específicos de aplicación en geoquímica, química y medicina.Las estimaciones de probabilidad obtenidas del presente método explícito son consistentes con software comercial y de acceso libre.Además, una aplicación más extensa de nuestros valores de t se presenta para procesar datos entre-laboratorios de los materiales de referencia geoquímica granitos G-1 y G-2 de los Estados Unidos de Norte-América, así como para evaluar los datos geoquímicos de rocas básicas provenientes de las Islas Canarias y de Azores.
Palabras clave: Simulación Monte Carlo, prueba t de Student, regresiones polinomiales, método estadístico, material de referencia geoquímica is more explicit especially for probability calculations.We also discuss the application of Student´s t test to three case studies that highlight the importance of precise and accurate t values.Additional examples provide detailed account of arriving at central tendency and dispersion parameters for chemical variables in the geochemical reference materials granites G-2 and G-1 from U.S.A., as well as comparison of geochemical compositions of basic rocks from the Canary and Azores Islands.

Monte Carlo simulation of Student´s t critical values
This procedure has been recently used for generating precise and accurate critical values of 33 discordancy test variants (Verma and Quiroz-Ruiz, 2006a, 2006b, 2008, 2011;Verma et al., 2008), which have been useful for overall efficiency evaluation of these tests (González-Ramírez et al., 2009;Verma et al., 2009) as well as for application of discordancy tests to experimental data.A different application of our Monte Carlo procedure deals with the evaluation of nuclear reactor performance (Espinosa-Paredes et al., 2010) and for evaluation of error propagation in ternary diagrams (Verma, 2012).
Our modified Monte Carlo type simulation procedure can be summarised in the following six steps: 1) Generating random numbers uniformly distributed in the space (0, 1), i.e., samples from a uniform U(0, 1) distribution: The Marsenne Twister algorithm of Matsumoto and Nishimura (1998) was employed, because this widely used generator has a very long (2 19937 -1) period, which is a highly desirable property for such applications (Law and Kelton, 2000).Thus, a total of 20 different and independent streams were generated, each one consisting of at least 100,000,000 or more random numbers (IID U(0, 1)).In this way, more than 2000,000,000 random numbers of 64 bits were generated.
2) Testing of the random numbers if they resemble independent and identically distributed IID U(0, 1) random variates: Each stream was tested for randomness using two-and three-dimensional plot method (Marsaglia, 1968;Law and Kelton, 2000).The simulated data uniformly filled the (0, 1) space as required by this randomness test in both two-and three-dimensions.Another test for randomness was also applied, which checks how many individual numbers are actually repeated in a given stream of random numbers, and if such repeatnumbers are few, the simulated random numbers can be safely used for further applications.On the average, only around one number out of 100,000 numbers in individual streams of IID U(0, 1) was repeated.Between two streams, the repeat-numbers were, on the average, around three in 200,000 combined numbers, amounting to about 150 in the combined total of 10,000,000 numbers for two streams.Thus, because the repeat-numbers were so few, all 20 streams were considered appropriate for further work.
3) Converting the random numbers to continuous random variates for a normal distribution N(0, 1): The polar method (Marsaglia and Bray, 1964) was employed instead of the somewhat slower trigonometric method (Box and Muller, 1958).Further, any other faster scheme such as the algorithm proposed by Kinderman and Ramage (1976) was not explored, because the polar method was fast enough for our purpose; furthermore, this method uses two independent streams of random numbers for generating one stream of normal random variates, which we considered an asset for our work.Two parallel streams of random numbers (R 1 and R 2 ) were used for generating one set or stream of IID N(0, 1) normal random variates.Thus, from 20 different streams of IID U(0, 1) and by dividing them into two sets of 10 streams, 100 sets or streams of N(0, 1) were obtained, each one of the size ~100,000,000 or more.These N(0, 1) streams were found to be useful for simulations of critical values.The simulated data were graphically examined for normality (Verma, 2005).Practically, no repeat-numbers were found in tests with 100,000 numbers in these sets of random normal variates.Therefore, the data were considered of high quality to represent a normal distribution, and could be safely used for further applications.
4) Establishing the best simulation sizes: In order to determine the best simulation sizes, the results of mean critical values for 60% to 99.9% (two-sided confidence limits) and their respective standard error estimates were simulated for degrees of freedom (n ) of 1, 2, 5, 10, 20, and 30, for 13 different simulation sizes between 10,000 and 100,000,000, and using only 10 independent streams of IID N(0, 1) normal random variates.Representative results for ν = 20 are summarised in Figure 1, in which the mean critical values are shown by open circles and the standard errors by vertical error bars.Figure 1 shows that the critical values tend to stabilize as the standard errors sharply decrease with the simulation size increasing from 10,000 to 100,000,000.Therefore, for all final reports the simulation sizes were set at 100,000,000 for all degrees of freedom.n and y n are, respectively, the sample sizes for the two statistical samples under consideration.The test As an example, for ν = 10 x n can vary from 1 to 6, with the corresponding y n varying from 11 to 6, thus obtaining 6 combinations which, when multiplied by the total number (100) of N(0, 1) streams, can provide 600 possible results of this t-statistic (ν = 10).For smaller values of n , there will be less number of such combinations statistic is given by the following equation (Verma, 2005): where y x -is the absolute difference between the two mean values, and s is the combined standard deviation of the two samples or data arrays.The parameter s was calculated as follows: -Evaluación de los efectos de los tamaños de simulación desde 10,000 hasta 100,000,000 sobre los valores críticos simulados (media y su error estándar) de t de Student para niveles de confianza (CL) de dos colas o dos lados diferentes y para los grados de libertad (ν) de 20.a) CL de 60%; b) CL de 80%; c) CL de 90%; d) CL de 95%; e) CL de 98%; f) CL de 99%; g) CL de 99.8%; y h) CL de 99.9%.
and vice versa.As another example, we can quote ν = 20 x n can vary from 1 to 11, with the corresponding y n varying from 21 to 11, thus obtaining 11 combinations which, when multiplied by the total number (100) of N(0, 1) streams used in the present simulations, can provide 1100 possible results of this t-statistic (ν = 20).Each set of calculations was carried out 100,000,000 times (as determined in this study; Figure 1).6) Inferring critical values and evaluating their reliability: Critical values (percentage points) were computed for each of the possible sets of 100,000,000 simulated test statistic values for sample sizes of 1(1)30( 5)100(10)200( 50) 400(100)1000( 200)2000(1000)6000.For example, for ν = 10, 600 such sets were used.Each set of 100,000,000 t-statistic results were arranged from low to high values and critical values or percentage points were extracted for a total of 11 confidence levels (both two-sided and one-sided) from 50% to 99.9%.These were: confidence levels (two-sided) = 50%, 60%, 70%, 80%, 90%, 95%, 98%, 99%, 99.5%, 99.8%, and 99.9%, i.e., with signifi-cance levels a = 0.50, 0.40, 0.30, 0.20, 0.10, 0.05, 0.02, 0.01, 0.005, 0.002, and 0.001, as well as correspondingly one-sided of 75% to 99.95% with significance levels a of 0.25 to 0.0005.The final overall mean (central tendency) as well as standard deviation and standard error of the mean (dispersion) parameters for Student´s t were computed from these sets of values.The standard error and the corresponding mean values were rounded following the flexible rules put forth by Bevington and Robinson (2003) and Verma (2005).

Polynomial fits for Student´s t critical values
When a tabulated critical value is missing for a given n and a , as is the case of the present work, interpolation or extrapolation of the available critical values is required.There is no clear indication on how this was done in the past except that, in an attempt to generate the best interpolated critical values, natural logarithmtransformation of n was proposed (Verma, 2009)    means for obtaining highly precise interpolations using Statistica © software.We used highly precise critical values generated in this work to test 28 different regression models for obtaining both the best-fit interpolation and extrapolation equations.These models consisted of simple (i.e., without natural logarithm-transformation of n ) polynomial regressions (quadratic to 8 th order) to single natural-logarithm of n ) ) (ln(n , double naturallogarithm of n ))) (ln(ln(n and triple natural-logarithm of n )))) (ln(ln(ln(n transformed quadratic to 8 th order regressions.The best-fit equations were obtained from the combined criteria of four different fitting quality parameters: (i) the multiple-correlation coefficient (R 2 ; Fig. 2); (ii) averaged sum of the squared residuals SSR/N where N is the total number of n used in the regression model (Fig. 3); (iii) averaged sum of the squared residuals of interpolation (SSR/N) int where N is the total number n not used in obtaining the regression model that lie within the range of all n values of the regression model (Fig. 4); and (iii) averaged sum of the squared residuals of extrapolation (SSR/N) ext where N is the total number of n that lie outside the range of all n values of the regression model (Fig. 5).

Polynomial fits for Student´s t probability calculations
The simulated critical values for Student´s t were used to propose an explicit method based on best fit equations for the computation of probability or confidence level estimates of Student´s t test corresponding to any two sets of statistical samples.Such probabilities can be calculated through commercial or freely available software packages (e.g., see Miller and Miller, 2005;Efstathiou, 2006), but the exact procedure is largely unknown.We present new "best-fit" equations based on these highly precise and accurate critical values for computing probabilities or confidence levels from them.Thus, the performance of these equations can be compared with the existing commercial software packages.The advantage is that our method is explicit and calculates the confidence level (%) that would correspond to the calculated t value for any two statistical samples.t 50 -critical value of t for twosided (ts) 50% confidence level; cv os t 75 -critical value of t for one-sided (os) 75% confidence level; and similar symbols are used for other columns.The more frequently used confidence levels are marked in boldface.Tabla 1.-Forma abreviada de la tabla de valores críticos simulados para la prueba de t de Student.Las abreviaturas son las siguientes: valor crítico de t para dos colas (ts) nivel de confianza 50%; cv os t 75 -el valor crítico de t para una cola (os) nivel de confianza 75%; y símbolos similares se usaron para las otras columnas.Los niveles de confianza más usados han sido resaltados en negrillado.t 50 -critical value of t for two-sided (ts) 50% confidence level; cv os t 75 -critical value of t for one-sided (os) 75% confidence level; and similar symbols are used for other columns.The more frequently used confidence levels are marked in boldface.Tabla 2.-Valores del error estándar para valores críticos simulados para la prueba de t de Student.Las abreviaturas son las siguientes: cv ts t 50 -el valor crítico de t para dos colas (ts) nivel de confianza 50%; cv os t 75 -el valor crítico de t para una cola (os) nivel de confianza 75%; y símbolos similares se usaron para las otras columnas.Los niveles de confianza más usados han sido resaltados en negrillado.
Table 3.-Evaluation of best-fit critical value equations obtained from 65 critical values of Student t distribution for degrees of freedom () from 3 to 2000 and 99% confidence level.R 2 -multiple-correlation coefficient; (*) -Next to the best-fit (i.e., second best-fit) according to this particular criterion; [**] -Third best-fit according to this particular criterion; {***} -Fourth best-fit according to this particular criterion.For explanation of ll and lll functions see Table 4.

Critical Values
The results of our simulated critical values are presented in abridged form in Table 1.The respective standard errors are summarised in Table 2.All 76 critical values and their standard errors (Tables ES1 and ES2, respectively) are available on request to any of the authors (spv@ier.unam.mx or recrh@live.com).The actually simulated 67 critical values include the following values of ν that were used for evaluating the different regression models: 1(1)30( 5)100( 10)160( 20)200( 50)400( 100)1000( 200)2000 In this nomenclature, the numbers in parentheses refer to the step-size or increment and the numbers before and after these parentheses are the initial and final ν, respectively, for each increment.The trends of these new critical values are graphically shown in Figure 6, which highlights their non-linear nature in this bivariate plot.Similarly, new critical values were also simulated for arbitrarily chosen ν = 105, 220, 380, 860, and 1100, used for testing the proposed equations for interpolation (see one such equation as inset in Figure 6) as well as for ν = 3000, 4000, 5000, and 6000, used for testing of extrapolation equations.Thus, for each of the 11 confidence levels, a total of 76 critical values along with their respective standard errors were simulated (Tables ES1 and ES2).
Our critical values for Student´s t are consistent with those estimated by software R following standard methods (Fig. 7).Whereas the use of software R requires some programming work, our values are readily available in tabulated form as well as in electronic files.Both confidence levels of 95% and 99% most used in science and

Statistical decision criteria for regression models and equations
Fitting quality parameter (complete data in Fig. 2) Mean squared residuals from 65 fitted simulated data (complete data in Fig. 3) Mean squared residuals from 5 interpolated simulated data (complete data in Fig. 4) Mean squared residuals from 4 extrapolated simulated data (complete data in Fig. 5  -critical value of t for one-sided (os) 75% confidence level; and similar symbols are used for other columns.The more frequently used confidence levels are marked in boldface.Tabla 2.-Valores del error estándar para valores críticos simulados para la prueba de t de Student.Las abreviaturas son las siguientes: cv ts t 50 -el valor crítico de t para dos colas (ts) nivel de confianza 50%; cv os t 75 -el valor crítico de t para una cola (os) nivel de confianza 75%; y símbolos similares se usaron para las otras columnas.Los niveles de confianza más usados han sido resaltados en negrillado.CL de 80%; e) CL de 90%; f) CL de 95%; g) CL de 98%; h) CL de 99%; i) CL de 99.5%; j) CL de 99.8%; y k) CL de 99.9%.
engineering applications as well as the extreme confidence levels of 50% and 99.9% are shown in Figure 7.For 50%, 95% and 99%, critical values of the present work agree with those calculated by software R within about 0.002%.For the extremely high confidence level of 99.9%, these differences reach higher values but are mostly within 0.005%.Our new critical values (Table 1 or ES1) are individually characterised by their standard error estimates (Table 2 or ES2).Nevertheless, we can conclude that the alternative approach of Monte Carlo simulation gives t critical values consistent with those obtained from the Student´s t distribution (software R).
In other words, we have empirically confirmed through high precision Monte Carlo simulation that the small-size sampling from normal distribution is represented by the Student´s t distribution.

Critical value equations
The results of 28 regression models are summarised in Figures 2-5.Four best models from each of the four criteria (Figs.2-5) are presented in Table 3 for 99% confidence levels (both two-sided and one-sided), whereas complete information for all confidence levels is given in Table ES3 (available from any of the authors), in which the best interpolation and extrapolation models are highlighted.None of the simpler polynomial regressions (a total of 7 models) was satisfactory (see models identified as q to p8 in Figures 2-5).New methodology of natural logarithm-(ln-) transformation of n provided better models (see the remaining 21 models in Figures 2-5), although single logarithm-transformation ) ) (ln(n was not satisfactory (see models lq to l8).Generally, the double-  ext donde N=4, siendo los cuatro grados de libertad 3000, 4000, 5000, y 6000 listados en la segunda parte de la Tabla ES1; los valores de t para estos grados de libertad no fueron usados para obtener el modelo de regresión) para el ajuste de 65 valores críticos de t de Student para diferentes niveles de confianza de tipo dos colas (CL) y para grados de libertad (ν) de 3 a 2000; los modelos de regresión son los mismos que en la Figura 2. Favor de ver las Tablas ES3 y ES4 para mayores detalles.a) CL de 50%; b) CL de 60%; c) CL de 70%; d) CL de 80%; e) CL de 90%; f) CL de 95%; g) CL de 98%; h) CL de 99%; i) CL de 99.5%; j) CL de 99.8%; y k) CL de 99.9%.
tions with the 4 th to 8 th order polynomials were the best models (see ll4 to ll8 and lll4 to lll8 in Table 3 or ES3 and Figures 2-5).
The best equations for all confidence levels are presented for 99% confidence level in Table 4 (complete information is provided in Table ES4 available from any of the authors).As expected, the interpolation equations provide better estimates (lower errors) than the extrapolation equations.Nevertheless, both sets of equations can be used for computing the critical values for all those n not included in Table ES1.
Recently, Verma (2012) has graphically shown how the log-transformation of the x-axis (degrees of freedom) provides "smoothing" of these curves, enabling thus a better fit to the data in the log-transformed space.Double or triple log-transformation can make these curves smoother than the simple log-transformation.Such transforma-tions were successfully used by Verma and Quiroz-Ruiz (2008) for polynomial fits of critical values of discordancy tests.We suggest that log-transformations provide an efficient means for obtaining "best-fit" equations in other applications, in which polynomial fits without transformation fail to perform satisfactorily.We emphasise that this should be an important application of our procedure in many scientific and engineering fields.
The version of Student´s t test for unequal variances (Ebdon, 1988;Jensen et al., 1997;Otto, 1999;Miller and Miller, 2005;Verma, 2005;Kenji, 2006) would also be objectively and best applied if we could estimate precise critical value for non-integer n.For such applications, the calculated n nearly always results in a non-integer number, and most text books (e.g., Ebdon, 1988;Verma, 2005) suggest truncating the n value to the integer number.We propose that it would be better to maintain the non-integer n and estimate the corresponding critical value.This would be certainly possible from the use of the critical value equations (Table ES4).The freely available software R also does this job, but it needs certain amount of programming work and the procedure is not as explicit as in this work.

Student´s t confidence level calculations for two statistical samples
As an innovation, we report an explicit method to estimate the "critical" confidence level of Student´s t test corresponding to any two statistical samples.Although such probability estimates can also be obtained from conventional software, the method of these calculations is not stated.Using log-transformation of critical values (Table ES1), we fitted "best" equations to our Student´s t critical values; these equations for a few degrees of freedom are summarised in Table 5 (equations for all simulated degrees of freedom are listed in Table ES5 available from any of the authors).Confidence levels of Student´s t test can be easily calculated by substituting the calculated t value for t calc in the appropriate equation proposed for given degrees of freedom for two statistical samples.
The applications to geochemical reference materials granites G-1 and G-2 from U.S.A. and basic rocks from Canary and Azores Islands presented in the next section will further clarify our proposed method.

Applications
It is clear that if the calculated t value (t calc ) for a set of two statistical samples is widely different from the critical value at a given confidence or significance level and the required degrees of freedom, the statistical interpretation of Student´s t test will not depend on the critical value tables, but if t calc is close to the tabulated critical value, the precision and accuracy of the critical value will largely determine the final interpretation and decision in favour of the null and alternate hypotheses (H 0 and H 1 , respectively), i.e., either H 0 will be accepted and correspondingly H 1 will be rejected, or H 1 will be accepted and correspondingly H 0 will be rejected.We illustrate the importance of our precise critical values through a series of carefully chosen examples.For testing these hypotheses (H 0 and H 1 ) the strict 99% confidence level (two-sided) will be used.Of course, for such applications other confidence levels, such as 95%, could likewise be used, if desired.
The above interpretation cannot be directly compared with any commercial or freely available software, because the latter only provide the probability estimates (or significance level) corresponding to the t calc of the two statistical samples under evaluation.Therefore, we also computed such probability estimates from best-fitted equations (Table 5 or ES5) and compared them with independent estimates obtained from different software -Statistica © , R, and Excel.
Further, for the applications presented here, only crude chemical compositions are evaluated.Nevertheless, the applications can be easily extended to log-transformed data (Aitchison, 1986;Egozcue et al., 2003;Agrawal and Verma, 2007) in order to comply with the coherent statistical treatment of compositional data.

Geochemistry
Sr-isotopic composition ( 87 Sr/ 86 Sr) of rocks provides constraints on geological processes (Faure, 1986(Faure, , 2001)).Therefore, the data quality plays an essential role in quantifying the relative importance of these processes.In this context, an example of 87 Sr/ 86 Sr measured in the geochemical reference material JA-1 (andesite from Hakone volcano) from the Geological Society of Japan (GSJ; Internet address http://riodb02.ibase.aist.go.jp/geostand/) will be used.Let us assume that two independent trials or experiments involving two laboratories (LabA and LabB) were carried out.We further assume that each of these laboratories obtained 11 different measurements on this sample in each trial.For this purpose, we simulated, using our Monte Carlo procedure, fairly realistic data (Table 6) on this sample in the light of the actual measurements compiled by the GSJ.We would like to evaluate for each trial the null hypothesis (H 0 ) that the two statistical samples of 87 Sr/ 86 Sr (measurements from LabA and LabB) were drawn from the same population, i.e., there is no significant difference between them, and the alternate hypothesis (H 1 ) that the two statistical sam-ples of 87 Sr/ 86 Sr were not drawn from the same population, i.e., there is a significant difference between them.
The partial calculations as well as the calculated t and several tabulated critical values are summarised in Table 6.Because, in both trials, the calculated t values are less than the tabulated critical value (Miller and Miller, 2005), the null hypothesis H 0 will be true (or accepted) and, as a consequence, the alternate hypothesis H 1 will be false (or rejected), i.e., there is no significant difference between these sets of data from the two laboratories.Note that the t test is applied at the strict 99% confidence level.If the tabulated critical values by Verma (2005) were used, the interpretation will be just the opposite, i.e., for both trials, there is significant difference between these sets of data from the two laboratories.However, if the statistical inference were drawn from the new precise critical values for Student´s t test, for Trial 1 H 0 will be accepted and H 1 rejected, whereas for Trial 2, H 0 will be rejected and H 1 accepted (Table 6).We therefore conclude that it is safer to use more precise and accurate critical values to draw statistical inferences.
In an analogous manner, we now compare the performance of our work with Statistica © , R, and Excel results (Table 6).For Trail 1, the probability estimates are all >0.01, except for R (=0.01).Because the hypotheses H 0 and H 1 are being evaluated at 99% confidence level or, equivalently, at 0.01 significance level, the probability corresponding to the set of samples represented by Trail 1 should be >0.01 for H 0 to be accepted.Therefore, for Trail 1, Statistica © , Excel and our work suggest that H 0 is accepted and correspondingly, H 1 is rejected.For Trail 2, on the other hand, because the probability estimates from all packages and this work are <0.01 (Table 6), the interpretation would be that H 1 is accepted and correspondingly, H 0 is rejected.

Chemistry
Our second example concerns simulated data for paracetamol concentration in tablets (Miller and Miller, 2005).We envision the experiment either by using two analytical methods (Trial 1) or by two analysts using the same method (Trial 2); the results are summarised in Table 7. Similar to the geochemical application, the inference at 99% confidence level will depend on the critical values used for evaluating these experiments.For Trial 1 (results of two analytical methods; Table 7), using any of the literature critical values (Miller and Miller, 2005;      cepted from Verma (2009) as well as from the present work.The presently available precise critical values and best-fit equations could therefore be advantageously used in all future applications in chemistry.
In terms of probability estimates from this work and their comparison with Statistica © , R and Excel (Table 7), all results are consistent to infer that H 1 is accepted and correspondingly, H 0 is rejected, because all probabilities are <0.01.Verma, 2005Verma, , 2009)), H 0 will be accepted and H 1 will be rejected, whereas in the light of the new interpolated critical value obtained from the best-fit equation (Table 4 or  ES4), H 0 will be rejected and H 1 will be accepted.On the other hand, for Trial 2 (results of two analysts; Table 7), H 0 will be accepted and H 1 will be rejected according to the literature critical values (Miller and Miller, 2005;Verma, 2005), but H 0 will be rejected and H 1 will be ac-Fig.6.-Critical values (t cv ) for different confidence levels (CL) from 50% to 99.9% as a function of the degrees of freedom.The two most important curves for 95% and 99% confidence levels are highlighted as red and green colours, respectively.As an example, the best polynomial double natural logarithmtransformed interpolation equation for two-sided 95% confidence level is shown as inset in this Figure .This example of equation is provided here, so it may become clear under what circumstances such natural logarithm (specifically triple logarithm) transformation may be useful for curve fitting.Fig. 6.-Valores críticos (t cv ) para diferentes niveles de confianza (CL) de 50% a 99.9% como una función de los grados de libertad.Las dos curvas más importantes para niveles de confianza de 95% y 99% son resaltadas con colores rojo y verde, respectivamente.
Como un ejemplo, la mejor ecuación polinomial para la interpolación basada en la transformación doblelogarítmica natural para el nivel de confianza de dos colas de 95% se presenta en el interior de la Figura.
Este ejemplo de una ecuación se presenta aquí con el fin de aclarar las circunstancias, en las cuales este tipo de trasformación logarítmica natural (específicamente logaritmo triple) pueda ser útil para el ajuste de curvas.was applied to these data (Table 9) at the same 95% confidence level as done for DODESSYS.The null hypothesis (H 0 ) was that this work provided the statistically similar standard deviation (from F test) and mean values (t test) as the literature values, whereas the alternate hypothesis (H 1 ) was that this work provided lower or higher standard deviation and lower or higher mean, respectively, from the F and t tests.The following elements showed significantly lower standard deviation for this work as compared to the literature values: Si, Fe, K, Nd, Sm, Ho, Tm, Yb, Co, Li, and Sr.None of the elements showed statistically significantly higher standard deviation for this work than that reported by Gladney et al. (1991).This implies that the multiple test method of Verma (1997) practiced here performs better (provides less dispersion) than the two standard deviation method used by Gladney et al. (1991).The application of t test showed that none of the mean values obtained in the present work was significantly lower or higher than the literature values at 95% confidence level.

Geochemical reference material granite G-2 from U.S.A.
In order to establish central tendency (mean) and dispersion (standard deviation, standard error of the mean, or confidence interval) parameters, the data from different analytical methods should first be evaluated from significance tests (Verma, 1998).However, the application of F and t tests requires that the user assures that the individual groups of data have been drawn from normal populations (see Jensen et al., 1997).Objective ways to achieve this goal can be found in Barnett and Lewis (1994), Verma et al. (2009), or González-Ramírez et al. (2009).
For geochemical data from two different analytical methods and at the chose confidence level, depending on the results of whether the null hypothesis H 0 (both sets of statistical samples drawn from the same normal population) is accepted and correspondingly, the alternate hypothesis H 1 is rejected, or otherwise, i.e., H 0 is rejected and correspondingly, H 1 is accepted.Thus, these data from two different methods can or cannot be combined for arriving at pooled statistical parameters (Verma, 1998).In other words, in the first result of significance tests (H 0 accepted) the data from the two methods can be combined to calculate the central tendency and dispersion parameters.In the second result (H 1 accepted), the identity for one or more analytical methods different from the remaining groups will have to be maintained whereas the similar method groups could be combined.The statistical parameters will then be calculated individually for them.
A computer program was written in Java that is much more efficient than the available software for statistical

Medicine
Our third example deals with monitoring the change in glucose levels of a group of patients with schizophrenia or schizoaffective disorder over several weeks of treatment with an antipsychotic medication (Lindenmayer et al., 2003).We present just one set of simulated data in Table 8.The application of Student´s t test at 99% confidence level shows that H 0 is accepted and H 1 is rejected when the literature critical values are used, but the opposite is the case when the presently simulated precise critical value from the best-fit equation (Table 4 or ES4) is used.Thus, the medical treatment will be considered effective only from the present critical values.The same conclusion of successful medical treatment (H 0 rejected and H 1 accepted) is also consistently reached from the probability estimates (Table 8).

Geochemical reference material granite G-1 from U.S.A.
In the field of geochemistry, the most important use of Student´s t test could be related to the evaluation of data quality of traditionally available reference materials (RM; see Verma, 2012).Geochemical data for major-and trace-elements in RM are generally obtained from the application of different analytical methods in laboratories worldwide (e.g., Gladney et al., 1991Gladney et al., , 1992;;Imai et al., 1995;Verma, 1997Verma, , 1998;;Velasco-Tapia et al., 2001;Marroquín-Guerra et al., 2009;Pandarinath, 2009a).
The geochemical data for granite G-1 were compiled from a published report by Gladney et al. (1991).These authors applied the so called "two standard deviation method" to eliminate discordant observations and also reported their "recommended" or "consensus" values for different elements.This particular granite sample is one of the first international geochemical reference materials proposed long ago by the United States Geological Survey (e.g., see Flanagan, 1967).The compiled data were first processed by applying all thirty-three discordancy tests at 95% confidence level through DODESSYS software (Verma and Díaz-González, 2012).This confidence level was chosen to make comparable the multiple test method of Verma (1997) used here with the "two standard deviation method" practiced by Gladney et al. (1991).The statistical results from the outlier-free data of G-1 (for the same chemical elements as for G-2) are summarised in Table 9.Also included for comparison are the consensus values from Gladney et al. (1991).Elements with at least five valid observations were reported.
With the aim of objectively comparing these two sets of results, one-sided t test, in combination with Fisher F test (Verma, 2005;Cruz-Huicochea and Verma, 2013) content (H 2 O + and H 2 O -), fourteen rare earth elements from La to Lu, twenty-four commonly measured traceelements from B to Zr and nineteen other trace-elements from Ag to W (Table 10).These values for sixty-nine chemical parameters, including the lower and upper 99% confidence limits of the mean, will be useful for calibrating analytical instruments and evaluating data quality of individual laboratories.The application of t test has been for arriving at reliable statistical estimates (Table 10) for granite G-2.

Comparison of geochemical reference materials G-1 and G-2 from U.S.A.
The granite standard G-2, collected a few km away from the site of G-1, was proposed to replace the already exhausted supply of G-1.We considered it interesting to evaluate the hypothesis that there are no significant differences between the chemical composition of the two standards (Table 11) by applying Student t test (twosided) at 99% confidence level (see Verma, 2005 for more details).We also evaluated if the new standard G-2 showed higher or lower concentrations than the older standard G-1 by applying Student t test (one-sided) at 99% confidence level.However, the complete chemical data for each standard were newly processed for discordant outliers by applying all single as well as multiple outlier discordancy tests (Verma and Díaz-González, 2012) and t test was applied to such discordant outlier free data sets for each element.The results are summarised in Table 11.
The following elements showed significant differences between G-1 and G-2 at 99% confidence level (see H 0 "false" in the column "Two-sided' in Table 11; this word could as well be "rejected" instead of false): all major elements (Si, Ti, Al, Fe, Mn, Mg, Ca, Na, K, and P); H 2 O + and H 2 O -; rare earth elements La, Sm, Eu, Gd, and Lu; and most other commonly measured trace elements (B, Ba, Be, Co, Cr, Cs, Cu, Ga, Hf, Li, Nb, Ni, Pb, Rb, Sb, Sr, Ta, Th, U, V, Zn, and Zr); and several less frequently measured trace elements (As, Au, Bi, F, Hg, In, Mo, Sn, Tl, and W).Consequently, the alternate hypothesis (H 1 ) of the existence of a statistically significant difference would be true or accepted for these elements.On the contrary, the following elements did not show significant differences between G-1 and G-2 (see H 0 "true" in the column "Two-sided' in Table 11; this word could as well be "accepted" instead of true): rare earth elements Ce, Pr, Nd, Tb, Dy, Ho, Er, Tm, and Yb; and trace elements Y, Ag, Br, C, Cd, Cl, Ge, Ir, S, and Se.
Similarly, the same elements also showed significantly higher or lower concentration values (see H 0 "false" in the column "One-sided" in Table 11); Y is also added to processing of multi-variate geochemical data, especially those arising from inter-laboratory trials.An updated version of this program (UDASYS ─ Univariate Data Analysis SYStem; Verma et al., 2013) with all discordancy and significance tests and capable of efficiently processing extensive experimental databases, is now available from the authors.
We used the current preliminary version of this program to process an unpublished compilation (by S.P. Verma and R. González-Ramírez) of geochemical data for granite G-2 (a reference material from the United States Geological Survey, U.S.A.).Prior to the application of t test at 99% confidence level, F test was applied at 99% confidence level to all data sets in this file to determine the type of t-statistics applicable for each pair of groups or statistical samples.Although application of ANOVA could be a better procedure to statistically handle these extensive geochemical data (e.g., Jensen et al., 1997), we highlight the application of t test to all possible groups of data (or pairs of statistical samples).For the application of t test, the implicit assumption is that both samples be drawn from normal population(s), which was tested and its validity assured through software DODESSYS (Verma and Díaz-González, 2012) by applying all single-outlier type tests (Verma et al., 2009) at 99% confidence level.The method grouping was the same as that proposed by Velasco-Tapia et al. (2001).
As a result of the application of t-test to all pairs of data for G-2, no statistically significant differences were observed at 99% confidence level for six major-elements (Si, Ti, Al, Mn, H 2 O + , and H 2 O -, all expressed as %m/m), nine rare earth elements (Pr, Nd, Sm, Gd, Dy, Ho, Er, Yb, and Lu), eighteen more common trace-elements (B, Be, Cr, Cu, Ga, Hf, Li, Nb, Pb, Rb, Sb, Sr, Ta, U, V, Y, Zn, and Zr) and other ten trace-elements (Ag, As, Bi, Br, C, Hg, Mo, S, Se, and W).Confidence levels were also individually calculated for all pairs of method groups.One pair of method groups showed differences in their mean values for two major-elements (Fe and Mg), five rare earth elements (La, Ce, Eu, Tb, and Tm), and five traceelements (Cs, Sc, Cd, Cl, and F), whereas two or more groups of data showed differences for the remaining four major-elements (Ca, Na, K, and P) and six trace-elements (Ba, Co, Ni, Th, Ge, and Tl).The data obtained from the method groups showing significant differences from the remaining methods were left out before applying discordancy tests to the combined data and computing statistical parameters.
The applications of discordancy and significance tests as explained above enabled us to compute statistical parameters for geochemical data of G-2 (Table 10).These data include both central tendency and dispersion parameters for all ten major-elements from Si to P, water lands: (MgO) adj , (Fo) norm , (Ol) norm , Mg_value, Cr, and Pb.Once established such similarities and differences on strictly statistical basis, geological and petrogenetic reasons can then be explored to explain them.

Other scientific and engineering fields
Although to limit the length of this paper we have formulated only a few examples, numerous such cases can be built from all other areas of science and engineering where quantitative data are interpreted for their statistical significance.Just to mention a few areas where these critical values will be useful, they are: agriculture, astronomy, biology, biomedicine, biotechnology, criminology and justice, environmental and pollution research, food science and technology, geochronology, meteorology, nuclear science, palaeontology, petroleum research, quality assurance and assessment programs, soil science, structural geology, water research, and zoology.
The correct procedure would be to fulfil the requirement that the statistical samples should have been drawn from normal populations without any statistical contamination, which should be ascertained through discordancy tests (Barnett and Lewis, 1994;Verma, 1997Verma, , 2005Verma, , 2012;;Verma et al., 1998) before the application of Student´s t test.The new software DODESSYS (Verma and Díaz-González, 2012) should prove useful for this purpose.

Conclusions
New highly precise and accurate critical values have been generated for Student´s t test.Best-fit regression equations based on double or triple natural-logarithm transformations of degrees of freedom have also been proposed for computing critical values for other degrees of freedom not-tabulated in the present work, including fractional degrees of freedom.These critical values agree with those provided by software R.Although only a few examples highlight the importance of new critical values for inferring the validity of the null or alternate hypothesis, this work should be useful for many other scientific and engineering fields.Application of significance and discordancy tests to geochemical reference materials G-1 this list.Consequently, as for the significant differences all elements listed except Y (see H 0 "true" in the column "One-sided" in Table 11), the one set of data are not higher or lower from the other set.
Therefore, we can safely conclude that the more recently prepared granite reference material G-2 has significantly different chemical composition than the earlier granite material G-1 although both were sampled from nearby localities in the same intrusive body.

Basic rocks from Canary and Azores Islands
Both groups of islands in the Atlantic Ocean probably originated by similar tectonic processes and, therefore, chemical compositions of similar magma types from these islands are likely to be similar.It may be interesting to explore the application of significance tests (F and t) to geochemical data from these islands.
Most major elements and normative minerals, rare earth elements Eu-Lu, and several trace elements did not show significant differences between the Canary and Azores Islands (see "true" in the H 0 "Two-sided" column of Table 12).The elements or normative mineral parameters that showed significant differences at 99% confidence level (see "false" in the H 0 "Two-sided" column of Table 12) were as follows: major elements (MnO) adj , (Na 2 O) adj , and (P 2 O 5 ) adj ; normative minerals (Ne) norm , (Fs) norm , (Fo) norm , (Ol) norm , and (Ap) norm ; Mg_value; rare earth elements La, Ce, Pr, Nd, and Sm; and trace elements Cr, Nb, Pb, Sr, Ta, Th, U, Y, Zn, and Zr.
Basic rocks from the Canary Islands showed significantly higher concentrations than the Azores Islands for the following elements (see "false" in the H 0 "One-sided" column and the mean concentrations in Table 12): (MnO) adj , (Na 2 O) adj , (P 2 O 5 ) adj , (Ne) norm , (Fs) norm , and (Ap) norm ; rare earth elements La, Ce, Pr, Nd, and Sm; and trace elements Nb, Sr, Ta, Th, U, Y, Zn, and Zr.On the other hand, a few elements in basic rocks of the Canary Islands showed lower concentrations than the Azores Is-

Fig. 3 .
Fig. 3.-Evaluation of the quality of 28 regression models (polynomial fits) in terms of the averaged sum of the squared residuals (SSR/N where N=65, being the same degrees of freedom as used in the regression model) for the fitting of 65 critical values of Student t for different two-sided confidence levels (CL) for the degrees of freedom (ν) from 3 to 2000; the regression models are the same as in Figure 2. See Tables 3 or ES3 and 4 or ES4 for more details.a) CL of 50%; b) CL of 60%; c) CL of 70%; d) CL of 80%; e) CL of 90%; f) CL of 95%;g) CL of 98%; h) CL of 99%; i) CL of 99.5%; j) CL of 99.8%; and k) CL of 99.9%.Fig.3.-Evaluación de la calidad de 28 modelos de regresión en términos de la suma promedio de cuadrados de los residuales (SSR/N donde N=65, siendo los mismos grados de libertad que usados en el modelo de regresión) para el ajuste de 65 valores críticos de t de Student para diferentes niveles de confianza de tipo dos colas (CL) y para grados de libertad (ν) de 3 a 2000; los modelos de regresión son los mismos que en la Figura 2. Favor de ver las Tablas ES3 y ES4 para mayores detalles.a) CL de 50%; b) CL de 60%; c) CL de 70%; d) CL de 80%; e) CL de 90%; f) CL de 95%; g) CL de 98%; h) CL de 99%; i) CL de 99.5%; j) CL de 99.8%; y k) CL de 99.9%.

Fig. 4 .
Fig. 4.-Evaluation for the interpolation ( int ) purposes, of the quality of 28 regression models in terms of the averaged sum of the squared residuals ((SSR/N) int where N=5, being the five degrees of freedom 105, 220, 380, 860, and 1100 listed in second part of TableES1; the t values corresponding to these degrees of freedom were not used in obtaining the regression model) for the fitting of 65 critical values of Student t for different two-sided confidence levels (CL) for the degrees of freedom (ν) from 3 to 2000; the regression models are the same as in Figure2.See Tables3 or ES3and 4 or ES4 for more details.a) CL of 50%; b) CL of 60%; c) CL of 70%; d) CL of 80%; e) CL of 90%; f) CL of 95%; g) CL of 98%; h) CL of 99%; i) CL of 99.5%; j) CL of 99.8%; and k) CL of 99.9%.Fig.4.-Evaluación para los propósitos de interpolación ( int ), de la calidad de 28 modelos de regresión en términos de la suma promedio de cuadrados de los residuales ((SSR/N) int donde N=5, siendo los cinco grados de libertad 105, 220, 380, 860, y 1100 listados en la segunda parte de la Tabla ES1; los valores de t para estos grados de libertad no fueron usados para obtener el modelo de regresión) para el ajuste de 65 valores críticos de t de Student para diferentes niveles de confianza de tipo dos colas (CL) y para grados de libertad (ν) de 3 a 2000; los modelos de regresión son los mismos que en la Figura 2. Favor de ver las Tablas ES3 y ES4 para mayores detalles.a) CL de 50%; b) CL de 60%; c) CL de 70%; d) CL de 80%; e) CL de 90%; f) CL de 95%; g) CL de 98%; h) CL de 99%; i) CL de 99.5%; j) CL de 99.8%; y k) CL de 99.9%.

Fig. 5 .
Fig. 5.-Evaluation for the extrapolation ( ext ) purposes, of the quality of 28 regression models in terms of the averaged sum of the squared residuals ((SSR/N) ext where N=4 being the four degrees of freedom 3000, 4000, 5000, and 6000 listed in second part of TableES1; the t values corresponding to these degrees of freedom were not used in obtaining the regression model) for the fitting of 65 critical values of Student t for different two-sided confidence levels (CL) for the degrees of freedom (ν) from 3 to 2000; the regression models are the same as in Figure2.See TablesES3 and ES4for more details.a) CL of 50%; b) CL of 60%; c) CL of 70%; d) CL of 80%; e) CL of 90%; f) CL of 95%; g) CL of 98%; h) CL of 99%; i) CL of 99.5%; j) CL of 99.8%; and k) CL of 99.9%.Fig.5.-Evaluación para los propósitos de extrapolación ( ext ), de la calidad de 28 modelos de regresión en términos de la suma promedio de cuadrados de los residuales ((SSR/N)

Fig. 7 .
Fig. 7.-Comparison of Student t critical values obtained in this work from the alternative approach of Monte Carlo simulation with those calculated from software R. Note only small percentage differences between the two estimates, generally less than 0.002%.Fig. 7.-Comparación de valores críticos de t de Student obtenidos en este trabajo mediante el método alternos de simulación Monte Carlo con los calculados mediante el software R. Se observa solamente pequeñas diferencias entre las dos estimaciones, generalmente menores que 0.002%.

Table 3
ajuste de acuerdo con este criterio particular.Para la explicación de las funciones ll and lll ver la Tabla 4.

Table 2 .
-Standard error values for simulated critical values of the Student t test.The abbreviations are as follows: cv ts

Table 4 .
-The best-fit critical value equations of Student t distribution for 99% confidence levels.Tabla 4.-Evaluación de las ecuaciones para el mejor ajuste de valores críticos de la distribución t de Student para nivel de confianza de 99%.

Table 1 .
-Abridged form of simulated critical value table of Student t test.The abbreviations are as follows: cv ts t 50 -critical value of t for twosided (ts) 50% confidence level; cv os t 75 -critical value of t for one-sided (os) 75% confidence level; and similar symbols are used for other columns.The more frequently used confidence levels are marked in boldface.Tabla 1.-Forma abreviada de la tabla de valores críticos simulados para la prueba de t de Student.Las abreviaturas son las siguientes: 75 -el valor crítico de t para una cola (os) nivel de confianza 75%; y símbolos similares se usaron para las otras columnas.Los niveles de confianza más usados han sido resaltados en negrillado.

Table 2 .
-Standard error values for simulated critical values of the Student t test.The abbreviations are as follows: cv

Table 6 .
-Simulated 87 Sr/ 86 Sr values in geochemical reference material JA-1 and application of Student´s t test based on different critical values.Tabla 6.-Valores de la relación 87 Sr/ 86 Sr simulados en el material de referencia geoquímica JA-1 y aplicación de la prueba t de Student basada en diferentes valores críticos.

Table 7 .
-Simulated paracetamol concentration in tablets by two methods or by two analysts and application of Student´s t test based on different critical values.Tabla 7.-Concentación simulada de paracetamol en tabletas por dos métodos o por dos analistas y aplicación de la prueba t de Student basada en diferentes valores críticos.
0 : there is no statistically significant difference between paracetamol from the two methods (MetA and MatB) or two analysts (AnalA and AnalB).H 1 : there is statistically significant difference between paracetamol from the two methods (MetA and MatB) or two analysts (AnalA and AnalB).

Table 8 .
-Simulated glucose levels of a group of patients before and after medication and application of Student´s t test based on different critical values.Tabla 8.-Niveles de glucosa simulados de un grupo de pacientes antes y después de la medicación y y aplicación de la prueba t de Student basada en diferentes valores críticos.

Table 11 .
-Statistical parameters of element concentrations in G-1 and G-2 and application of Student t test to evaluate similarities and differences between them.Tabla 11.-Parámetros estadísticos de las concentraciones de elementos en G-1 y G-2 así como la aplicación de la prueba t de Student para evaluar similitudes y diferencias entre ellas.

Table 12 .
-Statistical parameters of element concentrations in basic rocks from the Canary and Azores Islands and application of Student t test to evaluate similarities and differences between them.Tabla 12.-Parámetros estadísticos de las concentraciones de elementos en rocas básicas de las Islas Canarias y de Azores y la aplicación de la prueba t de Student para evaluar similitudes y diferencias entre ellas.