Regression on Rankits

It is possible to replace the probabilities with rankits (also called normal order scores or order statistics) and then to use regression to fit a line to the probability plot (Gilliom and Helsel, 1986 Hashimoto and Trussell, 1983 Travis and Land, 1990). This is equivalent to rescaling the graph in terms of standard deviations instead of probabilities. If the data are normally distributed, or have been transformed to make them normal, the probabili-ties(p) are converted to rankits (normal order...

Method Detection Limit General Concepts

The method detection limit (MDL) is much more a statistical than a chemical concept. Without a precise statistical definition, one cannot determine a scientifically defensible value for the limit of detection, expect different laboratories to be consistent in how they determine the limit of detection, or be scientifically honest about declaring that a substance has (or has not) been detected. Beyond the statistical definition there must be a clear set of operational rules for how this...

Cohens Maximum Likelihood Estimator Method

There are several methods to estimate the mean of a sample of censored data. Comparative studies show that none is always superior so we have chosen to present Cohen's maximum likelihood method (Cohen, 1959, 1961 Gilliom and Helsel, 1986 Haas and Scheff, 1990). It is easy to compute for samples from a normally distributed parent population or from a distribution that can be made normal by a log)arithmic transformation. A sample of n observations has measured values of the variable only at y >...

Stratified Sampling

Figure 23.4 shows three ways that sampling might be arranged in a area. Random sampling and systematic sampling do not take account of any special features of the site, such as different soil type of different levels of contamination. Stratified sampling is used when the study area exists in two or more distinct strata, classes, or conditions (Gilbert, 1987 Mendenhall et al., 1971). Often, each class or stratum has a different inherent variability. In Figure 23.4, samples are proportionally...

Fractional Factorial Experimental Designs

KEY WORDS alias structure, confounding, defining relation, dissolved oxygen, factorial design, fractional factorial design, half-fraction, interaction, main effect, reference distribution, replication, ruggedness testing, t distribution, variance. Two-level factorial experimental designs are very efficient but the number of runs grows exponentially as the number of factors increases. Usually your budget cannot support 128 or 256 runs. Even if it could, you would not want to commit your entire...

Make the Original Data Record a Plot

Because the best way to display data is in a plot, it makes little sense to make the primary data record a table of values. Instead, plot the data directly on a digidot plot, which is Hunter's (1988) innovative combination of a time-sequence plot with a stem-and-leaf plot (Tukey, 1977) and is extremely useful for a modest-sized collection of data. The graph is illustrated in Figure 3.1 for a time series of 36 hourly observations (time, in hours, is measured from left to right). FIGURE 3.1...

Case Study Compaction of Fly

There was a proposal to use pozzolanic fly ash from a large coal-fired electric generating plant to build impermeable liners for storage lagoons and landfills. Pozzolanic fly ash reacts with water and sets into a rock-like material. With proper compaction this material can be made very impermeable. A typical criterion is that the liner must have a permeability of no more than 10-7 cm sec. This is easily achieved using small quantities of fly ash in the laboratory, but in the field there are...

Random and Systematic Errors

The titration example oversimplifies the accumulation of random errors in titrations. It is worth a more complete examination in order to clarify what is meant by multiple sources of variation and additive errors. Making a volumetric titration, as one does to measure alkalinity, involves a number of steps 1. Making up a standard solution of one of the reactants. This involves (a) weighing some solid material, (b) transferring the solid material to a standard volumetric flask, (c) weighing the...

The Box Cox Power Transformations

A power transformation model developed by Box and Cox (1964) can, so far as possible, satisfy the conditions of normality and constant variance simultaneously. The method is applicable for almost any kind of statistical model and any kind of transformation. The transformed value of the original variable y is where yg is the geometric mean of the original data series, and X expresses the power of the transformation. The geometric mean is obtained by averaging ln(y) and taking the exponential...

Transformations for Linearization

Transformations are sometimes used to obtain a straight-line relationship between two variables. This may involve, for example, using reciprocals, ratios, or logarithms. The left-hand panel of Figure 7.1 shows the exponential growth of bacteria. Notice that the variance (spread) of the counts increases as the population density increases. The right-hand panel shows that the data can be described by a straight line when plotted on a log scale. Plotting on a log scale is equivalent to making a...

References

On the Distribution of a Positive Random Variable Having a Discrete Probability Mass at the Origin, J. Am. Stat. Assoc., 50, 901-908. Aitchison, J. and J. A. Brown (1969). The Lognormal Distribution, Cambridge, England, Cambridge University Press. Berthouex, P. M. and L. C. Brown (1994). Statistics for Environmental Engineers, Boca Raton, FL, Lewis Publishers. Blom, G. (1958). Statistical Estimates and Transformed Beta Variables, New York, John Wiley. Cohen, A. C., Jr....

Exercises

Evaluate an irrigation system that uses recycled water to grow cucumbers and eggplant. Some field test data are given in the table below. Irrigation water was applied in two ways sprinkle and drip. Evaluate the yield, yield per gallon, and biomass production 27.2 Water Pipe Corrosion. Students at Tufts University collected the following data to investigate the concentration of iron in drinking water as a means of inferring water pipe corrosion. (a) Estimate the...

Probability Plots

A probability plot is not needed to interpret the data in Table 5.1 because the time series plot and dot diagrams expose the important characteristics of the data. It is instructive, nevertheless, to use these data to illustrate how a probability plot is constructed, how its shape is related to the shape of the frequency distribution, and how it could be misused to estimate population characteristics. The probability plot, or cumulative frequency distribution, shown in Figure 5.4 was...

Constructing an External Reference Distribution

The first 130 observations in Figure 6.1 show the natural background pH in a stream. Table 6.1 lists the data. Suppose that a new effluent has been discharged to the stream and someone suggests it is depressing the stream pH. A survey to check this has provided ten additional consecutive measurements 6.66, 6.63, 6.82, 6.84, 6.70, 6.74, 6.76, 6.81, 6.77, and 6.67. Their average is 6.74. We wish to judge whether this group of observations differs from past observations. These ten values are...

Parametric Estimates of Quantiles

If we know or are willing to assume the population distribution, we can use a parametric method. Parametric quantile (percentile) estimation will be discussed initially in terms of the normal distribution. The same methods can be used on nonnormally distributed data after transformation to make them approximately normal. This is convenient because the properties of the normal distribution are known and accessible in tables. FIGURE 8.1 Correspondence of percentiles on the lognormal and normal...