Introduction
Researchers
use experiments to answer questions. An experiment is characterized by the
treatments and experimental units to be used, the way treatments are assigned
to units, and the responses that are measured.
Advantages of experiments:
Experiments
help us answer questions, but there are also non experimental techniques.
Consider that:
1.
Experiments allow us to set up a direct comparison between the treatments of
interest.
2.
We can design experiments to minimize any bias in the comparison.
3.
We can design experiments so that the error in the comparison is small.
4.
Most important, we are in control of experiments, and having that control allows
us to make stronger inferences about the nature of differences that we see in
the experiment.
Components
of an Experiment
An
experiment has treatments, experimental units, responses, and a method
to
assign treatments to units.
Treatments,
units, and assignment method specify the experimental
design. A good experimental design must
• Avoid systematic error
• Be precise
• Allow estimation of error
• Have broad validity.
Treatments
are
the different procedures we want to compare. These could be different kinds or
amounts of fertilizer in Horticulture, different long distance rate structures
in marketing.
Experimental
units are
the things to which we apply the treatments. These could be plots of land
receiving fertilizer, groups of customers receiving.
Responses
are
outcomes that we observe after applying a treatment to an experimental unit. That
is, the response is what we measure to judge what happened in the experiment.
Randomization
is
the use of a known, understood probabilistic mechanism for the assignment of
treatments to units.
Experimental
Error is
the random variation present in all experimental results.
Different
experimental units will give different responses to the same treatment, and it
is often true that applying the same treatment over and over again to the same
unit will result in different responses in different trials. Experimental error
does not refer to conducting the wrong experiment or dropping test tubes.
Measurement
units are
the actual objects on which the response is measured. For example, consider the
effect of different fertilizers on the nitrogen content of Horticultural
plants.
Factors
combine
to form treatments. For example, the baking treatment for a cake involves a
given time at a given temperature. Individual settings for each factor are
called levels of the factor.
CRD (Complete
Randomized Design)
Random effects are approach to designing experiments
and modeling data. Random effects are appropriate when the treatments are
random samples from a population of potential treatments. They are also useful
for Random random subsampling from populations.
ANOVA for one-factor
model.
Source DF EMS
Treatments g-1 σ2 + nσ2
Error N-g σ2
ANOVA for a
two-factor model
Source DF
EMS
A a − 1
B b − 1
AB (a − 1)(b − 1)
Error N − ab = ab(n
− 1)
Randomized
Complete Block Design (RCBD)
The
Randomized Complete Block design (RCBD) is the basic blocking design.
Why
and when to use the RCB
The
RCBD is an effective design when there is a single source of extraneous
variation in the responses that we can identify ahead of time and use to
partition the units into blocks. Blocking is done at the time of randomization;
you can’t construct blocks after the experiment has been run.
Analysis
for the RCBD
we
have the correct model, we do point estimates, confidence RCBD intervals,
multiple comparisons, testing, residual analysis, and so on, in the same
way as for the CRD.
Example:
The
ANOVA table follows:
DF SS MS
F-value p-value
(5% level)
Blocks 4
686.4 171.60
Treatments 2
432.03 216.02
12.2 .0037
Error 8
141.8 17.725
RCBD
and a CRD to test the same treatments, both designs have the same total size N,
and both use the same population of units. The efficiency of the RCBD relative
to the CRD is the factor by which the sample size of the CRD would need to be increased
to have the same formation as the RCBD.
Comparison of two means
Among the most commonly used
statistical significance tests
applied to small data sets (populations samples) is the series of Student's
tests. One of these tests is used for the comparison of two means, which is
commonly applied to many cases. Typical examples are:
Example 1: Comparison of analytical results obtained with the same
method on samples A and B, in order to confirm whether both samples contain the
same percentage of the measured analyte or not.
Example 2: Comparison of analytical results obtained with two
different methods A and B on the same sample, in order to confirm whether both
methods provide similar analytical results or not.
General aspects of
significance tests
The outcome of these tests is the acceptance or rejection of the null
hypothesis (H0). The null hypothesis generally states that:
"Any differences, discrepancies, or suspiciously outlying results are
purely due to random and not systematic errors". The alternative hypothesis (Ha)
states exactly the opposite.
The null hypothesis for the aforementioned
examples is:
The means
are the same, i.e. in Example 1: both samples contain the same percentage of the
analyte; in Example 2: both
methods provide the same analytical results. The differences observed (if any)
are purely due to random errors.
The alternative hypothesis is:
The means
are significantly different, i.e. in Example 1: each sample contains a
different percentage of the analyte; in Example
2: the methods provide
different analytical results (so
at least one method yields systematic analytical errors).
Student's t-test for the
comparison of two means
This test assumes: (a) A normal
distribution for the populations of the random errors, (b) there is no
significant difference between the standard deviations of both population
samples.
Correlation and
Regression Analysis
Regression
analysis involves identifying the relationship between a dependent variable and
one or more independent variables. A model of the relationship is hypothesized,
and estimates of the parameter values are used to develop an estimated
regression equation. Various tests are then employed to determine if the model
is satisfactory. If the model is deemed satisfactory, the estimated regression
equation can be used to predict the value of the dependent variable given
values for the independent variables.
Regression model
In simple linear regression, the model used to
describe the relationship between a single dependent variable y and a single
independent variable x is y = a0 + a1x + k. a0and
a1 are referred to as the model parameters, and is a probabilistic
error term that accounts for the variability in y that cannot be explained by
the linear relationship with x. If the error term were not present, the model
would be deterministic; in that case, knowledge of the value of x would be
sufficient to determine the value of y.
Least squares method
Either a simple or multiple regression model is
initially posed as a hypothesis concerning the relationship among the dependent
and independent variables. The least squares method is the most widely used
procedure for developing estimates of the model parameters.
Correlation
Correlation and regression analysis are related in the
sense that both deal with relationships among variables. The correlation
coefficient is a measure of linear association between two variables. Values of
the correlation coefficient are always between -1 and +1. A correlation
coefficient of +1 indicates that two variables are perfectly related in a
positive linear sense, a correlation coefficient of -1 indicates that two
variables are perfectly related in a negative linear sense, and a correlation
coefficient of 0 indicates that there is no linear relationship between the two
variables. For simple linear regression, the sample correlation coefficient is
the square root of the coefficient of determination, with the sign of the
correlation coefficient being the same as the sign of b1, the coefficient of x1
in the estimated regression equation.
Neither regression nor correlation analyses can be
interpreted as establishing cause-and-effect relationships. They can indicate
only how or to what extent variables are associated with each other. The
correlation coefficient measures only the degree of linear association between
two variables. Any conclusions about a cause-and-effect relationship must be
based on the judgment of the analyst.
Reference
Gomez, K.A. and A.A. Gomez.
1994. Statistical Procedures for Agricultural Research. (2nd Ed.),
John Wiley & Sons, New Work.680 p.
Zaman. S.M.H.; K. Rahim and
M. Howladar. 1982. Simple Lesson from Biometry. Bangladesh Rice Research
Institute, Joydebpur, Dhaka.
Presented
and prepered by:
Md. Rafiqul
Islam Shuvo
B.Sc.Ag.(Hons.)
, PSTU
MS in
Horticulture, BAU
www-agricultureinfo.blogspot.com
shuvo_ag10@yahoo.com
No comments:
Post a Comment