BioStatistics
Definition of
Statistics:
Is a field of study concerned with
(1) the collection, organization, summarization, and analysis of data; and (2)
the drawing of inferences about a body of data when only a part of the data is
observed.
Definition of Biostatistics:
When the data analyzed are derived
from the biological sciences and medicine, we use the term biostatistics to
distinguish this particular application of statistical tools and concepts.
Definition of
Descriptive Statistics:
Are descriptive coefficients that
summarize certain set of data which can be representative of entire or sample
of population.
It is divided into:
a)
Measures
of Central tendency.
b)
Measures
of Dispersion.
Measures of Central tendency is a single value that describes the way in which a group of data
cluster around a central value such as mode, mean and median and you can see definition
from name itself.
Or:
In each of the measures of central
tendency, of which we discuss three, we have
a single value that is considered to be typical of the set of data as a whole.
Measures of central tendency convey information regarding the average value of
a set of values. As we will see, the word average can be defined in different
ways.
First: Arithmetic
mean:
It is descriptive measure (is the
ability to summarize the data by means of a single number) to get what come in mind
of people “average”.
To calculate mean in general way:
The mean is obtained by adding all
the values in a population or sample and dividing by the number of values that
are added.
To calculate Sample Mean:
1)
Let X
represent random variable, age or……
2)
Let Xi
represent a typical value of random variable.
3)
Let n
represent number of values in sample.
4)
Σ is
summation symbol and in the equation below means adding all values of variables
from first to the last.
Ex:
If you know that these ages are
selected randomly from population and we want to calculate mean of this sample:
( 43, 66, 61, 64, 65, 38, 59, 57, 57, 50) years.
Then we will apply the rule of
sample mean: x̅ =
Properties of Mean:
1)
Uniqueness. For a given set of data there is one and only
one arithmetic mean.
2)
Simplicity. The arithmetic mean is easily understood and
easy to compute.
3)
Can
be affected by outliers which is an
extreme value in a set of data which is much higher or lower than the other
numbers.
Since each and
every value in a set of data enters into the computation of the mean, it is affected
by each value. Extreme values, therefore, have an influence on the mean and, in
some cases, can so distort it that it becomes undesirable as a measure of
central tendency.
For example:
If you want to calculate
the charge average of 5 physicians which are L.E 75, L.E 80, L.E 75, L.E 80, and
L.E 400.
Then average
will be = L.E 142 which is a value of non-sense as it can’t be representative
of set of data as whole.
4)
Can
be used to calculate variance.
5)
Act
as balancing point; the total distance from the mean
to the data points below the mean is equal to the total distance from the mean
to the data points above the mean.
For example: here mean = 56
Total distance
from mean to below values = 18+13+6 = 37
Total distance
from mean to above values = 1+1+3+5+9+8+10 = 37
NOTE: age 57 is repeated twice so its
distance from mean is written twice.
Second: Median:
The median of a
finite set of values is that value which divides the set into two equal
parts such that the number of values equal to or greater than the median is equal
to the number of values equal to or less than the median.
If the number
of values is odd, the median will be the middle
value when all values have been
arranged in order of magnitude.
When the number
of values is even, there is no single middle value. Instead there are two
middle values. In this case the median is taken to be the mean of these two
middle values, when all values have been arranged in the order of their
magnitudes.
So,
Median
of a set of data can be calculated by:
1)
Order
the values in ascending or descending order.
2)
Use
the following rule to determine the place of median in the ordered sequence:
Where n is number
of values in data.
Ex:
If we want to
measure median of the following numbers, first: we have to order these values in
ascending or descending order, then calculate median.
10,2,5,4,8,7,9
Order:
2,4,5,7,8,9,10
Number of values
= 7, then median will be 4 th ordered value which is 7.
Properties of Median:
1)
Uniqueness. As is true with the mean, there is only one
median for a given set of data.
2)
Simplicity. The median is easy to calculate.
3)
It
is not as drastically affected by extreme values as is the mean.
Third: Mode:
The mode of
a set of values is that value which occurs most frequently.
Properties
of Mode:
1) Can be used for qualitative
or quantitative data.
2) May not be unique.
3) Not affected by extreme
values.
4) May not exist.
Measures
of Dispersion:
The
dispersion of a set of observations refers to the variety that they exhibit.
A measure of dispersion conveys information regarding the amount of
variability present in a set of data. If all the values are the same, there
is no dispersion; if they are not all the same, dispersion is present in the
data. The amount of dispersion may be small when the values, though different,
are close together.
First: Range:
One way to
measure the variation in a set of values is to compute the range. The range is
the difference between the largest and smallest value in a set of observations.
If we denote the range by R, the largest value by = xL and
the smallest value by xS we compute the range as follows:
R = xL -
xS
Ex: if we want to calculate range of
ages from previous example:
Range = 66-38 = 28.
Second: Variance:
It measures
how far a set of data are spread out from their average value.
To calculate
variance, follow the steps:
1) Get the mean of values.
2) Subtract each value from mean.
3) Square the result of subtraction process.
4) Add all results of squaring of previous
step.
5) Divide the final result by (number of values
in sample – 1).
Final
equation:
Ex:
from example of ages of people.
Mean = 56.
By applying
the equation: S2 =
Third: Standard Deviation:
The variance
represents squared units and, therefore, is not an appropriate measure of
dispersion when we wish to express this concept in terms of the original units.
To obtain a measure of dispersion in original units, we merely take the square
root of the variance. The result is called the standard deviation.
General
formula:
Properties of Standard deviation and Variance:
1) Used to determine the spread of data
and used also to compare between 2 or more data to determine which is
more variance.
2) Used to determine consistency of variable.
3) Used to determine number of data that
fall within a specified interval in a distribution.
4) Are used quite in inferential statistics.
5) Are sensitive to extreme values.
The
Coefficient of variation is used when one desires to compare the dispersion in two sets of data.
The general formula =
S is standard deviation and X̅ is mean.
For example: Suppose
two samples of human males yield the following results:
To know which is more variable, we have to calculate the coefficient of variation:
for 25-years = C.V = 10/145 ×100 = 6.9%
for 11-years = C.V = 10/80 ×100 = 12.5%
then variation in 11-years sample is more than that of 25-years.
Wait for us in another topic !!!!!!!!!!!
يعطيكن العافية 🌼🌼🌼🌼🌼🌼🌼😍
ReplyDeleteThnx
Delete