BioStatistics


Definition of Statistics:
Is a field of study concerned with (1) the collection, organization, summarization, and analysis of data; and (2) the drawing of inferences about a body of data when only a part of the data is observed.
Definition of Biostatistics:
When the data analyzed are derived from the biological sciences and medicine, we use the term biostatistics to distinguish this particular application of statistical tools and concepts.
Definition of Descriptive Statistics:
Are descriptive coefficients that summarize certain set of data which can be representative of entire or sample of population.
It is divided into:
a)     Measures of Central tendency.
b)     Measures of Dispersion.
Measures of Central tendency is a single value that describes the way in which a group of data cluster around a central value such as mode, mean and median and you can see definition from name itself.
Or:
In each of the measures of central tendency, of which we discuss three, we have a single value that is considered to be typical of the set of data as a whole. Measures of central tendency convey information regarding the average value of a set of values. As we will see, the word average can be defined in different ways.
First: Arithmetic mean:
It is descriptive measure (is the ability to summarize the data by means of a single number) to get what come in mind of people “average”.
To calculate mean in general way:
The mean is obtained by adding all the values in a population or sample and dividing by the number of values that are added.
To calculate Sample Mean:
1)     Let X represent random variable, age or……
2)     Let Xi represent a typical value of random variable.
3)     Let n represent number of values in sample.
4)     Σ is summation symbol and in the equation below means adding all values of variables from first to the last.


Ex:
If you know that these ages are selected randomly from population and we want to calculate mean of this sample:

( 43, 66, 61, 64, 65, 38, 59, 57, 57, 50) years. 
Then we will apply the rule of sample mean: x̅ =
Properties of Mean:
1)     Uniqueness. For a given set of data there is one and only one arithmetic mean.
2)     Simplicity. The arithmetic mean is easily understood and easy to compute.
3)     Can be affected by outliers which is an extreme value in a set of data which is much higher or lower than the other numbers.

Since each and every value in a set of data enters into the computation of the mean, it is affected by each value. Extreme values, therefore, have an influence on the mean and, in some cases, can so distort it that it becomes undesirable as a measure of central tendency.
For example:
If you want to calculate the charge average of 5 physicians which are L.E 75, L.E 80, L.E 75, L.E 80, and L.E 400.
Then average will be = L.E 142 which is a value of non-sense as it can’t be representative of set of data as whole.

4)     Can be used to calculate variance.
5)     Act as balancing point; the total distance from the mean to the data points below the mean is equal to the total distance from the mean to the data points above the mean.


For example: here mean = 56
Total distance from mean to below values = 18+13+6 = 37
Total distance from mean to above values = 1+1+3+5+9+8+10 = 37
NOTE: age 57 is repeated twice so its distance from mean is written twice.

 Second: Median:
The median of a finite set of values is that value which divides the set into two equal parts such that the number of values equal to or greater than the median is equal to the number of values equal to or less than the median.
If the number of values is odd, the median will be the middle value when all values have been arranged in order of magnitude.
When the number of values is even, there is no single middle value. Instead there are two middle values. In this case the median is taken to be the mean of these two middle values, when all values have been arranged in the order of their magnitudes.

So,

Median of a set of data can be calculated by:
1)     Order the values in ascending or descending order.
2)     Use the following rule to determine the place of median in the ordered sequence:
Where n is number of values in data.   
th
Ex:
If we want to measure median of the following numbers, first: we have to order these values in ascending or descending order, then calculate median.
10,2,5,4,8,7,9

Order: 2,4,5,7,8,9,10
Number of values = 7, then median will be 4 th ordered value which is 7.

 Properties of Median:
1)     Uniqueness. As is true with the mean, there is only one median for a given set of data.
2)     Simplicity. The median is easy to calculate.
3)     It is not as drastically affected by extreme values as is the mean.

Third: Mode:
The mode of a set of values is that value which occurs most frequently.
Properties of Mode:
1)     Can be used for qualitative or quantitative data.
2)     May not be unique.
3)     Not affected by extreme values.
4)     May not exist.

Measures of Dispersion:
The dispersion of a set of observations refers to the variety that they exhibit. A measure of dispersion conveys information regarding the amount of variability present in a set of data. If all the values are the same, there is no dispersion; if they are not all the same, dispersion is present in the data. The amount of dispersion may be small when the values, though different, are close together.

First: Range:
One way to measure the variation in a set of values is to compute the range. The range is the difference between the largest and smallest value in a set of observations. If we denote the range by R, the largest value by = xL and the smallest value by xS we compute the range as follows:
R = xL - xS
Ex: if we want to calculate range of ages from previous example:
Range = 66-38 = 28.

Second: Variance:
It measures how far a set of data are spread out from their average value.
To calculate variance, follow the steps:
1)     Get the mean of values.
2)     Subtract each value from mean.
3)     Square the result of subtraction process.
4)     Add all results of squaring of previous step.
5)     Divide the final result by (number of values in sample – 1).
Final equation:


Ex: from example of ages of people.

Mean = 56.
By applying the equation: S2 =   

Third: Standard Deviation:
The variance represents squared units and, therefore, is not an appropriate measure of dispersion when we wish to express this concept in terms of the original units. To obtain a measure of dispersion in original units, we merely take the square root of the variance. The result is called the standard deviation.
General formula:


Properties of Standard deviation and Variance:
1)     Used to determine the spread of data and used also to compare between 2 or more data to determine which is more variance.
2)     Used to determine consistency of variable.
3)     Used to determine number of data that fall within a specified interval in a distribution.
4)     Are used quite in inferential statistics.
5)     Are sensitive to extreme values.

The Coefficient of variation is used when one desires to compare the dispersion in two sets of data.
The general formula =
S is standard deviation and X̅ is mean.


For example: Suppose two samples of human males yield the following results:
To know which is more variable, we have to calculate the coefficient of variation:
 for 25-years = C.V = 10/145 ×100 = 6.9%
for 11-years = C.V = 10/80 ×100 = 12.5%
then variation in 11-years sample is more than that of 25-years.

Wait for us in another topic !!!!!!!!!!!














Comments

Post a Comment

Popular posts from this blog

Molecular biology--Deoxyribonucleic acid (DNA)

Introduction to Medical Research

Aim VS Objectives & Variables types