Health Information Systems for Low-Income Countries: An Overview |
|
Appendix 5 - Descriptive statistics
Paul Fisher
Why use statistics?
|
|
Populations require health care decisions. |
|
|
Health care decisions require health care information. |
|
|
Health statistics are a kind of health information. |
|
|
Natural variation amoung individuals that make up a population. |
|
|
Natural variantion may mask differences between populations. |
|
|
Need for tools to reveal true differences. |
|
|
The first step of any analysis is to describe your data. |
|
|
Type of description depends on the nature of the data.
|
Shape of a Distribution
|
|
The shape of a distribution is its configuration of points when plotted on a graph. |
|
|
The data's symmetry, modality, and kurtosis describe the shape. |
|
|
Useful graphs for describing distribution:
|
Location of a Distribution
|
|
The location of a distribution is the position of its values when plotted on a graph. |
|
|
The distribution's centre summarizes the location. |
|
|
Common measures of location are:
|
Dispersion of a Distribution
|
|
The dispersion of a distribution is its spread (variability) around some central point. |
|
|
The dispersion is as important as the location of a distribution. |
|
|
Common measures of dispersion are:
|
Association between Variables
|
|
Association refers to the degree to which values "go together". |
|
|
If there is a tendency for variables to go together in the same direction, a positive association is said to exist. |
|
|
If there is a tendency for variable values to go in opposite directions, a negative association is said to exist. |
|
|
If there is no association, variables are said to be independent. |
|
|
Examples of measures of association include:
|
| Inference is the capacity to say something about a population based on examination of a sample or samples. |
Regardless of the inferential method used, it is important to keep clearly in mind the distinction between:
|
|
The parameters (numerical summaries of the population) being inferred and |
|
|
the statistics (numerical summaries of the sample) used to infer them |
| Parameters | Statistics |
|
|
Different symbols are used to represent sample statistics and population parameters. For example:
|
|
("p hat") may be used to represent a sample proportion (statistic) |
|
|
p may be used to represent a population proportion (parameter) |
|
|
x may be used to represent a sample mean (statistic) |
|
|
µ may be used to represent a population mean (parameter) |
| A statistic is a number that can be computed from data, involving no unknown parameters. As a function of a random sample, a statistic is a random variable. Statistics are used to estimate parameters, and to test hypotheses. |
| A parameter is a numerical property of a population, such as its mean. |
| A variable is a numerical value or a characteristic that can differ from individual to individual. |
Types of variables are:
| Qualitative Variables |
| |
| Categorical Variables |
| |
| Ordinal Variables |
| |
| Random Variables |
| |
| Quantitative Variables |
| |
| Discrete Variables |
| |
| Continuous Variables |
| |
| A population is a collection of units being studied and about which information is desired. Units can be people, programs, institutions, time periods, procedures, etc. |
Much of statistics is concerned with infering the characteristics of an entire population (parameters) from the characteristics of a random sample of units from the population. Population parameters are:
A sample is a subset (size = n) of the population (size = N) selected for study.
|
Sample Statistics

then

Median
Standard Deviation
).
Standard Variance
| © 2005 Canadian Society for International Health and the Contributors last update: 2005-06-28 |
||