Introduction to Statistics for Nursing Students

1. Definition and Use of Statistics

What is Statistics?

Statistics is a branch of mathematics concerned with collecting, analyzing, interpreting, and presenting data to describe patterns, make predictions, and draw meaningful conclusions.

Why Statistics Matter in Nursing

Clinical Practice

Interpreting patient vital signs and lab values
Understanding medication efficacy rates
Evaluating treatment outcomes
Monitoring disease prevalence

Research

Designing nursing research studies
Analyzing research findings
Interpreting published research
Contributing to evidence-based practice

Mnemonic: “CARE”

Collect data that matters

Analyze data accurately

Report findings clearly

Evaluate implications for practice

Example: Statistics in Action

A nurse collects blood pressure readings from 50 patients before and after implementing a new relaxation technique. Using statistical analysis, the nurse can determine if the technique significantly reduces blood pressure and make evidence-based recommendations for practice.

2. Scales of Measurement

In statistics, data is classified into four measurement scales, each with different properties and appropriate statistical methods.

Nominal Scale

Definition: Categorizes data without any order or ranking.

Properties: Categories are mutually exclusive with no numerical significance.

Nursing Examples:

Gender (male/female)
Blood type (A, B, AB, O)
Diagnosis codes
Hospital unit categories

Appropriate Statistics: Mode, frequency counts, percentages, chi-square test

Ordinal Scale

Definition: Categories with a clear order or ranking, but without equal intervals.

Properties: Ordered categories, but differences between values aren’t consistent.

Nursing Examples:

Pain scales (0-10)
Pressure ulcer staging (I-IV)
Likert scales (strongly disagree to strongly agree)
Triage categories (emergent, urgent, non-urgent)

Appropriate Statistics: Median, mode, percentiles, Spearman correlation

Interval Scale

Definition: Ordered values with equal intervals but no true zero point.

Properties: Equal spacing between values, but ratios aren’t meaningful.

Nursing Examples:

Temperature in Celsius or Fahrenheit
Calendar dates
IQ scores
Some mental health assessment scores

Appropriate Statistics: Mean, median, mode, standard deviation, Pearson correlation

Ratio Scale

Definition: Ordered values with equal intervals and a true zero point.

Properties: Highest level of measurement; ratios are meaningful.

Nursing Examples:

Blood pressure readings
Weight, height
Lab values (hemoglobin, glucose)
Time measurements (length of hospital stay)

Appropriate Statistics: All statistical measures, including geometric mean and coefficient of variation

Mnemonic: “NOIR”

Nominal (Names) – Categories without order

Ordinal (Order) – Ranked but unequal intervals

Interval (Intervals) – Equal gaps, no true zero

Ratio (Ratios) – Equal gaps with true zero

Example: Identifying Measurement Scales

A nurse researcher is studying patient recovery after surgery and collects the following data:

Nominal: Surgical procedure type (appendectomy, cholecystectomy, etc.)
Ordinal: Post-operative pain levels (mild, moderate, severe)
Interval: Patient satisfaction score (1-5 scale)
Ratio: Length of hospital stay in days

Understanding these scales helps the researcher select appropriate statistical tests for analysis.

3. Frequency Distribution and Graphical Presentation

What is a Frequency Distribution?

A frequency distribution is an organized tabulation of individual data values showing the frequency (count) or percentage of observations in each data category or interval.

Types of Frequency Distributions

Simple Frequency Distribution

Shows the number of occurrences for each data value.

Example: Blood Types of 50 Patients

Blood Type	Frequency
A	20
B	12
AB	5
O	13

Relative Frequency Distribution

Shows the percentage of occurrences for each value.

Example: Blood Types of 50 Patients

Blood Type	Frequency	Relative Frequency
A	20	40%
B	12	24%
AB	5	10%
O	13	26%

Cumulative Frequency Distribution

Shows the accumulation of frequencies up to each data value.

Example: Patient Ages in Hospital Ward

Age Group	Frequency	Cumulative Frequency
18-30	8	8
31-45	12	20
46-60	15	35
61-75	10	45
76-90	5	50

Grouped Frequency Distribution

Groups continuous data into intervals or classes.

Example: Systolic Blood Pressure Readings

Blood Pressure (mmHg)	Frequency
90-109	5
110-129	18
130-149	14
150-169	8
170-189	5

Graphical Presentation of Data

Types of frequency distributions in healthcare statistics

Bar Graph

Best for: Categorical (nominal and ordinal) data

Nursing Applications:

Comparing patient demographics
Displaying medication frequencies
Showing disease incidence by department

Histogram

Best for: Continuous (interval and ratio) data

Nursing Applications:

Displaying distribution of patient ages
Showing distribution of lab values
Analyzing length of hospital stays

Pie Chart

Best for: Showing proportions of a whole

Nursing Applications:

Allocation of nursing time to different tasks
Distribution of patient diagnoses
Budget allocation in healthcare facilities

Line Graph

Best for: Showing trends over time

Nursing Applications:

Tracking vital signs over time
Monitoring infection rates
Following patient improvement during treatment

Box Plot

Best for: Showing distribution and identifying outliers

Nursing Applications:

Comparing lab values between patient groups
Analyzing pain scores across treatments
Studying distribution of hospital readmission times

Scatter Plot

Best for: Showing relationships between two variables

Nursing Applications:

Examining relationship between BMI and blood pressure
Studying correlation between stress and sleep quality
Analyzing association between age and recovery time

Mnemonic: “GRAPHS”

Group data meaningfully

Represent visually with suitable charts

Analyze patterns and trends

Present findings clearly

Highlight key insights

Support with appropriate statistics

Example: Choosing the Right Graph

A nurse manager wants to analyze data about pressure ulcer incidence in different hospital units:

Bar graph: To compare pressure ulcer rates between different units
Pie chart: To show distribution of pressure ulcer stages
Line graph: To track pressure ulcer rates over the past 12 months
Scatter plot: To examine relationship between length of stay and pressure ulcer development

Each graph type reveals different insights from the same dataset.

4. Mean, Median, Mode, and Standard Deviation

Measures of Central Tendency

Measures of central tendency describe the center or typical value of a dataset. The three main measures are mean, median, and mode.

Mean

Definition: The arithmetic average of all values.

Mean (x̄) = Σx / n

Where:

Σx = sum of all values
n = number of values

Best used when: Data is normally distributed without extreme outliers.

Nursing Example: Average heart rate of patients in a unit.

Median

Definition: The middle value when data is arranged in order.

How to find:

Arrange data in ascending order
If n is odd, median is the middle value
If n is even, median is average of two middle values

Best used when: Data has outliers or is skewed.

Nursing Example: Median length of hospital stay.

Mode

Definition: The most frequently occurring value(s).

Properties:

Data can have one mode (unimodal)
Data can have two modes (bimodal)
Data can have more than two modes (multimodal)
Data can have no mode

Best used when: Identifying the most common category or value.

Nursing Example: Most common chief complaint in ER.

Example: Calculating Mean, Median, and Mode

A nurse collected systolic blood pressure readings (mmHg) from 9 patients:

118, 124, 136, 128, 142, 118, 132, 145, 128

Mean:

Sum = 118 + 124 + 136 + 128 + 142 + 118 + 132 + 145 + 128 = 1171

Mean = 1171 ÷ 9 = 130.1 mmHg

Median:

Ordered: 118, 118, 124, 128, 128, 132, 136, 142, 145

Median = 128 mmHg (5th value)

Mode:

118 appears twice

128 appears twice

Mode = 118 and 128 mmHg (bimodal)

Measure of Dispersion: Standard Deviation

Standard Deviation

Standard deviation (SD) measures how spread out the values in a dataset are from the mean. It indicates the typical distance between each data point and the mean.

Standard Deviation (σ) = √[(Σ(x – x̄)²) / n]

Where:

x = each individual value
x̄ = mean of all values
n = number of values
Σ = sum of

Normal distribution curve with mean and standard deviations marked

Interpreting Standard Deviation

Small SD: Data points are close to the mean (less variability)
Large SD: Data points are spread out from the mean (more variability)
In a normal distribution:
- 68% of data falls within ±1 SD of the mean
- 95% of data falls within ±2 SD of the mean
- 99.7% of data falls within ±3 SD of the mean

Nursing Applications

Understanding lab reference ranges (typically mean ±2 SD)
Identifying abnormal vital signs
Comparing variability between patient groups
Evaluating consistency of clinical measurements
Interpreting research findings

Example: Standard Deviation in Practice

A hospital unit measures the time (in minutes) it takes to administer medications to patients:

8, 12, 9, 15, 10

Step 1: Calculate the mean

Mean = (8 + 12 + 9 + 15 + 10) ÷ 5 = 54 ÷ 5 = 10.8 minutes

Step 2: Calculate deviations from the mean and square them

Value (x)	Deviation (x – x̄)	(x – x̄)²
8	8 – 10.8 = -2.8	7.84
12	12 – 10.8 = 1.2	1.44
9	9 – 10.8 = -1.8	3.24
15	15 – 10.8 = 4.2	17.64
10	10 – 10.8 = -0.8	0.64
Sum of squared deviations:		30.8

Step 3: Calculate standard deviation

SD = √(30.8 ÷ 5) = √6.16 = 2.48 minutes

This means that the medication administration times typically vary by about ±2.48 minutes from the mean time of 10.8 minutes.

Mnemonic: “MMM-SD”

Mean for normal distributions without outliers

Median for skewed data or when outliers present

Mode for most common category

Standard Deviation for understanding variation

5. Normal Probability and Tests of Significance

Normal Probability Distribution

The normal distribution (also called Gaussian or bell curve) is a continuous probability distribution that is symmetrical around its mean. Many biological and health measurements follow this distribution.

Properties of the Normal Distribution

Key Characteristics

Bell-shaped and symmetrical around the mean
Mean, median, and mode are all equal
Defined by two parameters: mean (μ) and standard deviation (σ)
Total area under the curve equals 1 (100% probability)
Extends infinitely in both directions but approaches zero

The 68-95-99.7 Rule

68% of data falls within ±1 SD of the mean
95% of data falls within ±2 SD of the mean
99.7% of data falls within ±3 SD of the mean

This rule is essential for interpreting lab values, vital signs, and other health measurements.

Example: Normal Distribution in Nursing

Hemoglobin levels in adult women approximately follow a normal distribution with a mean (μ) of 14 g/dL and a standard deviation (σ) of 1 g/dL. Using the 68-95-99.7 rule:

68% of women have hemoglobin between 13-15 g/dL (14 ± 1)
95% of women have hemoglobin between 12-16 g/dL (14 ± 2)
99.7% of women have hemoglobin between 11-17 g/dL (14 ± 3)

Values outside the 95% range (below 12 or above 16) might be considered clinically significant and warrant further investigation.

Tests of Significance

Hypothesis Testing and Statistical Significance

Statistical significance testing helps determine whether observed results are likely due to chance or represent a real effect. This process involves formulating and testing hypotheses.

Hypothesis Testing Process

State hypotheses:
- Null hypothesis (H₀): No effect or relationship
- Alternative hypothesis (H₁): An effect or relationship exists
Set significance level: Usually α = 0.05
Select appropriate test: Based on data type and research question
Calculate test statistic and p-value
Make decision: Reject or fail to reject null hypothesis
Interpret results: Clinical significance vs. statistical significance

Common Statistical Tests

Test	When to Use
t-test	Compare means of two groups
ANOVA	Compare means of three or more groups
Chi-square	Compare proportions/categorical data
Pearson’s r	Measure linear correlation
Mann-Whitney U	Compare two groups (non-parametric)
Wilcoxon	Compare paired observations (non-parametric)

P-value Interpretation

A p-value is the probability of observing results at least as extreme as the current results if the null hypothesis were true.

p < 0.05: Results are statistically significant; reject null hypothesis
p ≥ 0.05: Results are not statistically significant; fail to reject null hypothesis

Important: Statistical significance does not always equal clinical significance. A statistically significant result may have little practical importance in patient care.

Example: Hypothesis Testing in Nursing Research

A nurse researcher wants to test if a new pain management protocol reduces post-operative pain scores compared to standard care.

H₀: No difference in pain scores between new protocol and standard care
H₁: New protocol results in lower pain scores than standard care
Test: Independent samples t-test
Results: Mean pain score (standard care) = 6.8, Mean pain score (new protocol) = 5.3, p = 0.028
Conclusion: Since p < 0.05, the researcher rejects the null hypothesis and concludes that the new protocol significantly reduces pain scores compared to standard care.
Clinical significance: A reduction of 1.5 points on a 10-point pain scale may be clinically meaningful for patients and influence practice.

Mnemonic: “NURSE”

Null hypothesis statement

Understand what test to use

Run statistical analysis

Significant or not? Check p-value

Evaluate clinical importance

6. Coefficient of Correlation

What is Correlation?

Correlation measures the strength and direction of the linear relationship between two variables. The correlation coefficient (r) quantifies this relationship.

Correlation coefficient scatter plots showing positive, negative, and no correlation

Properties of Correlation Coefficient

Key Characteristics

Values range from -1 to +1
+1 indicates perfect positive correlation
-1 indicates perfect negative correlation
0 indicates no linear correlation
Correlation does not imply causation
Measures only linear relationships

Interpreting Correlation Strength

Correlation Value	Interpretation
0.00 – 0.19	Very weak
0.20 – 0.39	Weak
0.40 – 0.59	Moderate
0.60 – 0.79	Strong
0.80 – 1.00	Very strong

Note: Same scale applies to negative values.

Types of Correlation Coefficients

Pearson’s Correlation Coefficient (r)

Measures linear relationship between two continuous variables
Assumes normal distribution and linear relationship
Formula:
r = Σ[(x – x̄)(y – ȳ)] / √[Σ(x – x̄)² Σ(y – ȳ)²]
Nursing Example: Correlation between BMI and blood pressure

Spearman’s Rank Correlation (rho)

Non-parametric alternative to Pearson’s
Used when data is ordinal or does not meet assumptions for Pearson’s
Measures monotonic relationships (when variables tend to change together, but not necessarily at a constant rate)
Nursing Example: Correlation between pain scale ratings and medication dosage

Example: Correlation in Nursing Research

A nurse researcher collected data on hours of sleep (x) and anxiety scores (y) from 10 patients:

Patient	Hours of Sleep (x)	Anxiety Score (y)
1	4	9
2	5	7
3	6	6
4	7	5
5	8	4
6	3	10
7	9	3
8	5	8
9	7	4
10	6	5

Calculating Pearson’s correlation coefficient gives r = -0.94

Interpretation:

The negative sign indicates an inverse relationship: as hours of sleep increase, anxiety scores decrease
The magnitude (0.94) indicates a very strong correlation
This suggests that sleep and anxiety are strongly related in this patient group
Note: This doesn’t prove that lack of sleep causes anxiety or vice versa (correlation ≠ causation)

Important Considerations

Correlation does not imply causation: Two variables may be correlated because they are both influenced by a third variable
Outliers can significantly affect correlation: Always visualize data with a scatter plot
Correlation only measures linear relationships: Two variables may have a strong non-linear relationship even if r is close to 0
Sample size matters: Correlations from small samples may not be reliable

Mnemonic: “CORDS”

Correlation value (-1 to +1)

Observe the direction (positive or negative)

Review the strength (weak, moderate, strong)

Don’t assume causation

Scatter plot to visualize

7. Statistical Packages and Applications

Statistical Software in Nursing Research

Statistical packages are specialized software designed to perform statistical analyses efficiently and accurately. These tools are essential for managing and analyzing data in nursing research.

Common Statistical Packages

SPSS

Statistical Package for the Social Sciences

Most widely used in nursing research
User-friendly point-and-click interface
Comprehensive statistical capabilities
Excellent for survey data analysis
Powerful data visualization tools

R

Open-source Statistical Software

Free and open-source
Highly flexible with extensive packages
Superior graphics capabilities
Powerful for advanced statistical methods
Steeper learning curve (programming-based)

SAS

Statistical Analysis System

Enterprise-level statistics software
Excellent for large datasets
Highly reliable and validated
Comprehensive data management tools
Common in healthcare organizations

Microsoft Excel

Spreadsheet with Statistical Functions

Widely available and accessible
Good for basic statistics
Suitable for small datasets
Built-in data visualization tools
Limited advanced statistical capabilities

Stata

Integrated Statistical Software

Balance of usability and power
Strong in epidemiological research
Excellent documentation
Good data management capabilities
Popular in healthcare research

EpiInfo

CDC-developed Software

Free software from CDC
Designed for epidemiology research
Easy to learn and use
Good for survey and questionnaire design
Limited advanced statistics

Applications in Nursing

Research Applications

Analyzing clinical trial data
Processing survey responses
Determining intervention effectiveness
Exploring relationships between variables
Testing nursing theories
Meta-analysis of existing research

Clinical Applications

Quality improvement initiatives
Patient outcome tracking
Performance monitoring
Resource utilization analysis
Risk assessment modeling
Decision support systems

Example: Statistical Software in Action

A nurse researcher is studying the impact of a new discharge education protocol on readmission rates for heart failure patients.

Research Process:

Data collection from medical records
Entering data into SPSS
Data cleaning and validation
Descriptive statistics generation
Comparative analysis (t-test)
Creating visualization for findings
Interpreting and reporting results

Statistical Tools Used:

SPSS for primary analysis
Excel for initial data management
Descriptive statistics: mean, median, standard deviation
Inferential statistics: independent samples t-test
Kaplan-Meier survival curve for readmission timing
Bar graphs and line charts for visual presentation

Tips for Choosing Statistical Software

Consider your needs: Research goals, sample size, complexity of analysis
Evaluate your skills: Some programs require more statistical and technical knowledge
Check availability: Many institutions provide licenses for specific software
Assess support resources: Training, documentation, and community help
Consider future needs: Software that can grow with your research skills

Mnemonic: “STATS”

Select appropriate software for your needs

Train yourself properly before analysis

Analyze data with the right statistical tests

Thoroughly document your process

Share results with clear visualizations

Table of Contents

1. Definition and Use of Statistics

What is Statistics?

Why Statistics Matter in Nursing

Clinical Practice

Research

Mnemonic: “CARE”

Example: Statistics in Action

2. Scales of Measurement

Nominal Scale

Ordinal Scale

Interval Scale

Ratio Scale

Mnemonic: “NOIR”

Example: Identifying Measurement Scales

3. Frequency Distribution and Graphical Presentation

What is a Frequency Distribution?

Types of Frequency Distributions

Simple Frequency Distribution

Example: Blood Types of 50 Patients

Relative Frequency Distribution

Example: Blood Types of 50 Patients

Cumulative Frequency Distribution

Example: Patient Ages in Hospital Ward

Grouped Frequency Distribution

Example: Systolic Blood Pressure Readings

Graphical Presentation of Data

Bar Graph

Histogram

Pie Chart

Line Graph

Box Plot

Scatter Plot

Mnemonic: “GRAPHS”

Example: Choosing the Right Graph

4. Mean, Median, Mode, and Standard Deviation

Measures of Central Tendency

Mean

Median

Mode

Example: Calculating Mean, Median, and Mode

Measure of Dispersion: Standard Deviation

Standard Deviation

Interpreting Standard Deviation

Nursing Applications

Example: Standard Deviation in Practice

Mnemonic: “MMM-SD”

5. Normal Probability and Tests of Significance

Normal Probability Distribution

Properties of the Normal Distribution

Key Characteristics

The 68-95-99.7 Rule

Example: Normal Distribution in Nursing

Tests of Significance

Hypothesis Testing and Statistical Significance

Hypothesis Testing Process

Common Statistical Tests

P-value Interpretation

Example: Hypothesis Testing in Nursing Research

Mnemonic: “NURSE”

6. Coefficient of Correlation

What is Correlation?

Properties of Correlation Coefficient

Key Characteristics

Interpreting Correlation Strength

Types of Correlation Coefficients

Pearson’s Correlation Coefficient (r)

Spearman’s Rank Correlation (rho)

Example: Correlation in Nursing Research

Important Considerations

Mnemonic: “CORDS”

7. Statistical Packages and Applications

Statistical Software in Nursing Research

Common Statistical Packages

SPSS

R

SAS

Microsoft Excel

Stata

EpiInfo