Section 5: Data Analysis

Step 4: Confirm your data analysis methods

You and your partners should agree on and confirm the "recipes" to be used. A recipe provides the finished product, ingredients, and directions. Similarly, the data analysis method identifies the goal/ objective, the set of variables/ themes, and the step-by-step methods for carrying out the analysis.

Your analytical methods may be used to describe, compare, or predict variables/ themes or relationships among variables/ themes proposed in the goals, objectives, or evaluation questions.

A. Describe variables or themes

For quantitative data, there are three primary ways to describe variables, including:

  • Descriptive statistics summarize the distribution of the data using measures of central tendency and spread. Measures of central tendency include the mean, median, and mode. Measures of spread include range, variation, and standard deviation. Descriptive statistics can be easily summarized in tables or graphs.
  • Frequencies are counts of occurrences for a particular variable or theme. Frequencies may summarize several data points, or cases, at once and are often reported as a percent. Displaying data across different variables in cross tabulations can further illustrate the data distribution.
  • Geocoded data use geographic coordinates (latitude and longitude) to spatially orient the data. These data are often shown in maps.

For qualitative data, there are two main approaches to identifying and describing themes, or coding data from qualitative sources, including:

  • Deductive data analysis involves interpretive techniques designed to extract and summarize different themes and the frequency of those themes in a dataset (e.g., transcripts from an interview or focus group, elements of a policy identified through policy analysis) using preset codes derived from the intervention goals and objectives or the evaluation questions.
  • Inductive data analysis uses focused coding procedures to identify indigenous or emergent themes, or ideas and concepts derived from the data; themes are inductively organized into categories, or sensitizing concepts, to create new themes or codes or to replicate existing themes or codes.2,3

For deductive qualitative data analysis, software applications (e.g., AtlasTI, NVivo) are frequently used by qualitative analysts to assist with both the organization and coding of qualitative data.

B. Compare or predict variables or relationships among variables

Most data analysis methods used for comparison or prediction fall into the universe of inferential statistics.

In contrast to descriptive statistics that represent observed properties of the data, inferential statistics are used to make inferences about population data based on observations in a sample data set and assumptions reflecting different models, parametric (continuous data) or non-parametric (categorical data).

These models are used to assess the probability that the observations for the sample are true for the population; this is referred to as the level of significance, p, and p less-than .05 is a common value that is used to identify an acceptable probability.

Simple inferential statistical analysis methods are commonly used to compare variables to one another or to compare variables by different samples.

  • Correlations (parametric) are used to determine the degree of correspondence between two variables; these analyses do not assess causality (i.e., one variable causes the other variable to occur). Correlations are used to determine whether the correspondence between or among variables is strong or weak and positive or negative.
  • T-tests (parametric) or Chi-Square Analyses (non-parametric) are used to determine whether there is a significant different between two groups on a specified variable. T-tests compare the means of the two groups and Chi-Square Analyses compare proportions for the two groups.

More complex comparisons examine relationships between independent and dependent variables.

Again, dependent variables typically reflect the outputs as well as short-term, intermediate, and long-term outcomes from the logic model.

And, independent variables represent all the other factors (i.e., populations and samples, intervention, contextual conditions) that may explain any changes in the dependent variables over time.

  • Analysis of Variance (ANOVA; parametric) assesses relationships of independent variables to dependent variables by examining differences in means among populations or evaluation samples or subsamples (e.g., higher or lower income groups, exposed or unexposed to pedestrian safety campaign messages, residents living in a mixed-use versus traditional development, pre- versus post-intervention). Relationships of multiple independent (e.g., two-way ANOVA) or dependent variables (Multiple Analysis of Variance, MANOVA) can also be assessed through this analysis method.

Simple and complex analysis methods are also designed to predict relationships among variables or to incorporate mixed-methods analysis, or triangulation, reflecting a range of different data sources.

  • Regression analysis is used to determine whether one variable (predictor) can be used to predict the outcome of another variable (criterion). The output from a regression analysis identifies statistically significant predictors of the criterion as well as the amount of variation in the criterion that is explained by the set of predictors.
  • Health impact assessment (HIA) is a process to evaluate the potential health effects of a plan, project, or policy before it is built or implemented; the major steps include screening, scoping, assessment, recommendations, reporting, and monitoring and evaluation.4
  • Spatial Analysis uses geometric (points, lines, shapes) and geographic (latitude and longitude coordinates) data to analyze variables and their connection to places.
  • Model building is used to go beyond simple comparisons and predictions with a small set of variables to examine complex relationships among a large set of variables. Some prominent analytical methods include:
    • Hierarchical linear modeling is designed to work with nested data (e.g., analyzing people within organizations within communities).
    • Structural equation modeling identifies direct and indirect pathways among independent and dependent variables and tests these relationships as well as the fit of the entire model of paths and relationships.
    • Economic modeling includes cost-effectiveness analysis, cost-benefit analysis, and cost utility analysis.
    • Systems modeling includes Markov decision-choice models, agent-based models, system dynamics models, and theoretical mathematical models.


  1. Patton M. Qualitative research and evaluation methods. 3rd ed. Thousand Oaks, CA: Sage; 2002.
  2. Bowen G. Grounded theory and sensitizing concepts. International Journal of Qualitative Methods 2006;5(3).

Next: Continue to Step 5
elderly couple on sidewalk speedbumps