Tools/Propensity Score Analysis
Experimentation

Propensity Score Analysis

Learn how propensity scores work, when to use them, and run your own analysis — upload a test/control dataset and instantly see how balanced your groups are.

Learn

1

What are propensity scores?

A propensity score is the probability that a person was assigned to the treatment (test) group, given their observed characteristics. In plain English: it's a single number that summarises everything you know about a respondent — their age, attitudes, behaviours — and tells you how likely they were to end up in the test group. Introduced by Rosenbaum & Rubin (1983), propensity scores are a cornerstone of causal inference. They let you compare apples to apples: instead of directly comparing test and control groups (which may differ systematically), you compare people with similar propensity scores.

2

When should you use them?

Propensity scores are most valuable when you cannot randomise perfectly — which is most of the time in marketing and consumer research. Use them when: • Your test and control groups weren't randomly assigned (observational data) • You suspect the groups differ on key background characteristics • You want to check balance after an experiment ran • You're weighting or matching respondents to create a fairer comparison They're commonly applied in media mix modelling, brand lift studies, A/B test diagnostics, and causal attribution.

3

How are they calculated?

The standard approach is logistic regression: 1. Combine your test and control groups into one dataset 2. Create a binary outcome: 1 = Test, 0 = Control 3. Use your pre-treatment variables (demographics, attitudes, behaviours) as predictors 4. Train the model — the predicted probability for each row is their propensity score Once you have scores, you assess balance by comparing the score distributions across test and control. Good overlap = well-balanced groups. Poor overlap = systematic differences that could bias your results. The Standardized Mean Difference (SMD) is the standard metric: SMD < 0.10 is considered well balanced; SMD > 0.25 signals concern.

Try it

Upload a CSV with test and control groups, or use the sample dataset. The tool will calculate propensity scores and show you how well your groups are balanced.

Upload your data

Drag & drop a .csv file here, or click to browse

Needs a group column (test / control) and numeric feature columns.

or