match module¶
This module implements several variants of matching: one-to-one matching, one-to-many matching, with or without a caliper, and without or without replacement. Variants of the methods are examined in Austin (2014).
Austin, P. C. (2014), A comparison of 12 algorithms for matching on the propensity score. Statistic. Med., 33: 1057-1069.
-
class
pscore_match.match.
Match
(groups, propensity)[source]¶ Parameters: - groups (array-like) – treatment assignments, must be 2 groups
- propensity (array-like) – object containing propensity scores for each observation. Propensity and groups should be in the same order (matching indices)
-
create
(method='one-to-one', **kwargs)[source]¶ Parameters: - method (string) – ‘one-to-one’ (default) or ‘many-to-one’
- caliper_scale (string) – “propensity” (default) if caliper is a maximum difference in propensity scores, “logit” if caliper is a maximum SD of logit propensity, or “none” for no caliper
- caliper (float) – specifies maximum distance (difference in propensity scores or SD of logit propensity)
- replace (bool) – should individuals from the larger group be allowed to match multiple individuals in the smaller group? (default is False)
Returns: - A series containing the individuals in the control group matched to the treatment group.
- Note that with caliper matching, not every treated individual may have a match.
-
plot_balance
(covariates, test=['t', 'rank'], filename='balance-plot', **kwargs)[source]¶ Plot the p-values for covariate balance before and after matching
Parameters: - matches (Match) – Match class object with matches already fit
- covariates (DataFrame) – Dataframe for with all observations and one covariate per column.
- test (array-like or str) – Statistical test to compare treatment and control covariate distributions. Options are ‘t’ for a two sample t-test or ‘rank’ for Wilcoxon rank sum test
- filename (str) – Optional, name of file to save plot in. Default ‘balance-plot’
- kwargs (dict) – Key word arguments to pass into plotly.offline.plot
Returns: Return type: None
Notes
Creates a file with given filename
-
pscore_match.match.
rank_test
(covariates, groups)[source]¶ Wilcoxon rank sum test for the distribution of treatment and control covariates.
Parameters: - covariates (DataFrame) – Dataframe with one covariate per column. If matches are with replacement, then duplicates should be included as additional rows.
- groups (array-like) – treatment assignments, must be 2 groups
Returns: Return type: A list of p-values, one for each column in covariates
-
pscore_match.match.
t_test
(covariates, groups)[source]¶ Two sample t test for the distribution of treatment and control covariates
Parameters: - covariates (DataFrame) – Dataframe with one covariate per column. If matches are with replacement, then duplicates should be included as additional rows.
- groups (array-like) – treatment assignments, must be 2 groups
Returns: Return type: A list of p-values, one for each column in covariates
-
pscore_match.match.
whichMatched
(matches, data, show_duplicates=True)[source]¶ Simple function to convert output of Matches to DataFrame of all matched observations
Parameters: - matches (Match) – Match class object with matches already fit
- data (DataFrame) – Dataframe with unique rows, for which we want to create new matched data. This may be a dataframe of covariates, treatment, outcome, or any combination.
- show_duplicates (bool) – Should repeated matches be included as multiple rows? Default is True. If False, then duplicates appear as one row but a column of weights is added.
Returns: - DataFrame containing only the treatment group and matched controls,
- with the same columns as input data