
Summarize units that rank consistently high across models
Source:R/model_agreement.R
summarize_top_units.RdAggregates lower-level rows to a chosen unit level, ranks units within each model, and summarizes which units most consistently appear near the top across models. This is useful for questions such as "Which books consistently have the strongest effects across models?"
Usage
summarize_top_units(
data,
outcome = "mean_outcome",
item_by = "book_id",
rank_within = NULL,
model_col = "model",
top_n = 3,
higher_is_better = TRUE,
standardize = c("z", "none", "minmax", "max"),
include_ranks = FALSE,
drop_missing = TRUE
)Arguments
- data
A data frame with one row per model-by-unit combination.
- outcome
Character string naming the score column (default
"mean_outcome").- item_by
Character vector identifying the items to rank, e.g.
"book"or"book_id".- rank_within
Optional character vector defining separate ranking contexts, e.g.
"party"to rank books separately within party.- model_col
Character string naming the model column (default
"model").- top_n
Integer. Number of top-ranked items to count for each model.
- higher_is_better
Logical. If
TRUE(default), larger outcome values receive better ranks. IfFALSE, smaller values receive better ranks.- standardize
Character. How to standardize item scores within each model before computing cross-model mean scores.
"z"(default) centers and scales scores within model;"none"keeps raw scores;"minmax"rescales scores within model to 0–1;"max"divides scores within model by that model's maximum absolute score. Ranks are unchanged by monotonic standardization, butmean_scoreand point sizes inplot_top_units()use the standardized scores.- include_ranks
Logical. If
TRUE, return a list with both the summary table and the model-level ranks. IfFALSE(default), return only the summary table.- drop_missing
Logical. Whether to drop rows with missing model, item, or ranking-context identifiers before aggregating (default
TRUE).
Value
A tibble, or a list with summary and ranks when
include_ranks = TRUE.
The summary table contains:
rank_withincolumnsOptional grouping columns used to define separate ranking contexts, such as party.
item_bycolumnsThe ranked item identifiers, such as book.
mean_scoreMean outcome score for the item across models.
score_scaleThe score standardization method used for
mean_score.mean_rankAverage rank of the item across models. Lower values indicate more consistently high-ranked items when
higher_is_better = TRUE.overall_mean_rankWhen
rank_withinis supplied, the item's average rank computed without those ranking contexts. This preserves a common item order for subgroup displays.median_rankMedian rank of the item across models.
top_n_modelsNumber of models that ranked the item within the top
top_nitems in its ranking context. For example, iftop_n = 3andtop_n_models = 4, then 4 models placed that item in their top 3.n_modelsNumber of models with non-missing ranks for the item.
top_nThe top-N threshold used to compute
top_n_models.top_n_labelCompact display label combining
top_n_modelsandn_models, such as"4/5".
When include_ranks = TRUE, the ranks table contains one row per
model-by-item combination, including score, rank, and top_n.