In Survivor is entering the merge down on numbers a disadvantage?

In Survivor it is usually seen as beneficial to enter the merge with the numbers. If you have the majority you have a good chance of voting out those in the minority and making if further in the game. Makes sense, but often things don’t go according to plan.

I was curious about this because, in a 3 tribe setup, no single tribe has the majority at merge. This could be the reason for the standard 3 tribe set up in the new era.

Here is a Bayesian analysis to estimate the probability of making the final 2 or 3 and winning the season for those that entered the merge down on numbers.

TL;DR

Do those in the smallest tribe at merge have a lower chance of making the final 2 or 3?

  • Yes – Slightly lower at approximately 14% lower.
  • Out of 34 relevant seasons, there have been 27 players that have made the final. We would expect 33.

Do those in the smallest tribe at merge have a lower chance of winning the season given they made the final?

  • No – They may actually have a slightly higher probability than those with the numbers, approximately 13% higher.
  • Out of 20 relevant seasons, there have been 13 cases. We would expect 11.

The takeaway is that those that enter the merge without the numbers face a harder time making it to the final. But those that do are really strong players and actually have an edge on winning the season.

There are many other factors that could affect their chances of making the final 2 or 3 and/or winning but this is simplifying the problem down to testing if the different from equal chance. Entering the merge down on the numbers ultimately affects the other key features so it’s still a useful view.

All data is available in the survivoR R package.

Data setup and considerations

I’ll make the following considerations when setting up the data:

  • I’ll be using data from the US, AU, SA, and NZ versions.
  • I’ll only be including seasons that started with 2 tribes. When there are 3+ tribes it levels the playing field at the merge. Essentially two tribes need to unite in order to become the majority.
  • In seasons where tribes enter the merge with the same numbers are removed since there is no advantage/disadvantage to estimate.
  • Tribe swaps haven’t been included in the analysis but they would have an impact on balancing out the tribes, and the chances of winning at the merge. Important to keep in mind.
  • For the ‘winner’ model, I’ll only be considering cases where there are at least 1 or 2 players from the smallest tribe in the final. This is because if there is 0 there is a 0% chance they will win the season and if there are 3 there is a 100% chance so we already know the outcome.

Summary of seasons

To start, I’ve listed the seasons which meet the conditions above. Those underlined are those who were in the smallest tribe at merge. It’s interesting that those from the smallest tribe won more often in the early seasons of Survivor but there has only been one in the later seasons. Kind of wild.

Model

Probability of making the final

Firstly, I’ll look at the probability of making the final 2/3. Since there can be either 0, 1, 2, or 3 that can make it to the final, depending on how many made it to the merge, the likelihood is a binomial. This isn’t the most precise model. Technically it’s a hypergeometric but the binomial is a good approximation and does what I need it to do.

A key consideration for the model is that each season can have either 16-20 players and the merge can happen at different stages. It can occur when there are 10 players left, meaning an equal probability of 1/10. Or when there are 13 players, meaning an equal probability of 1/13.

In which case I’ll specify the model as,

    \[\theta_i & = \frac{\text{no. in smallest tribe at merge}}{\text{no. at merge}}\]

    \[\begin{array}{ll} y_i & \sim Binomial(n_i, \beta\theta_i) \\ \beta & \sim N(1, 0.2)\end{array}\]

Under this formulation \beta will be estimated across all seasons and if the median is 1 i.e. there is no adjustment to \theta then there is no advantage or disadvantage being the smallest tribe at the merge. If \beta < 1 then there is a disadvantage to being in the smallest tribe at the merge.

I’m choosing a prior of N(1, 0.2) because I honestly don’t know if it will be above or below one but don’t expect it to stray too much. If there is a difference I want to be sure about it.

This model is fit in Stan.

library(survivoR)
library(tidyverse)
library(rstan)

add_tribe <- function(df, .tribe_status = NULL) {
  if(is.null(.tribe_status)) {
    out <- df |>
      left_join(
        survivoR::tribe_mapping |>
          distinct(version_season, episode, castaway_id, tribe),
        by = c("version_season", "episode", "castaway_id")
      )
  } else {
    out <- df |>
      left_join(
        survivoR::tribe_mapping |>
          filter(tribe_status == .tribe_status) |>
          distinct(version_season, castaway_id, tribe),
        by = c("version_season", "castaway_id")
      )
  }
  out
}

df <- survivoR::boot_mapping |>
  filter(
    tribe_status == "Merged",
    version_season != "SA05"
  ) |>
  group_by(version_season) |>
  slice_max(final_n) |>
  select(-tribe) |>
  add_tribe("Original") |>
  count(version_season, tribe) |>
  group_by(version_season) |>
  mutate(
    n_merge = sum(n),
    n_tribes = n(),
    rank = rank(n, ties.method = "min"),
    n_min = sum(rank == 1),
    size = ifelse(rank == 1, "n_smallest", "n_rest")
    ) |>
  filter(n_min == 1, n_tribes == 2) |>
  ungroup()

df_f3 <- df |>
  filter(size == "n_smallest") |>
  left_join(df_finalists, by = c("version_season", "tribe")) |>
  left_join(n_finalists, by = "version_season") |>
  mutate(
    n_f3 = replace_na(n_f3, 0),
    p = n/n_merge,
    exp_val = map2_dbl(p, n_finalists, ~sum(dbinom(0:.y, .y, .x)*0:.y))
    )

stan_dat_f3 <- list(
  N = nrow(df_f3),
  y = df_f3$n_f3,
  p = df_f3$p,
  n = df_f3$n_finalists
)

mod_f3 <- stan(file = "scripts/stan/smallest-tribe-f3.stan", data = stan_dat_f3)

The following code is contained in smallest-tribe-f3.stan.

data {
  int<lower=0> N;
  array[N] int<lower=0, upper=3> y;
  array[N] real<lower=0, upper=1> p;
  array[N] int<lower=0> n;
}

parameters {
  real<lower=0> beta;
}

transformed parameters {
  array[N] real<lower=0, upper=1> kappa;
  for(k in 1:N) {
    kappa[k] = beta * p[k];
  }
}

model {
  beta ~ normal(1, 0.2);
  for(k in 1:N) {
    y[k] ~ binomial(n[k], kappa[k]);
  }
}

The median value of \beta is -14% and the distribution shows it’s different from 0% although not super strong – 10% lie above 0%. This suggests there is somewhat of a disadvantage to making the final 3 for players entering the merge in the smallest tribe. This is in line with what we would intuitively think is the case.

Probability of winning

Next, I’ll look at the probability of winning the season for a player in the smallest tribe given they made the final 3. Here the probability is defined as,

    \[\theta_i & = \frac{\text{no. in smallest tribe in the final}}{nnumber of finalists}\]

    \[\begin{array}{ll} y_i & \sim Bernoulli(\beta\theta_i) \\ \beta & \sim N(1, 0.2)\end{array}\]

df_won <- df |>
  filter(size == "n_smallest") |>
  left_join(
    survivoR::boot_mapping |>
      filter(final_n == 3) |>
      distinct(version_season, castaway_id) |>
      add_tribe("Original") |>
      count(version_season, tribe, name = "n_f3"),
    by = c("version_season", "tribe")
  ) |>
  left_join(
    survivoR::castaways |>
      filter(result == "Sole Survivor") |>
      distinct(version_season, tribe = original_tribe) |>
      mutate(won = 1),
    by = c("version_season", "tribe")
  ) |>
  mutate(
    won = replace_na(won, 0),
    n_f3 = replace_na(n_f3, 0),
    p = n/n_merge,
    p_f3 = n_f3/3
    )

df_winner <- df_won |>
  filter(n_f3 > 0 & n_f3 < 3)

stan_dat_tribe <- list(
  N = nrow(df_tribe),
  y = df_tribe$won,
  theta = df_tribe$p
)

mod <- stan(file = "scripts/stan/smallest-tribe.stan", data = stan_dat_tribe)

The following code is contained in smallest-tribe.stan.

data {
  int<lower=0> N;
  array[N] int<lower=0, upper=1> y;
  array[N] real<lower=0, upper=1> theta;
}

parameters {
  real<lower=0> beta;
}

transformed parameters {
  array[N] real<lower=0, upper=1> kappa;
  for(k in 1:N) {
    kappa[k] = beta * p[k];
  }
}

model {
  beta ~ normal(1, 0.2);
  y ~ bernoulli(kappa);
}

The median of \beta is 8% and 79% of the time it is above 0%. Some evidence that they have a slightly better chance, which is interesting. Although, it is only weak and I wouldn’t write home about it. This suggests that while they might have a tougher time making it to the final, if they do they stand a good chance of winning.

Final thoughts

It’s interesting that those that enter the merge down on the numbers have a tougher time making it to the final 2/3. However, those that do stand a good chance of winning, are slightly more than those in the majority. Fair to say that those that make it to the final 3 are top players.

This could be the reason for the 3 tribe set up to help balance the power at the merge. While the tribe swap would play a big part here, the 3 tribe setup means no tribe enters the merge with the majority encouraging the social game to build a majority.

There are obviously a lot of other things that would influence the probability of winning/making the final that haven’t been considered. But entering the merge down on the numbers would be correlated with many of the other influential features.

Follow me on social media: