The IID assumption (independent and identically distributed) is pretty important. Ignoring it can lead you to make incorrect conclusions (usually […]

The post Ignoring the IID assumption isn’t a great idea appeared first on Dan Oehm | Gradient Descending.

]]>The IID assumption (independent and identically distributed) is pretty important. Ignoring it can lead you to make incorrect conclusions (usually through pseudoreplication). Here’s a quick example.

You have 50 bags filled with 1 red and 9 green balls. You randomly draw 1 ball, and record the colour from each bag. You draw the red ball 5 times and a green ball 45 times. Let’s put that into a 2×2.

```
x1 <- matrix(
c(405, 45, 45, 5),
nrow = 2,
dimnames = list(
c("Green", "Red"),
c("Not drawn", "Drawn")
)
)
> x1
Not drawn Drawn
Green 405 45
Red 45 5
```

I’ll fit a chi-squared test to see if there is a difference between drawing a green ball to drawing a red ball. Maybe the red balls are bigger, are rougher, lighter, stickier, something which means it may be more likely to be drawn from the bag over the green balls.

```
> chisq.test(x1)
Pearson's Chi-squared test
data: x1
X-squared = 0, df = 1, p-value = 1
```

Nope. The p-value is 1 because this is exactly what we would expect to draw if the balls were the same just different colours.

Now perhaps you have 40 bags with 3 red balls and 1 green ball. It’s more likely you’ll draw more red balls in this case. You draw 30 red balls and 10 green balls. Let’s put it into a 2×2 and fit a chi-squared test again.

```
x2 <- matrix(
c(30, 90, 10, 30),
nrow = 2,
dimnames = list(
c("Green", "Red"),
c("Not drawn", "Drawn")
)
)
> x2
Not drawn Drawn
Green 30 10
Red 90 30
> chisq.test(x2)
Pearson's Chi-squared test
data: x2
X-squared = 0, df = 1, p-value = 1
```

Look at that, the p-value is 1, we drew exactly what was expected. In which case we would conclude that the red and green balls are not different and the red balls are not more likely to be drawn than green balls.

Because we’re lazy, let’s combine the draws and fit the chi-squared test again.

```
> x1+x2
Not drawn Drawn
Green 435 55
Red 135 35
> chisq.test(x1+x2)
Pearson's Chi-squared test with Yates' continuity correction
data: x1 + x2
X-squared = 8.6183, df = 1, p-value = 0.003328
```

Now, seemingly by magic, there is a difference. We might conclude that there is a difference between the balls and that the red ones are more likely to be drawn.

We know this isn’t correct though, we set this up such that each test was not significant and we drew the expected number of red balls. So, why is this now showing a strong association?

It’s because the observations are not iid. The observation is the colour of the ball being drawn from the bag and there is only one of them, not all 10 or 4 that are drawn. If one ball is drawn it means all the others can’t be drawn.

The probability of drawing a red in the first set is 1/10 and in the second set, it’s 3/4. They are not the same and the model needs to be set up in a way that accounts for that.

This problem can be structured as a regression problem and you’ll get the same result, in isolation the first and second examples are not significant but pooled together they are.

It seems fairly benign and just poor stats, but imagine replacing the bags with job vacancies and the colour of the balls with some demographic variable of applicants e.g. age, gender, race, ethnicity, etc. Suddenly you find yourself in a situation.

I’m sure you could think of other examples where this would be a problem. Unfortunately, I’ve seen this on more than one occasion in the real world. Most recently regarding first boots in Survivor. You can check out that post if you wish (you will see the same example though).

In this case, the observation isn’t the balls in the bag but the one that is drawn. Assuming we know the contents of the bag the correct way to test for an association between colour and being drawn is as follows.

- The observed number of red balls is 5 in the first set and 30 in the second.
- The expected value is 50*1/10 = 5 in the first and 40*3/4 = 30 in the second.
- The test is

And we have observed exactly what was expected, ergo, no further action.

The takeaway is, be careful how you analyse data and keep it in mind when reading others.

The post Ignoring the IID assumption isn’t a great idea appeared first on Dan Oehm | Gradient Descending.

]]>Alone Australia season 2 has finished and is now available in the package and ready for analysis. As per usual […]

The post {alone} v0.4 is now available appeared first on Dan Oehm | Gradient Descending.

]]>Alone Australia season 2 has finished and is now available in the package and ready for analysis. As per usual install via Git or CRAN.

```
devtools::install_github("doehm/alone")
```

```
install.packages("alone")
```

Any issues please raise them via Git.

It was another great season. Season 1 started off rough with a few early taps, but those in season 2 hung around for a little longer which shows in the survival chart. While the average days lasted is longer for season 2, Gina still holds the record for lasting 67 days.

Over the next few weeks, I’ll update my analysis comparing the US and AU versions.

Two key pieces of information missing from the data are the full names of the contestants and the loadouts for AU. I haven’t been able to find this data anyway. If you do come across it, let me know and I can add it to the package.

Alone season 11 has just started and will be available after the season. In the meantime, you can see the results and grab what data is available from Google Sheets.

The post {alone} v0.4 is now available appeared first on Dan Oehm | Gradient Descending.

]]>I wanted a space to throw all my tables and charts made using the {survivoR} R package into, so I […]

The post The Sanctuary: Stats and data from {survivoR} appeared first on Dan Oehm | Gradient Descending.

]]>I wanted a space to throw all my tables and charts made using the {survivoR} R package into, so I started The Sanctuary (built with Quarto, of course). It has interactive tables about the castaways, challenges, voting history, confessionals, episode details, and a bunch more.

While there is a lot of data out there about Survivor it’s rarely all in one place. This provides a view of castaways across seasons and various stats. There won’t be a lot of explanation or in-depth analysis, just a truckload of data, tables, and charts to explore. Longer form posts will remain on the blog.

The Sanctuary is updated regularly during seasons and whenever new data hits Git. It’s the companion for the {survivoR} package.

I won’t share too much here, I’ll let you explore for yourself, but here is an idea of what you can find.

The score is a measure of how many Tribal Councils they survived, the difficulty of surviving the vote (e.g surviving a Tribal with 4 people is harder than surviving one with 12), and how many votes they copped. Denise, Sandra, and Stephanie take out the top 3 spots.

The challenge score is a measure of challenge success. For individual immunity challenges Ozzy takes out the top spot followed by Brad Culpepper in season 34 Games Changers and Mike Holloway in Season 30 Worlds Apart.

The highest rated season based on IMDb ratings is season 31 Cambodia, the second chance season, followed by season 40 Winners at War and Season 20 Heroes vs Villains. The top 3 are all returnee seasons. The 4th highest rated season is season 7 Pearl Islands, which is also the highest rated all newbie season.

Russell Hantz still holds the most number of coveralls in a season, which is going to be hard to beat. The next two are Rob Cesternino and Colby Donaldson to round out the top 3.

Anyway, expect more things as time goes on.

The post The Sanctuary: Stats and data from {survivoR} appeared first on Dan Oehm | Gradient Descending.

]]>Wrapping up season 46 and time for another release of survivoR. A few new things in this release including two […]

The post {survivoR} 2.3.3 is now available appeared first on Dan Oehm | Gradient Descending.

]]>Wrapping up season 46 and time for another release of survivoR. A few new things in this release including two new datasets.

Install from Git or CRAN:

```
install.packages("survivoR")
```

```
devtools::install_github("doehm/survivoR")
```

As usual, if you find any issues, raise an issue on Git – survivoR issues

For non-R users, it’s free to download from Google Sheets.

If you just want the stats you can head over to The Sanctuary

- New seasons:
- US46

- New datasets added:
`episode_summary`

– the summary of the episode from Wikipedia`challenge_summary`

– a summarised version of`challenge_results`

for easy analysis

- New fields added:
- team on
`challenge_results`

– identifying the team that the castaways were on during the challenge

- team on

I have included the episode summary extracts from Wikipedia that detail the events of the episode. It usually includes pre-challenge events and discussions of strategy, challenge description and results, strategy discussions amongst the tribe heading to Tribal Council, and the result. It may be interesting for NLP type applications.

```
> episode_summary
# A tibble: 647 × 4
version version_season episode episode_summary
<chr> <chr> <dbl> <chr>
1 US US01 1 "The two tribes paddled their way to their respective beaches on a raft with meager supplies. Upon ar…
2 US US01 2 "Following their Tribal Council, Tagi found its fish traps still empty. Disappointed that Rudy was no…
3 US US01 3 "At the Tagi tribe, Stacey still wanted to get rid of Rudy and tried to create a girl alliance to do …
4 US US01 4 "At Pagong, Ramona started to feel better after having been sick and tried to begin pulling her weigh…
5 US US01 5 "At Tagi, Dirk and Sean were still trying to fish instead of helping around camp, but to no avail. Su…
6 US US01 6 "Both tribes were wondering what the merge was going to be like. Tagi was afraid due to their numeric…
7 US US01 7 "The day after Pagong voted Joel out, one person from each tribe went to the opposite tribe's camp an…
8 US US01 8 "At camp, the remaining members of the former Pagong tribe felt vulnerable because the Tagi tribe had…
9 US US01 9 "While Richard was catching fish, the other players began to realize that nobody voted him out becaus…
10 US US01 10 "Some people were happy that Jenna was voted out because she was getting on everyone's nerves. Everyo…
# 637 more rows
# Use `print(n = ...)` to see more rows
```

When I was making some charts for The Sanctuary and specifically the challenge score I realised it was quite difficult to summarise the `challenge_results `

table to the different types of challenges e.g. individual immunity. There are a few edge cases where there are combined challenges e.g. Team / Individual Immunity and Reward challenges where a team will win reward and the last person standing on each team wins immunity. So there are 3 winning outcomes – reward only, immunity only, and immunity and reward for the last person standing.

To make it easier to summarise I created `challenge_summary`

. It looks like this…

```
> challenge_summary
# A tibble: 50,428 × 12
category version_season challenge_id challenge_type outcome_type tribe castaway_id castaway n_entities n_winners n_in_team won
<chr> <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <int> <int> <int> <dbl>
1 All US01 1 Immunity and Reward Tribal Pagong US0002 B.B. 2 1 8 1
2 All US01 1 Immunity and Reward Tribal Pagong US0004 Ramona 2 1 8 1
3 All US01 1 Immunity and Reward Tribal Pagong US0006 Joel 2 1 8 1
4 All US01 1 Immunity and Reward Tribal Pagong US0007 Gretchen 2 1 8 1
5 All US01 1 Immunity and Reward Tribal Pagong US0008 Greg 2 1 8 1
6 All US01 1 Immunity and Reward Tribal Pagong US0009 Jenna 2 1 8 1
7 All US01 1 Immunity and Reward Tribal Pagong US0010 Gervase 2 1 8 1
8 All US01 1 Immunity and Reward Tribal Pagong US0011 Colleen 2 1 8 1
9 All US01 1 Immunity and Reward Tribal Tagi US0001 Sonja 2 1 8 0
10 All US01 1 Immunity and Reward Tribal Tagi US0003 Stacey 2 1 8 0
# 50,418 more rows
# Use `print(n = ...)` to see more rows
```

The other challenge datasets can be easily joined to this table. `challenge_summary `

is not MECE, for example, the category contains ‘All’, ‘Individual’, ‘Individual Reward’, and ‘Individual Immunity’, to name a few. The results are counted separately for each category. You will need to filter for the right category before using the table.

Not every castaway is counted in every category. If they didn’t make it to the merge they didn’t compete in an individual challenge (except in some edge cases). Rather than their record being 0, they are not featured in that category. See Github for more details.

The post {survivoR} 2.3.3 is now available appeared first on Dan Oehm | Gradient Descending.

]]>A while back, I looked into whether or not BIPOC players are disproportionately voted out first. I didn’t find a […]

The post Racial Bias in Survivor: Are BIPOC Players Disproportionately Voted Out First? (part 2) appeared first on Dan Oehm | Gradient Descending.

]]>A while back, I looked into whether or not BIPOC players are disproportionately voted out first. I didn’t find a lot of evidence to support this claim, despite what it may seem.

However, I did find that female players were disproportionately voted out first. BIPOC women were as well although that is likely more due to gender not race/ethnicity. At the merge, it flips and men are more likely to be voted out.

I refreshed the analysis after Season 46 made the merge to see if there had been a change. I’ve expanded it to test the following points:

- Are BIPOC players disproportionately voted out of their original tribe first?
- Are women disproportionately voted out of their original tribe first?
- Are BIPOC women disproportionately voted out of their original tribe first?
- Are white women disproportionally voted out of their original tribe first?

I take a very statistical view here but I think that’s needed to cut through perceptions and confirmation bias.

The number of first boots is over expectation for both BIPOC and Female cohorts, but far more so for women.

Cohort | Expected | Actual | Difference |
---|---|---|---|

BIPOC | 29 | 33 | 4 |

Female | 45 | 55 | 10 |

Male, BIPOC | 12 | 9 | -3 |

Male, White | 31 | 24 | -7 |

Female, White | 28 | 31 | 3 |

Female, BIPOC | 17 | 24 | 7 |

The model estimates if there is an increase in the probability of being voted out for a certain cohort. When there are equal numbers of BIPOC and other players in the tribe at the first Tribal Council, above 50% means a positive bias. For the splits e.g. BIPOC women, I have assumed 25%. The bands indicate statistical variation and represent the 50%, 80%, and 95% credible intervals.

Both BIPOC and female cohorts have a positive bias. The interval for the BIPOC cohort fairly comfortably contains 50% within the 80% CI, so maybe there is some weak evidence there but, it’s not particularly strong.

The bias estimate for women is clearly higher than 50% so it’s pretty obvious there is a bias here. Women are more likely to be voted out of their tribe first.

Not really.

If there was no bias and perfectly random, out of 82 Tribal Councils that have at least 1 BIPOC player, we would expect 29 BIPOC players to be voted out first and have observed 33.

The bias is estimated to be a +6% chance of being voted out first, but it’s not a huge amount, the 95% CI is (-6%, 17%). The claim that BIPOC players are disproportionately voted out first is not supported by the data.

To put this into perspective a little more, let’s say there are 8 people on the tribe, 4 of which are BIPOC players (50%). The average bias is +6% so the probability a BIPOC player will be voted out is 56%. Therefore each player has a 14% (56%/4) chance of being voted out where equal chance is 12.5%, just a +1.5% increase per person.

Yes.

If there was no bias we could expect 45 first boots to be female but we have seen 55. If it were completely random the probability of seeing 55 boots is about 1%.

The bias is estimated to be a +13% chance of being voted out first and is significant in this case. The 95% CI is (3%, 24%).

Yes, but likely more due to gender than race/ethnicity.

If there was no bias we could expect to see 17 but there have been 24. The probability of seeing 24 is about 2%. The bias is estimated to be a +11% chance of being voted out first with a 95% CI of (0%, 24%). It’s wide due to the lower numbers, but from the above, we can say that it’s primarily due to gender not race.

Actually, no.

We would expect to see 28 but there have been 31. The bias is estimated to be +4% of being voted out first with a 95% of (-5%, 13%). This is pretty interesting, gender is clearly the strongest factor (out of gender and race/ethnicity), but particularly from S42 the first boots have been primarily BIPOC women. This suggests that BIPOC women contributing more than their fair share to first boots.

These results are similar to what I found last time. The number of BIPOC first boots is above average but only by 4, which seems like a considerable increase above what was expected, but it could also happen randomly. After many more seasons something may emerge, but at this stage claiming that BIPOC players are voted out first isn’t supported by the data. Although, I don’t think it’s as clear-cut as that.

Gender bias is present and more of a factor than race/ethnicity. The data shows that women are more likely to be voted out of the tribe first with a +13% increase in probability, and easily above equal probability.

What’s important here is that while women are disproportionately voted out, BIPOC women make up the most of that imbalance. When modeling white women and BIPOC women independently white women have a smaller bias and, to be fair, within reasonable variation. The bias for BIPOC women is about 2.5x higher.

There could be other points for consideration. There was a post comparing votes for BIPOC players and other players. I’ll look into that next to see if it holds water (I have a lot to say about that post, to be honest). In a later post, I’ll also consider age as a contributing factor.

Again, I’m not saying that (subconscious) bias doesn’t exist towards BIPOC players in Survivor, just that it’s not measurable in the data point of who is voted out first.

How I arrived at this position is important, so we get into the weeds a bit (by a bit I mean a lot) here about how I conducted the analysis. It’s important because I consider a few things that are often overlooked.

I looked at it in 3 ways:

- Bayesian model
- Simulation model
- Regression model (there are issues with this though. I’ll explain.)

All code and results are contained in the post so you can reproduce the analysis.

There are quite a few things to consider to set up the data correctly:

- Who is considered BIPOC? In the survivoR package, I record anyone as BIPOC if they are listed as African-American/Canadian, Asian-American/Canadian, Latin-American, or Native American on the Survivor Wiki. I don’t make assumptions about someone’s identity.
- I only consider the first time an original tribe goes to Tribal and votes for the first time. There could be players that didn’t go to Tribal until either the Merge or before they swapped tribes. They are removed from the analysis because I want to control for any other sources of variation. This leaves 88 Tribal Councils for the analysis. Still a good amount for this analysis.
- I only consider the first true vote out. That means if a player quits in the first Tribal e.g. Hannah in S45, or when Jonny Fairplay asked to be voted out in S16, I don’t consider that the first Tribal, similarly I remove medically evacuated players. I want to make sure I only include the first true
*target*for each tribe. - To calculate the probability of someone being voted out I consider only those eligible to be voted out. For example, if someone has individual immunity for the first Tribal, or safety without power, they are removed from the analysis.
- I ensure the makeup of the tribe is accounted for. This is the single most important consideration and what other analyses overlook. More below in the analysis.
- I have included players who have played multiple times. Even though in seasons where there are returning players and newbies, returning players already have a target on their back. This is mostly because it would remove too many tribals.

The tricky thing with this problem is that every first tribe has a different number of BIPOC / other players. The proportion ranges from 0-1 in the 88 tribes that attend Tribal Council across the 46 seasons. On average 25% of the castaways are BIPOC in a tribe. If it was the same every Tribal there wouldn’t be an issue but this is an important point that can’t be ignored. More below.

I want to directly estimate the bias with this model. If bias is present and BIPOC players are more likely to be voted out, the probability should be some factor above equal probability.

The way I’ve formulated it is as follows:

where is the probability of a BIPOC player being voted out first. takes the log odds of the probability and adds , the bias term. is a hierarchical term across all the seasons whereas the other parameters are different for each season.

Under this model, we can let be normally distributed and unconstrained to estimate the bias. I’ve used a prior of . On the log scale, it’s hard to interpret, but essentially if this would equate to a bias of +0.12 which I think is fair.

Code: Bayesian data analysis

```
# set up ------------------------------------------------------------------
no_quitters <- survivoR::castaways |>
filter(
version == "US",
str_detect(result, "voted"),
!(castaway == "Jonny Fairplay" & version_season == "US16")
) |>
distinct(version_season, castaway, castaway_id)
demogs <- survivoR::castaway_details |>
select(castaway_id, gender, bipoc, race, ethnicity)
tribe_size <- survivoR::boot_mapping |>
filter(order == 0) |>
count(version_season, tribe)
log_odds <- function(x) {
log(x/(1-x))
}
log_odds_inv <- function(x) {
1/(1+exp(-x))
}
p_adj <- function(p, bias) {
log_odds_inv(log_odds(p)+bias)
}
summarise_quantiles <- function(df, x) {
x <- enquo(x)
df |>
summarise(
q2.5 = quantile(!!x, 0.025),
q10 = quantile(!!x, 0.1),
q25 = quantile(!!x, 0.25),
q50 = quantile(!!x, 0.5),
q75 = quantile(!!x, 0.75),
q90 = quantile(!!x, 0.90),
q97.5 = quantile(!!x, 0.975),
mean = mean(!!x),
sd = sd(!!x)
)
}
levels <- c("bipoc", "not_bipoc", "female", "male", "female_bipoc", "male_bipoc",
"female_not_bipoc", "male_not_bipoc")
df_labs <- tribble(
~var, ~lab_text,
"bipoc", "BIPOC",
"female", "Female",
"female_bipoc", "Female, BIPOC",
"female_not_bipoc", "Female, White",
"male_bipoc", "Male, BIPOC",
"male_not_bipoc", "Male, White"
) |>
mutate(var = factor(var, levels = levels))
# first boots -------------------------------------------------------------
# voted out data frame
df_voted_out <- survivoR::vote_history |>
filter(
version == "US",
tribe_status == "Original"
) |>
distinct(version_season, voted_out, voted_out_id, order, tribe, tribe_status) |>
semi_join(no_quitters, by = c("version_season", "voted_out_id" = "castaway_id")) |>
group_by(version_season, tribe) |>
slice_min(order) |>
left_join(demogs, by = c("voted_out_id" = "castaway_id")) |>
mutate(
bipoc = replace_na(bipoc, FALSE),
not_bipoc = !bipoc,
female = gender == "Female",
male = gender == "Male",
female_bipoc = gender == "Female" & bipoc,
male_bipoc = gender == "Male" & bipoc,
female_not_bipoc = gender == "Female" & !bipoc,
male_not_bipoc = gender == "Male" & !bipoc
) |>
ungroup()
# expected data frame
df_expected <- survivoR::vote_history |>
filter(
version == "US",
tribe_status == "Original",
is.na(immunity) | immunity == "Hidden"
) |>
distinct(version_season, castaway, castaway_id, order, tribe, tribe_status) |>
group_by(version_season, tribe) |>
slice_min(order) |>
left_join(demogs, by = "castaway_id") |>
mutate(
bipoc = replace_na(bipoc, FALSE),
female = gender == "Female",
male = gender == "Male",
female_bipoc = gender == "Female" & bipoc,
male_bipoc = gender == "Male" & bipoc,
female_not_bipoc = gender == "Female" & !bipoc,
male_not_bipoc = gender == "Male" & !bipoc
) |>
group_by(version_season, order, tribe) |>
summarise(
n = n(),
n_bipoc = sum(bipoc),
n_not_bipoc = sum(!bipoc),
n_female = sum(female),
n_male = sum(male),
n_female_bipoc = sum(female_bipoc),
n_male_bipoc = sum(male_bipoc),
n_female_not_bipoc = sum(female_not_bipoc),
n_male_not_bipoc = sum(male_not_bipoc),
.groups = "drop"
) |>
mutate(
p_bipoc = n_bipoc/n,
p_not_bipoc = n_not_bipoc/n,
p_female = n_female/n,
p_male = n_male/n,
p_female_bipoc = n_female_bipoc/n,
p_male_bipoc = n_male_bipoc/n,
p_female_not_bipoc = n_female_not_bipoc/n,
p_male_not_bipoc = n_male_not_bipoc/n
) |>
ungroup()
# summary -----------------------------------------------------------------
# observed
df_obs <- df_voted_out |>
summarise(
bipoc = sum(bipoc),
not_bipoc = sum(not_bipoc),
female = sum(female),
male = sum(male),
female_bipoc = sum(female_bipoc),
male_bipoc = sum(male_bipoc),
female_not_bipoc = sum(female_not_bipoc),
male_not_bipoc = sum(male_not_bipoc)
) |>
pivot_longer(everything(), names_to = "var", values_to = "observed") |>
left_join(
df_expected |>
select(starts_with("p")) |>
summarise_all(~round(sum(.x))) |>
pivot_longer(everything(), names_to = "var", values_to = "expected") |>
mutate(var = str_remove(var, "p_")),
by = "var"
) |>
mutate(
var = factor(var, levels = levels),
res = observed-expected
)
# bayes model -------------------------------------------------------------
library(rstan)
library(tidybayes)
stan_dat <- df_voted_out |>
left_join(
df_expected |>
select(version_season, tribe, p_bipoc, p_not_bipoc, p_female, p_male, p_female_bipoc,
p_male_bipoc, p_female_not_bipoc, p_male_not_bipoc),
by = c("version_season", "tribe")
) |>
transmute(
version_season,
tribe,
y_bipoc = as.numeric(bipoc),
y_not_bipoc = as.numeric(!bipoc),
y_female = as.numeric(gender == "Female"),
y_male = as.numeric(gender == "Male"),
y_female_bipoc = as.numeric(female_bipoc),
y_male_bipoc = as.numeric(male_bipoc),
y_female_not_bipoc = as.numeric(female_not_bipoc),
y_male_not_bipoc = as.numeric(male_not_bipoc),
p_bipoc,
p_not_bipoc,
p_female,
p_male,
p_female_bipoc,
p_male_bipoc,
p_female_not_bipoc,
p_male_not_bipoc
)
stan_dat <- stan_dat |>
select(-starts_with("p")) |>
pivot_longer(starts_with("y"), names_to = "var", values_to = "y") |>
mutate(var = str_remove(var, "y_")) |>
left_join(
stan_dat |>
select(-starts_with("y")) |>
pivot_longer(starts_with("p"), names_to = "var", values_to = "p") |>
mutate(var = str_remove(var, "p_")),
by = c("version_season", "tribe", "var")
) |>
mutate(
log_odds = log(p/(1-p)),
var = factor(var, levels = levels),
mu0 = case_when(
var %in% c("bipoc", "female", "female_not_bipoc", "female_bipoc") ~ 0.5,
var %in% c("not_bipoc", "male", "male_not_bipoc", "male_bipoc") ~ -0.5,
TRUE ~ 0
)
)
stan_code <- "data {
int<lower=0> N;
array[N] int<lower=0, upper=1> y;
array[N] real<lower=0, upper=1> p;
array[N] real log_odds;
real mu0;
}
parameters {
real beta;
}
transformed parameters {
array[N] real<lower=0, upper=1> kappa;
for(k in 1:N) {
kappa[k] = 1/(1+exp(-(log_odds[k] + beta)));
}
}
model {
beta ~ normal(mu0, 1.5);
y ~ bernoulli(kappa);
}"
# compile one for faster fitting
dat <- stan_dat |>
filter(
var == "female",
p > 0,
p < 1
) |>
as.list()
dat$mu0 <- unique(dat$mu0)
dat$N <- length(dat$y)
mod_stan <- stan(
model_code = stan_code,
data = dat
)
# fit models
df_bias <- map_dfr(levels, ~{
dat <- stan_dat |>
filter(
var == .x,
p > 0,
p < 1
) |>
as.list()
dat$mu0 <- unique(dat$mu0)
dat$N <- length(dat$y)
mod_stan <- stan(
model_code = stan_code,
data = dat
)
tibble(
var = .x,
bias = rstan::extract(mod_stan, "beta")$beta
)
}) |>
mutate(var = factor(var, levels = levels)) |>
left_join(
stan_dat |>
group_by(var) |>
summarise(median = median(p)),
by = "var"
)
df_bias_summary <- df_bias |>
group_by(var) |>
summarise_quantiles(bias) |>
mutate(
lab = snakecase::to_title_case(levels) |>
str_replace("Bipoc", "BIPOC")
)
df_bias_summary_p <- df_bias |>
mutate(
p0 = ifelse(var %in% c("female_bipoc", "male_bipoc", "female_not_bipoc", "male_not_bipoc"), 0.25, 0.5),
p = log_odds_inv(log_odds(p0)+bias),
) |>
group_by(var, p0) |>
summarise_quantiles(p) |>
mutate(
pct = glue("{ifelse(q50<p0, '', '+')}{100*round(q50-p0, 2)}%"),
)
```

```
> df_bias_summary
# A tibble: 8 × 11
var q2.5 q10 q25 q50 q75 q90 q97.5 mean sd lab
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 bipoc -0.245 -0.0708 0.0768 0.236 0.390 0.533 0.692 0.234 0.238
2 not_bipoc -0.687 -0.532 -0.384 -0.222 -0.0525 0.0908 0.268 -0.218 0.244
3 female 0.0987 0.247 0.369 0.525 0.691 0.832 0.997 0.533 0.232
4 male -0.967 -0.817 -0.678 -0.525 -0.374 -0.237 -0.0831 -0.526 0.226
5 female_bipoc 0.00560 0.178 0.342 0.513 0.699 0.861 1.04 0.517 0.263
6 male_bipoc -1.21 -0.929 -0.698 -0.443 -0.204 -0.00400 0.211 -0.458 0.366
7 female_not_bipoc -0.288 -0.127 0.0177 0.183 0.341 0.482 0.627 0.179 0.238
8 male_not_bipoc -0.932 -0.755 -0.588 -0.413 -0.240 -0.0944 0.0718 -0.416 0.258
```

For BIPOC players, the bias term CI includes 0 fairly comfortably so I’d say that they are not voted out first any more than other players, at least not enough evidence to confirm that they are. It is also lower than my expectations.

Women are voted out first more often than male players. The 95% CI doesn’t include 0 and clearly different. BIPOC women players are similar. From this it should be clear that it’s more due to gender than race/ethnicity.

The Bayesian analysis showed us what we need to know, but I wanted to look at this another way as well. I’ve also fit a simulation model and looked at the probability distribution. If it was completely random how many BIPOC players can we expect to be voted out first?

I took 4,000 random draws from each of the first Tribal Councils and counted how many times a BIPOC, female, or female BIPOC castaway was voted out. Below are the probability distributions under the assumption of perfect randomness. Each bar represents the likelihood of observing that many boots out of the 88 Tribal councils.

For example, there have been 33 BIPOC players booted from the first Tribal Council, and under perfect randomness, we would expect 29. But the distribution shows we can reasonably expect somewhere between 22-37, so 33 is on the upper end but isn’t particularly unusual.

We have seen 55 female castaways booted first where would expect 45. This is right in the tail of the distribution. There’s only a 1% chance that we should see 55 or more female first boots which means there’s probably something here, there’s a preference to vote women out first.

Code: Simulation

```
# number of sims
n_sims <- 4000
levels <- c("bipoc", "female", "female_bipoc", "female_not_bipoc")
df_sim0 <- map_dfr(1:n_sims, ~{
df_expected |>
mutate(sim = .x)
}) |>
mutate(
bipoc = rbernoulli(n(), p_bipoc),
female = rbernoulli(n(), p_female),
female_bipoc = rbernoulli(n(), p_female_bipoc),
female_not_bipoc = rbernoulli(n(), p_female_not_bipoc),
male_bipoc = rbernoulli(n(), p_male_bipoc),
male_not_bipoc = rbernoulli(n(), p_male_not_bipoc)
) |>
group_by(sim) |>
summarise(
bipoc = sum(bipoc),
female = sum(female),
female_bipoc = sum(female_bipoc),
female_not_bipoc = sum(female_not_bipoc),
male_bipoc = sum(male_bipoc),
male_not_bipoc = sum(male_not_bipoc)
) |>
pivot_longer(-sim, names_to = "var", values_to = "y") |>
filter(var %in% levels)
df_ci <- df_sim0 |>
group_by(var) |>
summarise_quantiles(y)
df_sim <- df_sim0 |>
count(var, y)
```

I’ve only included those with a positive bias in the chart.

The final way I’ll look at this is by fitting a basic regression model. This is a bad model choice, to be honest for reasons I’ll explain.

For this model to work the model data frame needs to be at the person level. The response is either 0 or 1 if the person voted out. The predictors are BIPOC (yes, no) and gender (male, female).

The issue with this model is each observation is assumed to be independent meaning that whether or not the person is voted out is only dependent on the person’s characteristics and independent from all other people. But, that doesn’t hold. There is only one person eliminated per Tribal Council. That means whoever is voted out of the tribe first means all the others can’t be voted out. Independence only holds between Tribals Councils.

That’s really important to understand because if you were to predict who will be voted out the model may spit out multiple people going home which is dumb. It’s also important to consider what that means for the coefficients of the model. What’s going to happen is that the dependent relationship is going to change the variance depending on the proportions within the tribe and the effect is going to be averaged across the seasons. This could be misleading.

You need to be really careful when interpreting the output under these conditions. I’m doing this anyway because I’m mainly interested in if it’s drastically different from the above.

The convenient thing about the regression model is comparing the coefficients of gender and BIPOC status. Even with the issues of dependence, we can compare the magnitude of both to see which has the strongest influence. You still have to be careful though.

Given the analysis above, gender should be the stronger predictor. If that’s true, I rest my case.

Code: Regression

```
# regression --------------------------------------------------------------
diverse_tribes <- df_expected |>
filter(
p_bipoc > 0,
p_bipoc < 1
) |>
distinct(version_season, tribe)
df_mod <- survivoR::vote_history |>
semi_join(diverse_tribes, by = c("version_season", "tribe")) |>
filter(
version == "US",
tribe_status == "Original",
is.na(immunity) | immunity == "Hidden"
) |>
distinct(version_season, castaway, castaway_id, voted_out, voted_out_id, order, tribe, tribe_status) |>
semi_join(no_quitters, by = c("version_season", "voted_out_id" = "castaway_id")) |>
mutate(voted_out = as.numeric(voted_out == castaway)) |>
group_by(version_season, tribe) |>
slice_min(order) |>
left_join(demogs, by = "castaway_id") |>
mutate(bipoc = replace_na(bipoc, FALSE)) |>
filter(gender != "Non-binary")
mod <- glm(voted_out ~ gender + bipoc, data = df_mod, family = binomial(link='logit'))
summary(mod)
```

```
> summary(mod)
Call:
glm(formula = voted_out ~ gender + bipoc, family = binomial(link = "logit"),
data = df_mod)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.7650 0.1816 -9.721 <2e-16 ***
genderMale -0.5869 0.2454 -2.392 0.0168 *
bipocTRUE 0.2285 0.2473 0.924 0.3555
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 494.06 on 654 degrees of freedom
Residual deviance: 486.77 on 652 degrees of freedom
AIC: 492.77
Number of Fisher Scoring iterations: 5
```

I rest my case.

Gender is clearly more influential than BIPOC status. If I was a frequentist, I’d be removing BIPOC from the model as it’s not significant.

I didn’t want to talk about this, but here we go. The paper titled ‘Surviving Racism and Sexism: What Votes in the Television Program Survivor Reveal about Discrimination‘ came out after my original post. The analysis looks at if BIPOC and female contestants are disproportionately voted out as well as trends at other stages of the game.

It shows that women are disproportionately voted out first, as I’ve shown above. However, it claims that BIPOC players are also more likely to be the target and disproportionately voted out: *“Compared to White contestants, BIPOC contestants had 51% higher odds of being voted out of their tribe first, (1, N=731)=4.59, p=.032, OR=1.51, 95% CI [1.03–2.19]”*. This is counter to what I have shown in the analysis.

They have used a logistic regression model at the person level for all 731 castaways in seasons 1-40. As I’ve shown above, this is not a good model for the problem. Even with the issues of fitting a regression model to this data I don’t see anything close to 51% higher odds.

I suspect there are also differences in how the data was set up. There’s not a lot of discussion about the data considerations before modeling as I’ve done, e.g. only using the original tribes, and removing ineligible castaways.

They have used the Survivor WIki for labeling race/ethnicity so that should be consistent but there could be differences in which race/ethnicities are included.

I’ve curated the data to only those eligible to be voted out and the first true vote for a tribe using 46 seasons. Some tribes/castaways are removed because they didn’t go to Tribal Council before a swap. This leaves 655 castaways over 46 seasons.

They have only kept diverse tribes (although N=731 from above so I’m not so sure about that). A tribe that consists of entirely BIPOC players like the Manihiki tribe in Cook Islands, only has one choice so is removed. This is an important consideration and should be removed but the same logic should extend to all tribe makeups. The probability of voting out a BIPOC player with there are 5/6 in a tribe is much higher than if there is only 1/10.

This imbalance alters the model outcome and I believe is the heart of the issue and why the paper probably made some incorrect conclusions. I’ll explain.

To demonstrate why this is important, I’ll make up a toy example.

Let’s assume 50 tribes went to Tribal Council. Each tribe has 1 BIPOC and 9 white players. In total, there are 500 players – 50 BIPOC and 450 white.

Let’s also assume there is no bias and everyone has a 1/10 chance of being voted out. Then we expect to see 5 BIPOC players and 45 white players voted out first. We’ll put that into a 2×2.

```
x1 <- matrix(c(405, 45, 45, 5), nrow = 2, dimnames = list(c("White", "BIPOC"), c("No", "Yes")))
> x1
No Yes
White 405 45
BIPOC 45 5
```

I’ll fit a Chi-squared test to see if there is an association between race and being voted out first.

```
> chisq.test(x1)
Pearson's Chi-squared test
data: x1
X-squared = 0, df = 1, p-value = 1
```

The p-value is 1 because we’ve assumed equal probability of being voted out first. Makes perfect sense.

Let’s choose another example, 40 Tribals, and each tribe has 3 BIPOC players and 1 White player. That’s 120 BIPOC players and 40 White players in total. Again, let’s assume no bias and equal probability of being voted out of 1/4. Then we expect 30 BIPOC players and 10 white players voted out. I’ll put that into a 2×2 and run a Chi-squared test.

```
x2 <- matrix(c(30, 90, 10, 30), nrow = 2, dimnames = list(c("White", "BIPOC"), c("No", "Yes")))
> x2
No Yes
White 30 10
BIPOC 90 30
> chisq.test(x2)
Pearson's Chi-squared test
data: x2
X-squared = 0, df = 1, p-value = 1
```

No surprises to anyone, there’s no association.

Now what if we joined them together i.e. ? There would be 90 Tribal Councils, 490 White players, and 170 BIPOC players, 55 and 35 voted out respectively. Again, I’ll run a Chi-squared test for an association.

```
> x <- x1 + x2
> x
No Yes
White 435 55
BIPOC 135 35
> chisq.test(x)
Pearson's Chi-squared test with Yates' continuity correction
data: x
X-squared = 8.6183, df = 1, p-value = 0.003328
```

Now the test is highly significant! We would confirm without a doubt that there IS an association between race and being voted out first.

But we know this isn’t correct because we specifically assumed equal probability of being voted out. We set the data up with this exact property and the individual tests produced a p-value of 1.

So, why is combining them suddenly showing an association when there is not? It’s because each observation is assumed to be iid – independent and identically distributed. But, the observations are not independent. In a single Tribal Council if one person gets voted out it means the others can’t be voted out. There is a dependency within each Tribal Council so the makeup of the tribe matters. The model doesn’t understand this though.

Each Tribal Council IS independent since there is no interaction between the votes in one Tribal and the votes in another. That’s what the observation should be, the Tribal Council, not the player.

This is accounted for in the Bayesian model and the simulation, but not the regression model since it is at the person level.

I hope this makes sense because by ignoring this property you could be making conclusions about an association when there is none, which is what I think may have happened.

To call a spade a spade, a group of people go to Tribal Council to vote someone out. Bias enters the game when humans group together and make decisions about who to vote out, which is the very essence of the game. Maybe, that’s what needs to change? To make the game fairer perhaps more elements of chance need to be introduced. Perhaps they need to remove players’ votes, force people to rely on their social game and make true, meaningful connections.

I can’t imagine the fandom getting behind any major change though. They are a pretty conservative bunch and resist almost any change made to the game – final 4 fire, the 3 tribe set up, 26 days, new advantages, more than one hidden immunity idol, moral dilemmas, Summit journeys, the rice negotiation, beware advantages, fake idol kits, shot in the dark…. pretty much everything. Now in the new era, players can lose their vote and they fucking hate it.

So, it’s pretty funny reading unhinged posts like ‘Man who manipulates Survivor’s game cannot imagine adjusting to make it fair‘ because a) maybe production is adjusting it? And b) I can’t imagine any change aimed at making it ‘fairer’ that would receive unanimous approval, particularly when that would probably mean removing human decisions or introducing a new mechanic. The sentiment tends to be ‘if it ain’t broke, don’t fix it, except when the person I liked gets voted out’.

The post Racial Bias in Survivor: Are BIPOC Players Disproportionately Voted Out First? (part 2) appeared first on Dan Oehm | Gradient Descending.

]]>If you’re looking for something a little different, ggbrick creates a ‘waffle’ style chart with the aesthetic of a brick […]

The post ggbrick is now on CRAN appeared first on Dan Oehm | Gradient Descending.

]]>If you’re looking for something a little different, `ggbrick`

creates a ‘waffle’ style chart with the aesthetic of a brick wall. The usage is similar to `geom_col`

where you supply counts as the height of the bar and a `fill`

for a stacked bar. Each whole brick represents 1 unit. Two half bricks equal one whole brick.

It has been available on Git for a while, but recently I’ve made some changes and it now has CRAN’s tick of approval.

```
install.packages("ggbrick")
```

There are two main geoms included:

`geom_brick()`

: To make the brick wall-style waffle chart.`geom_waffle()`

: To make a regular-style waffle chart.

Use `geom_brick()`

the same way you would use `geom_col()`

.

```
library(dplyr)
library(ggplot2)
library(ggbrick)
# basic usage
mpg |>
count(class, drv) |>
ggplot() +
geom_brick(aes(class, n, fill = drv)) +
coord_brick()
```

`coord_brick()`

is included to maintain the aspect ratio of the bricks. It is similar to `coord_fixed()`

, in fact, it is just a wrapper for `coord_fixed()`

with a parameterised aspect ratio based on the number of bricks. The default number of bricks is 4. To change the width of the line outlining the brick use the `linewidth`

parameter as normal.

To change specify the `bricks_per_layer`

parameter in the geom and coord functions.

```
mpg |>
count(class, drv) |>
ggplot() +
geom_brick(aes(class, n, fill = drv), bricks_per_layer = 6) +
coord_brick(6)
```

You can change the width of the columns similar to `geom_col()`

to add more space between the bars. To maintain the aspect ratio you also need to set the width in `coord_brick()`

.

```
mpg |>
count(class, drv) |>
ggplot() +
geom_brick(aes(class, n, fill = drv), width = 0.5) +
coord_brick(width = 0.5)
```

To get more space between each brick use the `gap`

parameter.

```
mpg |>
count(class, drv) |>
ggplot() +
geom_brick(aes(class, n, fill = drv), gap = 0.04) +
coord_brick()
```

For no gap set `gap = 0`

or use the shorthand `geom_brick0()`

.

```
mpg |>
count(class, drv) |>
ggplot() +
geom_brick0(aes(class, n, fill = drv)) +
coord_brick()
```

For fun, I’ve included a parameter to randomise the fill of the bricks or add a small amount of variation at the join between two groups. The proportions are maintained and designed to just give a different visual.

```
mpg |>
count(class, drv) |>
ggplot() +
geom_brick(aes(class, n, fill = drv), type = "soft_random") +
coord_brick()
```

```
mpg |>
count(class, drv) |>
ggplot() +
geom_brick(aes(class, n, fill = drv), type = "random") +
coord_brick()
```

`geom_waffle()`

has the same functionality as `geom_brick()`

but the bricks are square giving a standard waffle chart. I added this so you can make a normal waffle chart in the same way you would use `geom_col()`

. It requires `coord_waffle()`

. To maintain the aspect ratio.

```
mpg |>
count(class, drv) |>
ggplot() +
geom_waffle(aes(class, n, fill = drv)) +
coord_waffle()
```

```
mpg |>
count(class, drv) |>
ggplot() +
geom_waffle0(aes(class, n, fill = drv), bricks_per_layer = 6) +
coord_waffle(6)
```

You may want to flip the coords when using `geom_waffle()`

. To do so you’ll need to use `coord_flip()`

and `theme(aspect.ratio = <number>)`

. I haven’t made a `coord_waffle_flip()`

, yet!

```
mpg |>
count(class, drv) |>
ggplot() +
geom_waffle0(aes(class, n, fill = drv)) +
coord_flip() +
theme(aspect.ratio = 1.8)
```

I think `geom_brick()`

pairs with `with_shadow()`

and `with_inner_blur()`

pretty well!

```
library(ggbrick)
library(ggfx)
font_add_google("Karla", "karla")
showtext_auto()
ft <- "karla"
txt <- "grey10"
bg <- "white"
survivoR::challenge_results |>
filter(
version == "US",
outcome_type == "Individual",
result == "Won"
) |>
left_join(
survivoR::castaway_details |>
select(castaway_id, gender, bipoc),
by = "castaway_id"
) |>
left_join(
survivoR::challenge_description |>
mutate(type = ifelse(race, "Race", "Endurance")) |>
select(version_season, challenge_id, type),
by = c("version_season", "challenge_id")
) |>
count(type, gender) |>
drop_na() |>
ggplot() +
with_shadow(
with_inner_glow(
geom_brick(aes(type, n, fill = gender), linewidth = 0.1, bricks_per_layer = 6)
),
x_offset = 4,
y_offset = 4
) +
coord_brick(6) +
scale_fill_manual(values = blue_pink[c(5, 1, 4)]) +
labs(
title = toupper("Survivor Challenges"),
subtitle = "Approximately a third of races and half of endurance challenges\nare won by women.",
fill = "Gender",
caption = "Individual challenges only. The different proportions of men and women at merge hasn't been taken into consideration."
) +
theme_void() +
theme(
text = element_text(family = ft, colour = txt, lineheight = 0.3, size = 32),
plot.background = element_rect(fill = bg, colour = bg),
plot.title = element_markdown(size = 128, colour = txt, hjust = 0.5, margin = margin(b = 10)),
plot.subtitle = element_text(hjust = 0.5, size = 48, margin = margin(b = 30)),
plot.caption = element_text(size = 24, hjust = 0, margin = margin(t = 20)),
axis.text = element_text(vjust = 0),
axis.title.y = element_blank(),
plot.margin = margin(t = 30, b = 10, l = 30, r = 30),
legend.position = "top"
)
```

The post ggbrick is now on CRAN appeared first on Dan Oehm | Gradient Descending.

]]>The Survivor Auction is classic. Seeing hungry people bid, win, and then binge on the food they just purchased ispretty […]

The post Survivor Auction analysis: Should you bid on the first covered item? appeared first on Dan Oehm | Gradient Descending.

]]>The Survivor Auction is classic. Seeing hungry people bid, win, and then binge on the food they just purchased ispretty good stuff.

It’s not always great. Remember bat soup? A covered item could be the meal of your dreams or it could be bat soup. So, when a covered item comes out should you bid on it?

I’ll be running a simulation model to determine if bidding on the first covered item is a good idea or a terrible one.

Before that, I’ll give a rundown of the data and show some summary stats of Survivor Auctions over the years.

All data is available in the {survivoR} R package and code at the bottom of the post.

Should you bid on the first covered item?

It’s best not to.

Should you bid on any?

Probably not.

Since this is Survivor and people want to take risks. So if they’re going to bid on one, which one should it be?

Go with the second one.

I’ll use the `survivor_auction`

and `auction_details`

datasets. A couple of things to note:

- I’ll only use the US version data, but it could be expanded to other versions.
- The auctions have changed and evolved over the years, for example, Season 5 was done early and the tribe bid together, and people could pool money in earlier seasons. That was restricted in later seasons. Fortunately, none of those things should complicate what we’re looking at here.
- What is considered a ‘bad item’ is subjective. You could argue that ‘rice and water’ isn’t a bad item but in the spirit of the Survivor Auction players wouldn’t be happy with that when they were hoping for a burger and chips or some sort of protein.

A borderline case for me was the skewer of chicken hearts Stephen purchased in Tocantins. It’s probably not what he hoped for but it’s far from bat soup, and you can buy them at the supermarket. I’ve recorded it as ‘food and drink’ rather than a ‘bad item’ but could be convinced otherwise. If the skewer had been regular chicken meat it’d be fine.

Another case was Will purchasing his removal from the auction at the very beginning, which is bad but at camp, he found the location for hidden rations. The result was good even though he couldn’t participate in the rest of the auction. I’ve categorised this as ‘food and drink’. - Where an item is for multiple people it is still considered one item. For example, letters from home are a common item. Usually, one person wins the bid and then it’s opened up to everyone. I consider this as one item.
- In the case where a covered item is purchased and then purchased by another player e.g. Austin buying the giant fish eyes it is only one item but auctioned off twice. The second time it was uncovered though.
- Occasionally they are given the option to switch to an alternative covered item. One of them is likely bad and the other isn’t. So far most have refused but Erik in Micronesia switched and got the good item. I’ve ignored this for the moment for data reasons but it is worth looking into.

Survivor Auctions aren’t super clean from a data point of view. There aren’t strict rules or the rules have changed and it’s a big collection of edge cases. But I think the way I’ve structured it makes sense.

Whether or not you should bid on an item depends on a few points of uncertainty:

- The number of players attending the Survivor Auction
- The number of items at the Survivor Auction
- The number of covered items
- The number of ‘bad items’

These points affect the chances of winning bat soup. Only one of them is known to the player. Three of them are unknown.

We need to understand how each of these varies from season to season to know if it’s a good idea to bid or not.

The first auction was held in Season 2 and there have been 17 in total with its return in Season 45. It is held at different stages of the game. The number of players at the Survivor Auction ranges from 6 to 12, and on average 8 people.

Code

```
# set up data frame
df <- survivoR::auction_details |>
filter(
version == "US",
auction_num == 1
) |>
distinct(version_season, item, item_description, category, covered) |>
group_by(version_season) |>
summarise(
n_items = n(),
n_covered = sum(covered),
n_bad = sum(category == "Bad item"),
pos_first_bad = cumsum(covered)[which(category == "Bad item")[1]]
) |>
left_join(
survivoR::survivor_auction |>
count(version_season, name = "n_cast"),
by = "version_season"
) |>
mutate(pos_first_bad = replace_na(pos_first_bad, 99))
# number of castaways
df_n_cast <- df |>
count(n_cast)
```

Number of castaways at the auction | Number of seasons |
---|---|

6 | 2 |

7 | 6 |

8 | 3 |

9 | 4 |

10 | 1 |

12 | 1 |

How many people are at the Survivor Auction could help to estimate how many items and therefore covered items there may be. The more people, the more items seem like a reasonable assumption. (Spoiler: It doesn’t matter).

The number of items at each auction varies from 5 to 12 items, and on average 8 items.

Code

```
# number of items
df_n_items <- df |>
count(n_items)
# categories
df_category <- survivoR::auction_details |>
filter(
version == "US",
auction_num == 1
) |>
distinct(version_season, item, item_description, category, covered) |>
count(category)
```

Number of items up for grabs | Number of seasons |
---|---|

5 | 2 |

6 | 4 |

7 | 2 |

8 | 4 |

9 | 1 |

10 | 2 |

11 | 1 |

12 | 1 |

I’ve binned the items into 5 main categories. Without too much surprise the majority are food and drink.

Category | Number of items |
---|---|

Food and drink | 95 |

Comfort | 8 |

Advantage | 13 |

Letter or message from home | 7 |

Bad item | 9 |

Every auction includes covered items where the player doesn’t know what they’re bidding on. The number of covered items varies from 1 to 5, and on average 3 items are covered.

Code

```
# number of covered items
df_n_covered <- df |>
count(n_covered)
# categories
df_category_covered <- survivoR::auction_details |>
filter(
version == "US",
auction_num == 1,
covered
) |>
distinct(version_season, item, item_description, category) |>
count(category)
```

Number of covered items | Number of seasons |
---|---|

1 | 4 |

2 | 4 |

3 | 4 |

4 | 3 |

5 | 2 |

The majority of covered items are food and drink as well, a few are advantages and 9 are bad items.

Category | Number of items |
---|---|

Food and drink | 34 |

Advantage | 3 |

Bad item | 9 |

Here are the 9 bad items that have been purchased over the years.

Code

```
# number of bad items
df_bad <- df |>
count(n_bad) |>
mutate(p = n/sum(n))
survivoR::auction_details |>
group_by(version_season) |>
filter(
category == "Bad item",
auction_num == 1,
version == "US"
) |>
select(season_name, item, item_description, castaway, cost)
# position
df |>
filter(pos_first_bad < 99) |>
count(pos_first_bad) |>
ungroup() |>
mutate(p = n/sum(n))
```

Season | Item number | Description | Castaway | Cost | The nth covered item |
---|---|---|---|---|---|

S2 The Australian Outback | 11 | Glass of river water | Amber | $200 | 1st |

S5 Thailand | 3 | Backed grubs | Sook Jai | $80 | 1st |

S6 The Amazon | 2 | Manioc | Alex | $240 | 1st |

S13 Cook Islands | 6 | Sea cucumber | Sundra | $140 | 3rd |

S16 Micronesia | 3 | Fruit bat soup | Natalie | $240 | 3rd |

S19 Samoa | 2 | Sea noodles and slug guts with parmesan cheese | Shambo | $240 | 1st |

S26 Caramoan | 8 | Pig brain | Brenda | $300 | 4th |

S28 Cagayan | 4 | Rice and water | Trish | $60 | 3rd |

S45 45 | 5 | Two giant fish eyes | Katurah | $480 | 2nd |

There has been a maximum of 1 bad item purchased for a given season. On average, 1 in every 2 seasons includes a bad item. Honestly, not as many as I remember.

Number of ‘bad items’ | Number of seasons | Percentage |
---|---|---|

0 | 8 | 47% |

1 | 9 | 53% |

The bad item is often revealed at different positions i.e. either the first, second, etc, covered item.

Position of ‘bad item’ | Number of seasons | Percentage |
---|---|---|

First covered item | 4 | 44% |

Second | 1 | 11% |

Third | 3 | 33% |

Fourth | 1 | 11% |

Fifth | 0 | 0% |

Out of the 9 seasons that had a bad item 4 were revealed in the first covered item and the other 5 across the second, third, and fourth covered items. There has only been a bad item under the second covered item on one occasion and it was in 45. If this was all we were going by, choosing the second item may be the way to go.

There has also only been one bad item under the 4th covered item and never under the 5th. There have only been 5 covered items on 2 occasions though. Not saying it won’t happen in the future.

I’ll be fitting a Bayesian simulation model to estimate the probability of which covered item holds the bad one, if any.

Each iteration of the simulation is done by the following:

- Draw the number of covered items in the auction.
- Draw the number of ‘bad items’ in the season.
- Draw the position of the ‘bad item’.

The number of covered items is drawn from a Dirichlet-Mutlinomial distribution using a non-informative prior.

Where is the observed number of seasons with that many covered items (table 3).

This will output a vector of probabilities from which a single number is drawn for each iteration.

The number of ‘bad items’ is drawn from a Beta-Bernoulli distribution using a non-informative prior.

where and are 8 and 9 from the table above. For each iteration, the number of ‘bad items’ is drawn from a Bernoulli distribution using . I’m restricting the simulation to have only one bad item but it could be expanded to more using a binomial.

In S16 Micronesia, Erik purchased an item and was offered a switch with another item. He switched and got Nachos instead of Jarred Octopus. We know that there were two in a season but weren’t won. An edge case I’m willing to ignore right now.

Similar to step 1, the position of the bad item is drawn from a Dirichlet-Mutlinomial using each draw from 1 and 2.

where is the vector of frequencies that the bad item appeared (table 6).

I’ll run 40,000 simulations. Each simulation can be considered a season.

Simulation code

```
library(tidyverse)
library(dirmult)
# set main data frame
df0 <- survivoR::auction_details |>
filter(
version == "US",
auction_num == 1
) |>
distinct(version_season, item, item_description, category, covered) |>
group_by(version_season) |>
summarise(
n_items = n(),
n_covered = sum(covered),
n_bad = sum(category == "Bad item"),
pos_first_bad = cumsum(covered)[which(category == "Bad item")[1]]
) |>
left_join(
survivoR::survivor_auction |>
count(version_season, name = "n_cast"),
by = "version_season"
) |>
mutate(pos_first_bad = replace_na(pos_first_bad, 99)) |>
# parameter to run the sim after a certain number of items are revealed.
.after <- 0
# data
df <- df0 |>
filter(pos_first_bad > .after & n_covered > .after)
# number of items
df_n_items <- df |>
count(n_items)
# number of covered items
df_n_covered <- df |>
count(n_covered)
# number of bad items
df_bad <- df |>
count(n_bad) |>
mutate(p = n/sum(n))
# position of first bad
df_pos_first_bad <- df |>
filter(pos_first_bad < 99) |>
drop_na() |>
count(n_covered, pos_first_bad) |>
group_by(n_covered) |>
mutate(p = n/sum(n))
# simulation
n_covered <- table(df$n_covered)
n_bad <- table(df$n_bad)
# vector of freqs for position simulation
n_pos_obs <- map((.after+1):5, ~{
i <- df_pos_first_bad |>
filter(n_covered == .x) |>
pull(pos_first_bad)
n <- df_pos_first_bad |>
filter(n_covered == .x) |>
pull(n)
obs <- rep(1, .x-.after)
obs[i-.after] <- obs[i-.after] + n
obs
})
# number of sims
n_sims <- 40000
# draw the probabilities
# theta
p_draws_n_covered <- rdirichlet(n = n_sims, alpha = n_covered+1)
# gamma
p_draws_n_bad <- rbeta(n_sims, n_bad[1]+1, n_bad[2]+1)
# draw the values
# y_covered
draws_n_covered <- apply(p_draws_n_covered, 1, function(x) sample((.after+1):5, 1, prob = x))
# y_bad
draws_n_bad <- rbinom(n = n_sims, 1, prob = p_draws_n_bad)
# sample position
pos <- rep(0, n_sims)
# run loop
fixed_n <- FALSE
i <- 3
equal_prob <- FALSE
for(k in 1:n_sims) {
if(draws_n_bad[k] == 1) {
if(!fixed_n) {
i <- draws_n_covered[k]-.after
}
if(equal_prob) {
pos[k] <- sample(1:i, 1, prob = rep(1/i, i))
} else {
p_k <- rdirichlet(1, alpha = n_pos_obs[[i]])
pos[k] <- sample(1:i, 1, prob = p_k)
}
}
}
# get the probs
table(pos)/n_sims
```

The vector pos holds all the simulated positions of the ‘bad item’. I’ve coded position ‘0’ for the cases where there wasn’t a bad item, or interpreted as the proportion of seasons that didn’t have a bad item. This sits at 53% (9/17) which makes sense.

The sum of the positions 1 to 5 is 47% or the probability that there will be a ‘bad item’. The next part is interpreting the probabilities at each position.

The probability that the first item will be a ‘bad item’ is 25%. This is the highest out of all positions. You may think that’s the worst one to bid on. The reason why the others are low is because the season may only have 1 covered item, or 2, or 3, etc. This probability also accounts for the unknown number of covered items.

We also have to consider how this plays out. Let’s assume you don’t bid on the first item. It turned out it’s a good item and then another covered item is presented. Should you bid on this item?

In this case, the probabilities change. We know with certainty that the first item was good, so the probability that it is bad goes to 0. We also know with certainty that at least 2 covered items are in the auction. So what are the probabilities now?

Thinking ahead briefly we could extend this to the 3rd item, 4th, etc. The table below details all the scenarios.

After ‘n’ covered items are revealed | No bad item | 1st covered item | 2nd | 3rd | 4th | 5th |
---|---|---|---|---|---|---|

0 | 53% | 25% | 9% | 9% | 3% | 1% |

1 | 50% | 23% | 18% | 7% | 2% | |

2 | 56% | 30% | 12% | 3% | ||

3 | 40% | 52% | 8% | |||

4 | 98% | 2% |

Based on this, skipping the first one and bidding on the second one is probably the best but it’s much of a muchness.

If the number of covered items is known (or we simply make an assumption), it changes slightly.

Let’s assume there are 3 covered items since there are on average 3. The table is now…

After ‘n’ covered items are revealed | No bad item | 1st covered item | 2nd | 3rd |
---|---|---|---|---|

0 | 53% | 15% | 8% | 24% |

1 | 50% | 13% | 38% | |

2 | 55% | 45% |

In this case, go with the second one, maybe the first, but don’t go with the third.

The best strategy is to not bid on the covered items.

But that’s boring, huh? Ok, bid on the second one instead.

If the first one is the bad item then by our assumptions and past seasons there is only one bad item so you should be safe from there. Of course, there still could be though, it just hasn’t happened before.

Have you ever rolled a standard 6-sided die and tallied up the results? You’re likely to roll a single number 0, 1, or 2 times, but could be more. If you estimate the probabilities from the results you’ll just get nonsense. We know the chance to roll a single number is 1/6 and not 0/6 or 2/6 or whatever.

That’s kind of what’s happening here. If we get to season 40k we’ll probably find that number of covered items and the position of the bad items balance out. I could be wrong. There could be subtle trends we pick up on. I’ll follow up after season 40k.

The best strategy at the Survivor Auction (in my opinion) is to just slam down 500 bucks or however many bucks you’ve got on the first food item you see.

Anyway, that was a fun way to say it doesn’t really matter that much! At the very least it should give you a good idea of how you can use the `auction_details`

data set.

The post Survivor Auction analysis: Should you bid on the first covered item? appeared first on Dan Oehm | Gradient Descending.

]]>This one rounds out another full year of Tidy Tuesday. A quick one looking at the number of lines of […]

The post Tidy Tuesday week 52: R Package Structure appeared first on Dan Oehm | Gradient Descending.

]]>This one rounds out another full year of Tidy Tuesday. A quick one looking at the number of lines of code by the major version number. On average, the higher the version number the more lines of code. Surprise, surprise!

Code Github

The post Tidy Tuesday week 52: R Package Structure appeared first on Dan Oehm | Gradient Descending.

]]>Catching up on the lead-up to the holiday break. This was a quick one looking at the ratings of The […]

The post Tidy Tuesday week 51: Holiday Episodes appeared first on Dan Oehm | Gradient Descending.

]]>Catching up on the lead-up to the holiday break. This was a quick one looking at the ratings of The Simpsons holiday episodes. It’s simple but I like the use of the yellow, blue, and Simpsons font.

Code Github

The post Tidy Tuesday week 51: Holiday Episodes appeared first on Dan Oehm | Gradient Descending.

]]>Approaching the holiday season, we’re looking at Christmas / holiday movies. This list was curated to movies that include the […]

The post Tidy Tuesday week 50: Holiday Movies appeared first on Dan Oehm | Gradient Descending.

]]>Approaching the holiday season, we’re looking at Christmas / holiday movies. This list was curated to movies that include the words Christmas, Holiday, Hanukkah, or Kwanzaa in the title. So it doesn’t include classic ‘Christmas’ movies like Die Hard or Home Alone, much to my disappointment.

I attempted to bring the Christmas theme to life by plotting the 20 most popular movies on the list as baubles on a Christmas Tree. The higher up the tree, the higher the IMDb rating. The ‘x’ variable is simply random. I tried to use a quantitative variable but nothing really worked. So in the interest of time, this is what you get. I like it though.

Code Github. The tree was generated with Midjourney.

The post Tidy Tuesday week 50: Holiday Movies appeared first on Dan Oehm | Gradient Descending.

]]>This week we’re looking at life expectancy across the globe. I wanted to look at the data holistically across each […]

The post Tidy Tuesday week 49: Life Expectancy appeared first on Dan Oehm | Gradient Descending.

]]>This week we’re looking at life expectancy across the globe. I wanted to look at the data holistically across each country and then focus on Australia. This would be great for an interactive chart built in Shiny but that’s for another day.

The first chart shows how overall life expectancy has increased over the years 1970-2020 as well as the rank change for each country. It also highlights events that have impacted life expectancy in some countries e.g. Rwanda in the 90s.

The second highlights Australia whose life expectancy for children born in a given year has increased from 71 to 84 and from ranked 33rd to 5th in the world.

Code Github

The post Tidy Tuesday week 49: Life Expectancy appeared first on Dan Oehm | Gradient Descending.

]]>This week we’re looking at Doctor Who episode data compiled by Jonathan Kitt. This was a quick one charting the […]

The post Tidy Tuesday week 48: Doctor Who appeared first on Dan Oehm | Gradient Descending.

]]>This week we’re looking at Doctor Who episode data compiled by Jonathan Kitt. This was a quick one charting the relationship between the episode rating and the number of viewers.

- The images were cropped using {cropcircles}
- The colour palette was created using {eyedroppeR}
- To get the facets in the right place I hacked it by adding -1 and 0 to the season number and placed the logo in -1 and the subtitle text in 0.

The hardest part was finding who played the doctor in each season.

Code Github

The post Tidy Tuesday week 48: Doctor Who appeared first on Dan Oehm | Gradient Descending.

]]>This week I played around with {ggpattern}. Very cool package. I looked at the number of events that were held […]

The post Tidy Tuesday week 47: R-ladies chapter events appeared first on Dan Oehm | Gradient Descending.

]]>This week I played around with {ggpattern}. Very cool package.

I looked at the number of events that were held in person and those online. To add more to the theme I create two tiles in Midjourney 1) of people attending a conference in person and 2) a tile of people on a video conference.

Overall I like it but I do think it is a bit busy!

Code Github

The post Tidy Tuesday week 47: R-ladies chapter events appeared first on Dan Oehm | Gradient Descending.

]]>Looking at Diwali sales this week. Diwali is all about lights so I decided to represent the data in a […]

The post Tidy Tuesday week 46: Diwali appeared first on Dan Oehm | Gradient Descending.

]]>Looking at Diwali sales this week. Diwali is all about lights so I decided to represent the data in a thematic way.

I made a chart that is similar to a tile chart but with light globes instead. For an extra touch, I represented the number of purchases made by men and women in the filament of the globe.

To create the globes it was going to be easiest to construct a data frame with the coordinates to draw and globe and use geom_bspline0 from {ggforce}. I could have just manually put in the coordinates instead, I mapped out the location of the points in Google Sheets. I then read the points into R. This made it easier because I could visualise what the globe was going to look like and tweak if needed.

I did something similar for the mount for the globe.

To represent the total sales by zone and category I considered sales representing the weight of the globe. More sales would mean a heavier globe and will hang lower in the line. I initially also tried to make the globe larger but it didn’t look that good.

Getting the filament and numbers right was a bit fiddly but time well spent. I like how it turned out.

Code Github

The post Tidy Tuesday week 46: Diwali appeared first on Dan Oehm | Gradient Descending.

]]>Confessional counts are highly scrutinised by the Survivor fandom. Fans don’t like seeing their favourites getting “purpled”. My feeling is […]

The post Should we expect balanced confessional counts in Survivor? appeared first on Dan Oehm | Gradient Descending.

]]>Confessional counts are highly scrutinised by the Survivor fandom. Fans don’t like seeing their favourites getting “purpled”. My feeling is a perfectly balanced confessional sheet where every player gets the same amount is at the expense of storytelling and some variation should be expected. The story elements of the game, like a tribe losing immunity and going to Tribal Council will be the focus after the immunity challenge as they talk strategy. Ultimately, they will get more confessionals for the episode. Makes sense. In the post-merge part of the game, everyone goes to tribal so there are likely other influencing factors for who gets the confessionals.

I’ll be testing the hypothesis that a castaway gets more confessionals given the following events:

- The castaway goes to
**Tribal Council**in the pre-merge part of the game. This is where the strategy comes into play. It makes sense that the tribe going to Tribal Council gets the full focus after the immunity challenge - The castaway
**wins the reward**challenge. Often when the tribe, or individual wins a reward they celebrate with food and drinks, and they get more confessionals telling you how great it is. - The castaway is
**chosen to participate in the reward**. Post-merge they often bring someone along with them and are separated from the others. This is likely similar to them winning the reward. - The castaway
**finds an advantage**. If the player finds an advantage there is usually a long scene where they replay the events. Makes sense. It would be strange for them to edit out the effort that went into finding it. It would trivialize the whole process and a moment that could change the game.

All data and code are available on Github.

- Data: survivoR R package
- Code: Confessionals and tribals

Do castaways get more confessionals for the episode when they…

- Visit tribal council?
**Yes**. - Win reward?
**Yes**, post-merge but not pre-merge - Chosen to participate in the reward?
**Yes.** - Find an advantage?
**Yes**.

So, should we expect balanced confessionals in Survivor? No, we shouldn’t expect them to be. A decision has been made to focus on strategy and storytelling with the little time available to cram everything in. You might want them to be perfectly balanced from a philosophical point of view, but shouldn’t expect them to be.

There are a few things to consider when setting up the data and the model.

- I’m going to test if the castaway gets more confessionals in the pre and post-merge part of the game. Given each player goes to Tribal Council post-merge and the game is quite different I’ll be fitting two separate models.
- I will be using 44 seasons of the US version of Survivor. It could be done for the other versions as well, however, they come with their own nuances. Incorporating them into one model will increase the variation. It may be interesting but for this analysis, I will focus on the US only.
- Episodes vary in length so I will standardize the number of confessionals per person to 60 minutes before modeling. They will be converted back for comparison after modeling
- I’ll use counts rather than time since I only have the timing for a few seasons. (If you’d like to help get the times for past seasons please get in touch!). Worthwhile investigation for a second post using the results from this one as prior information though.

I want to quickly examine how confessionals are distributed to inform the model choice and prime my expectations.

The mean number of confessionals for each chart is:

- 2.8 confessionals per person per episode
- 2.4 pre-merge
- 3.8 post-merge

- 33 confessionals per episode
- 451 confessionals per season

Confessionals at the episode x person level have a very Gamma-like distribution which should be represented in the model choice.

The tail in the episode chart shows the effect of the longer episodes which is the motivation for standardizing the counts to 60 minutes.

To test the hypothesis I’ll fit a Bayesian regression model with the following specification.

Where = Confessional count/episode length x 60

Gamma distributions don’t like 0’s very much, so I’ll add 1 to the response to fit the model and subtract 1 when making predictions. As long as the model fits the data well you’ll still make the same conclusions.

The priors on the coefficients for tribal council, winning a reward, chosen for reward, and finding an advantage are centered around 0 so if there is a genuine difference we should pick it up.

The number of players left in the game needs to be controlled since the more players there are, the fewer confessionals each individual is likely to receive given the finite time of the episode. This will be one of the predictors.

The model formula is:

```
y ~ tribal_council + reward + found_adv + n_cast
```

The model is fit with the {brms} package in R.

```
prior_b0 <- prior(normal(2.4, 0.75), class = "Intercept")
prior_b1 <- prior(normal(0, 2), class = "b")
priors <- c(prior_b0, prior_b1)
mod <- brm(y ~ tribal_council + reward + found_adv + n_cast, data = df_pre_merge, family = "gamma", prior = priors)
```

Output

```
Family: gamma
Links: mu = log; shape = identity
Formula: y ~ tribal_council + reward + found_adv + n_cast
Data: df_pre_merge (Number of observations: 4368)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 1.94 0.06 1.82 2.06 1.00 6288 3243
tribal_council 0.33 0.02 0.29 0.37 1.00 5313 2610
reward -0.05 0.02 -0.09 -0.01 1.00 5586 2729
found_adv 0.58 0.06 0.47 0.70 1.00 6391 2815
n_cast -0.05 0.00 -0.05 -0.04 1.00 6091 3160
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
shape 2.65 0.05 2.55 2.75 1.00 5692 3074
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
```

The posterior predictive check looks very neat confirming the Gamma was a good choice.

The model the coefficients can’t be directly interpreted but it does show a strong difference between those who go to tribal council and those who don’t, and those who find an advantage. Those who won the reward challenge were not different from those who lost. The number of castaways left in the game was significant, meaning for each additional castaway they would receive proportionally fewer confessionals.

Parameter | Summary | Is there a difference? | 95% prediction interval |
---|---|---|---|

Found an advantage | Castaways that find advantages / hidden immunity idols tend to get, on average, 2-3 more confessionals as they replay the events. | Yes | (1.8, 3.1) |

Attended Tribal Council | Castaways that attend tribal council get on average 1 more confessional than others. | Yes | (1.1, 1.3) |

Won the reward challenge | There is some evidence that the tribe that wins the reward challenge receives slightly fewer confessionals | No | (-0.27, 0) |

Number of players (control) | The more castaways in the game the fewer confessionals each castaway is likely to get, ~0.14 per hour | Yes | (0.12, 0.16) |

Note that the difference in confessionals per person gets larger when there are fewer players in the game.

Post-merge everyone goes to Tribal Council and the nature of the game changes. The model I’ll be fitting here is as follows:

```
y ~ reward + chosen_for_reward + found_adv + n_cast
prior_b0 <- prior(normal(3.8, 1), class = "Intercept")
prior_b1 <- prior(normal(0, 2), class = "b")
priors <- c(prior_b0, prior_b1)
mod <- brm(y ~ reward + chosen_for_reward + found_adv + n_cast, data = df_pre_merge, family = "gamma", prior = priors)
```

Note the prior for the intercept has been updated to something more relevant for the post-merge stage of the game.

Output:

```
Family: gamma
Links: mu = log; shape = identity
Formula: y ~ reward + chosen_for_reward + found_adv + n_cast
Data: df_post_merge (Number of observations: 2362)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 2.08 0.04 2.01 2.17 1.00 5781 3067
reward 0.12 0.03 0.06 0.19 1.00 4980 3041
chosen_for_reward 0.11 0.05 0.01 0.21 1.00 5469 3136
found_adv 0.56 0.07 0.42 0.71 1.00 4403 2652
n_cast -0.04 0.00 -0.05 -0.04 1.00 5700 3306
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
shape 2.53 0.07 2.39 2.66 1.00 5338 2707
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
```

Again, the posterior predictive check looks great confirming a good fit.

All of the coefficients show a clear difference in the means in the post-merge stage. Unlike the pre-merge stage, winning the reward challenge is attributed to increased confessionals, which makes sense. Similarly, being chosen for the reward yields a similar bump in confessionals. However, it is dominated by finding an advantage that attributes 3-6 more confessionals on average. My theory is, in the later stage of the game advantages become more important and a well-played advantage can turn the game on its head, so there is a lot of focus on the advantages.

Parameter | Summary | Is there a difference? | Prediction interval |
---|---|---|---|

Found an advantage | Castaways that find advantages / hidden immunity idols tend to get, on average, 4-8 more confessionals as they replay the events. | Yes | (3, 6) |

Won the reward challenge | If the castaway wins the reward challenge, on average, they receive ~1 more confessional than others as they enjoy their feast. | Yes | (0.4, 1.1) |

Chosen to participate in the reward | Similar to winning the reward, if they are chosen to participate they receive 0-1 more confessionals | Yes | (0.1, 1.3) |

Number of players (control) | The more castaways in the game the fewer confessionals each castaway is likely to get, ~0.12 per hour | Yes | (0.2, 0.3) |

With these models, we can adjust the confessional count for a season, and see who truly received more confessionals than other players when the story effects of the game are removed.

I’ll test this on Season 44. The methodology is:

- Estimate the expected number of confessionals per 60mins with the above models for the pre and post-merge stages of the game, by episode and castaway.
- Scale the number of confessionals to the length of the episode .
- Aggregate to find the total expected , and observed confessionals per castaway for the season.
- Calculate the index relative to the expected: .

Yam Yam received 35 more confessionals than expected and Lauren received 21 fewer than expected on the extremes. Overall most castaways received more confessionals than expected with respect to other seasons. It’s really just Heidi and Lauren who didn’t receive the confessionals expected given their time in the game.

The table below compares the calculations assuming equal distribution and the model-based adjustment. In some ways, the model adjusted index looks more extreme since under the assumption of equal distribution Yam Yam’s index is +56% but the adjusted is +72%. Tika got slammed in the pre-merge and Yam Yam didn’t find any advantages. With those things factored in he received 72% more. Conclusion – the camera loved him. But overall it’s a better representation.

There are some big differences between the indexes, Matt for example. They really doubled down on the showmance which would explain it.

The model based adjustment is better for seeing which castaways are getting more confessionals than others by factoring in the story and game elements. It is also useful for comparing the trends across seasons. For example, it’s possible castaways for a season could have had more confessionals compared with similar castaways from other seasons. In which case, all castaways for a given season could have received proportionally more. In season 44, only 5 players received less than expected showing that overall it was a good season from a confessionals per person point of view.

Should we expect balanced confessionals in Survivor? Nah. There’s a clear difference between those that go to tribal, find an advantage, and win reward challenges or participate in them. The editors have chosen to focus on strategy and storytelling, over faffing around on the beach with the little time they have to cram everything in (not to say that faffing around on the beach isn’t fun or needed to understand their personalities and tribe connections).

If everyone went to tribal the same amount of times and found the same amount of advantages and won/went on the same number of rewards, the expectation would be the same for everyone. But would we actually expect to see the confessional counts, as arbitrary as they are, equal for everyone? I don’t think so. There’s still an underlying narrative for the episode and the season which would be diluted if they were.

This is my opinion more than evidence backed by data, but I think we can infer that if the “purpled” players were pivotal to the story beats of the season they wouldn’t be “purpled”. We should be reassured they’re telling the best story they can.

Finally, the differences in the counts associated with tribals and so on are estimated from the observed counts across the 44 seasons. They measure the editing decisions made by production. If for philosophical reasons you don’t agree those going to tribal pre-merge or those who find advantages should receive more confessionals, then, oh well. Since those differences exist I don’t think we should expect perfect balance.

There is a lot of code for this analysis and is available in full on Github.

I’ve visualised season 44 in a slightly different way which shows how many confessionals they received and their expected values. With all castaways being on the same scale it pulls it into perspective. Important to keep in mind how long each castaway was in the game. Not shown here though.

The post Should we expect balanced confessional counts in Survivor? appeared first on Dan Oehm | Gradient Descending.

]]>