Covid Vaccine
December 2020 vaccine distribution and administration began. I started trapping the daily spreadsheet from the state health department that tracked progress. This blog entry is really for prototyping some of the data cleanup and displays that I will incorporate into my shiny app.
Let’s take a look at data issues.
df %>%
filter(!is.na(Pct_given)) %>%
ggplot(aes(x=Pct_given)) +
geom_histogram()+
labs(x="Percent Distributed Administered",
title="Distribution of Administered Vaccine")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Well, that’s irritating. Some folks are giving shots they don’t have.
# Negative doses
df %>%
group_by(County) %>%
mutate(min_admin=min(Daily_adm)) %>%
ungroup() %>%
arrange(County) %>%
filter(min_admin<0) %>%
ggplot(aes(x=Date, y=Doses_Admin, color=County)) +
geom_line()+
labs(x="Date", y="Cumulative Doses",
title="Administered Vaccine")
So we have a few counties where the cumulative numbers have decreased. That is disappointing.
# how fair is the distribution?
df %>%
group_by(County) %>%
mutate(fairness=10000*Doses_Distrib/Pop_Adult) %>%
summarise(fairness=max(fairness)) %>%
ggplot(aes(x=fairness)) +
geom_histogram()+
labs(x="Per Capita Doses Distributed",
title="Distribution of Administered Vaccine")
## `summarise()` ungrouping output (override with `.groups` argument)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 1 rows containing non-finite values (stat_bin).
df %>%
filter(County!="Texas") %>%
group_by(County) %>%
mutate(fairness=10000*Doses_Distrib/Pop_Adult) %>%
summarise(fairness=max(fairness), Pop_Adult=max(Pop_Adult)) %>%
ggplot(aes(x=fairness, y=Pop_Adult)) +
geom_point() +
labs(x="Per Capita Doses Distributed (Doses per 10,000)",
y="County Population",
title="Distribution of Administered Vaccine")
## `summarise()` ungrouping output (override with `.groups` argument)
## Warning: Removed 1 rows containing missing values (geom_point).
df %>%
filter(County!="Texas") %>%
group_by(County) %>%
summarise(Doses_Distrib=max(Doses_Distrib), Pop_Adult=max(Pop_Adult)) %>%
ggplot(aes(x=Doses_Distrib, y=Pop_Adult)) +
geom_point() +
scale_x_log10() +
scale_y_log10() +
labs(x="Log Doses Distributed",
y="Log County Population",
title="Distribution of Administered Vaccine")
## `summarise()` ungrouping output (override with `.groups` argument)
## Warning: Transformation introduced infinite values in continuous x-axis
## Warning: Removed 1 rows containing missing values (geom_point).
Looks pretty fair. At low numbers can see the effect of minimum vaccine lot size.
# % over time
df %>%
group_by(County) %>%
mutate(Pct_Coverage=100*Doses_Admin/Pop_Adult) %>%
ungroup() %>%
filter(Pct_Coverage<20) %>%
ggplot(aes(x=Date, y=Pct_Coverage, color=County)) +
geom_line() +
theme(legend.position = "none") +
labs(x="Date",
y="Percent Coverage",
title="Percent of each County Vaccinated")
Cottle county jumps to almost 60%, the rest are below 6%.
# How to best calculate the date when everyone is vaccinated?
# Calculate an average inoculation rate over the last 7 days
this_day <- lubridate::today()
df %>%
filter(Date>last(Date)-7) %>%
group_by(County) %>%
summarize(Inoculation_rate=mean(Daily_adm, na.rm=TRUE),
Pop_Adult=last(Pop_Adult),
People_one_dose=last(People_one_dose)) %>%
mutate(Finish_date=as_date(this_day+((Pop_Adult-People_one_dose)/Inoculation_rate+28))) %>%
mutate(DaysFromNow=((Pop_Adult-People_one_dose)/Inoculation_rate+28)) %>%
ggplot(aes(x=DaysFromNow/7)) +
geom_histogram() +
labs(x="Weeks from Today",
title="Weeks from present to reach full vaccination by county",
subtitle="Rate derived from last 7 days")
## `summarise()` ungrouping output (override with `.groups` argument)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 3 rows containing non-finite values (stat_bin).