The {simulist} R package can generate line list data
(sim_linelist()
), contact tracing data
(sim_contacts()
), or both (sim_outbreak()
). By
default the line list produced by sim_linelist()
and
sim_outbreak()
contains 12 columns. Some amount of
post-simulation data wrangling may be needed to use the simulated
epidemiological case data to certain applications. This vignette
demonstrates some common data wrangling tasks that may be performed on
simulated line list or contact tracing data.
This vignette provides data wrangling examples using both functions available in the R language (commonly called “base R”) as well as using tidyverse R packages, which are commonly applied to data science tasks in R. The tidyverse examples are shown by default, but select the “Base R” tab to see the equivalent functionality using base R. There are many other tools for wrangling data in R which are not covered by this vignette (e.g. {data.table}).
See these great resources for more information on general data wrangling in R:
To simulate an outbreak we will use the sim_outbreak()
function from the {simulist} R package.
If you are unfamiliar with the {simulist} package or the
sim_outbreak()
function Get Started
vignette is a great place to start.
First we load in some data that is required for the outbreak simulation. Data on epidemiological parameters and distributions are read from the {epiparameter} R package.
# create contact distribution (not available from {epiparameter} database)
contact_distribution <- epiparameter(
disease = "COVID-19",
epi_name = "contact distribution",
prob_distribution = create_prob_distribution(
prob_distribution = "pois",
prob_distribution_params = c(mean = 2)
)
)
#> Citation cannot be created as author, year, journal or title is missing
# create infectious period (not available from {epiparameter} database)
infectious_period <- epiparameter(
disease = "COVID-19",
epi_name = "infectious period",
prob_distribution = create_prob_distribution(
prob_distribution = "gamma",
prob_distribution_params = c(shape = 1, scale = 1)
)
)
#> Citation cannot be created as author, year, journal or title is missing
# get onset to hospital admission from {epiparameter} database
onset_to_hosp <- epiparameter_db(
disease = "COVID-19",
epi_name = "onset to hospitalisation",
single_epiparameter = TRUE
)
#> Using Linton N, Kobayashi T, Yang Y, Hayashi K, Akhmetzhanov A, Jung S, Yuan
#> B, Kinoshita R, Nishiura H (2020). "Incubation Period and Other
#> Epidemiological Characteristics of 2019 Novel Coronavirus Infections
#> with Right Truncation: A Statistical Analysis of Publicly Available
#> Case Data." _Journal of Clinical Medicine_. doi:10.3390/jcm9020538
#> <https://doi.org/10.3390/jcm9020538>..
#> To retrieve the citation use the 'get_citation' function
# get onset to death from {epiparameter} database
onset_to_death <- epiparameter_db(
disease = "COVID-19",
epi_name = "onset to death",
single_epiparameter = TRUE
)
#> Using Linton N, Kobayashi T, Yang Y, Hayashi K, Akhmetzhanov A, Jung S, Yuan
#> B, Kinoshita R, Nishiura H (2020). "Incubation Period and Other
#> Epidemiological Characteristics of 2019 Novel Coronavirus Infections
#> with Right Truncation: A Statistical Analysis of Publicly Available
#> Case Data." _Journal of Clinical Medicine_. doi:10.3390/jcm9020538
#> <https://doi.org/10.3390/jcm9020538>..
#> To retrieve the citation use the 'get_citation' function
The seed is set to ensure the output of the vignette is consistent. When using {simulist}, setting the seed is not required unless you need to simulate the same line list multiple times.
The date event columns in simulated line lists are stored to double
point precision, meaning they are the exact event times. It is unusual
to not store <Date>
objects as integers, as explained
in ?Dates
, and the print()
function for
<Date>
s does not show that they may be part way
through a day.
Here we show this by printing the date of symptom onset for the simulated data, and then unclass it to show how it is stored internally.
linelist$date_onset
#> [1] "2023-01-01" "2023-01-01" "2023-01-01" "2023-01-01" "2023-01-02"
#> [6] "2023-01-01" "2023-01-01" "2023-01-01" "2023-01-01" "2023-01-01"
#> [11] "2023-01-02" "2023-01-02" "2023-01-02"
unclass(linelist$date_onset)
#> [1] 19358.00 19358.24 19358.04 19358.38 19359.16 19358.85 19358.14 19358.70
#> [9] 19358.41 19358.97 19359.40 19359.47 19359.51
The censor_linelist()
function can be used after
sim_linelist()
to censor the event dates to a given
precision. Here we show censoring the dates to daily and weekly
intervals. The daily censoring dates will look that same as before, but
the dates will have any value after the decimal point set to zero. The
weekly censored dates will be printed differently.
daily_cens_linelist <- censor_linelist(linelist, interval = "daily")
head(daily_cens_linelist)
#> id case_name case_type sex age date_onset date_reporting
#> 1 1 Fabian Mrazik confirmed m 90 2023-01-01 2023-01-01
#> 2 3 Ashley Martinez confirmed f 71 2023-01-01 2023-01-01
#> 3 4 Tia Vu probable f 48 2023-01-01 2023-01-01
#> 4 5 Abdul Majeed el-Saleh confirmed m 77 2023-01-01 2023-01-01
#> 5 6 Courtney Flood suspected f 83 2023-01-02 2023-01-02
#> 6 7 Joseph Jiron suspected m 56 2023-01-01 2023-01-01
#> date_admission outcome date_outcome date_first_contact date_last_contact
#> 1 <NA> recovered <NA> <NA> <NA>
#> 2 2023-01-08 died 2023-01-10 2022-12-26 2023-01-06
#> 3 <NA> recovered <NA> 2022-12-30 2023-01-05
#> 4 <NA> recovered <NA> 2022-12-31 2023-01-08
#> 5 <NA> recovered <NA> 2022-12-26 2023-01-04
#> 6 <NA> recovered <NA> 2022-12-28 2023-01-03
#> ct_value
#> 1 21.9
#> 2 22.7
#> 3 NA
#> 4 27.4
#> 5 NA
#> 6 NA
weekly_cens_linelist <- censor_linelist(linelist, interval = "weekly")
head(weekly_cens_linelist)
#> id case_name case_type sex age date_onset date_reporting
#> 1 1 Fabian Mrazik confirmed m 90 2022-W52 2022-W52
#> 2 3 Ashley Martinez confirmed f 71 2022-W52 2022-W52
#> 3 4 Tia Vu probable f 48 2022-W52 2022-W52
#> 4 5 Abdul Majeed el-Saleh confirmed m 77 2022-W52 2022-W52
#> 5 6 Courtney Flood suspected f 83 2023-W01 2023-W01
#> 6 7 Joseph Jiron suspected m 56 2022-W52 2022-W52
#> date_admission outcome date_outcome date_first_contact date_last_contact
#> 1 <NA> recovered <NA> <NA> <NA>
#> 2 2023-W01 died 2023-W02 2022-W52 2023-W01
#> 3 <NA> recovered <NA> 2022-W52 2023-W01
#> 4 <NA> recovered <NA> 2022-W52 2023-W01
#> 5 <NA> recovered <NA> 2022-W52 2023-W01
#> 6 <NA> recovered <NA> 2022-W52 2023-W01
#> ct_value
#> 1 21.9
#> 2 22.7
#> 3 NA
#> 4 27.4
#> 5 NA
#> 6 NA
See ?censor_linelist()
for more information on how to
use this function.
By using censor_linelist()
it avoids common mistakes
when working with <Date>
objects. For example,
rounding a date that is over half way through a day will mistakenly
result in the next day. Using censor_linelist()
avoids this
and other common mistakes.
linelist$date_onset
#> [1] "2023-01-01" "2023-01-01" "2023-01-01" "2023-01-01" "2023-01-02"
#> [6] "2023-01-01" "2023-01-01" "2023-01-01" "2023-01-01" "2023-01-01"
#> [11] "2023-01-02" "2023-01-02" "2023-01-02"
round(linelist$date_onset)
#> [1] "2023-01-01" "2023-01-01" "2023-01-01" "2023-01-01" "2023-01-02"
#> [6] "2023-01-02" "2023-01-01" "2023-01-02" "2023-01-01" "2023-01-02"
#> [11] "2023-01-02" "2023-01-02" "2023-01-03"
daily_cens_linelist$date_onset
#> [1] "2023-01-01" "2023-01-01" "2023-01-01" "2023-01-01" "2023-01-02"
#> [6] "2023-01-01" "2023-01-01" "2023-01-01" "2023-01-01" "2023-01-01"
#> [11] "2023-01-02" "2023-01-02" "2023-01-02"
The censored line list dates can be used with methods that account for censoring when fitting delay distributions such as {primarycensored}.
In this section we’ll show how case line lists and contact tracing data sets can be subset to represent under-reporting, a common feature of real-world outbreak data, especially in resource-limited settings.
In the line list each case in unlinked (i.e. information on each row is independent of information on every other row). This means we can remove rows in the line list without having to augment any remaining rows. We assume for this example that the probability of being reported, and thus included in the line list, is independent on case type, sex, age and the phase of the outbreak.
For this example we’ll assume the case reporting probability in the line list is 50%.
linelist %>%
filter(as.logical(rbinom(n(), size = 1, prob = 0.5)))
#> id case_name case_type sex age date_onset date_reporting date_admission
#> 1 1 Fabian Mrazik confirmed m 90 2023-01-01 2023-01-01 <NA>
#> 2 3 Ashley Martinez confirmed f 71 2023-01-01 2023-01-01 2023-01-08
#> 3 4 Tia Vu probable f 48 2023-01-01 2023-01-01 <NA>
#> 4 6 Courtney Flood suspected f 83 2023-01-02 2023-01-02 <NA>
#> 5 7 Joseph Jiron suspected m 56 2023-01-01 2023-01-01 <NA>
#> 6 8 Kevin Liddle suspected m 39 2023-01-01 2023-01-01 <NA>
#> 7 21 Katlyn Nelson probable f 36 2023-01-02 2023-01-02 <NA>
#> outcome date_outcome date_first_contact date_last_contact ct_value
#> 1 recovered <NA> <NA> <NA> 21.9
#> 2 died 2023-01-10 2022-12-26 2023-01-06 22.7
#> 3 recovered <NA> 2022-12-30 2023-01-05 NA
#> 4 recovered <NA> 2022-12-26 2023-01-04 NA
#> 5 recovered <NA> 2022-12-28 2023-01-03 NA
#> 6 recovered <NA> 2022-12-31 2023-01-03 NA
#> 7 recovered <NA> 2023-01-01 2023-01-03 NA
idx <- as.logical(rbinom(n = nrow(linelist), size = 1, prob = 0.5))
linelist[idx, ]
#> id case_name case_type sex age date_onset date_reporting
#> 2 3 Ashley Martinez confirmed f 71 2023-01-01 2023-01-01
#> 3 4 Tia Vu probable f 48 2023-01-01 2023-01-01
#> 5 6 Courtney Flood suspected f 83 2023-01-02 2023-01-02
#> 8 9 Rutaiba el-Raad confirmed f 68 2023-01-01 2023-01-01
#> 9 10 Jaime Middleton suspected m 1 2023-01-01 2023-01-01
#> 10 14 Emily Fyffe confirmed f 16 2023-01-01 2023-01-01
#> 12 21 Katlyn Nelson probable f 36 2023-01-02 2023-01-02
#> 13 24 Nicholas Rentie suspected m 49 2023-01-02 2023-01-02
#> date_admission outcome date_outcome date_first_contact date_last_contact
#> 2 2023-01-08 died 2023-01-10 2022-12-26 2023-01-06
#> 3 <NA> recovered <NA> 2022-12-30 2023-01-05
#> 5 <NA> recovered <NA> 2022-12-26 2023-01-04
#> 8 <NA> recovered <NA> 2022-12-29 2023-01-01
#> 9 <NA> recovered <NA> 2022-12-26 2023-01-02
#> 10 2023-01-02 recovered <NA> 2022-12-30 2023-01-02
#> 12 <NA> recovered <NA> 2023-01-01 2023-01-03
#> 13 <NA> recovered <NA> 2022-12-28 2023-01-04
#> ct_value
#> 2 22.7
#> 3 NA
#> 5 NA
#> 8 24.2
#> 9 NA
#> 10 21.3
#> 12 NA
#> 13 NA
The above example randomly sample rows in the line list using the
reporting probability resulting in different number of cases being kept
each time the code is run. To subset the line list data and get the same
number rows (i.e. cases) returned slice_sample()
can be
used instead.
linelist %>%
dplyr::slice_sample(prop = 0.5) %>%
dplyr::arrange(id)
#> id case_name case_type sex age date_onset date_reporting
#> 1 1 Fabian Mrazik confirmed m 90 2023-01-01 2023-01-01
#> 2 4 Tia Vu probable f 48 2023-01-01 2023-01-01
#> 3 5 Abdul Majeed el-Saleh confirmed m 77 2023-01-01 2023-01-01
#> 4 6 Courtney Flood suspected f 83 2023-01-02 2023-01-02
#> 5 21 Katlyn Nelson probable f 36 2023-01-02 2023-01-02
#> 6 24 Nicholas Rentie suspected m 49 2023-01-02 2023-01-02
#> date_admission outcome date_outcome date_first_contact date_last_contact
#> 1 <NA> recovered <NA> <NA> <NA>
#> 2 <NA> recovered <NA> 2022-12-30 2023-01-05
#> 3 <NA> recovered <NA> 2022-12-31 2023-01-08
#> 4 <NA> recovered <NA> 2022-12-26 2023-01-04
#> 5 <NA> recovered <NA> 2023-01-01 2023-01-03
#> 6 <NA> recovered <NA> 2022-12-28 2023-01-04
#> ct_value
#> 1 21.9
#> 2 NA
#> 3 27.4
#> 4 NA
#> 5 NA
#> 6 NA
slice_sample()
can reorder rows so we order by ID to
keep the cases in order of symptom onset date.
On to under-reporting in contact tracing data. Unlike line list data,
contact tracing data is linked. The direction of contact and possibly
transmission is recorded in the $from
and $to
columns.
For this example we will assume that the contact tracing
under-reporting is applicable to infections and contacts that were not
infected. However, the same method could be applied for under-reporting
of the transmission chain by first subsetting to infections only (see
vis-linelist.Rmd
vignette for example).
We plot the full contact network so it can be compared to the contact networks with under-reporting plotted below.
epicontacts <- make_epicontacts(
linelist = linelist,
contacts = contacts,
id = "case_name",
from = "from",
to = "to",
directed = TRUE
)
plot(epicontacts)
First we randomly sample who is not reported in the outbreak data. For this example we assume the pool of people that can be unreported is everyone in the contact network (infections and contacts), and assume a 50% reporting probability.
all_contacts <- unique(c(contacts$from, contacts$to))
not_reported <- sample(x = all_contacts, size = 0.5 * length(all_contacts))
not_reported
#> [1] "Katlyn Nelson" "Jin Fu" "Breanna Hofbauer"
#> [4] "Ashley Martinez" "Emily Abo" "Forrest Anderson"
#> [7] "Miguel Oyebi" "Shabeeba el-Younes" "Sarah Bridwell"
#> [10] "Rutaiba el-Raad" "Yvonne Howard" "Kevin Liddle"
#> [13] "Nicholas Rentie"
Next we subset the contact tracing data by removing infectees if that
are not reported. Because the contact tracing data is linked across
rows, we also need to set any unreported infectees to NA
for any secondary infections they cause.
# make copy of contact tracing data for under-reporting
contacts_ur <- contacts
for (person in not_reported) {
contacts_ur <- contacts_ur[contacts_ur$to != person, ]
contacts_ur[contacts_ur$from %in% person, "from"] <- NA
}
head(contacts_ur)
#> from to age sex date_first_contact
#> 3 Fabian Mrazik Tia Vu 48 f 2022-12-30
#> 4 Fabian Mrazik Abdul Majeed el-Saleh 77 m 2022-12-31
#> 5 <NA> Courtney Flood 83 f 2022-12-26
#> 6 <NA> Joseph Jiron 56 m 2022-12-28
#> 9 Abdul Majeed el-Saleh Jaime Middleton 1 m 2022-12-26
#> 13 <NA> Emily Fyffe 16 f 2022-12-30
#> date_last_contact was_case status
#> 3 2023-01-05 TRUE case
#> 4 2023-01-08 TRUE case
#> 5 2023-01-04 TRUE case
#> 6 2023-01-03 TRUE case
#> 9 2023-01-02 TRUE case
#> 13 2023-01-02 TRUE case
We can plot this new contact network with {epicontacts}. We’ll need to subset the line list to have the same unreported cases.
linelist_ur <- linelist[!linelist$case_name %in% not_reported, ]
epicontacts <- make_epicontacts(
linelist = linelist_ur,
contacts = contacts_ur,
id = "case_name",
from = "from",
to = "to",
directed = TRUE
)
plot(epicontacts)
The above example can be thought of as resulting from incomplete recording or recall of contacts. A second method for under-reporting of contact tracing data is to assume that if a case is unreported then all of the cases and contacts stemming from the unreported case are lost.
For this example we’ll sample a single individual not to report and then prune all cases and contacts from that individual in the network.
all_contacts <- unique(c(contacts$from, contacts$to))
not_reported <- sample(x = all_contacts, size = 1)
not_reported
#> [1] "Abdul Majeed el-Saleh"
Then we can recursively pruned all cases and contacts that are the result from this individual (this can be zero if the person had no secondary cases or contacts).
# make copy of contact tracing data for under-reporting
contacts_ur <- contacts
while (length(not_reported) > 0) {
contacts_ur <- contacts_ur[!contacts_ur$to %in% not_reported, ]
not_reported_ <- contacts_ur$to[contacts_ur$from %in% not_reported]
contacts_ur <- contacts_ur[!contacts_ur$from %in% not_reported, ]
not_reported <- not_reported_
}
head(contacts_ur)
#> from to age sex date_first_contact date_last_contact
#> 1 Fabian Mrazik Yvonne Howard 9 f 2022-12-31 2023-01-05
#> 2 Fabian Mrazik Ashley Martinez 71 f 2022-12-26 2023-01-06
#> 3 Fabian Mrazik Tia Vu 48 f 2022-12-30 2023-01-05
#> 5 Ashley Martinez Courtney Flood 83 f 2022-12-26 2023-01-04
#> 6 Ashley Martinez Joseph Jiron 56 m 2022-12-28 2023-01-03
#> 7 Tia Vu Kevin Liddle 39 m 2022-12-31 2023-01-03
#> was_case status
#> 1 FALSE under_followup
#> 2 TRUE case
#> 3 TRUE case
#> 5 TRUE case
#> 6 TRUE case
#> 7 TRUE case
Just as above we can plot the new contact network using {epicontacts}.
# subset line list to match under-reporting in contact tracing data
linelist_ur <- linelist[linelist$case_name %in% unique(contacts$from), ]
epicontacts <- make_epicontacts(
linelist = linelist_ur,
contacts = contacts_ur,
id = "case_name",
from = "from",
to = "to",
directed = TRUE
)
plot(epicontacts)
There are more complex under-reporting depending on covariates in the
line list and contact tracing data such as $case_type
in
the line list, with suspected cases most likely to go unreported, or
$status
in the contact tracing data, with
unknown
or lost_to_followup
more likely to be
under-reported.
Not every column in the simulated line list may be required for the
use case at hand. In this example we will remove the
$ct_value
column. For instance, if we wanted to simulate an
outbreak for which no laboratory testing (e.g Polymerase chain reaction,
PCR, testing) was available and thus a Cycle threshold (Ct) value would
not be known for confirmed cases.
# remove column by name
linelist %>% # nolint one_call_pipe_linter
select(!ct_value)
#> id case_name case_type sex age date_onset date_reporting
#> 1 1 Fabian Mrazik confirmed m 90 2023-01-01 2023-01-01
#> 2 3 Ashley Martinez confirmed f 71 2023-01-01 2023-01-01
#> 3 4 Tia Vu probable f 48 2023-01-01 2023-01-01
#> 4 5 Abdul Majeed el-Saleh confirmed m 77 2023-01-01 2023-01-01
#> 5 6 Courtney Flood suspected f 83 2023-01-02 2023-01-02
#> 6 7 Joseph Jiron suspected m 56 2023-01-01 2023-01-01
#> 7 8 Kevin Liddle suspected m 39 2023-01-01 2023-01-01
#> 8 9 Rutaiba el-Raad confirmed f 68 2023-01-01 2023-01-01
#> 9 10 Jaime Middleton suspected m 1 2023-01-01 2023-01-01
#> 10 14 Emily Fyffe confirmed f 16 2023-01-01 2023-01-01
#> 11 16 Miguel Oyebi confirmed m 54 2023-01-02 2023-01-02
#> 12 21 Katlyn Nelson probable f 36 2023-01-02 2023-01-02
#> 13 24 Nicholas Rentie suspected m 49 2023-01-02 2023-01-02
#> date_admission outcome date_outcome date_first_contact date_last_contact
#> 1 <NA> recovered <NA> <NA> <NA>
#> 2 2023-01-08 died 2023-01-10 2022-12-26 2023-01-06
#> 3 <NA> recovered <NA> 2022-12-30 2023-01-05
#> 4 <NA> recovered <NA> 2022-12-31 2023-01-08
#> 5 <NA> recovered <NA> 2022-12-26 2023-01-04
#> 6 <NA> recovered <NA> 2022-12-28 2023-01-03
#> 7 <NA> recovered <NA> 2022-12-31 2023-01-03
#> 8 <NA> recovered <NA> 2022-12-29 2023-01-01
#> 9 <NA> recovered <NA> 2022-12-26 2023-01-02
#> 10 2023-01-02 recovered <NA> 2022-12-30 2023-01-02
#> 11 2023-01-05 recovered <NA> 2022-12-30 2023-01-05
#> 12 <NA> recovered <NA> 2023-01-01 2023-01-03
#> 13 <NA> recovered <NA> 2022-12-28 2023-01-04
# remove column by numeric column indexing
# ct_value is column 12 (the last column)
linelist[, -12]
#> id case_name case_type sex age date_onset date_reporting
#> 1 1 Fabian Mrazik confirmed m 90 2023-01-01 2023-01-01
#> 2 3 Ashley Martinez confirmed f 71 2023-01-01 2023-01-01
#> 3 4 Tia Vu probable f 48 2023-01-01 2023-01-01
#> 4 5 Abdul Majeed el-Saleh confirmed m 77 2023-01-01 2023-01-01
#> 5 6 Courtney Flood suspected f 83 2023-01-02 2023-01-02
#> 6 7 Joseph Jiron suspected m 56 2023-01-01 2023-01-01
#> 7 8 Kevin Liddle suspected m 39 2023-01-01 2023-01-01
#> 8 9 Rutaiba el-Raad confirmed f 68 2023-01-01 2023-01-01
#> 9 10 Jaime Middleton suspected m 1 2023-01-01 2023-01-01
#> 10 14 Emily Fyffe confirmed f 16 2023-01-01 2023-01-01
#> 11 16 Miguel Oyebi confirmed m 54 2023-01-02 2023-01-02
#> 12 21 Katlyn Nelson probable f 36 2023-01-02 2023-01-02
#> 13 24 Nicholas Rentie suspected m 49 2023-01-02 2023-01-02
#> date_admission outcome date_outcome date_first_contact ct_value
#> 1 <NA> recovered <NA> <NA> 21.9
#> 2 2023-01-08 died 2023-01-10 2022-12-26 22.7
#> 3 <NA> recovered <NA> 2022-12-30 NA
#> 4 <NA> recovered <NA> 2022-12-31 27.4
#> 5 <NA> recovered <NA> 2022-12-26 NA
#> 6 <NA> recovered <NA> 2022-12-28 NA
#> 7 <NA> recovered <NA> 2022-12-31 NA
#> 8 <NA> recovered <NA> 2022-12-29 24.2
#> 9 <NA> recovered <NA> 2022-12-26 NA
#> 10 2023-01-02 recovered <NA> 2022-12-30 21.3
#> 11 2023-01-05 recovered <NA> 2022-12-30 26.0
#> 12 <NA> recovered <NA> 2023-01-01 NA
#> 13 <NA> recovered <NA> 2022-12-28 NA
# remove column by column name
linelist[, colnames(linelist) != "ct_value"]
#> id case_name case_type sex age date_onset date_reporting
#> 1 1 Fabian Mrazik confirmed m 90 2023-01-01 2023-01-01
#> 2 3 Ashley Martinez confirmed f 71 2023-01-01 2023-01-01
#> 3 4 Tia Vu probable f 48 2023-01-01 2023-01-01
#> 4 5 Abdul Majeed el-Saleh confirmed m 77 2023-01-01 2023-01-01
#> 5 6 Courtney Flood suspected f 83 2023-01-02 2023-01-02
#> 6 7 Joseph Jiron suspected m 56 2023-01-01 2023-01-01
#> 7 8 Kevin Liddle suspected m 39 2023-01-01 2023-01-01
#> 8 9 Rutaiba el-Raad confirmed f 68 2023-01-01 2023-01-01
#> 9 10 Jaime Middleton suspected m 1 2023-01-01 2023-01-01
#> 10 14 Emily Fyffe confirmed f 16 2023-01-01 2023-01-01
#> 11 16 Miguel Oyebi confirmed m 54 2023-01-02 2023-01-02
#> 12 21 Katlyn Nelson probable f 36 2023-01-02 2023-01-02
#> 13 24 Nicholas Rentie suspected m 49 2023-01-02 2023-01-02
#> date_admission outcome date_outcome date_first_contact date_last_contact
#> 1 <NA> recovered <NA> <NA> <NA>
#> 2 2023-01-08 died 2023-01-10 2022-12-26 2023-01-06
#> 3 <NA> recovered <NA> 2022-12-30 2023-01-05
#> 4 <NA> recovered <NA> 2022-12-31 2023-01-08
#> 5 <NA> recovered <NA> 2022-12-26 2023-01-04
#> 6 <NA> recovered <NA> 2022-12-28 2023-01-03
#> 7 <NA> recovered <NA> 2022-12-31 2023-01-03
#> 8 <NA> recovered <NA> 2022-12-29 2023-01-01
#> 9 <NA> recovered <NA> 2022-12-26 2023-01-02
#> 10 2023-01-02 recovered <NA> 2022-12-30 2023-01-02
#> 11 2023-01-05 recovered <NA> 2022-12-30 2023-01-05
#> 12 <NA> recovered <NA> 2023-01-01 2023-01-03
#> 13 <NA> recovered <NA> 2022-12-28 2023-01-04
# remove column by assigning it to NULL
linelist$ct_value <- NULL
linelist
#> id case_name case_type sex age date_onset date_reporting
#> 1 1 Fabian Mrazik confirmed m 90 2023-01-01 2023-01-01
#> 2 3 Ashley Martinez confirmed f 71 2023-01-01 2023-01-01
#> 3 4 Tia Vu probable f 48 2023-01-01 2023-01-01
#> 4 5 Abdul Majeed el-Saleh confirmed m 77 2023-01-01 2023-01-01
#> 5 6 Courtney Flood suspected f 83 2023-01-02 2023-01-02
#> 6 7 Joseph Jiron suspected m 56 2023-01-01 2023-01-01
#> 7 8 Kevin Liddle suspected m 39 2023-01-01 2023-01-01
#> 8 9 Rutaiba el-Raad confirmed f 68 2023-01-01 2023-01-01
#> 9 10 Jaime Middleton suspected m 1 2023-01-01 2023-01-01
#> 10 14 Emily Fyffe confirmed f 16 2023-01-01 2023-01-01
#> 11 16 Miguel Oyebi confirmed m 54 2023-01-02 2023-01-02
#> 12 21 Katlyn Nelson probable f 36 2023-01-02 2023-01-02
#> 13 24 Nicholas Rentie suspected m 49 2023-01-02 2023-01-02
#> date_admission outcome date_outcome date_first_contact date_last_contact
#> 1 <NA> recovered <NA> <NA> <NA>
#> 2 2023-01-08 died 2023-01-10 2022-12-26 2023-01-06
#> 3 <NA> recovered <NA> 2022-12-30 2023-01-05
#> 4 <NA> recovered <NA> 2022-12-31 2023-01-08
#> 5 <NA> recovered <NA> 2022-12-26 2023-01-04
#> 6 <NA> recovered <NA> 2022-12-28 2023-01-03
#> 7 <NA> recovered <NA> 2022-12-31 2023-01-03
#> 8 <NA> recovered <NA> 2022-12-29 2023-01-01
#> 9 <NA> recovered <NA> 2022-12-26 2023-01-02
#> 10 2023-01-02 recovered <NA> 2022-12-30 2023-01-02
#> 11 2023-01-05 recovered <NA> 2022-12-30 2023-01-05
#> 12 <NA> recovered <NA> 2023-01-01 2023-01-03
#> 13 <NA> recovered <NA> 2022-12-28 2023-01-04