% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/cv_estimate_rt.R
\name{cv_estimate_rt}
\alias{cv_estimate_rt}
\title{Leave-kth-out cross validation for choosing a optimal parameter lambda}
\usage{
cv_estimate_rt(
  observed_counts,
  korder = 3L,
  dist_gamma = c(2.5, 2.5),
  nfold = 3L,
  error_measure = c("deviance", "mse", "mae"),
  x = 1:n,
  lambda = NULL,
  maxiter = 1000000L,
  delay_distn = NULL,
  delay_distn_periodicity = NULL,
  regular_splits = FALSE,
  invert_splits = FALSE,
  ...
)
}
\arguments{
\item{observed_counts}{vector of the observed daily infection counts}

\item{korder}{Integer. Degree of the piecewise polynomial curve to be
estimated. For example, \code{korder = 0} corresponds to a piecewise constant
curve.}

\item{dist_gamma}{Vector of length 2. These are the shape and scale for the
assumed serial interval distribution. Roughly, this distribution describes
the probability of an infectious individual infecting someone else after
some period of time after having become infectious.
As in most literature, we assume that this interval follows a gamma
distribution with some shape and scale.}

\item{nfold}{Integer. This number of folds to conduct the leave-kth-out
cross validation. For leave-kth-out cross validation, every kth
observed_counts and their corresponding position (evenly or unevenly
spaced) are placed into the same fold. The first and last observed_counts are
not assigned to any folds. Smallest allowable value is \code{nfold = 2}.}

\item{error_measure}{Metric used to calculate cross validation scores.
Must be choose from \code{mse}, \code{mae}, and \code{deviance}.
\code{mse} calculates the mean square error; \code{mae} calculates the mean absolute error;
\code{deviance} calculates the deviance}

\item{x}{a vector of positions at which the counts have been observed. In an
ideal case, we would observe data at regular intervals (e.g. daily or
weekly) but this may not always be the case. May be numeric or Date.}

\item{lambda}{Vector. A user supplied sequence of tuning parameters which
determines the balance between data fidelity and
smoothness of the estimated Rt; larger \code{lambda} results in a smoother
estimate. The default, \code{NULL}
results in an automatic computation based on \code{nlambda}, the largest value
of \code{lambda} that would result in a maximally smooth estimate, and \code{lambda_min_ratio}.
Supplying a value of \code{lambda} overrides
this behaviour. It is likely better to supply a
decreasing sequence of \code{lambda} values than a single (small) value. If
supplied, the user-defined \code{lambda} sequence is automatically sorted in
decreasing order.}

\item{maxiter}{Integer. Maximum number of iterations for the estimation
algorithm.}

\item{delay_distn}{in the case of a non-gamma delay distribution,
a vector or matrix (or \code{Matrix::Matrix()}) of delay probabilities may be
passed here. For a vector, these will be coerced
to sum to 1, and padded with 0 in the right tail if necessary. If a
time-varying delay matrix, it must be lower-triangular. Each row will be
silently coerced to sum to 1. See also \code{vignette("delay-distributions")}.}

\item{delay_distn_periodicity}{Controls the relationship between the spacing
of the computed delay distribution and the spacing of \code{x}. In the default
case, \code{x} would be regular on the sequence \code{1:length(observed_cases)},
and this would
be 1. But if \code{x} is a \code{Date} object or spaced irregularly, the relationship
becomes more complicated. For example, weekly data when \code{x} is a date in
the form \code{YYYY-MM-DD} requires specifying \code{delay_distn_periodicity = "1 week"}.
Or if \code{observed_cases} were reported on Monday, Wednesday, and Friday,
then \code{delay_distn_periodicity = "1 day"} would be most appropriate.}

\item{regular_splits}{Logical.
If \code{TRUE}, the folds for k-fold cross-validation are chosen by placing
every kth point into the same fold. The first and last points are not
included in any fold and are always included in building the predictive
model. As an example, with 15 data points and \code{kfold = 4}, the points are
assigned to folds in the following way:
\deqn{
  0 \; 1 \; 2 \; 3 \; 4 \; 1 \; 2 \; 3 \;  4 \; 1 \; 2 \; 3 \; 4 \; 1 \; 0
  }{0 1 2 3 4 1 2 3 4 1 2 3 4 1 0} where 0 indicates no assignment.
Therefore, the folds are not random and running \code{cv_estimate_rt()} twice
will give the same result.}

\item{invert_splits}{Logical.
Typical K-fold CV would use K-1 folds for the training
set while reserving 1 fold for evaluation (repeating the split K times).
Setting this to true inverts this process, using a much smaller training
set with a larger evaluation set. This tends to result in larger values
of \code{lambda} that minimize CV.}

\item{...}{additional parameters passed to \code{estimate_rt()} function}
}
\value{
An object with S3 class \code{"cv_poisson_rt"}. Among the list components:
\itemize{
\item \code{full_fit} An object with S3 class \code{"poisson_rt"}, fitted with all
\code{observed_counts} and \code{lambda}
\item \code{cv_scores} leave-kth-out cross validation scores
\item \code{cv_se} leave-kth-out cross validation standard error
\item \code{lambda.min} lambda which achieved the optimal cross validation score
\item \code{lambda.1se} lambda that gives the optimal cross validation score
within one standard error.
\item \code{lambda} the value of \code{lambda} used in the algorithm.
}
}
\description{
Leave-kth-out cross validation for choosing a optimal parameter lambda
}
\examples{
y <- c(1, rpois(100, dnorm(1:100, 50, 15) * 500 + 1))
cv <- cv_estimate_rt(y, korder = 3, nfold = 3, nsol = 30)
cv
}
