% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/simulate_counts.R
\name{simulate_counts}
\alias{simulate_counts}
\title{simulate_counts}
\usage{
simulate_counts(
  count_matrix,
  damage_proportion,
  annotated_celltypes = FALSE,
  target_damage = c(0.1, 0.8),
  damage_distribution = "right_skewed",
  distribution_steepness = "moderate",
  beta_shape_parameters = NULL,
  ribosome_penalty = 0.001,
  generate_plot = TRUE,
  palette = c("grey", "#7023FD", "#E60006"),
  plot_ribosomal_penalty = FALSE,
  display_plot = TRUE,
  seed = NULL,
  organism = "Hsap"
)
}
\arguments{
\item{count_matrix}{Matrix or dgCMatrix containing the counts from
single cell RNA sequencing data.}

\item{damage_proportion}{Numeric describing what proportion
of the input data should be altered to resemble damaged data.
\itemize{
\item Must range between 0 and 1.
}}

\item{annotated_celltypes}{Boolean specifying whether input matrix has
cell type information stored.
\itemize{
\item Default is FALSE
}}

\item{target_damage}{Numeric vector specifying the upper and lower range of
the level of damage that will be introduced.

Here, damage refers to the amount of cytoplasmic RNA lost by a cell where
values closer to 1 indicate more loss and therefore more heavily damaged
cells.
\itemize{
\item Default is c(0.1, 0.8)
}}

\item{damage_distribution}{String specifying whether the distribution of
damage levels among the damaged cells should be shifted towards the
upper or lower range of damage specified in 'target_damage' or follow
a symmetric distribution between them. There are three valid options:
\itemize{
\item "right_skewed"
\item "left_skewed"
\item "symmetric"
\item Default is "right_skewed"
}}

\item{distribution_steepness}{String specifying how concentrated the spread
of damaged cells are about the mean of the target distribution specified in
'target_damage'. Here, an increase in steepness manifests in a more
apparent skewness.There are three valid options:
\itemize{
\item "shallow"
\item "moderate"
\item "steep"
\item Default is "moderate"
}}

\item{beta_shape_parameters}{Numeric vector that allows for the shape
parameters of the beta distribution to defined explicitly. This offers
greater flexibility than allowed by the 'damage_distribution' and
'distribution_steepness' parameters and will override the defaults they
offer.
\itemize{
\item Default is 'NULL'
}}

\item{ribosome_penalty}{Numeric specifying the factor by which the
probability of loosing a transcript from a ribosomal gene is multiplied by.
Here, values closer to 0 represent a greater penalty.
\itemize{
\item Default is 0.01.
}}

\item{generate_plot}{Boolean specifying whether the QC plot should
be outputted. QC plots will be generated by default as we recommend
verifying the perturbed data retains characteristics of true
single cell data.
\itemize{
\item Default is TRUE.
}}

\item{palette}{Character vector containing three colours to create the
continuous palette for damaged cells.
\itemize{
\item Default is c("grey", "#7023FD", "#E60006").
}}

\item{plot_ribosomal_penalty}{Boolean specifying whether the output QC plot
should focus on only the ribosomal proportion or contain additional QC
information. If TRUE, this can be useful for visualising the impact of
the ribosomal penalty parameter.
\itemize{
\item Default is FALSE.
}}

\item{display_plot}{Boolean specifying whether the output QC plot should
be displayed in the global environment. Naturally, this is only relevant
when generate_plot is TRUE.
\itemize{
\item Default is TRUE.
}}

\item{seed}{Numeric specifying the random seed to ensure reproducibility of
the function's output. Setting a seed ensures that the random sampling
and perturbation processes produce the same results when the function
is run multiple times with the same input data and parameters.
\itemize{
\item Default is 7.
}}

\item{organism}{String specifying the organism of origin of the input
data where there are two standard options,
\itemize{
\item "Hsap"
\item "Mmus"
}

If a user wishes to use a non-standard organism they must input a list
containing strings for the patterns to match mitochondrial and ribosomal
genes of the organism. If available, nuclear-encoded genes that are likely
retained in the nucleus, such as in nuclear speckles, must also
be specified. An example for humans is below,
\itemize{
\item organism = c(mito_pattern = "^MT-",
ribo_pattern = "^(RPS|RPL)",
nuclear <- c("NEAT1","XIST", "MALAT1")
\item Default is "Hsap"
}}
}
\value{
A list containing the altered count matrix, a data frame with summary
statistics, and, if specified, a 'ggplot2' object of the quality control
metrics of the alteration.
}
\description{
Function to simulate damaged cells by perturbing the gene expression of
existing cells.
}
\details{
'DamageDetective' models damage in single-cell RNA sequencing data as the
loss of cytoplasmic RNA, where cells experiencing greater RNA loss are
assumed to be more extensively damaged, while those with minimal loss are
considered largely intact. The perturbation process introduces RNA loss into
existing cells and is controlled by three key parameters: the \strong{target
proportion of damage},  which specifies the fraction of cells to be
perturbed; the \strong{target level of damage}, which defines the extent of RNA
loss across cells; and the \strong{target distribution of damage}, which
determines how the different levels of RNA loss are distributed across
cells.

Based on these parameters, cells are randomly selected and assigned a target
proportion of RNA loss. The total number of transcripts to be removed is
determined, and perturbation is applied through weighted sampling without
replacement from cytoplasmic gene counts. Here, the probability of
transcript loss is determined by gene abundance, with highly expressed genes
more likely to lose transcripts. Once the target RNA loss is reached, the
cell's expression profile is updated, and the process repeats for all
selected cells.
}
\examples{
data("test_counts", package = "DamageDetective")

simulated_damage <- simulate_counts(
  count_matrix = test_counts,
  damage_proportion = 0.1,
  ribosome_penalty = 0.01,
  target_damage = c(0.5, 0.9),
  generate_plot = FALSE,
  seed = 7
)
}
