auto_rate()
's linear method.R/sim_data.R
sim_data.Rd
Generate data of size len
that is coerced to mimic common respirometry
data. This is an internal function not meant for public use. We will modify
this function at any time, which may irreversibly change the outputs. Do NOT
use this for work! This function was first created to test auto_rate()
using the other internal function, test_lin()
, but we decided to publish
the code as it is an effective (and visually-appealing) tool for teaching
purposes. This function is by no means comprehensive and we encourage users
to generate data that suit their own unique situations.
sim_data(len = 300, type = "default", sd = 0.05, preview = TRUE)
len | numeric. Defaults at 300. Number of observations in the dataset. |
---|---|
type | character. What kind of data should the function generate? Available for use: "default", "corrupted" and "segmented". |
sd | numeric. Defaults at 0.05. This is the amount of noise to add to the system, randomly generated as a standard deviation based on the entire data set. |
preview | logical. Defaults to TRUE. Plots the generated data as an xy. |
A list containing the dataframe, the slope and the length of the
linear section of the data, to be used for analysis in the function
test_lin()
.
sim_data()
creates 3 types of data that we think are common in respirometry
or oxygen flux data. The data types can be selected using the method
argument:
"default"
: data is made up of a linear segment of known length and slope,
with a non-linear segment, generated by a sine or cosine function depending
on whether the slope is positive or negative, appended to the beginning of
the data. The shape of the dataset is designed to mimic many similar data
whereby the initial sections of the data are often non-linear. Here the slope
is randomly generated using rnorm(1, 0, 0.025)
, the length of the inital
segment randomly generated using floor(abs(rnorm(1, .25*len, .05*len)))
where len
is the total number of observations in the data, and the
amplitude of the segment also randomly generated using rnorm(1, .8, .05)
.
"corrupted"
: same as "default"
, but "corrupted" data is inserted
randomly at any point in the linear segment. The data corruption is chosen as
a sudden dip in the reading, which recovers. This event mimics equipment
interference that sometimes happens, but does not necessarily invalidate the
dataset if the corrupted section is omitted from analysis, The dip is
generated by a cosine function of fixed amplitude of 1, and the length is
randomly generated.
"segmented"
: same as "default"
, but the data is modified to contain two
linear segments. The slope of the second linear segment is randomly picked at
between 0.5 and 0.6 of the first linear segment. Its length is also randomly
generated but always smaller than the first segment.
Normally-distributed noise is added to the dataset to add variation, which
can be modified using the "sd"
argument.
# Generate data of length 200 sim_data(len = 200)#> Error in sim_data(len = 200): could not find function "sim_data"# Generate data that contains a "corruption" sim_data(type = "corrupted")#> Error in sim_data(type = "corrupted"): could not find function "sim_data"# Generate noisy data sim_data(type = "segmented", sd = .2)#> Error in sim_data(type = "segmented", sd = 0.2): could not find function "sim_data"# Generate "perfect" non-noisy data sim_data(sd = 0)#> Error in sim_data(sd = 0): could not find function "sim_data"