inspect()
scans and subsets a data.frame object for errors that may affect
the use of various functions in respR
. By default, the function scans only
the first 2 columns of a data frame and assumes that the first column is time
data. A plot of the data is also produced, including a rolling regression
plot using a width of floor(0.1 * nrow([data frame])
for a quick visual
inspection of the rate pattern (or stability) of the data. Note that rates
for oxygen uptake are returned as negative and plotted on a reverse axis;
higher oxygen uptake rates are higher on the rate plot (more negative).
inspect(df, time = NULL, oxygen = NULL, plot = TRUE)
df | data.frame object. Accepts any object of class |
---|---|
time | numeric vector. Defaults to NULL. This specifies the column number(s) of the time data to subset. |
oxygen | numeric vector. Defaults to NULL. This specifies the column number(s) of the oxygen data to subset. |
plot | logical. Defaults to TRUE. Plots the data for quick visual diagnosis. Works only when the subset dataframe contains exactly 2 columns. |
A list object of class inspect
.
Time columns are checked for NA/NaN values, sequential time, duplicate time
and evenly-spaced time data. Oxygen columns are simply checked for NA/NaN
data. Once data checks are complete, the function produces a list object
which may be directly loaded into calc_rate()
, calc_rate.bg()
,
calc_rate.ft()
, and auto_rate()
for further analyses.
If you wish to scan more than two columns, you can do so by specifying the
time
and oxygen
arguments to select specific columns of a large data
frame. However, the function will not produce a plot. Thus, you may inspect
flowthrough respirometry data, which usually contains oxygen values for
inflow and outflow, by specifying a vector of column numbers, e.g. oxygen = c(2,3)
.
It should be noted most of these checks are for exploratory purposes only; they help diagnose potential issues with the data. For instance, very long experiments could have had sensor dropouts the user is completely unaware of. Other issues are not issues at all - for instance, an uneven time warning can result from using decimalised minutes, which is a completely valid time metric.
If some of these checks fail, it should generally not hinder analysis of the data. respR has been coded to rely on linear regression on exact data values, and not make assumptions about data spacing. Therefore issues such as missing or NA/NaN values, duplicate values, or uneven time spacing should not cause any erroneous results, as long as they do not occur over large regions of the data. The only major potential issue is if time data are not sequential. This could cause unknown results and incorrect rates to be returned.
#> Warning: data set ‘sardine.rd’ not foundinspect(sardine.rd)#> Error in inspect(sardine.rd): could not find function "inspect"#> Warning: data set ‘urchins.rd’ not foundinspect(urchins.rd)#> Error in inspect(urchins.rd): could not find function "inspect"#> Warning: data set ‘urchins.rd’ not foundinspect(urchins.rd, time = 1, oxygen = 4)#> Error in inspect(urchins.rd, time = 1, oxygen = 4): could not find function "inspect"#> Warning: data set ‘urchins.rd’ not found#> Error in inspect(urchins.rd, time = 1, oxygen = c(2:12)): could not find function "inspect"#> Error in print(x): object 'x' not foundx$list$time.min # check position of errors in data frame#> Error in eval(expr, envir, enclos): object 'x' not found#> Warning: data set ‘flowthrough.rd’ not found#> Error in inspect(flowthrough.rd, 1, c(2, 3)): could not find function "inspect"x#> Error in eval(expr, envir, enclos): object 'x' not found