Importing your data • respR

We designed respR to be a universal, end-to-end solution for analysing data and reporting analyses from any and all aquatic respirometry experiments, regardless of the equipment used. Therefore, it is system-agnostic; the data need only be put into a very simple structure (time against oxygen in any units) for a full analysis to be conducted. Indeed, the entire package (with the exception of the final conversion step in convert_rate) considers data to be unitless, so non-aquatic respirometry data, or any time-series data can be explored and analysed using respR.

Generic R data frame type objects, including vectors and objects of class data.frame, data.table and tibble, are recognised. The only structural data requirement is that time~O₂ data be in a specific form; paired values of numeric time-elapsed (in s, m or h) and oxygen amount (in any common unit). Every respirometry system, to our knowledge, allows data to be exported in such a format, or at least in a structure from which it is easy to parse it to this format. Two functions are provided to assist with bringing in and formatting your data correctly.

`import_file()`

Most systems allow data to be exported in easily readable formats (e.g. .csv, .txt) which contain the numeric time and O2 data respR requires. These files are usually easily imported into R using generic functions such as read.csv() and the relevant columns specified when used in respR functions, or extracted into separate data frames.

Many systems however have raw output files with redundant information, or a structure that confuses importing functions. For example Loligo Systems AutoResp and Witrox files have several rows of metadata above the columns of raw data, which causes importing problems in read.csv(). These files can be altered in Excel or other spreadsheet software to fix these issues, however respR allows importing of many raw data files from various systems without modification.

The import_file() function uses pattern recognititon to identify the originating system of the file. It also automatically recognises the format of any date-time data and uses it to create a new numeric time column, if one does not already exist.

Here’s an example of importing a Witrox raw data file (from the current working directory, otherwise any external file can be specified with a filepath):

import_file("Witrox_eg.txt")
#> 
#> # import_file # -------------------------
#> Loligo AutoResp/Witrox file detected
#> -----------------------------------------
#>       Date_Time_DDMMYYYY_HHMMSS Time_stamp_code Barometric_pressure_hPa
#>    1:      5/11/2017 9:24:04 AM      3577364644                    1013
#>    2:      5/11/2017 9:24:05 AM      3577364645                    1013
#>    3:      5/11/2017 9:24:06 AM      3577364646                    1013
#>    4:      5/11/2017 9:24:07 AM      3577364647                    1013
#>    5:      5/11/2017 9:24:08 AM      3577364648                    1013
#>   ---                                                                  
#> 6812:     5/11/2017 11:17:35 AM      3577371455                    1013
#> 6813:     5/11/2017 11:17:36 AM      3577371456                    1013
#> 6814:     5/11/2017 11:17:37 AM      3577371457                    1013
#> 6815:     5/11/2017 11:17:38 AM      3577371458                    1013
#> 6816:     5/11/2017 11:17:39 AM      3577371459                    1013
#>       SDWA0003000060_CH_1_phase_rU SDWA0003000060_CH_1_temp_C
#>    1:                        29.58                      14.01
#>    2:                        29.58                      14.11
#>    3:                        29.57                      14.14
#>    4:                        29.59                      14.06
#>    5:                        29.58                      14.07
#>   ---                                                        
#> 6812:                        30.47                      13.58
#> 6813:                        30.49                      13.67
#> 6814:                        30.47                      13.47
#> 6815:                        30.49                      13.50
#> 6816:                        30.47                      13.53
#>       SDWA0003000060_CH_1_O2_mg/L
#>    1:                      10.056
#>    2:                      10.015
#>    3:                      10.012
#>    4:                      10.027
#>    5:                      10.032
#>   ---                            
#> 6812:                       9.454
#> 6813:                       9.403
#> 6814:                       9.497
#> 6815:                       9.469
#> 6816:                       9.474

As we can see, the function automatically recognises that this is a Witrox file, removes redundant information, and renames the columns, removing spaces from the names.

This function requires only a single input, the path to the file (one other option, export = TRUE allows exporting of the imported data to a .csv file). Everything else is handled automatically. This contrasts with other packages where numerous options such as the delimiter character, originating hardware, and specific date format must be specifed, which we have found leads to substantial usability issues (see A comparison of respR with other R packages).

After importing and saving as a data.frame, this can be passed to the rest of the respR functions for processing, all while leaving the raw data file unmodified.

This function supports several systems at present (Firesting, Pyro, PreSens, MiniDOT, Loligo Witrox, Vernier and more). However, it is still in development; some files may fail to import because of structural or version differences we have not encountered. We would encourage users to send us sample files for testing, especially any they have problems with, or from systems we do not yet support.

`format_time()`

For files types that are not yet supported, or if you have already imported your data by other means, the format_time() function can parse date-time columns to numeric time-elapsed, in the event the imported file does not contain this.

Here’s an example of a 2 column data frame with date-time data and oxygen.

head(data, n = 5)
#>               Date_Time O2_mg/L
#> 1: 5/11/2017 9:24:04 AM  10.056
#> 2: 5/11/2017 9:24:05 AM  10.015
#> 3: 5/11/2017 9:24:06 AM  10.012
#> 4: 5/11/2017 9:24:07 AM  10.027
#> 5: 5/11/2017 9:24:08 AM  10.032

We can use format_time to parse these data to numeric (internally, format_time uses functionality in the package lubridate). The date-times can either be passed as a vector (for example, so it can be appended to the original data), or as a data frame. By default, the function assumes the date-time data are in the first column (i.e. time = 1), but this can be overridden by changing the time operator to specify the column index where the date-time data occurs. The resulting data frame will be identical (including column names), except a new column with the converted numeric time called time_num is added as the last column. We also need to specify the format of the date-times (see ?format_time for further info):

## Pass as vector
data_2 <- format_time(data[[1]], format = "dmyHMS")
head(data_2)
#> [1] 1 2 3 4 5 6

## Pass as data frame
data_3 <- format_time(data, format = "dmyHMS")
head(data_3)
#>               Date_Time O2_mg/L time_num
#> 1: 5/11/2017 9:24:04 AM  10.056        1
#> 2: 5/11/2017 9:24:05 AM  10.015        2
#> 3: 5/11/2017 9:24:06 AM  10.012        3
#> 4: 5/11/2017 9:24:07 AM  10.027        4
#> 5: 5/11/2017 9:24:08 AM  10.032        5
#> 6: 5/11/2017 9:24:09 AM  10.073        6

By default, the new numeric time-elapsed data will start at zero, but we can override this. This could be useful if data are split into separate files, and you want to append the start of one onto the end of another, or you simply want to link a specific numeric time value to the start of the experiment.

## as data frame
data_4 <- format_time(data, format = "dmyHMS", start = 1000)
head(data_4)
#>               Date_Time O2_mg/L time_num
#> 1: 5/11/2017 9:24:04 AM  10.056     1000
#> 2: 5/11/2017 9:24:05 AM  10.015     1001
#> 3: 5/11/2017 9:24:06 AM  10.012     1002
#> 4: 5/11/2017 9:24:07 AM  10.027     1003
#> 5: 5/11/2017 9:24:08 AM  10.032     1004
#> 6: 5/11/2017 9:24:09 AM  10.073     1005

Note, numeric time data will always output in seconds regardless of the input format.

Dealing with timed events or notes

What if there are important notes or events associated with specific times in your experiment? For example, flushing of chambers, imposing a new swimming speed, changing the temperature, noting a response, etc. Resetting the times via formatting the time data may make these difficult to associate to certain stages of the analysis. This is easily dealt with by formatting the times of the events in the same way you formatted the data. You only need to make sure at least one event is associated with the same start time you used for experimental data.

Here’s an example of some experimental notes (in some systems such notes can be entered in the software, and so may be included in output files, or they could be copied from a lab book into a .csv file and imported).

exp_notes
#>                times                    events
#> 1  8/17/2016 9:42:02          Experiment start
#> 2  8/17/2016 9:52:02        Flush period start
#> 3  8/17/2016 9:54:34          Flush period end
#> 4 8/17/2016 10:19:02  Specimen acting normally
#> 5 8/17/2016 12:04:54             Went to lunch
#> 6 8/17/2016 14:31:22 Swim speed set to 20 cm/s

format_time(exp_notes, format = "mdyHMS")
#>                times                    events time_num
#> 1  8/17/2016 9:42:02          Experiment start        1
#> 2  8/17/2016 9:52:02        Flush period start      601
#> 3  8/17/2016 9:54:34          Flush period end      753
#> 4 8/17/2016 10:19:02  Specimen acting normally     2221
#> 5 8/17/2016 12:04:54             Went to lunch     8573
#> 6 8/17/2016 14:31:22 Swim speed set to 20 cm/s    17361

Such notes do not even have to be in the same date-time format, or even at the same precision, depending on how accurately you need to know when events occurred. The important factors are associating at least one event with the same start time used to format the experimental time data, and using the correct format setting.

exp_notes
#>   times                    events
#> 1  9:42          Experiment start
#> 2  9:52        Flush period start
#> 3  9:54          Flush period end
#> 4 10:19  Specimen acting normally
#> 5 12:04             Went to lunch
#> 6 14:31 Swim speed set to 20 cm/s

format_time(exp_notes, format = "HM")
#>   times                    events time_num
#> 1  9:42          Experiment start        1
#> 2  9:52        Flush period start      601
#> 3  9:54          Flush period end      721
#> 4 10:19  Specimen acting normally     2221
#> 5 12:04             Went to lunch     8521
#> 6 14:31 Swim speed set to 20 cm/s    17341

Next steps

After your data is in this paired, numeric time-~O2 form, it can be passed to inspect() or other functions for analysis.