Create a frequency analysis based on custom interval

The table will accept daily, discrete, or a both types of data. The median of each year/month is calculated. Then using that median, monthly stats are calculated. Percentiles are calculated using the quantile function with "type=6".

Usage

stats_by_interval(
  interval,
  gw_level_dv,
  gwl_data,
  parameter_cd = NA,
  date_col = NA,
  value_col = NA,
  approved_col = NA,
  flip = FALSE
)

Arguments

gw_level_dv: data frame, daily groundwater level data. Often obtained from read_waterdata_daily. Use NULL for no daily data.
gwl_data: data frame returned from read_waterdata_field_measurements, or data frame with a date, value, and approval columns. Use NULL for no discrete data.
parameter_cd: Can be used to filter data if the data frame has a "parameter_code" column. The default is NA, which will not do any filtering. If the gwl_data and gw_level_dv need different parameter code filtering, use a vector of 2 parameter codes. The first one will filter the gw_level_dv data frame, the second will filter the gwl_data data frame.
date_col: the name of the time columns. The first value is associated with the gw_level_dv input, and the second value is associated with the gwl_data input. The default is c("time", "time").
value_col: the name of the value columns. The first value is associated with the gw_level_dv input, and the second value is associated with the gwl_data input. The default is c("value", "value").
approved_col: the name of the column to get provisional/approved status. The first value is associated with the gw_level_dv input, and the second value is associated with the gwl_data input. The default is c("approval_status", "approval_status"). It is expected that these columns will have only "Approved" or "Provisional".
flip: logical. If TRUE, flips labels so that the lower numbers are in the higher percentages. Default is TRUE.

Examples


site <- "USGS-263819081585801"
p_code_dv <- "62610"
statCd <- "00001"
# gw_level_dv <- dataRetrieval::read_waterdata_daily(monitoring_location_id = site,
#                                                    parameter_code = p_code_dv,
#                                                    statistic_id = statCd)
                                                    
gw_level_dv <- L2701_example_data$Daily
monthly_frequency <- stats_by_interval("month",
                                       gw_level_dv,
                                       NULL)
head(monthly_frequency)
#> # A tibble: 6 × 11
#>   month   p05   p10   p25   p50   p75   p90    p95 nYears minMed maxMed
#>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <int>  <dbl>  <dbl>
#> 1     1 -37.8 -32.2 -29.1 -20.2 -16.9 -10.9  -7.15     46  -41.8  -5.68
#> 2     2 -40.4 -32.5 -28.8 -21.4 -17.7 -12.3  -7.64     47  -42.8  -7.25
#> 3     3 -40.8 -35.5 -29.9 -22.8 -18.8 -12.1  -7.81     47  -45.1  -6.87
#> 4     4 -44.6 -39.0 -31.3 -25.3 -21.7 -15.0 -10.4      46  -48.5  -9.34
#> 5     5 -45.7 -39.8 -33.8 -27.0 -23.2 -17.6 -14.2      46  -49.4 -10.0 
#> 6     6 -41.6 -37.7 -32.1 -26.6 -20.3 -15.6 -13.7      45  -42.8 -11.0 
# gwl_data <- dataRetrieval::read_waterdata_field_measurements(monitoring_location_id = site)
gwl_data <- L2701_example_data$Discrete
gwl_data <- gwl_data[gwl_data$parameter_code == p_code_dv, ]

monthly_frequency2 <- stats_by_interval("month",
                                       gw_level_dv,
                                       gwl_data) 

annual_frequency2 <- stats_by_interval("year",
                                       gw_level_dv,
                                       gwl_data) 
                                       
week_frequency2 <- stats_by_interval("week",
                                     gw_level_dv,
                                     gwl_data)