Title: | Seasonal Adjustment of Weekly Data |
---|---|
Description: | Perform seasonal adjustment of weekly data. The package provides a user-friendly interface for computing seasonally adjusted estimates of weekly data and includes functions for the creation of country-specific prior adjustment variables, as well as diagnostic tools to assess the quality of the adjustments. The method is described in more detail in Ginker (2023) <doi:10.13140/RG.2.2.12221.44000>. |
Authors: | Tim Ginker [aut, cre, cph] |
Maintainer: | Tim Ginker <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.1.3 |
Built: | 2025-02-05 13:26:51 UTC |
Source: | https://github.com/timginker/boiwsa |
Performs seasonal adjustment of weekly data. For more details on the usage of this function see the paper or the examples on Github.
boiwsa( x, dates, r = 0.8, auto.ao.search = TRUE, out.threshold = 3.8, ao.list = NULL, my.k_l = NULL, H = NULL, ic = "aicc", method = "additive" )
boiwsa( x, dates, r = 0.8, auto.ao.search = TRUE, out.threshold = 3.8, ao.list = NULL, my.k_l = NULL, H = NULL, ic = "aicc", method = "additive" )
x |
Input time series as a numeric vector |
dates |
a vector of class "Date", containing the data dates |
r |
Defines the rate of decay of the weights. Should be between zero and one. By default is set to 0.8. |
auto.ao.search |
Boolean. Search for additive outliers |
out.threshold |
t-stat threshold in outlier search. By default is 3.8 |
ao.list |
Vector with user specified additive outliers in a date format |
my.k_l |
Numeric vector defining the number of yearly and monthly trigonometric variables. If NULL, is found automatically using the information criteria. The search range is 0:36 and 0:12 with the step size of 6 for the yearly and monthly variables, respectively. |
H |
Matrix with holiday- and trading day factors |
ic |
Information criterion used in the automatic search for the number of trigonometric regressors. There are thee options: aic, aicc and bic. By default uses aicc |
method |
Decomposition type: additive or multiplicative |
sa Seasonally adjusted series
my.k_l Number of trigonometric variables used to model the seasonal pattern
sf Estimated seasonal effects
hol.factors Estimated holiday effects
out.factors Estimated outlier effects
beta Regression coefficients for the last year
m lm object. Unweighted OLS regression on the full sample
Tim Ginker
# Not run # Seasonal adjustment of weekly US gasoline production data("gasoline.data") res=boiwsa(x=gasoline.data$y,dates=gasoline.data$date)
# Not run # Seasonal adjustment of weekly US gasoline production data("gasoline.data") res=boiwsa(x=gasoline.data$y,dates=gasoline.data$date)
Israeli working dates
dates_il
dates_il
A data frame with 21550 rows and 4 variables:
DATE_VALUE
Date
ISR_WORKING_DAY_PART
1: full working day, 0.5: half working day, 0: holiday
JEWISH_FULL_DATE
Jewish date
DATE_WEEK_NUMBER
Weekday
Personal
Searches through the model space to identify the best number of trigonometric variables, with the lowest AIC, AICc or BIC value.
find_opt( x, dates, H = NULL, AO = NULL, method = "additive", l.max = 12, k.max = 42, by = 6 )
find_opt( x, dates, H = NULL, AO = NULL, method = "additive", l.max = 12, k.max = 42, by = 6 )
x |
Numeric vector. Time series to seasonally adjust |
dates |
a vector of class "Date", containing the data dates |
H |
(optional) Matrix with holiday and trading day variables |
AO |
(optional) Matrix with additive outlier variables |
method |
Decomposition method: "additive" or "multiplicative". By default uses the additive method |
l.max |
Maximal number of the monthly cycle variables to search for. By default is 12 |
k.max |
Maximal number of the yearly cycle variables to search for. By default is 42 |
by |
Step size in the search. By default is 6. |
list with the optimal number of (yearly and monthly) fourier variables according to AIC, AICc and BIC
data(gasoline.data) res=find_opt(x=gasoline.data$y,dates=gasoline.data$date) print(res)
data(gasoline.data) res=find_opt(x=gasoline.data$y,dates=gasoline.data$date) print(res)
Searches for additive outliers using the method described in Appendix C of Findley et al. (1998). If the number of trigonometric variables is not specified will search automatically through the model space to identify the best number of trigonometric variables, with the lowest AIC, AICc or BIC value.
find_outliers( x, dates, out.tolerance = 3.8, my.AO.list = NULL, H = NULL, my.k_l = NULL, method = "additive" )
find_outliers( x, dates, out.tolerance = 3.8, my.AO.list = NULL, H = NULL, my.k_l = NULL, method = "additive" )
x |
Numeric vector. Time series to seasonally adjust |
dates |
a vector of class "Date", containing the data dates |
out.tolerance |
t-stat threshold for outliers (see Findley et al., 1998) |
my.AO.list |
(optional) Vector with user defined additive outlier variables |
H |
(optional) Matrix with holiday and trading day variables |
my.k_l |
(optional) Vector with the number of fourier terms to capture the yearly and monthly cycle. If NULL, would perform automatic search using AICc criterion |
method |
Decomposition method: "additive" or "multiplicative". By default uses the additive method |
my.k_l
ao list of AO dates
Findley, D.F., Monsell, B.C., Bell, W.R., Otto, M.C. and B.C Chen (1998). New capabilities and methods of the X-12-ARIMA seasonal-adjustment program. Journal of Business & Economic Statistics, 16(2), pp.127-152.
#Not run: # Searching for additive outliers in Gasoline data data(gasoline.data) ao_list=find_outliers(x=gasoline.data$y,dates = gasoline.data$date)
#Not run: # Searching for additive outliers in Gasoline data data(gasoline.data) ao_list=find_outliers(x=gasoline.data$y,dates = gasoline.data$date)
Creates sine and cosine variables to capture intramonthly and intrayearly cycles.
fourier_vars(k = 1, l = 1, dates)
fourier_vars(k = 1, l = 1, dates)
k |
Number of pairs of the yearly cycle trigonometric variables |
l |
Number of pairs of the monthly cycle trigonometric variables |
dates |
Vector of dates in a date format |
Matrix with fourier variables
# create a vector of dates dates=seq.Date(from=as.Date("2023-01-02"),by="weeks",length.out = 100) # Create a matrix with 20 yearly and 6 monthly pairs of sine and cosine variables X=fourier_vars(k=20,l=6,dates=dates)
# create a vector of dates dates=seq.Date(from=as.Date("2023-01-02"),by="weeks",length.out = 100) # Create a matrix with 20 yearly and 6 monthly pairs of sine and cosine variables X=fourier_vars(k=20,l=6,dates=dates)
Weekly data beginning 2 February 1991, ending 20 January 2017. Units are "million barrels per day".
gasoline.data
gasoline.data
A data frame with 1355 rows and 2 columns:
date in a date format
gasoline consumption
Originally from the US Energy Information Administration. Copied from the fpp2 package.
Can be used to generate moving holiday regressors for the U. S. holidays of Easter, Labor Day, and Thanksgiving; or for Israeli Rosh Hashanah and Pesach. The variables are computed using the Easter formula in Table 2 of Findley et al. (1998). Uses calendar centring to avoid bias.
genhol(dates, holiday.dates, start = 7, end = 7)
genhol(dates, holiday.dates, start = 7, end = 7)
dates |
a vector of class "Date", containing the data dates |
holiday.dates |
a vector of class "Date", containing the occurrences of the holiday. It can be generated with as.Date(). |
start |
integer, shifts backwards the start point of the holiday. Use negative values if start is after the specified date. |
end |
integer, shifts end point of the holiday. Use negative values if end is before the specified date. |
a matrix with holiday variables that can be used as a user defined variable in boiwsa().
Findley, D.F., Monsell, B.C., Bell, W.R., Otto, M.C. and B.C Chen (1998). New capabilities and methods of the X-12-ARIMA seasonal-adjustment program. Journal of Business & Economic Statistics, 16(2), pp.127-152.
# Creating moving holiday variable for Israeli Rosh Hashanah data(gasoline.data) data(holiday_dates_il) # dates of Israeli Rosh Hashanah and Pesach movehol=genhol(gasoline.data$date,holiday.dates = holiday_dates_il$rosh)
# Creating moving holiday variable for Israeli Rosh Hashanah data(gasoline.data) data(holiday_dates_il) # dates of Israeli Rosh Hashanah and Pesach movehol=genhol(gasoline.data$date,holiday.dates = holiday_dates_il$rosh)
Rosh Hashanah and Pesach dates
holiday_dates_il
holiday_dates_il
A data frame with 51 rows and 3 variables:
year
Year
rosh
Rosh Hashanah date
pesah
Pesach date
Personal
Weekly data beginning 11 January 2014, ending 4 January 2020.
lbm
lbm
A data frame with 313 rows and 2 columns:
date in a date format
number of initial registrations
Internal
Creates a matrix with additive outlier variables. Uses the original data dates and the user specified outlier dates.
my_ao(dates, out.list)
my_ao(dates, out.list)
dates |
Vector of dates in a date format |
out.list |
Vector of outlier dates in a date format |
AO matrix with outlier variables
# create a sequence of dates dates=seq.Date(from=as.Date("2023-01-02"),by="weeks",length.out = 100) # create a vector of outlier dates my_ao_dates=as.Date(c("2023-01-02","2023-01-03")) # create a matrix of AO variables my_ao(dates = dates,out.list = my_ao_dates) # as you can see there is only one column corresponding to 2023-01-02, # the second date is ignored because it is not present in the dates vector
# create a sequence of dates dates=seq.Date(from=as.Date("2023-01-02"),by="weeks",length.out = 100) # create a vector of outlier dates my_ao_dates=as.Date(c("2023-01-02","2023-01-03")) # create a matrix of AO variables my_ao(dates = dates,out.list = my_ao_dates) # as you can see there is only one column corresponding to 2023-01-02, # the second date is ignored because it is not present in the dates vector
Creates a dummy moving holiday variable for the weekly number of initial registrations at the Employment Service in Israel.
my_rosh(dates, holiday.dates, start = -11, end = 12)
my_rosh(dates, holiday.dates, start = -11, end = 12)
dates |
a vector of class "Date", containing the data dates |
holiday.dates |
a vector of class "Date", containing the occurrences of the holiday. It can be generated with as.Date(). |
start |
-11 for rosh, 3 for pesach |
end |
12 for rosh, -1 for pesach |
rosh holiday variable
# Creating moving holiday dummy variable for Israeli Rosh Hashanah data(gasoline.data) data(holiday_dates_il) # dates of Israeli Rosh Hashanah and Pesach movehol=my_rosh(gasoline.data$date,holiday.dates = holiday_dates_il$rosh)
# Creating moving holiday dummy variable for Israeli Rosh Hashanah data(gasoline.data) data(holiday_dates_il) # dates of Israeli Rosh Hashanah and Pesach movehol=my_rosh(gasoline.data$date,holiday.dates = holiday_dates_il$rosh)
AR spectrum of the (detrended) original and seasonally adjusted data. Computed using stats::spec.ar()
with order set to 60.
plot_spec(x)
plot_spec(x)
x |
boiwsa results |
AR spectrum plot
# Not run # Seasonal adjustment of weekly US gasoline production res=boiwsa(x=gasoline.data$y,dates=gasoline.data$date) plot_spec(res)
# Not run # Seasonal adjustment of weekly US gasoline production res=boiwsa(x=gasoline.data$y,dates=gasoline.data$date) plot_spec(res)
S3 method for objects of class "boiwsa". Produces a ggplot object of seasonally decomposed time series.
## S3 method for class 'boiwsa' plot(x, ...)
## S3 method for class 'boiwsa' plot(x, ...)
x |
Result of boiwsa |
... |
Additional arguments (currently not used). |
S3 method for 'boiwsa' class. Returns forecasts and other information using a combination of nonseasonal auto.arima and estimates from boiwsa.
## S3 method for class 'boiwsa' predict(object, ...)
## S3 method for class 'boiwsa' predict(object, ...)
object |
An object of class |
... |
Additional arguments:
|
A list containing the forecast values and ARIMA fit.
This is the generic print function.
print(x, ...)
print(x, ...)
x |
An object to print. |
... |
Additional arguments (currently not used). |
S3 method for objects of class boiwsa
. Prints a short model summary
including the number of trigonometric terms and the position of outliers.
## S3 method for class 'boiwsa' print(x, ...)
## S3 method for class 'boiwsa' print(x, ...)
x |
Result of |
... |
Additional arguments (currently not used). |
Aggregates the count of full working days within a week and normalizes it.
simple_td(dates, df.td)
simple_td(dates, df.td)
dates |
a vector of class "Date", containing the data dates |
df.td |
dataframe with working days. Its should consit of 2 columns named as "date" and "WORKING_DAY_PART". date column should be of class "Date". WORKING_DAY_PART should be similar to ISR_WORKING_DAY_PART in dates_il |
matrix with trading day variables
library(dplyr) data(dates_il) data(gasoline.data) dates_il%>% dplyr::select(DATE_VALUE,ISR_WORKING_DAY_PART)%>% `colnames<-`(c("date","WORKING_DAY_PART"))%>% dplyr::mutate(date=as.Date(date))->df.td td=simple_td(dates = gasoline.data$date,df.td = df.td)
library(dplyr) data(dates_il) data(gasoline.data) dates_il%>% dplyr::select(DATE_VALUE,ISR_WORKING_DAY_PART)%>% `colnames<-`(c("date","WORKING_DAY_PART"))%>% dplyr::mutate(date=as.Date(date))->df.td td=simple_td(dates = gasoline.data$date,df.td = df.td)
This is the generic summary function.
summary(object, ...)
summary(object, ...)
object |
An object to summarize. |
... |
Additional arguments (currently not used). |
S3 method for objects of class "boiwsa". Prints the regression summary output.
## S3 method for class 'boiwsa' summary(object, ...)
## S3 method for class 'boiwsa' summary(object, ...)
object |
An object of class |
... |
Additional arguments (currently not used). |