Forecsting Optimization and FB prophet Hyperparameter Tuning
Introduction
Time series analysis is an important topic in machine learning that is present in almost every industry and a problem that every data scientist will face at some point in their career. Aside from the fundamentals of the working with dates and plotting, there are other approaches and models for characterising the underlying dynamic of a time series such as space state models (Arima, Armas, etc.). This article builds on the previous post on Facebook Prophet library, this latter is used to forecast time series data and fits nonlinear trends with yearly, monthly, and daily seasonality, as well as holiday effects. It works best with time series with substantial seasonal effects and historical data from multiple seasons. Prophet is tolerant of missing data and trend alterations, and it usually handles outliers well. However, the model has several parameters to tweak to acquire the best fit. A grid search (random search or Cartesian search) is the traditional approach of determining the optimal combination of parameters, although this can be timeconsuming, especially when crossvalidating models with many folds.
Basic Prophet model
To begin, we must make sure that our date is in datetime format, and the columns must be renamed to the Prophetrequired names of ds and y. We can then build a Prophet object to fit our data into the model, generate future predictions, visualize the forecast, and perform cross validation to check how well the model fits.
df = read.csv("https://raw.githubusercontent.com/KamranAfzali/StatisticalSupplementaryMaterials/master/DHS_Daily_Report.csv")
#Transform the Date Variable
df$Date = strptime(x = df$Date,
format = "%m/%d/%Y")
df$Date = as.Date(df$Date)
df = df %>% select(Date,
"Total.Individuals.in.Shelter",
Easter,
Thanksgiving,
Christmas)
#Change variable name
colnames(df)[1] = "ds"
colnames(df)[2] = "y"
head(df)%>%kableExtra::kable()
ds  y  Easter  Thanksgiving  Christmas 

20130821  49673  0  0  0 
20130822  49690  0  0  0 
20130823  49548  0  0  0 
20130824  49617  0  0  0 
20130825  49858  0  0  0 
20130826  49877  0  0  0 
easter_dates = subset(df, df$Easter == 1)
easter_dates = easter_dates$ds
easter = tibble(holiday = 'easter',
ds = easter_dates,
lower_window = 4,
upper_window = +2)
#thanksgiving
thanksgiving_dates = subset(df, df$Thanksgiving == 1)
thanksgiving_dates = thanksgiving_dates$ds
thanksgiving = tibble(holiday = 'thanksgiving',
ds = thanksgiving_dates,
lower_window = 3,
upper_window = 1)
#merge holidays
holidays = bind_rows(easter, thanksgiving)
head(holidays)%>%kableExtra::kable()
holiday  ds  lower_window  upper_window 

easter  20140420  4  2 
easter  20150405  4  2 
easter  20160327  4  2 
easter  20170416  4  2 
easter  20180401  4  2 
easter  20190421  4  2 
Crossvalidation plan
A kfold crossvalidation divides the training data into k groups of roughly equal size (referred to as “folds”) hence the number of resamples in simple kfold crossvalidation is equal to k. The training data was resampled to include k1 of the folds, while the test set included the last (kth) fold. Prophet includes a builtin cross validation feature this function will take your data and train the model over the specified time period. It will then forecast a time frame that you specify.
Grid search specification
We use the expand grid function because it enables the use of a random yet systematic procedure, therefore, it is important to set the seed to fix the grid order. There will be k predictions per fold due to the kfold crossvalidation, and the mean of the performance metric (RMSE, Rsquared, etc.) will be calculated. Here are some parameters that you can take into consideration.
Growth
This parameter is easiest to understand and implement, linear growth assumes no upbound where exponential growth slows down (or accelerates). The tricky part of this parameter is to specify the upper and lower bounds of the prediction when it comes to exponential growth, these upper and lower bounds can change over time or fixed values can be assigned.
Holidays
A holiday is a period during which the day has the same impact each year,the tricky part happens when you’re not modeling daily data. For instance, it is a mistakes to enter the holidays as a daily parameter when trying to model an hourly timestamp. Another parameter that handles holidays is holidays_prior_scale that determines the magnitude of the holidays effect on the forecasts.
Changepoints
A changing point is a point in the data where a sudden and abrupt change in the trend occurs, or the point of change is the time frame in which this major change occurred. There are four changepoint hyperparameters: changepoints, n_changepoints, changepoint_range, and changepoint_prior_scale. Changepoint parameters are used to specify the date of the changepoint rather than letting the Prophet determine the date of the changepoint. If you specify your own changes, Prophet will not estimate any further changes.
Seasonalities
It is these parameters that make the Prophet standout, as changing a few values can make a considerable improvement and give great insights. The first major parameter is seasonity_mode, which can be additive or multiplicative, there is also a parameter seasonity_prior_scale to make the seasonality more flexible. The other parameter you can tweak is the number of Fourier components (fourier_order) of each seasonality
prophetGrid = expand.grid(changepoint_prior_scale = c(0.05, 0.1),
seasonality_prior_scale = c(5, 10, 15),
holidays_prior_scale = c(5,10),
seasonality.mode = c('multiplicative', 'additive'))
head(prophetGrid)%>%kableExtra::kable()
changepoint_prior_scale  seasonality_prior_scale  holidays_prior_scale  seasonality.mode 

0.05  5  5  multiplicative 
0.10  5  5  multiplicative 
0.05  10  5  multiplicative 
0.10  10  5  multiplicative 
0.05  15  5  multiplicative 
0.10  15  5  multiplicative 
results = vector(mode = 'numeric',
length = nrow(prophetGrid))
Hyperparameter Tuning endtoend process
The barebone model used all the default parameters because no parameters were specified. Check out my earlier post to learn about Prophet’s default parameters. We create a parameter grid with all of the parameters and values we wish to cycle over, calculate the mean value of the performance matrix, and find the optimum parameter combination in terms of MAPE.

We’ll use kfold crossvalidation to generate a crossvalidation plan.

Define the tunable model specification: specify the tunable parameters clearly.

Create a parameter grid workflow in which we add model specification from the grid.

Check that the grid has all selected tunable parameters and ranges set correctly.

Define the search grid: the type of grid and its size must be specified.

Choose the best model: the one with the lowest RMSE or the highest Rsquared, which may or may not be the same model.

Provide the tuned methodology to refit the best model(s) using the test data.
#Parameter tuning
for (i in 1:nrow(prophetGrid)) {
#Fetch parameters
parameters = prophetGrid[i, ]
#Build the model
m = prophet(yearly.seasonality = TRUE,
weekly.seasonality = TRUE,
daily.seasonality = FALSE,
holidays = holidays,
seasonality.mode = parameters$seasonality.mode,
seasonality.prior.scale = parameters$seasonality_prior_scale,
holidays.prior.scale = parameters$holidays_prior_scale,
changepoint.prior.scale = parameters$changepoint_prior_scale)
m = add_regressor(m, 'Christmas')
m = fit.prophet(m, df)
#Crossvalidation
df.cv = cross_validation(model = m,
horizon = 31,
units = 'days',
period = 14,
initial = 2400)
#store the results
results[i] = accuracy(df.cv$yhat, df.cv$y)[1, 2]
print(i)
}
prophetGrid = cbind(prophetGrid, results)
prophetGrid%>%kableExtra::kable()
changepoint_prior_scale  seasonality_prior_scale  holidays_prior_scale  seasonality.mode  results 

0.05  5  5  multiplicative  1551.764 
0.10  5  5  multiplicative  1554.646 
0.05  10  5  multiplicative  1553.507 
0.10  10  5  multiplicative  1557.264 
0.05  15  5  multiplicative  1551.514 
0.10  15  5  multiplicative  1556.042 
0.05  5  10  multiplicative  1550.701 
0.10  5  10  multiplicative  1557.542 
0.05  10  10  multiplicative  1552.467 
0.10  10  10  multiplicative  1557.381 
best_params = prophetGrid[prophetGrid$results == min(results), ]
best_params
changepoint_prior_scale  seasonality_prior_scale  holidays_prior_scale  seasonality.mode  results  

7  0.05  5  10  multiplicative  1550.701 