﻿ Time series prediction (V) -- prophet model - Concrete additive news and tec
﻿

# Time series prediction (V) -- prophet model

2021-07-28 15:59 197
Time series prediction（ 5）——

Welcome to my personal blog website to watch the original： https://xkw168.github.io/2019/05/20/ Time series prediction - five prophet model. HTML

（ 1） Data preprocessing

（ 2） AR model（ Autoregressive model）

（ 3） Xgboost model

（ 4） LSTM model

（ 5） Prophet model（ Autoregressive model）

Model principle

prophet model， It is an open source model specially used for large-scale time series analysis by Facebook company, based on additive model（ Additive Model）% 26#xff0c; The periodicity of year, month and day and the influence of holidays are used to fit the nonlinear trend. The details can be found here. The model is most suitable for fitting data with strong periodicity and several cycles， And for the missing value， Trend offset and outliers are well supported. Prophet adopts a unique strategy（ As shown）% 26#xff0c; On the premise of fully automating the whole process when necessary， It allows data analysts to add their own judgment to the prediction through a set of key model parameters and options y ( t ) = g ( t ) + s ( t ) + h ( t ) + ϵ t y(t)= g(t)+ s(t)+ h(t)+\\ epsilon_ t y(t)= g(t)+ s(t)+ h(t)+ ϵ t​。 g ( t ) g(t) G (T) represents trend item， Used to fit the aperiodic term； s ( t ) s(t) S (T) represents periodic change（ Such as seasonal variation）% 26#xff1b; h ( t ) h(t) H (T) represents the impact of holidays（ It usually shows a special impact on some time points）% 26#xff1b; Error term ϵ t \\epsilon_ t ϵ Expandable： The problem of curve fitting can easily introduce seasonal and multi periodic effects， It can be applied to multiple data types； Data flexibility： Different from ARIMA model， The problem of curve fitting does not need equal step size， Therefore, there is no need to perform some special operations on the data（ Such as interpolation）% 26#xff1b; Fast： Compared with the traditional training model， Curve fitting is faster， Help data scientists iterate； Variables are easy to explain： Most variables of the model have clear physical meanings， And people with certain data analysis experience can quickly convert the background knowledge into new parameters and introduce them into the model. Model installation

CONDA

is recommended for prophet model installation

conda install -c conda-forge fbprophet

Download / aconda} in PowerShell and verify that it has been installed successfully

Detailed explanation of using pipconda virtual environment in CONDA Model implementation

Attention： Prophet model has restrictions on input， Accept dataframe format input， And the time data column name is ds， The observation data column name is y

def prophet_ predict_ fb(observed_ data, x_ name=% 26#34; ds",  y_ name=% 26#34; y",  forecast_ cnt= 365, frep=% 26#34; D",  file_ name=% 26#34;% 26#34;):
"% 26#34;% 26#34;
function that predict time series with library fbprophet
:param observed_ data: time series data(DataFrame format)
(two columns, one is time in YYYY-MM-DD or YYYY-MM-DD HH:MM:SS format and the other is numeric data)
:param x_ name: x column name(time data), usually is DATE
:param y_ name: y column name(numeric data) e.g. HMD, MAX...
:param forecast_ cnt: how many point needed to be predicted
:param frep: the frequency/period of prediction
:param file_ name:
:return: None
"% 26#34;% 26#34;
def check_ parameter_ validity():
if x_ name not in observed_ data.keys():
raise KeyError(" train_ data doesn' t have column named %s" %  x_ name)
if y_ name not in observed_ data.keys():
raise KeyError(" train_ data doesn' t have column named %s" %  y_ name)
try:
check_ parameter_ validity()
except KeyError as e:
print(" key error: %s" %  str(e))
return None
observed_ data =  observed_ data.rename(columns={x_ name: " ds",  y_ name: " y"})
observed_ data[" ds"] % 26#61;  pd.to_ datetime(observed_ data[" ds"])
observed_ data[" y"] % 26#61;  pd.to_ numeric(observed_ data[" y"],  downcast=% 26#39; float',  errors=% 26#39; coerce')
df2_ pro =  fbprophet.Prophet(changepoint_ prior_ scale= 0.1)
df2_ pro.fit(observed_ data)
future_ date =  df2_ pro.make_ future_ dataframe(periods= forecast_ cnt, freq= frep)
df2_ forecast =  df2_ pro.predict(future_ date)
# register a datetime converter for matplotlib
from pandas.plotting import register_ matplotlib_ converters
register_ matplotlib_ converters()
if file_ name:
fig1 =  df2_ pro.plot(df2_ forecast, xlabel= x_ name, ylabel= y_ name)
fig1.show()
fig1.savefig('./ result/%s.png ' %  file_ name)
fig2 =  df2_ pro.plot_ components(df2_ forecast)
fig2.show()
fig2.savefig('./ result/%s.png ' %  str(file_ name + % 26#34; 1"))
return df2_ forecast

Key parameters

The model is well encapsulated， No need to modify more parameters

changepoint_ prior_ scale： Change frequency of data trend， The higher the value， The slower the trend item of the data changes.

Note that this model is more suitable for analyzing data with obvious periodicity

Tag：Time,series,prediction,prophet