54
Tecnología y Ciencias del Agua
, vol. VIII, núm. 2, marzo-abril de 2017, pp. 51-60
Kan
et al.
,
Daily streamflow simulation based on improved machine learning method
•
ISSN 2007-2422
Non-updating RR simulation of the PKEK
model
The non-updating modeling approach of the
PKEK model is similar to the previously pro-
posed NU-PEK model. The difference is the
addition of the
K
-means clustering algorithm.
The modeling approach of the PKEK model is
as follows:
X
t
Q
_SIM
(
)
=
Q
t
1
SIM
(
)
,
Q
t
2
SIM
(
)
, ...,
Q
t
-n Q
SIM
(
)
(
)
T
(1)
X
t
SWCR
(
)
=
SWCRs
t
1
( )
,
SWCRs
t
2
( )
, ...,
SWCRs
t
n
P
( )
T
(2)
X
t
S
( )
=
IVS
Q
_SIM
X
t
Q
_SIM
(
)
(
)
,
IVS
SWCR
X
t
SWCR
(
)
(
)
(
)
T
(3)
Q
t
EST
(
)
=F
EBPNN
F
K
-means
X
t
S
( )
( )
(4)
E
t
EST
(
)
=F
KNN
Q
t
EST
(
)
,F
K
-means
X
t
S
( )
(
)
(5)
Q
t
SIM
(
)
=
Q
t
EST
(
)
+
E
t
EST
(
)
(6)
where
X
t
(
Q
_SIM
)
denotes the candidate simu-
lated antecedent discharge (SAD) input vector;
X
t
(
SWCR
)
denotes the candidate sliding window
cumulative rainfall (SWCR) input vector;
Q
t
-1
(
SIM
)
denotes the SAD,
i
=1, 2, …,
n
Q
; SWCR
s
t
(
i
)
denotes
the SWCR with sliding window width
i
,
i
=1, 2,
…,
n
P
;
n
Q
and
n
P
denote the order of the SAD
and the SWCR respectively; IVS
Q_
SIM
denotes
PMI-based IVS for candidate SAD input vector;
IVS
SWCR
denotes PMI-based IVS for candidate
SWCR input vector;
X
t
(
S
)
denotes the selected
input vector;
Q
t
(EST)
denotes the estimated output
discharge associated with
X
t
(
S
)
;
E
t
(EST)
denotes the
estimated output discharge error associated
with
X
t
(
S
)
;
Q
t
(SIM)
denotes the simulated output
discharge and it is the final prediction of the
output discharge; F
EBPNN
denotes the EBPNN
discharge estimation; F
KNN
denotes the KNN
discharge error estimation; F
K
-means
denotes the
K
-
means clustering algorithm. The non-updating
simulation procedure of the PKEK model for
each flood event is almost the same with the
NU-PEK model. The difference is that after the
PMI-based separate IVS, the
K
-means clustering
algorithm is executed to classify the input vector
and fed it to different PEK module to calculate
the forecasted discharge value.
Calibration of the PKEK model
The calibration of the PKEK model is almost the
same with the NU-PEK model. The difference
is the addition of the calibration of the
K
-means
clustering method. The
K
-means algorithm is
calibrated by the iterative method and this
calibration process is repeated for 500 times
with different initial parameters to avoid the
local minimum problem. The optimal number
of classes is determined by the Silhouette value.
In this paper, approximately 70% of the flood
events are utilized to create the calibration data
and the remaining events are utilized to create
the validation data.
Results and discussion
Model structure and parameter calibration
Input variable selection
Considered that we are studying the daily
simulation, we set the order of SWCR and SAD
(i.e.
n
P
and
n
Q
) equal to 10 and suppose 10 is
large enough. This means that the rainfall will
becomes runoff with a lag at most 10 days. After
the IVS we have found that the lag time of most
selected SADs are less than 10. Therefore, we