CVSL Logo
FrancaisHome
AboutPeopleResearchPublicationsEventsProfile
About
Publications

 

 

 

CERVIM

REPARTI

MIVIM

Training subset selection in Hourly Ontario Energy Price forecasting using time series clustering-based stratification


Karol Lina Lopez, Christian Gagné, Germán Castellanos-Dominguez and Mauricio Orozco-Alzate

More on this project...

Abstract - Training a given learning-based forecasting method to a satisfactory level of performance often requires a large dataset. Indeed, any data-driven methods require having examples that are providing a satisfactory representation of what we want to model to work properly. This often implies using large datasets to be sure that the phenomenon of interest is properly sampled. However, learning from time series composed of too many samples can also be a problem, given that the computational requirements of the learning algorithms can easily grow following a polynomial complexity according to the training set size. In order to identify representative examples of a dataset, we are proposing a methodology using clustering-based stratification of time series to select a training data subset. The principle for constructing a representative sample set using this method consists in selecting heterogeneous instances picked from all the various clusters composing the dataset. Results obtained show that with a small number of training examples, obtained through the proposed clustering-based stratification, we can preserve the performance and improve the stability of models such as artificial neural networks and support vector regression, while training at a much lower computational cost. We illustrate the methodology through forecasting the one-step ahead Hourly Ontario Energy Price (HOEP).

download documentdownload document

Bibtex:

@article{Lopez1085,
    author    = { Karol Lina Lopez and Christian Gagné and Germán Castellanos-Dominguez and Mauricio Orozco-Alzate },
    title     = { Training subset selection in Hourly Ontario Energy Price forecasting using time series clustering-based stratification },
    volume    = { 156 },
    number    = { 25 May 2015 },
    pages     = { 268--279 },
    year      = { 2015 },
    journal   = { Neurocomputing },
    keywords  = { Stratification; Data selection; Stratified sampling; Forecasting models; Hourly Ontario Energy Price },
    web       = { http://dx.doi.org/10.1016/j.neucom.2014.12.052 }
}

Last modification: 2015/03/10 by cgagne

     
   
   

©2002-. Computer Vision and Systems Laboratory. All rights reserved