18 resultados para classification and regression trees
Resumo:
The multi-dimensional classification problem is a generalisation of the recently-popularised task of multi-label classification, where each data instance is associated with multiple class variables. There has been relatively little research carried out specific to multi-dimensional classification and, although one of the core goals is similar (modelling dependencies among classes), there are important differences; namely a higher number of possible classifications. In this paper we present method for multi-dimensional classification, drawing from the most relevant multi-label research, and combining it with important novel developments. Using a fast method to model the conditional dependence between class variables, we form super-class partitions and use them to build multi-dimensional learners, learning each super-class as an ordinary class, and thus explicitly modelling class dependencies. Additionally, we present a mechanism to deal with the many class values inherent to super-classes, and thus make learning efficient. To investigate the effectiveness of this approach we carry out an empirical evaluation on a range of multi-dimensional datasets, under different evaluation metrics, and in comparison with high-performing existing multi-dimensional approaches from the literature. Analysis of results shows that our approach offers important performance gains over competing methods, while also exhibiting tractable running time.
Resumo:
In order to implement accurate models for wind power ramp forecasting, ramps need to be previously characterised. This issue has been typically addressed by performing binary ramp/non-ramp classifications based on ad-hoc assessed thresholds. However, recent works question this approach. This paper presents the ramp function, an innovative wavelet- based tool which detects and characterises ramp events in wind power time series. The underlying idea is to assess a continuous index related to the ramp intensity at each time step, which is obtained by considering large power output gradients evaluated under different time scales (up to typical ramp durations). The ramp function overcomes some of the drawbacks shown by the aforementioned binary classification and permits forecasters to easily reveal specific features of the ramp behaviour observed at a wind farm. As an example, the daily profile of the ramp-up and ramp-down intensities are obtained for the case of a wind farm located in Spain
Resumo:
In the present paper, 1-year PM10 and PM 2.5 data from roadside and urban background monitoring stations in Athens (Greece), Madrid (Spain) and London (UK) are analysed in relation to other air pollutants (NO,NO2,NOx,CO,O3 and SO2)and several meteorological parameters (wind velocity, temperature, relative humidity, precipitation, solar radiation and atmospheric pressure), in order to investigate the sources and factors affecting particulate pollution in large European cities. Principal component and regression analyses are therefore used to quantify the contribution of both combustion and non-combustion sources to the PM10 and PM 2.5 levels observed. The analysis reveals that the EU legislated PM 10 and PM2.5 limit values are frequently breached, forming a potential public health hazard in the areas studied. The seasonal variability patterns of particulates varies among cities and sites, with Athens and Madrid presenting higher PM10 concentrations during the warm period and suggesting the larger relative contribution of secondary and natural particles during hot and dry days. It is estimated that the contribution of non-combustion sources varies substantially among cities, sites and seasons and ranges between 38-67% and 40-62% in London, 26-50% and 20-62% in Athens, and 31-58% and 33-68% in Madrid, for both PM10 and PM 2.5. Higher contributions from non-combustion sources are found at urban background sites in all three cities, whereas in the traffic sites the seasonal differences are smaller. In addition, the non-combustion fraction of both particle metrics is higher during the warm season at all sites. On the whole, the analysis provides evidence of the substantial impact of non-combustion sources on local air quality in all three cities. While vehicular exhaust emissions carry a large part of the risk posed on human health by particle exposure, it is most likely that mitigation measures designed for their reduction will have a major effect only at traffic sites and additional measures will be necessary for the control of background levels. However, efforts in mitigation strategies should always focus on optimal health effects.