853 resultados para heterogeneous regressions algorithms
Resumo:
Pocket Data Mining (PDM) describes the full process of analysing data streams in mobile ad hoc distributed environments. Advances in mobile devices like smart phones and tablet computers have made it possible for a wide range of applications to run in such an environment. In this paper, we propose the adoption of data stream classification techniques for PDM. Evident by a thorough experimental study, it has been proved that running heterogeneous/different, or homogeneous/similar data stream classification techniques over vertically partitioned data (data partitioned according to the feature space) results in comparable performance to batch and centralised learning techniques.
Resumo:
In order to assist in comparing the computational techniques used in different models, the authors propose a standardized set of one-dimensional numerical experiments that could be completed for each model. The results of these experiments, with a simplified form of the computational representation for advection, diffusion, pressure gradient term, Coriolis term, and filter used in the models, should be reported in the peer-reviewed literature. Specific recommendations are described in this paper.
Resumo:
We discuss the modeling of dielectric responses for an electromagnetically excited network of capacitors and resistors using a systems identification framework. Standard models that assume integral order dynamics are augmented to incorporate fractional order dynamics. This enables us to relate more faithfully the modeled responses to those reported in the Dielectrics literature.
Resumo:
With the fast development of the Internet, wireless communications and semiconductor devices, home networking has received significant attention. Consumer products can collect and transmit various types of data in the home environment. Typical consumer sensors are often equipped with tiny, irreplaceable batteries and it therefore of the utmost importance to design energy efficient algorithms to prolong the home network lifetime and reduce devices going to landfill. Sink mobility is an important technique to improve home network performance including energy consumption, lifetime and end-to-end delay. Also, it can largely mitigate the hot spots near the sink node. The selection of optimal moving trajectory for sink node(s) is an NP-hard problem jointly optimizing routing algorithms with the mobile sink moving strategy is a significant and challenging research issue. The influence of multiple static sink nodes on energy consumption under different scale networks is first studied and an Energy-efficient Multi-sink Clustering Algorithm (EMCA) is proposed and tested. Then, the influence of mobile sink velocity, position and number on network performance is studied and a Mobile-sink based Energy-efficient Clustering Algorithm (MECA) is proposed. Simulation results validate the performance of the proposed two algorithms which can be deployed in a consumer home network environment.
Resumo:
The variability of results from different automated methods of detection and tracking of extratropical cyclones is assessed in order to identify uncertainties related to the choice of method. Fifteen international teams applied their own algorithms to the same dataset—the period 1989–2009 of interim European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERAInterim) data. This experiment is part of the community project Intercomparison of Mid Latitude Storm Diagnostics (IMILAST; see www.proclim.ch/imilast/index.html). The spread of results for cyclone frequency, intensity, life cycle, and track location is presented to illustrate the impact of using different methods. Globally, methods agree well for geographical distribution in large oceanic regions, interannual variability of cyclone numbers, geographical patterns of strong trends, and distribution shape for many life cycle characteristics. In contrast, the largest disparities exist for the total numbers of cyclones, the detection of weak cyclones, and distribution in some densely populated regions. Consistency between methods is better for strong cyclones than for shallow ones. Two case studies of relatively large, intense cyclones reveal that the identification of the most intense part of the life cycle of these events is robust between methods, but considerable differences exist during the development and the dissolution phases.
Resumo:
We present an efficient graph-based algorithm for quantifying the similarity of household-level energy use profiles, using a notion of similarity that allows for small time–shifts when comparing profiles. Experimental results on a real smart meter data set demonstrate that in cases of practical interest our technique is far faster than the existing method for computing the same similarity measure. Having a fast algorithm for measuring profile similarity improves the efficiency of tasks such as clustering of customers and cross-validation of forecasting methods using historical data. Furthermore, we apply a generalisation of our algorithm to produce substantially better household-level energy use forecasts from historical smart meter data.
Resumo:
Historic geomagnetic activity observations have been used to reveal centennial variations in the open solar flux and the near-Earth heliospheric conditions (the interplanetary magnetic field and the solar wind speed). The various methods are in very good agreement for the past 135 years when there were sufficient reliable magnetic observatories in operation to eliminate problems due to site-specific errors and calibration drifts. This review underlines the physical principles that allow these reconstructions to be made, as well as the details of the various algorithms employed and the results obtained. Discussion is included of: the importance of the averaging timescale; the key differences between “range” and “interdiurnal variability” geomagnetic data; the need to distinguish source field sector structure from heliospherically-imposed field structure; the importance of ensuring that regressions used are statistically robust; and uncertainty analysis. The reconstructions are exceedingly useful as they provide calibration between the in-situ spacecraft measurements from the past five decades and the millennial records of heliospheric behaviour deduced from measured abundances of cosmogenic radionuclides found in terrestrial reservoirs. Continuity of open solar flux, using sunspot number to quantify the emergence rate, is the basis of a number of models that have been very successful in reproducing the variation derived from geomagnetic activity. These models allow us to extend the reconstructions back to before the development of the magnetometer and to cover the Maunder minimum. Allied to the radionuclide data, the models are revealing much about how the Sun and heliosphere behaved outside of grand solar maxima and are providing a means of predicting how solar activity is likely to evolve now that the recent grand maximum (that had prevailed throughout the space age) has come to an end.
Resumo:
Purpose – Commercial real estate is a highly specific asset: heterogeneous, indivisible and with less information transparency than most other commonly held investment assets. These attributes encourage the use of intermediaries during asset acquisition and disposal. However, there are few attempts to explain the use of different brokerage models (with differing costs) in different markets. This study aims to address this gap. Design/methodology/approach – The study analyses 9,338 real estate transactions in London and New York City from 2001 to 2011. Data are provided by Real Capital Analytics and cover over $450 billion of investments in this period. Brokerage trends in the two cities are compared and probit regressions are used to test whether the decision to transact with broker representation varies with investor or asset characteristics. Findings – Results indicate greater use of brokerage in London, especially by purchasers. This persists when data are disaggregated by sector, time or investor type, pointing to the role of local market culture and institutions in shaping brokerage models and transaction costs. Within each city, the nature of the investors involved seems to be a more significant influence on broker use than the characteristics of the assets being traded. Originality/value – Brokerage costs are the single largest non-tax charge to an investor when trading commercial real estate, yet there is little research in this area. This study examines the role of brokers and provides empirical evidence on factors that influence the use and mode of brokerage in two major investment destinations.
Resumo:
Recent literature has suggested that macroeconomic forecasters may have asymmetric loss functions, and that there may be heterogeneity across forecasters in the degree to which they weigh under- and over-predictions. Using an individual-level analysis that exploits the Survey of Professional Forecasters respondents’ histogram forecasts, we find little evidence of asymmetric loss for the inflation forecasters
Resumo:
We present a Bayesian image classification scheme for discriminating cloud, clear and sea-ice observations at high latitudes to improve identification of areas of clear-sky over ice-free ocean for SST retrieval. We validate the image classification against a manually classified dataset using Advanced Along Track Scanning Radiometer (AATSR) data. A three way classification scheme using a near-infrared textural feature improves classifier accuracy by 9.9 % over the nadir only version of the cloud clearing used in the ATSR Reprocessing for Climate (ARC) project in high latitude regions. The three way classification gives similar numbers of cloud and ice scenes misclassified as clear but significantly more clear-sky cases are correctly identified (89.9 % compared with 65 % for ARC). We also demonstrate the poetential of a Bayesian image classifier including information from the 0.6 micron channel to be used in sea-ice extent and ice surface temperature retrieval with 77.7 % of ice scenes correctly identified and an overall classifier accuracy of 96 %.
Resumo:
Climate data are used in a number of applications including climate risk management and adaptation to climate change. However, the availability of climate data, particularly throughout rural Africa, is very limited. Available weather stations are unevenly distributed and mainly located along main roads in cities and towns. This imposes severe limitations to the availability of climate information and services for the rural community where, arguably, these services are needed most. Weather station data also suffer from gaps in the time series. Satellite proxies, particularly satellite rainfall estimate, have been used as alternatives because of their availability even over remote parts of the world. However, satellite rainfall estimates also suffer from a number of critical shortcomings that include heterogeneous time series, short time period of observation, and poor accuracy particularly at higher temporal and spatial resolutions. An attempt is made here to alleviate these problems by combining station measurements with the complete spatial coverage of satellite rainfall estimates. Rain gauge observations are merged with a locally calibrated version of the TAMSAT satellite rainfall estimates to produce over 30-years (1983-todate) of rainfall estimates over Ethiopia at a spatial resolution of 10 km and a ten-daily time scale. This involves quality control of rain gauge data, generating locally calibrated version of the TAMSAT rainfall estimates, and combining these with rain gauge observations from national station network. The infrared-only satellite rainfall estimates produced using a relatively simple TAMSAT algorithm performed as good as or even better than other satellite rainfall products that use passive microwave inputs and more sophisticated algorithms. There is no substantial difference between the gridded-gauge and combined gauge-satellite products over the test area in Ethiopia having a dense station network; however, the combined product exhibits better quality over parts of the country where stations are sparsely distributed.
Resumo:
A class identification algorithms is introduced for Gaussian process(GP)models.The fundamental approach is to propose a new kernel function which leads to a covariance matrix with low rank,a property that is consequently exploited for computational efficiency for both model parameter estimation and model predictions.The objective of either maximizing the marginal likelihood or the Kullback–Leibler (K–L) divergence between the estimated output probability density function(pdf)and the true pdf has been used as respective cost functions.For each cost function,an efficient coordinate descent algorithm is proposed to estimate the kernel parameters using a one dimensional derivative free search, and noise variance using a fast gradient descent algorithm. Numerical examples are included to demonstrate the effectiveness of the new identification approaches.
Resumo:
Satellite data are increasingly used to provide observation-based estimates of the effects of aerosols on climate. The Aerosol-cci project, part of the European Space Agency's Climate Change Initiative (CCI), was designed to provide essential climate variables for aerosols from satellite data. Eight algorithms, developed for the retrieval of aerosol properties using data from AATSR (4), MERIS (3) and POLDER, were evaluated to determine their suitability for climate studies. The primary result from each of these algorithms is the aerosol optical depth (AOD) at several wavelengths, together with the Ångström exponent (AE) which describes the spectral variation of the AOD for a given wavelength pair. Other aerosol parameters which are possibly retrieved from satellite observations are not considered in this paper. The AOD and AE (AE only for Level 2) were evaluated against independent collocated observations from the ground-based AERONET sun photometer network and against “reference” satellite data provided by MODIS and MISR. Tools used for the evaluation were developed for daily products as produced by the retrieval with a spatial resolution of 10 × 10 km2 (Level 2) and daily or monthly aggregates (Level 3). These tools include statistics for L2 and L3 products compared with AERONET, as well as scoring based on spatial and temporal correlations. In this paper we describe their use in a round robin (RR) evaluation of four months of data, one month for each season in 2008. The amount of data was restricted to only four months because of the large effort made to improve the algorithms, and to evaluate the improvement and current status, before larger data sets will be processed. Evaluation criteria are discussed. Results presented show the current status of the European aerosol algorithms in comparison to both AERONET and MODIS and MISR data. The comparison leads to a preliminary conclusion that the scores are similar, including those for the references, but the coverage of AATSR needs to be enhanced and further improvements are possible for most algorithms. None of the algorithms, including the references, outperforms all others everywhere. AATSR data can be used for the retrieval of AOD and AE over land and ocean. PARASOL and one of the MERIS algorithms have been evaluated over ocean only and both algorithms provide good results.