50 resultados para Model-based Categorical Sequence Clustering
Resumo:
A neurofuzzy classifier identification algorithm is introduced for two class problems. The initial fuzzy base construction is based on fuzzy clustering utilizing a Gaussian mixture model (GMM) and the analysis of covariance (ANOVA) decomposition. The expectation maximization (EM) algorithm is applied to determine the parameters of the fuzzy membership functions. Then neurofuzzy model is identified via the supervised subspace orthogonal least square (OLS) algorithm. Finally a logistic regression model is applied to produce the class probability. The effectiveness of the proposed neurofuzzy classifier has been demonstrated using a real data set.
Resumo:
With the fast development of the Internet, wireless communications and semiconductor devices, home networking has received significant attention. Consumer products can collect and transmit various types of data in the home environment. Typical consumer sensors are often equipped with tiny, irreplaceable batteries and it therefore of the utmost importance to design energy efficient algorithms to prolong the home network lifetime and reduce devices going to landfill. Sink mobility is an important technique to improve home network performance including energy consumption, lifetime and end-to-end delay. Also, it can largely mitigate the hot spots near the sink node. The selection of optimal moving trajectory for sink node(s) is an NP-hard problem jointly optimizing routing algorithms with the mobile sink moving strategy is a significant and challenging research issue. The influence of multiple static sink nodes on energy consumption under different scale networks is first studied and an Energy-efficient Multi-sink Clustering Algorithm (EMCA) is proposed and tested. Then, the influence of mobile sink velocity, position and number on network performance is studied and a Mobile-sink based Energy-efficient Clustering Algorithm (MECA) is proposed. Simulation results validate the performance of the proposed two algorithms which can be deployed in a consumer home network environment.
Resumo:
Aerosols affect the Earth's energy budget directly by scattering and absorbing radiation and indirectly by acting as cloud condensation nuclei and, thereby, affecting cloud properties. However, large uncertainties exist in current estimates of aerosol forcing because of incomplete knowledge concerning the distribution and the physical and chemical properties of aerosols as well as aerosol-cloud interactions. In recent years, a great deal of effort has gone into improving measurements and datasets. It is thus feasible to shift the estimates of aerosol forcing from largely model-based to increasingly measurement-based. Our goal is to assess current observational capabilities and identify uncertainties in the aerosol direct forcing through comparisons of different methods with independent sources of uncertainties. Here we assess the aerosol optical depth (τ), direct radiative effect (DRE) by natural and anthropogenic aerosols, and direct climate forcing (DCF) by anthropogenic aerosols, focusing on satellite and ground-based measurements supplemented by global chemical transport model (CTM) simulations. The multi-spectral MODIS measures global distributions of aerosol optical depth (τ) on a daily scale, with a high accuracy of ±0.03±0.05τ over ocean. The annual average τ is about 0.14 over global ocean, of which about 21%±7% is contributed by human activities, as estimated by MODIS fine-mode fraction. The multi-angle MISR derives an annual average AOD of 0.23 over global land with an uncertainty of ~20% or ±0.05. These high-accuracy aerosol products and broadband flux measurements from CERES make it feasible to obtain observational constraints for the aerosol direct effect, especially over global the ocean. A number of measurement-based approaches estimate the clear-sky DRE (on solar radiation) at the top-of-atmosphere (TOA) to be about -5.5±0.2 Wm-2 (median ± standard error from various methods) over the global ocean. Accounting for thin cirrus contamination of the satellite derived aerosol field will reduce the TOA DRE to -5.0 Wm-2. Because of a lack of measurements of aerosol absorption and difficulty in characterizing land surface reflection, estimates of DRE over land and at the ocean surface are currently realized through a combination of satellite retrievals, surface measurements, and model simulations, and are less constrained. Over the oceans the surface DRE is estimated to be -8.8±0.7 Wm-2. Over land, an integration of satellite retrievals and model simulations derives a DRE of -4.9±0.7 Wm-2 and -11.8±1.9 Wm-2 at the TOA and surface, respectively. CTM simulations derive a wide range of DRE estimates that on average are smaller than the measurement-based DRE by about 30-40%, even after accounting for thin cirrus and cloud contamination. A number of issues remain. Current estimates of the aerosol direct effect over land are poorly constrained. Uncertainties of DRE estimates are also larger on regional scales than on a global scale and large discrepancies exist between different approaches. The characterization of aerosol absorption and vertical distribution remains challenging. The aerosol direct effect in the thermal infrared range and in cloudy conditions remains relatively unexplored and quite uncertain, because of a lack of global systematic aerosol vertical profile measurements. A coordinated research strategy needs to be developed for integration and assimilation of satellite measurements into models to constrain model simulations. Enhanced measurement capabilities in the next few years and high-level scientific cooperation will further advance our knowledge.
Resumo:
Classical regression methods take vectors as covariates and estimate the corresponding vectors of regression parameters. When addressing regression problems on covariates of more complex form such as multi-dimensional arrays (i.e. tensors), traditional computational models can be severely compromised by ultrahigh dimensionality as well as complex structure. By exploiting the special structure of tensor covariates, the tensor regression model provides a promising solution to reduce the model’s dimensionality to a manageable level, thus leading to efficient estimation. Most of the existing tensor-based methods independently estimate each individual regression problem based on tensor decomposition which allows the simultaneous projections of an input tensor to more than one direction along each mode. As a matter of fact, multi-dimensional data are collected under the same or very similar conditions, so that data share some common latent components but can also have their own independent parameters for each regression task. Therefore, it is beneficial to analyse regression parameters among all the regressions in a linked way. In this paper, we propose a tensor regression model based on Tucker Decomposition, which identifies not only the common components of parameters across all the regression tasks, but also independent factors contributing to each particular regression task simultaneously. Under this paradigm, the number of independent parameters along each mode is constrained by a sparsity-preserving regulariser. Linked multiway parameter analysis and sparsity modeling further reduce the total number of parameters, with lower memory cost than their tensor-based counterparts. The effectiveness of the new method is demonstrated on real data sets.
Resumo:
Abstract Background: The amount and structure of genetic diversity in dessert apple germplasm conserved at a European level is mostly unknown, since all diversity studies conducted in Europe until now have been performed on regional or national collections. Here, we applied a common set of 16 SSR markers to genotype more than 2,400 accessions across 14 collections representing three broad European geographic regions (North+East, West and South) with the aim to analyze the extent, distribution and structure of variation in the apple genetic resources in Europe. Results: A Bayesian model-based clustering approach showed that diversity was organized in three groups, although these were only moderately differentiated (FST=0.031). A nested Bayesian clustering approach allowed identification of subgroups which revealed internal patterns of substructure within the groups, allowing a finer delineation of the variation into eight subgroups (FST=0.044). The first level of stratification revealed an asymmetric division of the germplasm among the three groups, and a clear association was found with the geographical regions of origin of the cultivars. The substructure revealed clear partitioning of genetic groups among countries, but also interesting associations between subgroups and breeding purposes of recent cultivars or particular usage such as cider production. Additional parentage analyses allowed us to identify both putative parents of more than 40 old and/or local cultivars giving interesting insights in the pedigree of some emblematic cultivars. Conclusions: The variation found at group and sub-group levels may reflect a combination of historical processes of migration/selection and adaptive factors to diverse agricultural environments that, together with genetic drift, have resulted in extensive genetic variation but limited population structure. The European dessert apple germplasm represents an important source of genetic diversity with a strong historical and patrimonial value. The present work thus constitutes a decisive step in the field of conservation genetics. Moreover, the obtained data can be used for defining a European apple core collection useful for further identification of genomic regions associated with commercially important horticultural traits in apple through genome-wide association studies.