756 resultados para Grid-based clustering approach


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Clustering methods are increasingly being applied to residential smart meter data, providing a number of important opportunities for distribution network operators (DNOs) to manage and plan the low voltage networks. Clustering has a number of potential advantages for DNOs including, identifying suitable candidates for demand response and improving energy profile modelling. However, due to the high stochasticity and irregularity of household level demand, detailed analytics are required to define appropriate attributes to cluster. In this paper we present in-depth analysis of customer smart meter data to better understand peak demand and major sources of variability in their behaviour. We find four key time periods in which the data should be analysed and use this to form relevant attributes for our clustering. We present a finite mixture model based clustering where we discover 10 distinct behaviour groups describing customers based on their demand and their variability. Finally, using an existing bootstrapping technique we show that the clustering is reliable. To the authors knowledge this is the first time in the power systems literature that the sample robustness of the clustering has been tested.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The development of strategies for structural health monitoring (SHM) has become increasingly important because of the necessity of preventing undesirable damage. This paper describes an approach to this problem using vibration data. It involves a three-stage process: reduction of the time-series data using principle component analysis (PCA), the development of a data-based model using an auto-regressive moving average (ARMA) model using data from an undamaged structure, and the classification of whether or not the structure is damaged using a fuzzy clustering approach. The approach is applied to data from a benchmark structure from Los Alamos National Laboratory, USA. Two fuzzy clustering algorithms are compared: fuzzy c-means (FCM) and Gustafson-Kessel (GK) algorithms. It is shown that while both fuzzy clustering algorithms are effective, the GK algorithm marginally outperforms the FCM algorithm. (C) 2008 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: The genome-wide identification of both morbid genes, i.e., those genes whose mutations cause hereditary human diseases, and druggable genes, i.e., genes coding for proteins whose modulation by small molecules elicits phenotypic effects, requires experimental approaches that are time-consuming and laborious. Thus, a computational approach which could accurately predict such genes on a genome-wide scale would be invaluable for accelerating the pace of discovery of causal relationships between genes and diseases as well as the determination of druggability of gene products.Results: In this paper we propose a machine learning-based computational approach to predict morbid and druggable genes on a genome-wide scale. For this purpose, we constructed a decision tree-based meta-classifier and trained it on datasets containing, for each morbid and druggable gene, network topological features, tissue expression profile and subcellular localization data as learning attributes. This meta-classifier correctly recovered 65% of known morbid genes with a precision of 66% and correctly recovered 78% of known druggable genes with a precision of 75%. It was than used to assign morbidity and druggability scores to genes not known to be morbid and druggable and we showed a good match between these scores and literature data. Finally, we generated decision trees by training the J48 algorithm on the morbidity and druggability datasets to discover cellular rules for morbidity and druggability and, among the rules, we found that the number of regulating transcription factors and plasma membrane localization are the most important factors to morbidity and druggability, respectively.Conclusions: We were able to demonstrate that network topological features along with tissue expression profile and subcellular localization can reliably predict human morbid and druggable genes on a genome-wide scale. Moreover, by constructing decision trees based on these data, we could discover cellular rules governing morbidity and druggability.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we deal with the problem of boosting the Optimum-Path Forest (OPF) clustering approach using evolutionary-based optimization techniques. As the OPF classifier performs an exhaustive search to find out the size of sample's neighborhood that allows it to reach the minimum graph cut as a quality measure, we compared several optimization techniques that can obtain close graph cut values to the ones obtained by brute force. Experiments in two public datasets in the context of unsupervised network intrusion detection have showed the evolutionary optimization techniques can find suitable values for the neighborhood faster than the exhaustive search. Additionally, we have showed that it is not necessary to employ many agents for such task, since the neighborhood size is defined by discrete values, with constrain the set of possible solution to a few ones.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider a fully model-based approach for the analysis of distance sampling data. Distance sampling has been widely used to estimate abundance (or density) of animals or plants in a spatially explicit study area. There is, however, no readily available method of making statistical inference on the relationships between abundance and environmental covariates. Spatial Poisson process likelihoods can be used to simultaneously estimate detection and intensity parameters by modeling distance sampling data as a thinned spatial point process. A model-based spatial approach to distance sampling data has three main benefits: it allows complex and opportunistic transect designs to be employed, it allows estimation of abundance in small subregions, and it provides a framework to assess the effects of habitat or experimental manipulation on density. We demonstrate the model-based methodology with a small simulation study and analysis of the Dubbo weed data set. In addition, a simple ad hoc method for handling overdispersion is also proposed. The simulation study showed that the model-based approach compared favorably to conventional distance sampling methods for abundance estimation. In addition, the overdispersion correction performed adequately when the number of transects was high. Analysis of the Dubbo data set indicated a transect effect on abundance via Akaike’s information criterion model selection. Further goodness-of-fit analysis, however, indicated some potential confounding of intensity with the detection function.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND The copy number variation (CNV) in beta-defensin genes (DEFB) on human chromosome 8p23 has been proposed to contribute to the phenotypic differences in inflammatory diseases. However, determination of exact DEFB CN is a major challenge in association studies. Quantitative real-time PCR (qPCR), paralog ratio tests (PRT) and multiplex ligation-dependent probe amplification (MLPA) have been extensively used to determine DEFB CN in different laboratories, but inter-method inconsistencies were observed frequently. In this study we asked which one is superior among the three methods for DEFB CN determination. RESULTS We developed a clustering approach for MLPA and PRT to statistically correlate data from a single experiment. Then we compared qPCR, a newly designed PRT and MLPA for DEFB CN determination in 285 DNA samples. We found MLPA had the best convergence and clustering results of the raw data and the highest call rate. In addition, the concordance rates between MLPA or PRT and qPCR (32.12% and 37.99%, respectively) were unacceptably low with underestimated CN by qPCR. Concordance rate between MLPA and PRT (90.52%) was high but PRT systematically underestimated CN by one in a subset of samples. In these samples a sequence variant which caused complete PCR dropout of the respective DEFB cluster copies was found in one primer binding site of one of the targeted paralogous pseudogenes. CONCLUSION MLPA is superior to PRT and even more to qPCR for DEFB CN determination. Although the applied PRT provides in most cases reliable results, such a test is particularly sensitive to low-frequency sequence variations preferably accumulating in loci like pseudogenes which are most likely not under selective pressure. In the light of the superior performance of multiplex assays, the drawbacks of such single PRTs could be overcome by combining more test markers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ecological regions are increasingly used as a spatial unit for planning and environmental management. It is important to define these regions in a scientifically defensible way to justify any decisions made on the basis that they are representative of broad environmental assets. The paper describes a methodology and tool to identify cohesive bioregions. The methodology applies an elicitation process to obtain geographical descriptions for bioregions, each of these is transformed into a Normal density estimate on environmental variables within that region. This prior information is balanced with data classification of environmental datasets using a Bayesian statistical modelling approach to objectively map ecological regions. The method is called model-based clustering as it fits a Normal mixture model to the clusters associated with regions, and it addresses issues of uncertainty in environmental datasets due to overlapping clusters.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Grid computing is an emerging technology for providing the high performance computing capability and collaboration mechanism for solving the collaborated and complex problems while using the existing resources. In this paper, a grid computing based framework is proposed for the probabilistic based power system reliability and security analysis. The suggested name of this computing grid is Reliability and Security Grid (RSA-Grid). Then the architecture of this grid is presented. A prototype system has been built for further development of grid-based services for power systems reliability and security assessment based on probabilistic techniques, which require high performance computing and large amount of memory. Preliminary results based on prototype of this grid show that RSA-Grid can provide the comprehensive assessment results for real power systems efficiently and economically.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this paper is to delineate a green supply chain (GSC) performance measurement framework using an intra-organisational collaborative decision-making (CDM) approach. A fuzzy analytic network process (ANP)-based green-balanced scorecard (GrBSc) has been used within the CDM approach to assist in arriving at a consistent, accurate and timely data flow across all cross-functional areas of a business. A green causal relationship is established and linked to the fuzzy ANP approach. The causal relationship involves organisational commitment, eco-design, GSC process, social performance and sustainable performance constructs. Sub-constructs and sub-sub-constructs are also identified and linked to the causal relationship to form a network. The fuzzy ANP approach suitably handles the vagueness of the linguistics information of the CDM approach. The CDM approach is implemented in a UK-based carpet-manufacturing firm. The performance measurement approach, in addition to the traditional financial performance and accounting measures, aids in firms decision-making with regard to the overall organisational goals. The implemented approach assists the firm in identifying further requirements of the collaborative data across the supply-cain and information about customers and markets. Overall, the CDM-based GrBSc approach assists managers in deciding if the suppliers performances meet the industry and environment standards with effective human resource. © 2013 Taylor & Francis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most research in the area of emotion detection in written text focused on detecting explicit expressions of emotions in text. In this paper, we present a rule-based pipeline approach for detecting implicit emotions in written text without emotion-bearing words based on the OCC Model. We have evaluated our approach on three different datasets with five emotion categories. Our results show that the proposed approach outperforms the lexicon matching method consistently across all the three datasets by a large margin of 17–30% in F-measure and gives competitive performance compared to a supervised classifier. In particular, when dealing with formal text which follows grammatical rules strictly, our approach gives an average F-measure of 82.7% on “Happy”, “Angry-Disgust” and “Sad”, even outperforming the supervised baseline by nearly 17% in F-measure. Our preliminary results show the feasibility of the approach for the task of implicit emotion detection in written text.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conceptual database design is an unusually difficult and error-prone task for novice designers. This study examined how two training approaches---rule-based and pattern-based---might improve performance on database design tasks. A rule-based approach prescribes a sequence of rules for modeling conceptual constructs, and the action to be taken at various stages while developing a conceptual model. A pattern-based approach presents data modeling structures that occur frequently in practice, and prescribes guidelines on how to recognize and use these structures. This study describes the conceptual framework, experimental design, and results of a laboratory experiment that employed novice designers to compare the effectiveness of the two training approaches (between-subjects) at three levels of task complexity (within subjects). Results indicate an interaction effect between treatment and task complexity. The rule-based approach was significantly better in the low-complexity and the high-complexity cases; there was no statistical difference in the medium-complexity case. Designer performance fell significantly as complexity increased. Overall, though the rule-based approach was not significantly superior to the pattern-based approach in all instances, it out-performed the pattern-based approach at two out of three complexity levels. The primary contributions of the study are (1) the operationalization of the complexity construct to a degree not addressed in previous studies; (2) the development of a pattern-based instructional approach to database design; and (3) the finding that the effectiveness of a particular training approach may depend on the complexity of the task.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Numerous works have been conducted on modelling basic compliant elements such as wire beams, and closed-form analytical models of most basic compliant elements have been well developed. However, the modelling of complex compliant mechanisms is still a challenging work. This paper proposes a constraint-force-based (CFB) modelling approach to model compliant mechanisms with a particular emphasis on modelling complex compliant mechanisms. The proposed CFB modelling approach can be regarded as an improved free-body- diagram (FBD) based modelling approach, and can be extended to a development of the screw-theory-based design approach. A compliant mechanism can be decomposed into rigid stages and compliant modules. A compliant module can offer elastic forces due to its deformation. Such elastic forces are regarded as variable constraint forces in the CFB modelling approach. Additionally, the CFB modelling approach defines external forces applied on a compliant mechanism as constant constraint forces. If a compliant mechanism is at static equilibrium, all the rigid stages are also at static equilibrium under the influence of the variable and constant constraint forces. Therefore, the constraint force equilibrium equations for all the rigid stages can be obtained, and the analytical model of the compliant mechanism can be derived based on the constraint force equilibrium equations. The CFB modelling approach can model a compliant mechanism linearly and nonlinearly, can obtain displacements of any points of the rigid stages, and allows external forces to be exerted on any positions of the rigid stages. Compared with the FBD based modelling approach, the CFB modelling approach does not need to identify the possible deformed configuration of a complex compliant mechanism to obtain the geometric compatibility conditions and the force equilibrium equations. Additionally, the mathematical expressions in the CFB approach have an easily understood physical meaning. Using the CFB modelling approach, the variable constraint forces of three compliant modules, a wire beam, a four-beam compliant module and an eight-beam compliant module, have been derived in this paper. Based on these variable constraint forces, the linear and non-linear models of a decoupled XYZ compliant parallel mechanism are derived, and verified by FEA simulations and experimental tests.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Inter-subject parcellation of functional Magnetic Resonance Imaging (fMRI) data based on a standard General Linear Model (GLM) and spectral clustering was recently proposed as a means to alleviate the issues associated with spatial normalization in fMRI. However, for all its appeal, a GLM-based parcellation approach introduces its own biases, in the form of a priori knowledge about the shape of Hemodynamic Response Function (HRF) and task-related signal changes, or about the subject behaviour during the task. In this paper, we introduce a data-driven version of the spectral clustering parcellation, based on Independent Component Analysis (ICA) and Partial Least Squares (PLS) instead of the GLM. First, a number of independent components are automatically selected. Seed voxels are then obtained from the associated ICA maps and we compute the PLS latent variables between the fMRI signal of the seed voxels (which covers regional variations of the HRF) and the principal components of the signal across all voxels. Finally, we parcellate all subjects data with a spectral clustering of the PLS latent variables. We present results of the application of the proposed method on both single-subject and multi-subject fMRI datasets. Preliminary experimental results, evaluated with intra-parcel variance of GLM t-values and PLS derived t-values, indicate that this data-driven approach offers improvement in terms of parcellation accuracy over GLM based techniques.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Maintenance of transport infrastructure assets is widely advocated as the key in minimizing current and future costs of the transportation network. While effective maintenance decisions are often a result of engineering skills and practical knowledge, efficient decisions must also account for the net result over an asset's life-cycle. One essential aspect in the long term perspective of transport infrastructure maintenance is to proactively estimate maintenance needs. In dealing with immediate maintenance actions, support tools that can prioritize potential maintenance candidates are important to obtain an efficient maintenance strategy. This dissertation consists of five individual research papers presenting a microdata analysis approach to transport infrastructure maintenance. Microdata analysis is a multidisciplinary field in which large quantities of data is collected, analyzed, and interpreted to improve decision-making. Increased access to transport infrastructure data enables a deeper understanding of causal effects and a possibility to make predictions of future outcomes. The microdata analysis approach covers the complete process from data collection to actual decisions and is therefore well suited for the task of improving efficiency in transport infrastructure maintenance. Statistical modeling was the selected analysis method in this dissertation and provided solutions to the different problems presented in each of the five papers. In Paper I, a time-to-event model was used to estimate remaining road pavement lifetimes in Sweden. In Paper II, an extension of the model in Paper I assessed the impact of latent variables on road lifetimes; displaying the sections in a road network that are weaker due to e.g. subsoil conditions or undetected heavy traffic. The study in Paper III incorporated a probabilistic parametric distribution as a representation of road lifetimes into an equation for the marginal cost of road wear. Differentiated road wear marginal costs for heavy and light vehicles are an important information basis for decisions regarding vehicle miles traveled (VMT) taxation policies. In Paper IV, a distribution based clustering method was used to distinguish between road segments that are deteriorating and road segments that have a stationary road condition. Within railway networks, temporary speed restrictions are often imposed because of maintenance and must be addressed in order to keep punctuality. The study in Paper V evaluated the empirical effect on running time of speed restrictions on a Norwegian railway line using a generalized linear mixed model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Effective and efficient implementation of intelligent and/or recently emerged networked manufacturing systems require an enterprise level integration. The networked manufacturing offers several advantages in the current competitive atmosphere by way to reduce, by shortening manufacturing cycle time and maintaining the production flexibility thereby achieving several feasible process plans. The first step in this direction is to integrate manufacturing functions such as process planning and scheduling for multi-jobs in a network based manufacturing system. It is difficult to determine a proper plan that meets conflicting objectives simultaneously. This paper describes a mobile-agent based negotiation approach to integrate manufacturing functions in a distributed manner; and its fundamental framework and functions are presented. Moreover, ontology has been constructed by using the Protégé software which possesses the flexibility to convert knowledge into Extensible Markup Language (XML) schema of Web Ontology Language (OWL) documents. The generated XML schemas have been used to transfer information throughout the manufacturing network for the intelligent interoperable integration of product data models and manufacturing resources. To validate the feasibility of the proposed approach, an illustrative example along with varied production environments that includes production demand fluctuations is presented and compared the proposed approach performance and its effectiveness with evolutionary algorithm based Hybrid Dynamic-DNA (HD-DNA) algorithm. The results show that the proposed scheme is very effective and reasonably acceptable for integration of manufacturing functions.