948 resultados para Hierarchical partitioning analysis


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: Recently, much research has been proposed using nature inspired algorithms to perform complex machine learning tasks. Ant colony optimization (ACO) is one such algorithm based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper investigates ant-based algorithms for gene expression data clustering and associative classification. Methods and material: An ant-based clustering (Ant-C) and an ant-based association rule mining (Ant-ARM) algorithms are proposed for gene expression data analysis. The proposed algorithms make use of the natural behavior of ants such as cooperation and adaptation to allow for a flexible robust search for a good candidate solution. Results: Ant-C has been tested on the three datasets selected from the Stanford Genomic Resource Database and achieved relatively high accuracy compared to other classical clustering methods. Ant-ARM has been tested on the acute lymphoblastic leukemia (ALL)/acute myeloid leukemia (AML) dataset and generated about 30 classification rules with high accuracy. Conclusions: Ant-C can generate optimal number of clusters without incorporating any other algorithms such as K-means or agglomerative hierarchical clustering. For associative classification, while a few of the well-known algorithms such as Apriori, FP-growth and Magnum Opus are unable to mine any association rules from the ALL/AML dataset within a reasonable period of time, Ant-ARM is able to extract associative classification rules.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Web document cluster analysis plays an important role in information retrieval by organizing large amounts of documents into a small number of meaningful clusters. Traditional web document clustering is based on the Vector Space Model (VSM), which takes into account only two-level (document and term) knowledge granularity but ignores the bridging paragraph granularity. However, this two-level granularity may lead to unsatisfactory clustering results with “false correlation”. In order to deal with the problem, a Hierarchical Representation Model with Multi-granularity (HRMM), which consists of five-layer representation of data and a twophase clustering process is proposed based on granular computing and article structure theory. To deal with the zero-valued similarity problemresulted from the sparse term-paragraphmatrix, an ontology based strategy and a tolerance-rough-set based strategy are introduced into HRMM. By using granular computing, structural knowledge hidden in documents can be more efficiently and effectively captured in HRMM and thus web document clusters with higher quality can be generated. Extensive experiments show that HRMM, HRMM with tolerancerough-set strategy, and HRMM with ontology all outperform VSM and a representative non VSM-based algorithm, WFP, significantly in terms of the F-Score.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper a Hierarchical Analytical Network Process (HANP) model is demonstrated for evaluating alternative technologies for generating electricity from MSW in India. The technological alternatives and evaluation criteria for the HANP study are characterised by reviewing the literature and consulting experts in the field of waste management. Technologies reviewed in the context of India include landfill, anaerobic digestion, incineration, pelletisation and gasification. To investigate the sensitivity of the result, we examine variations in expert opinions and carry out an Analytical Hierarchy Process (AHP) analysis for comparison. We find that anaerobic digestion is the preferred technology for generating electricity from MSW in India. Gasification is indicated as the preferred technology in an AHP model due to the exclusion of criteria dependencies and in an HANP analysis when placing a high priority on net output and retention time. We conclude that HANP successfully provides a structured framework for recommending which technologies to pursue in India, and the adoption of such tools is critical at a time when key investments in infrastructure are being made. Therefore the presented methodology is thought to have a wider potential for investors, policy makers, researchers and plant developers in India and elsewhere. © 2013 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Biological experiments often produce enormous amount of data, which are usually analyzed by data clustering. Cluster analysis refers to statistical methods that are used to assign data with similar properties into several smaller, more meaningful groups. Two commonly used clustering techniques are introduced in the following section: principal component analysis (PCA) and hierarchical clustering. PCA calculates the variance between variables and groups them into a few uncorrelated groups or principal components (PCs) that are orthogonal to each other. Hierarchical clustering is carried out by separating data into many clusters and merging similar clusters together. Here, we use an example of human leukocyte antigen (HLA) supertype classification to demonstrate the usage of the two methods. Two programs, Generating Optimal Linear Partial Least Square Estimations (GOLPE) and Sybyl, are used for PCA and hierarchical clustering, respectively. However, the reader should bear in mind that the methods have been incorporated into other software as well, such as SIMCA, statistiXL, and R.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A thermodynamic analysis which is capable of estimating the austenite/ferrite equilibria in duplex stainless steels has been carried out using the sublattice thermodynamic model. The partitioning of alloying elements between the austenite and ferrite phases has been calculated as a function of temperature. The results showed that chromium partitioning was not influenced significantly by the temperature. The molybdenum, on the other hand, was found to partition preferentially into ferrite phase as the temperature decreases. A strong partitioning of nickel into the austenite was observed to decrease gradually with increasing temperature. Among the alloying elements, average nitrogen concentration was found to have the most profound effect on the phase balance and the partitioning of nitrogen into the austenite. The partitioning coefficient of nitrogen (the ratio of the mole fraction of nitrogen in the austenite to that in the ferrite) was found to be as high as 7.0 around 1300 K. Consequently, the volume fraction of austenite was influenced by relatively small additions of nitrogen. The results are compared with the experimentally observed data in a duplex stainless steel weld metal in conjunction with the solid state δ → δ + γ phase transformation. Particular attention was given to the morphological instability of grain boundary austenite allotriomorphs. A compariso between the experimental results and calculations indicated that the instability associated with irregular austenite perturbations results from the high degree of undercooling. The results suggest that the model can be used successfully to understand the development of the microstructure in duplex stainless steel weld metals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this paper is to explain the notion of clustering and a concrete clustering method- agglomerative hierarchical clustering algorithm. It shows how a data mining method like clustering can be applied to the analysis of stocks, traded on the Bulgarian Stock Exchange in order to identify similar temporal behavior of the traded stocks. This problem is solved with the aid of a data mining tool that is called XLMiner™ for Microsoft Excel Office.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Systems analysis (SA) is widely used in complex and vague problem solving. Initial stages of SA are analysis of problems and purposes to obtain problems/purposes of smaller complexity and vagueness that are combined into hierarchical structures of problems(SP)/purposes(PS). Managers have to be sure the PS and the purpose realizing system (PRS) that can achieve the PS-purposes are adequate to the problem to be solved. However, usually SP/PS are not substantiated well enough, because their development is based on a collective expertise in which logic of natural language and expert estimation methods are used. That is why scientific foundations of SA are not supposed to have been completely formed. The structure-and-purpose approach to SA based on a logic-and-linguistic simulation of problems/purposes analysis is a step towards formalization of the initial stages of SA to improve adequacy of their results, and also towards increasing quality of SA as a whole. Managers of industrial organizing systems using the approach eliminate logical errors in SP/PS at early stages of planning and so they will be able to find better decisions of complex and vague problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The principal feature of ontology, which is developed for a text processing, is wider knowledge representation of an external world due to introduction of three-level hierarchy. It allows to improve semantic interpretation of natural language texts.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We introduce a novel algorithm for medial surfaces extraction that is based on the density-corrected Hamiltonian analysis. The approach extracts the skeleton directly from a triangulated mesh and adopts an adaptive octree-based approach in which only skeletal voxels are refined to a lower level of the hierarchy, resulting in robust and accurate skeletons at extremely high resolution. The quality of the extracted medial surfaces is confirmed by an extensive set of experiments. © 2012 IEEE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Herein, we demonstrate a template-free and eco-friendly strategy to synthesize hierarchical Ag3PO4 microcrystals with sharp corners and edges via silver–ammine complex at room temperature. The as-synthesized hierarchical Ag3PO4 microcrystals were characterized by X-ray diffraction, field-emission scanning electron microscope (FESEM), UV–vis diffuse reflectance spectroscopy (UV–vis DRS), BET surface area analyzer, and photoluminescence analysis (PL). Our results clearly indicated that the as-synthesized Ag3PO4 microcrystals possess a hierarchical structure with sharp corners and edges. More attractively, the adsorption ability and visible light photocatalytic activity of the as-synthesized hierarchical Ag3PO4 is much higher than that of conventional Ag3PO4.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study explores factors related to the prompt difficulty in Automated Essay Scoring. The sample was composed of 6,924 students. For each student, there were 1-4 essays, across 20 different writing prompts, for a total of 20,243 essays. E-rater® v.2 essay scoring engine developed by the Educational Testing Service was used to score the essays. The scoring engine employs a statistical model that incorporates 10 predictors associated with writing characteristics of which 8 were used. The Rasch partial credit analysis was applied to the scores to determine the difficulty levels of prompts. In addition, the scores were used as outcomes in the series of hierarchical linear models (HLM) in which students and prompts constituted the cross-classification levels. This methodology was used to explore the partitioning of the essay score variance.^ The results indicated significant differences in prompt difficulty levels due to genre. Descriptive prompts, as a group, were found to be more difficult than the persuasive prompts. In addition, the essay score variance was partitioned between students and prompts. The amount of the essay score variance that lies between prompts was found to be relatively small (4 to 7 percent). When the essay-level, student-level-and prompt-level predictors were included in the model, it was able to explain almost all variance that lies between prompts. Since in most high-stakes writing assessments only 1-2 prompts per students are used, the essay score variance that lies between prompts represents an undesirable or "noise" variation. Identifying factors associated with this "noise" variance may prove to be important for prompt writing and for constructing Automated Essay Scoring mechanisms for weighting prompt difficulty when assigning essay score.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this study was to better understand the study behaviors and habits of university undergraduate students. It was designed to determine whether undergraduate students could be grouped based on their self-reported study behaviors and if any grouping system could be determined, whether group membership was related to students’ academic achievement. A total of 152 undergraduate students voluntarily participated in the current study by completing the Study Behavior Inventory instrument. All participants were enrolled in fall semester of 2010 at Florida International University. The Q factor analysis technique using principal components extraction and a varimax rotation was used in order to examine the participants in relation to each other and to detect a pattern of intercorrelations among participants based on their self-reported study behaviors. The Q factor analysis yielded a two factor structure representing two distinct student types among participants regarding their study behaviors. The first student type (i.e., Factor 1) describes proactive learners who organize both their study materials and study time well. Type 1 students are labeled “Proactive Learners with Well-Organized Study Behaviors”. The second type (i.e., Factor 2) represents students who are poorly organized as well as being very likely to procrastinate. Type 2 students are labeled Disorganized Procrastinators. Hierarchical linear regression was employed to examine the relationship between student type and academic achievement as measured by current grade point averages (GPAs). The results showed significant differences in GPAs between Type 1 and Type 2 students at the .05 significance level. Furthermore, student type was found to be a significant predictor of academic achievement beyond and above students’ attribute variables including sex, age, major, and enrollment status. The study has several implications for educational researchers, practitioners, and policy makers in terms of improving college students' learning behaviors and outcomes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Petri Nets are a formal, graphical and executable modeling technique for the specification and analysis of concurrent and distributed systems and have been widely applied in computer science and many other engineering disciplines. Low level Petri nets are simple and useful for modeling control flows but not powerful enough to define data and system functionality. High level Petri nets (HLPNs) have been developed to support data and functionality definitions, such as using complex structured data as tokens and algebraic expressions as transition formulas. Compared to low level Petri nets, HLPNs result in compact system models that are easier to be understood. Therefore, HLPNs are more useful in modeling complex systems. ^ There are two issues in using HLPNs—modeling and analysis. Modeling concerns the abstracting and representing the systems under consideration using HLPNs, and analysis deals with effective ways study the behaviors and properties of the resulting HLPN models. In this dissertation, several modeling and analysis techniques for HLPNs are studied, which are integrated into a framework that is supported by a tool. ^ For modeling, this framework integrates two formal languages: a type of HLPNs called Predicate Transition Net (PrT Net) is used to model a system's behavior and a first-order linear time temporal logic (FOLTL) to specify the system's properties. The main contribution of this dissertation with regard to modeling is to develop a software tool to support the formal modeling capabilities in this framework. ^ For analysis, this framework combines three complementary techniques, simulation, explicit state model checking and bounded model checking (BMC). Simulation is a straightforward and speedy method, but only covers some execution paths in a HLPN model. Explicit state model checking covers all the execution paths but suffers from the state explosion problem. BMC is a tradeoff as it provides a certain level of coverage while more efficient than explicit state model checking. The main contribution of this dissertation with regard to analysis is adapting BMC to analyze HLPN models and integrating the three complementary analysis techniques in a software tool to support the formal analysis capabilities in this framework. ^ The SAMTools developed for this framework in this dissertation integrates three tools: PIPE+ for HLPNs behavioral modeling and simulation, SAMAT for hierarchical structural modeling and property specification, and PIPE+Verifier for behavioral verification.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Produced water constitutes the largest volume of waste from offshore oil and gas operations and is composed of a wide range of organic and inorganic compounds. Although treatment processes have to meet strict oil in water regulations, the definition of “oil” is a function of the analysis process and may include aliphatic hydrocarbons which have limited environmental impact due to degradability whilst ignoring problematic dissolved petroleum species. This thesis presents the partitioning behavior of oil in produced water as a function of temperature and salinity to identify compounds of environmental concern. Phenol, p-cresol, and 4-tert-butylphenol were studied because of their xenoestrogenic power; other compounds studied are polycyclic aromatic hydrocarbon PAHs which include naphthalene, fluorene, phenanthrene, and pyrene. Partitioning experiments were carried out in an Innova incubator for 48 hours, temperature was varied from 4゚C to 70゚C, and two salinity levels of 46.8‰ and 66.8‰ were studied. Results obtained showed that the dispersed oil concentration in the water reduces with settling time and equilibrium was attained at 48 h settling time. Polycyclic aromatic hydrocarbons (PAHs) partitions based on dispersed oil concentration whereas phenols are not significantly affected by dispersed oil concentration. Higher temperature favors partitioning of PAHs into the water phase. Salinity has negligible effect on partitioning pattern of phenols and PAHs studied. Simulation results obtained from the Aspen HYSYS model shows that temperature and oil droplet distribution greatly influences the efficiency of produced water treatment system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The known moss flora of Terra Nova National Park, eastern Newfoundland, comp~ises 210 species. Eighty-two percent of the moss species occurring in Terra Nova are widespread or widespread-sporadic in Newfoundland. Other Newfoundland distributional elements present in the Terra Nova moss flora are the northwestern, southern, southeastern, and disjunct elements, but four of the mosses occurring in Terra Nova appear to belong to a previously unrecognized northeastern element of the Newfoundland flora. The majority (70.9%) of Terra Nova's mosses are of boreal affinity and are widely distributed in the North American coniferous forest belt. An additional 10.5 percent of the Terra Nova mosses are cosmopolitan while 9.5 percent are temperate and 4.8 percent are arctic-montane species. The remaining 4.3 percent of the mosses are of montane affinity, and disjunct between eastern and western North America. In Terra Nova, temperate species at their northern limit are concentrated in balsam fir stands, while arctic-montane species are restricted to exposed cliffs, scree slopes, and coastal exposures. Montane species are largely confined to exposed or freshwater habitats. Inability to tolerate high summer temperatures limits the distributions of both arctic-montane and montane species. In Terra Nova, species of differing phytogeographic affinities co-occur on cliffs and scree slopes. The microhabitat relationships of five selected species from such habitats were evaluated by Discriminant Functions Analysis and Multiple Regression Analysis. The five mosses have distinct and different microhabitats on cliffs and scree slopes in Terra Nova, and abundance of all but one is associated with variation in at least one microhabitat variable. Micro-distribution of Grimmia torquata, an arctic-montane species at its southern limit, appears to be deterJ]lined by sensitivity to high summer temperatures. Both southern mosses at their northern limit (Aulacomnium androgynum, Isothecium myosuroides) appear to be limited by water availability and, possibly, by low winter temperatures. The two species whose distributions extend both north and south or the study area (Encalypta procera, Eurhynchium pulchellum) show no clear relationship with microclimate. Dispersal factors have played a significant role in the development of the Terra Nova moss flora. Compared to the most likely colonizing source (i .e. the rest of the island of Newfoundland), species with small diaspores have colonized the study area to a proportionately much greater extent than have species with large diaspores. Hierarchical log-linear analysis indicates that this is so for all affinity groups present in Terra Nova. The apparent dispersal effects emphasize the comparatively recent glaciation of the area, and may also have been enhanced by anthropogenic influences. The restriction of some species to specific habitats, or to narrowly defined microhabitats, appears to strengthen selection for easily dispersed taxa.