998 resultados para Evolutionary clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Digital collections are growing exponentially in size as the information age takes a firm grip on all aspects of society. As a result Information Retrieval (IR) has become an increasingly important area of research. It promises to provide new and more effective ways for users to find information relevant to their search intentions. Document clustering is one of the many tools in the IR toolbox and is far from being perfected. It groups documents that share common features. This grouping allows a user to quickly identify relevant information. If these groups are misleading then valuable information can accidentally be ignored. There- fore, the study and analysis of the quality of document clustering is important. With more and more digital information available, the performance of these algorithms is also of interest. An algorithm with a time complexity of O(n2) can quickly become impractical when clustering a corpus containing millions of documents. Therefore, the investigation of algorithms and data structures to perform clustering in an efficient manner is vital to its success as an IR tool. Document classification is another tool frequently used in the IR field. It predicts categories of new documents based on an existing database of (doc- ument, category) pairs. Support Vector Machines (SVM) have been found to be effective when classifying text documents. As the algorithms for classifica- tion are both efficient and of high quality, the largest gains can be made from improvements to representation. Document representations are vital for both clustering and classification. Representations exploit the content and structure of documents. Dimensionality reduction can improve the effectiveness of existing representations in terms of quality and run-time performance. Research into these areas is another way to improve the efficiency and quality of clustering and classification results. Evaluating document clustering is a difficult task. Intrinsic measures of quality such as distortion only indicate how well an algorithm minimised a sim- ilarity function in a particular vector space. Intrinsic comparisons are inherently limited by the given representation and are not comparable between different representations. Extrinsic measures of quality compare a clustering solution to a “ground truth” solution. This allows comparison between different approaches. As the “ground truth” is created by humans it can suffer from the fact that not every human interprets a topic in the same manner. Whether a document belongs to a particular topic or not can be subjective.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

China has made great progress in constructing comprehensive legislative and judicial infrastructures to protect intellectual property rights. But levels of enforcement remain low. Estimates suggest that 90% of film and music products consumed in China are ‘pirated’ and in 2009 81% of the infringing goods seized at the US border originated from China. Despite of heavy criticism over its failure to enforce IPRs, key areas of China’s creative industries, including film, mobile-music, fashion and animation, are developing rapidly. This paper explores how the rapid expansion of China’s creative economy might be reconciled with conceptual approaches that view the CIs in terms of creativity inputs and IP outputs. It argues that an evolutionary understanding of copyright’s role in creative innovation might better explain China’s experiences and provide more general insights into the nature of the creative industries and the policies most likely to promote growth in this sector of the economy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Software used by architectural and industrial designers – has moved from becoming a tool for drafting, towards use in verification, simulation, project management and project sharing remotely. In more advanced models, parameters for the designed object can be adjusted so a family of variations can be produced rapidly. With advances in computer aided design technology, numerous design options can now be generated and analyzed in real time. However the use of digital tools to support design as an activity is still at an early stage and has largely been limited in functionality with regard to the design process. To date, major CAD vendors have not developed an integrated tool that is able to both leverage specialized design knowledge from various discipline domains (known as expert knowledge systems) and support the creation of design alternatives that satisfy different forms of constraints. We propose that evolutionary computing and machine learning be linked with parametric design techniques to record and respond to a designer’s own way of working and design history. It is expected that this will lead to results that impact on future work on design support systems-(ergonomics and interface) as well as implicit constraint and problem definition for problems that are difficult to quantify.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper explores design thinking from the perspective of designing new forms of interaction to engage people in community change initiatives. A case study of an agile ridesharing system is presented. We describe the fundamental premise of the design approach taken—deploying simple interactive prototypes for use by communities in order to test the design hypothesis, evolve the design in use and grow the community of participants. Real-time use data and feedback from participants influences our understanding of the design approach and feeds into the gradual evolution of the prototype while it continues to be used. We then reflect upon this form of evolutionary distributed design thinking. In contrast to the conventional IT wisdom of building systems to automate ride matching and fare calculation using structured forms, our initial phase of design revealed a preference for informal messaging, negotiation and caution in the sharing of specific location information.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Computation Fluid Dynamics (CFD) has become an important tool in optimization and has seen successful in many real world applications. Most important among these is in the optimisation of aerodynamic surfaces which has become Multi-Objective (MO) and Multidisciplinary (MDO) in nature. Most of these have been carried out for a given set of input parameters such as free stream Mach number and angle of attack. One cannot ignore the fact that in aerospace engineering one frequently deals with situations where the design input parameters and flight/flow conditions have some amount of uncertainty attached to them. When the optimisation is carried out for fixed values of design variables and parameters however, one arrives at an optimised solution that results in good performance at design condition but poor drag or lift to drag ratio at slightly off-design conditions. The challenge is still to develop a robust design that accounts for uncertainty in the design in aerospace applications. In this paper this issue is taken up and an attempt is made to prevent the fluctuation of objective performance by using robust design technique or Uncertainty.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Lactobacillus reuteri BR11 possesses an abundant cystine uptake (Cyu) ABC-transporter that was previously found to be involved in a novel mechanism of oxidative defence mediated by cystine. The current study aimed to elucidate this mechanism with a focus on the role of the co-transcribed cystathionine ã-lyase (Cgl). Growth studies of wild-type L. reuteri BR11 and mutants inactivated in cgl and the cystine-binding protein encoding gene cyuC showed that in contrast to the Cyu transporter, whose inactivation led to growth arrest in aerated cultures, Cgl is not crucial for oxidative defence. However, the role of Cgl in oxidative defence became apparent in the presence of severe oxidative damage and cysteine deprivation. Cysteine was found to be protective against oxidative stress, and the action of Cgl in both cysteine biosynthesis and degradation poses a seemingly futile pathway that deprives the intracellular cysteine pool. To further characterise the relationship between Cgl activity and cysteine and their roles in oxidative defence, enzymatic assays were performed on purified Cgl, and intracellular concentrations of cysteine, cystathionine and methionine were determined. Cgl was highly active towards cystine and cystathionine and less active towards cysteine in vitro, suggesting the main function of Cgl to be cysteine biosynthesis. Cysteine was found at high concentrations in the cell, but the levels were not significantly affected by inactivation of cgl or growth under aerobic conditions. It was concluded that both anabolic and catabolic activities of Cgl towards cysteine contribute to oxidative defence, the former by maintaining an intracellular reservoir of thiol analogous to glutathione, and the latter by producing H2S which is readily secreted, thus creating a reducing extracellular environment. The significance of the Cyu transporter to the physiology of L. reuteri BR11 prompted a phylogenetic study to determine its presence in bacteria. Orthologs of the Cyu transporter that are closest matches to the Cyu transporter are only limited to several species of Lactobacillus and Leuconostoc. Outside the Lactobacillales order, the closest matching orthologs belong to Proteobacteria, and there are more orthologs in Proteobacteria than non-Lactobacillales Firmicutes, suggesting that the Cyu transporter locus was present in the ancestor of the Proteobacteria and Firmicutes, and over evolutionary time has been lost or diverged in many Firmicutes. The clustering of the Cyu transporter locus with a gene encoding a Cgl family protein is even rarer. It was only found in L. reuteri, Lactobacillus vaginalis, Weissella paramesenteroides, the Lactobacillus casei group, and several Campylobacter sp. An accompanying phylogenetic study of L. reuteri BR11 using multi-locus sequence analysis showed that L. reuteri BR11 had diverged from more than 100 strains of L. reuteri isolated from various hosts and geographical locations. However, comparison with other Lactobacillus species supported the current classification of BR11 as L. reuteri. The most closely related species to L. reuteri is L. vaginalis or Lactobacillus antri, depending on the housekeeping gene used for analysis. The close evolutionary relationship of L. vaginalis to L. reuteri and the high degree of sequence identity between the cgl-cyuABC loci in both species suggest that the Cyu system is highly likely to perform similar functions in L. vaginalis. In search of other genes that function in oxidative defence, a number of mutants which were inactivated in genes that confer increased resistance to oxidative stress in other bacteria were constructed. The genes targeted were ahpC (peroxidase component of the alkyl hydroperoxide reductase system), tpx (thiol peroxidase), osmC (osmotically induced protein C), mntH (Mn2+/Fe2+ transporter), gshA (ã-glutamylcysteine synthetase) and msrA (methionine sulfoxide reductase). The ahpC and mntH mutants had slightly lower minimum inhibitory concentrations of organic peroxides, suggesting these genes might be involved in resistance to organic peroxides in L. reuteri. However, none of the mutants exhibited growth defects in aerated cultures, in stark contrast to the cyuC mutant. This may be due to compensatory functions of other genes, a hypothesis which cannot be tested until a robust protocol for constructing markerless multiple gene deletion mutants in L. reuteri is developed. These results highlight the importance of the Cyu transporter in oxidative defence and provide a foundation for extending the research of this system in other bacteria.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The main objective of this paper is to detail the development of a feasible hardware design based on Evolutionary Algorithms (EAs) to determine flight path planning for Unmanned Aerial Vehicles (UAVs) navigating terrain with obstacle boundaries. The design architecture includes the hardware implementation of Light Detection And Ranging (LiDAR) terrain and EA population memories within the hardware, as well as the EA search and evaluation algorithms used in the optimizing stage of path planning. A synthesisable Very-high-speed integrated circuit Hardware Description Language (VHDL) implementation of the design was developed, for realisation on a Field Programmable Gate Array (FPGA) platform. Simulation results show significant speedup compared with an equivalent software implementation written in C++, suggesting that the present approach is well suited for UAV real-time path planning applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A hierarchical structure is used to represent the content of the semi-structured documents such as XML and XHTML. The traditional Vector Space Model (VSM) is not sufficient to represent both the structure and the content of such web documents. Hence in this paper, we introduce a novel method of representing the XML documents in Tensor Space Model (TSM) and then utilize it for clustering. Empirical analysis shows that the proposed method is scalable for a real-life dataset as well as the factorized matrices produced from the proposed method helps to improve the quality of clusters due to the enriched document representation with both the structure and the content information.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper argues a model of adaptive design for sustainable architecture within a framework of entropy evolution. The spectrum of sustainable architecture consists of efficient use of energy and material resource in the life-cycle of buildings, active involvement of the occupants into micro-climate control within the building, and the natural environment as the physical context. The interactions amongst all the parameters compose a complex system of sustainable architecture design, of which the conventional linear and fragmented design technologies are insufficient to indicate holistic and ongoing environmental performance. The latest interpretation of the Second Law of Thermodynamics states a microscopic formulation of an entropy evolution of complex open systems. It provides a design framework for an adaptive system evolves for the optimization in open systems, this adaptive system evolves for the optimization of building environmental performance. The paper concludes that adaptive modelling in entropy evolution is a design alternative for sustainable architecture.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ross River virus (RRV) is a mosquito-borne member of the genus Alphavirus that causes epidemic polyarthritis in humans, costing the Australian health system at least US$10 million annually. Recent progress in RRV vaccine development requires accurate assessment of RRV genetic diversity and evolution, particularly as they may affect the utility of future vaccination. In this study, we provide novel RRV genome sequences and investigate the evolutionary dynamics of RRV from time-structured E2 gene datasets. Our analysis indicates that, although RRV evolves at a similar rate to other alphaviruses (mean evolutionary rate of approx. 8x10(-4) nucleotide substitutions per site year(-1)), the relative genetic diversity of RRV has been continuously low through time, possibly as a result of purifying selection imposed by replication in a wide range of natural host and vector species. Together, these findings suggest that vaccination against RRV is unlikely to result in the rapid antigenic evolution that could compromise the future efficacy of current RRV vaccines.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an overview of the experiments conducted using Hybrid Clustering of XML documents using Constraints (HCXC) method for the clustering task in the INEX 2009 XML Mining track. This technique utilises frequent subtrees generated from the structure to extract the content for clustering the XML documents. It also presents the experimental study using several data representations such as the structure-only, content-only and using both the structure and the content of XML documents for the purpose of clustering them. Unlike previous years, this year the XML documents were marked up using the Wiki tags and contains categories derived by using the YAGO ontology. This paper also presents the results of studying the effect of these tags on XML clustering using the HCXC method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The XML Document Mining track was launched for exploring two main ideas: (1) identifying key problems and new challenges of the emerging field of mining semi-structured documents, and (2) studying and assessing the potential of Machine Learning (ML) techniques for dealing with generic ML tasks in the structured domain, i.e., classification and clustering of semi-structured documents. This track has run for six editions during INEX 2005, 2006, 2007, 2008, 2009 and 2010. The first five editions have been summarized in previous editions and we focus here on the 2010 edition. INEX 2010 included two tasks in the XML Mining track: (1) unsupervised clustering task and (2) semi-supervised classification task where documents are organized in a graph. The clustering task requires the participants to group the documents into clusters without any knowledge of category labels using an unsupervised learning algorithm. On the other hand, the classification task requires the participants to label the documents in the dataset into known categories using a supervised learning algorithm and a training set. This report gives the details of clustering and classification tasks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Waist circumference has been identified as a valuable predictor of cardiovascular risk in children. The development of waist circumference percentiles and cut-offs for various ethnic groups are necessary because of differences in body composition. The purpose of this study was to develop waist circumference percentiles for Chinese children and to explore optimal waist circumference cut-off values for predicting cardiovascular risk factors clustering in this population.----- ----- Methods: Height, weight, and waist circumference were measured in 5529 children (2830 boys and 2699 girls) aged 6-12 years randomly selected from southern and northern China. Blood pressure, fasting triglycerides, low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, and glucose were obtained in a subsample (n = 1845). Smoothed percentile curves were produced using the LMS method. Receiver-operating characteristic analysis was used to derive the optimal age- and gender-specific waist circumference thresholds for predicting the clustering of cardiovascular risk factors.----- ----- Results: Gender-specific waist circumference percentiles were constructed. The waist circumference thresholds were at the 90th and 84th percentiles for Chinese boys and girls respectively, with sensitivity and specificity ranging from 67% to 83%. The odds ratio of a clustering of cardiovascular risk factors among boys and girls with a higher value than cut-off points was 10.349 (95% confidence interval 4.466 to 23.979) and 8.084 (95% confidence interval 3.147 to 20.767) compared with their counterparts.----- ----- Conclusions: Percentile curves for waist circumference of Chinese children are provided. The cut-off point for waist circumference to predict cardiovascular risk factors clustering is at the 90th and 84th percentiles for Chinese boys and girls, respectively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The traditional Vector Space Model (VSM) is not able to represent both the structure and the content of XML documents. This paper introduces a novel method of representing XML documents in a Tensor Space Model (TSM) and then utilizing it for clustering. Empirical analysis shows that the proposed method is scalable for large-sized datasets; as well, the factorized matrices produced from the proposed method help to improve the quality of clusters through the enriched document representation of both structure and content information.