850 resultados para Wavelet Packet and Support Vector Machine


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background. The factors behind the reemergence of severe, invasive group A streptococcal (GAS) diseases are unclear, but it could be caused by altered genetic endowment in these organisms. However, data from previous studies assessing the association between single genetic factors and invasive disease are often conflicting, suggesting that other, as-yet unidentified factors are necessary for the development of this class of disease. Methods. In this study, we used a targeted GAS virulence microarray containing 226 GAS genes to determine the virulence gene repertoires of 68 GAS isolates (42 associated with invasive disease and 28 associated with noninvasive disease) collected in a defined geographic location during a contiguous time period. We then employed 3 advanced machine learning methods (genetic algorithm neural network, support vector machines, and classification trees) to identify genes with an increased association with invasive disease. Results. Virulence gene profiles of individual GAS isolates varied extensively among these geographically and temporally related strains. Using genetic algorithm neural network analysis, we identified 3 genes with a marginal overrepresentation in invasive disease isolates. Significantly, 2 of these genes, ssa and mf4, encoded superantigens but were only present in a restricted set of GAS M-types. The third gene, spa, was found in variable distributions in all M-types in the study. Conclusions. Our comprehensive analysis of GAS virulence profiles provides strong evidence for the incongruent relationships among any of the 226 genes represented on the array and the overall propensity of GAS to cause invasive disease, underscoring the pathogenic complexity of these diseases, as well as the importance of multiple bacteria and/ or host factors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An emerging issue in the field of astronomy is the integration, management and utilization of databases from around the world to facilitate scientific discovery. In this paper, we investigate application of the machine learning techniques of support vector machines and neural networks to the problem of amalgamating catalogues of galaxies as objects from two disparate data sources: radio and optical. Formulating this as a classification problem presents several challenges, including dealing with a highly unbalanced data set. Unlike the conventional approach to the problem (which is based on a likelihood ratio) machine learning does not require density estimation and is shown here to provide a significant improvement in performance. We also report some experiments that explore the importance of the radio and optical data features for the matching problem.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The dissertation starts by providing a description of the phenomena related to the increasing importance recently acquired by satellite applications. The spread of such technology comes with implications, such as an increase in maintenance cost, from which derives the interest in developing advanced techniques that favor an augmented autonomy of spacecrafts in health monitoring. Machine learning techniques are widely employed to lay a foundation for effective systems specialized in fault detection by examining telemetry data. Telemetry consists of a considerable amount of information; therefore, the adopted algorithms must be able to handle multivariate data while facing the limitations imposed by on-board hardware features. In the framework of outlier detection, the dissertation addresses the topic of unsupervised machine learning methods. In the unsupervised scenario, lack of prior knowledge of the data behavior is assumed. In the specific, two models are brought to attention, namely Local Outlier Factor and One-Class Support Vector Machines. Their performances are compared in terms of both the achieved prediction accuracy and the equivalent computational cost. Both models are trained and tested upon the same sets of time series data in a variety of settings, finalized at gaining insights on the effect of the increase in dimensionality. The obtained results allow to claim that both models, combined with a proper tuning of their characteristic parameters, successfully comply with the role of outlier detectors in multivariate time series data. Nevertheless, under this specific context, Local Outlier Factor results to be outperforming One-Class SVM, in that it proves to be more stable over a wider range of input parameter values. This property is especially valuable in unsupervised learning since it suggests that the model is keen to adapting to unforeseen patterns.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Yellow fever virus (YFV) was isolated from Haemagogus leucocelaenus mosquitoes during an epizootic in 2001 in the Rio Grande do Sul State in southern Brazil In October 2008 a yellow fever outbreak was reported there with nonhuman primate deaths and human cases This latter outbreak led to intensification of surveillance measures for early detection of YFV and support for vaccination programs We report entomologic surveillance in 2 municipalities that recorded nonhuman primate deaths Mosquitoes were collected at ground level identified and processed for virus isolation and molecular analyses Eight YFV strains were isolated (7 from pools of Hg leucocelaenus mosquitoes and another from Aedes serratus mosquitoes) 6 were sequenced and they grouped in the YFV South American genotype I The results confirmed the role of Hg leucocelaenus mosquitoes as the main YFV vector in southern Brazil and suggest that Ae serratus mosquitoes may have a potential role as a secondary vector

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the several kinds of services that use the Internet and data networks infra-structures, the present networks are characterized by the diversity of types of traffic that have statistical properties as complex temporal correlation and non-gaussian distribution. The networks complex temporal correlation may be characterized by the Short Range Dependence (SRD) and the Long Range Dependence - (LRD). Models as the fGN (Fractional Gaussian Noise) may capture the LRD but not the SRD. This work presents two methods for traffic generation that synthesize approximate realizations of the self-similar fGN with SRD random process. The first one employs the IDWT (Inverse Discrete Wavelet Transform) and the second the IDWPT (Inverse Discrete Wavelet Packet Transform). It has been developed the variance map concept that allows to associate the LRD and SRD behaviors directly to the wavelet transform coefficients. The developed methods are extremely flexible and allow the generation of Gaussian time series with complex statistical behaviors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Coquillettidia linealis is a severe pest on some of the Moreton Bay islands in Queensland, Australia, but little is known of its breeding habitats and biology. Because of its high abundance and its association with Ross River (RR) and Barmah Forest (BF) viruses by field isolation, its vector competence was evaluated in the laboratory by feeding dilutions of both viruses in blood. For RR, Cq. linealis was of comparable efficiency to Ochlerotatus vigilax (Skuse), recognised as being a major vector. Results were as follows for Cq. linealis and Oc. vigilax , respectively: dose to infect 50%, 10(2.2) and

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVE: Describe the overall transmission of malaria through a compartmental model, considering the human host and mosquito vector. METHODS: A mathematical model was developed based on the following parameters: human host immunity, assuming the existence of acquired immunity and immunological memory, which boosts the protective response upon reinfection; mosquito vector, taking into account that the average period of development from egg to adult mosquito and the extrinsic incubation period of parasites (transformation of infected but non-infectious mosquitoes into infectious mosquitoes) are dependent on the ambient temperature. RESULTS: The steady state equilibrium values obtained with the model allowed the calculation of the basic reproduction ratio in terms of the model's parameters. CONCLUSIONS: The model allowed the calculation of the basic reproduction ratio, one of the most important epidemiological variables.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Designing electric installation projects, demands not only academic knowledge, but also other types of knowledge not easily acquired through traditional instructional methodologies. A lot of additional empirical knowledge is missing and so the academic instruction must be completed with different kinds of knowledge, such as real-life practical examples and simulations. On the other hand, the practical knowledge detained by the most experienced designers is not formalized in such a way that is easily transmitted. In order to overcome these difficulties present in the engineers formation, we are developing an Intelligent Tutoring System (ITS), for training and support concerning the development of electrical installation projects to be used by electrical engineers, technicians and students.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many learning problems require handling high dimensional datasets with a relatively small number of instances. Learning algorithms are thus confronted with the curse of dimensionality, and need to address it in order to be effective. Examples of these types of data include the bag-of-words representation in text classification problems and gene expression data for tumor detection/classification. Usually, among the high number of features characterizing the instances, many may be irrelevant (or even detrimental) for the learning tasks. It is thus clear that there is a need for adequate techniques for feature representation, reduction, and selection, to improve both the classification accuracy and the memory requirements. In this paper, we propose combined unsupervised feature discretization and feature selection techniques, suitable for medium and high-dimensional datasets. The experimental results on several standard datasets, with both sparse and dense features, show the efficiency of the proposed techniques as well as improvements over previous related techniques.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Arguably, the most difficult task in text classification is to choose an appropriate set of features that allows machine learning algorithms to provide accurate classification. Most state-of-the-art techniques for this task involve careful feature engineering and a pre-processing stage, which may be too expensive in the emerging context of massive collections of electronic texts. In this paper, we propose efficient methods for text classification based on information-theoretic dissimilarity measures, which are used to define dissimilarity-based representations. These methods dispense with any feature design or engineering, by mapping texts into a feature space using universal dissimilarity measures; in this space, classical classifiers (e.g. nearest neighbor or support vector machines) can then be used. The reported experimental evaluation of the proposed methods, on sentiment polarity analysis and authorship attribution problems, reveals that it approximates, sometimes even outperforms previous state-of-the-art techniques, despite being much simpler, in the sense that they do not require any text pre-processing or feature engineering.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

More than ever, there is an increase of the number of decision support methods and computer aided diagnostic systems applied to various areas of medicine. In breast cancer research, many works have been done in order to reduce false-positives when used as a double reading method. In this study, we aimed to present a set of data mining techniques that were applied to approach a decision support system in the area of breast cancer diagnosis. This method is geared to assist clinical practice in identifying mammographic findings such as microcalcifications, masses and even normal tissues, in order to avoid misdiagnosis. In this work a reliable database was used, with 410 images from about 115 patients, containing previous reviews performed by radiologists as microcalcifications, masses and also normal tissue findings. Throughout this work, two feature extraction techniques were used: the gray level co-occurrence matrix and the gray level run length matrix. For classification purposes, we considered various scenarios according to different distinct patterns of injuries and several classifiers in order to distinguish the best performance in each case described. The many classifiers used were Naïve Bayes, Support Vector Machines, k-nearest Neighbors and Decision Trees (J48 and Random Forests). The results in distinguishing mammographic findings revealed great percentages of PPV and very good accuracy values. Furthermore, it also presented other related results of classification of breast density and BI-RADS® scale. The best predictive method found for all tested groups was the Random Forest classifier, and the best performance has been achieved through the distinction of microcalcifications. The conclusions based on the several tested scenarios represent a new perspective in breast cancer diagnosis using data mining techniques.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para obtenção do grau de Mestre em Engenharia Electrotécnica e Computadores

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Quality of life is a concept influenced by social, economic, psychological, spiritual or medical state factors. More specifically, the perceived quality of an individual's daily life is an assessment of their well-being or lack of it. In this context, information technologies may help on the management of services for healthcare of chronic patients such as estimating the patient quality of life and helping the medical staff to take appropriate measures to increase each patient quality of life. This paper describes a Quality of Life estimation system developed using information technologies and the application of data mining algorithms to access the information of clinical data of patients with cancer from Otorhinolaryngology and Head and Neck services of an oncology institution. The system was evaluated with a sample composed of 3013 patients. The results achieved show that there are variables that may be significant predictors for the Quality of Life of the patient: years of smoking (p value 0.049) and size of the tumor (p value < 0.001). In order to assign the variables to the classification of the quality of life the best accuracy was obtained by applying the John Platt's sequential minimal optimization algorithm for training a support vector classifier. In conclusion data mining techniques allow having access to patients additional information helping the physicians to be able to know the quality of life and produce a well-informed clinical decision.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Internet of Things (IoT) has emerged as a paradigm over the last few years as a result of the tight integration of the computing and the physical world. The requirement of remote sensing makes low-power wireless sensor networks one of the key enabling technologies of IoT. These networks encompass several challenges, especially in communication and networking, due to their inherent constraints of low-power features, deployment in harsh and lossy environments, and limited computing and storage resources. The IPv6 Routing Protocol for Low Power and Lossy Networks (RPL) [1] was proposed by the IETF ROLL (Routing Over Low-power Lossy links) working group and is currently adopted as an IETF standard in the RFC 6550 since March 2012. Although RPL greatly satisfied the requirements of low-power and lossy sensor networks, several issues remain open for improvement and specification, in particular with respect to Quality of Service (QoS) guarantees and support for mobility. In this paper, we focus mainly on the RPL routing protocol. We propose some enhancements to the standard specification in order to provide QoS guarantees for static as well as mobile LLNs. For this purpose, we propose OF-FL (Objective Function based on Fuzzy Logic), a new objective function that overcomes the limitations of the standardized objective functions that were designed for RPL by considering important link and node metrics, namely end-to-end delay, number of hops, ETX (Expected transmission count) and LQL (Link Quality Level). In addition, we present the design of Co-RPL, an extension to RPL based on the corona mechanism that supports mobility in order to overcome the problem of slow reactivity to frequent topology changes and thus providing a better quality of service mainly in dynamic networks application. Performance evaluation results show that both OF-FL and Co-RPL allow a great improvement when compared to the standard specification, mainly in terms of packet loss ratio and average network latency. 2015 Elsevier B.V. Al

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Geographic information systems give us the possibility to analyze, produce, and edit geographic information. Furthermore, these systems fall short on the analysis and support of complex spatial problems. Therefore, when a spatial problem, like land use management, requires a multi-criteria perspective, multi-criteria decision analysis is placed into spatial decision support systems. The analytic hierarchy process is one of many multi-criteria decision analysis methods that can be used to support these complex problems. Using its capabilities we try to develop a spatial decision support system, to help land use management. Land use management can undertake a broad spectrum of spatial decision problems. The developed decision support system had to accept as input, various formats and types of data, raster or vector format, and the vector could be polygon line or point type. The support system was designed to perform its analysis for the Zambezi river Valley in Mozambique, the study area. The possible solutions for the emerging problems had to cover the entire region. This required the system to process large sets of data, and constantly adjust to new problems’ needs. The developed decision support system, is able to process thousands of alternatives using the analytical hierarchy process, and produce an output suitability map for the problems faced.