852 resultados para 080109 Pattern Recognition and Data Mining
Resumo:
telligence applications for the banking industry. Searches were performed in relevant journals resulting in 219 articles published between 2002 and 2013. To analyze such a large number of manuscripts, text mining techniques were used in pursuit for relevant terms on both business intelligence and banking domains. Moreover, the latent Dirichlet allocation modeling was used in or- der to group articles in several relevant topics. The analysis was conducted using a dictionary of terms belonging to both banking and business intelli- gence domains. Such procedure allowed for the identification of relationships between terms and topics grouping articles, enabling to emerge hypotheses regarding research directions. To confirm such hypotheses, relevant articles were collected and scrutinized, allowing to validate the text mining proce- dure. The results show that credit in banking is clearly the main application trend, particularly predicting risk and thus supporting credit approval or de- nial. There is also a relevant interest in bankruptcy and fraud prediction. Customer retention seems to be associated, although weakly, with targeting, justifying bank offers to reduce churn. In addition, a large number of ar- ticles focused more on business intelligence techniques and its applications, using the banking industry just for evaluation, thus, not clearly acclaiming for benefits in the banking business. By identifying these current research topics, this study also highlights opportunities for future research.
Resumo:
In this paper, we present an integrated system for real-time automatic detection of human actions from video. The proposed approach uses the boundary of humans as the main feature for recognizing actions. Background subtraction is performed using Gaussian mixture model. Then, features are extracted from silhouettes and Vector Quantization is used to map features into symbols (bag of words approach). Finally, actions are detected using the Hidden Markov Model. The proposed system was validated using a newly collected real- world dataset. The obtained results show that the system is capable of achieving robust human detection, in both indoor and outdoor environments. Moreover, promising classification results were achieved when detecting two basic human actions: walking and sitting.
Resumo:
Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação
Resumo:
Data mining, frequent pattern mining, database mining, mining algorithms in SQL
Resumo:
This paper presents general problems and approaches for the spatial data analysis using machine learning algorithms. Machine learning is a very powerful approach to adaptive data analysis, modelling and visualisation. The key feature of the machine learning algorithms is that they learn from empirical data and can be used in cases when the modelled environmental phenomena are hidden, nonlinear, noisy and highly variable in space and in time. Most of the machines learning algorithms are universal and adaptive modelling tools developed to solve basic problems of learning from data: classification/pattern recognition, regression/mapping and probability density modelling. In the present report some of the widely used machine learning algorithms, namely artificial neural networks (ANN) of different architectures and Support Vector Machines (SVM), are adapted to the problems of the analysis and modelling of geo-spatial data. Machine learning algorithms have an important advantage over traditional models of spatial statistics when problems are considered in a high dimensional geo-feature spaces, when the dimension of space exceeds 5. Such features are usually generated, for example, from digital elevation models, remote sensing images, etc. An important extension of models concerns considering of real space constrains like geomorphology, networks, and other natural structures. Recent developments in semi-supervised learning can improve modelling of environmental phenomena taking into account on geo-manifolds. An important part of the study deals with the analysis of relevant variables and models' inputs. This problem is approached by using different feature selection/feature extraction nonlinear tools. To demonstrate the application of machine learning algorithms several interesting case studies are considered: digital soil mapping using SVM, automatic mapping of soil and water system pollution using ANN; natural hazards risk analysis (avalanches, landslides), assessments of renewable resources (wind fields) with SVM and ANN models, etc. The dimensionality of spaces considered varies from 2 to more than 30. Figures 1, 2, 3 demonstrate some results of the studies and their outputs. Finally, the results of environmental mapping are discussed and compared with traditional models of geostatistics.
Resumo:
Résumé Les agents pathogènes responsables d'infection entraînent chez l'hôte deux types de réponses immunes, la première, non spécifique, dite immunité innée, la seconde, spécifique à l'agent concerné, dite immunité adaptative. L'immunité innée, qui représente la première ligne de défense contre les pathogènes, est liée à la reconnaissance par les cellules de l'hôte de structures moléculaires propres aux micro-organismes (« Pathogen-Associated Molecular Patterns », PAMPs), grâce à des récepteurs membranaires et cytoplasmiques (« Pattern Recognition Receptors », PRRs) identifiant de manière spécifique ces motifs moléculaires. Les récepteurs membranaires impliqués dans ce processus sont dénommés toll-like récepteurs, ou TLRS. Lorsqu'ils sont activés par leur ligand spécifique, ces récepteurs activent des voies de signalisation intracellulaires initiant la réponse inflammatoire non spécifique et visant à éradiquer l'agent pathogène. Les deux voies de signalisation impliquées dans ce processus sont la voie des « Mitogen-Activated Protein Kinases » (MAPKs) et celle du « Nuclear Factor kappaB » (NF-κB), dont l'activation entraîne in fine l'expression de protéines de l'inflammation dénommées cytokines, ainsi que certaines enzymes produisant divers autres médiateurs inflammatoires. Dans certaines situations, cette réponse immune peut être amplifiée de manière inadéquate, entraînant chez l'hôte une réaction inflammatoire systémique exagérée, appelée sepsis. Le sepsis peut se compliquer de dysfonctions d'organes multiples (sepsis sévère), et dans sa forme la plus grave, d'un collapsus cardiovasculaire, définissant le choc septique. La défaillance circulatoire du choc septique touche les vaisseaux sanguins d'une part, le coeur d'autre part, réalisant un tableau de «dysfonction cardiaque septique », dont on connaît mal les mécanismes pathogéniques. Les bactéries à Gram négatif peuvent déclencher de tels phénomènes, notamment en libérant de l'endotoxine, qui active les voies de l'immunité innée par son interaction avec un toll récepteur, le TLR4. Outre l'endotoxine, la plupart des bactéries à Gram négatif relâchent également dans leur environnement une protéine, la flagelline, qui est le constituant majeur du flagelle bactérien, organelle assurant la mobilité de ces micro-organismes. Des données récentes ont indiqué que la flagelline active, dans certaines cellules, les voies de l'immunité innée en se liant au récepteur TLRS. On ne connaît toutefois pas les conséquences de l'interaction flagelline-TLRS sur le développement de l'inflammation et des dysfonctions d'organes au cours du sepsis. Nous avons par conséquent élaboré le présent travail en formulant l'hypothèse que la flagelline pourrait déclencher une telle inflammation et représenter ainsi un médiateur potentiel de la dysfonction d'organes au cours du sepsis à Gram négatif, en nous intéressant plus particulièrement àl'inflammation et à la dysfonction cardiaque. Dans la première partie de ce travail, nous avons étudié les effets de la flagelline sur l'activation du NF-κB et des MAPKs, et sur l'expression de cytokines inflammatoires au niveau du myocarde in vitro (cardiomyocytes en culture) et in vivo (injection de flagelline recombinante à des souris). Nous avons observé tout d'abord que le récepteur TLRS est fortement exprimé au niveau du myocarde. Nous avons ensuite démontré que la flagelline active la voie du NF-κB et des MAP kinases (p38 et JNK), stimule la production de cytokines et de chemokines inflammatoires in vitro et in vivo, et entraîne l'activation de polynucléaires neutrophiles dans le tissu cardiaque in vivo. Finalement, au plan fonctionnel, nous avons pu montrer que la flagelline entraîne une dilatation et une réduction aiguë de la contractilité du ventricule gauche chez la souris, reproduisant les caractéristiques de la dysfonction cardiaque septique. Dans la deuxième partie, nous avons déterminé la distribution du récepteur TLRS dans les autres organes majeurs de la souris (poumon, foie, intestin et rein}, et avons caractérisé dans ces organes l'effet de la flagelline sur l'activation du NF-κB et des MAPKs, l'expression de cytokines, et l'induction de l'apoptose. Nous avons démontré que le TLRS est exprimé de façon constitutive dans ces organes, et que l'injection de flagelline y déclenche les cascades de l'immunité innée et de processus apoptotiques. Finalement, nous avons également déterminé que la flagelline entraîne une augmentation significative de multiples cytokines dans le plasma une à six heures après son injection. En résumé, nos données démontrent que la flagelline bactérienne (a) entraîne une inflammation et une dysfonction importantes du myocarde et (b) active de manière très significative les mécanismes d'immunité innée dans les principaux organes et entraîne une réponse inflammatoire systémique. Par conséquent, la flagelline peut représenter un médiateur puissant de l'inflammation et de la dysfonction d'organes, notamment du coeur, au cours du choc septique déclenché par les bactéries à Gram négatif. Summary Pathogenic microorganisms trigger two kinds of immune responses in the host. The first one is immediate and non-specific and is termed innate immunity, whereas the second one, specifically targeted at the invading agent, is termed adaptative immunity. Innate immunity, which represents the first line of defense against invading pathogens, confers the host the ability to recognize molecular structures common to many microbial pathogens, ("Pathogen-Associated Molecular Patterns", PAMPs), through cytosolic or membrane-associated receptors ("Pattern Recognition Receptors", PRRs), the latter being represented by a family of receptors termed "toll-like receptors or TLRs". Once activated by the binding of their specific ligand, these receptors activate intracellular signaling pathways, which initiate the non-specific inflammatory response aimed at eradicating the pathogens. The two pathways implicated in this process are the mitogen-activated protein kinases (MAPK) and the nuclear factor kappa B (NF-κB) signaling pathways, whose activation elicit in fine the expression of inflammatory proteins termed cytokines, as well as various enzymes producing a wealth of additional inflammatory mediators. In some circumstances, the innate immune response can become amplified and dysregulated, triggering an overwhelming systemic inflammatory response in the host, identified as sepsis. Sepsis can be associated with multiple organ dysfunction (severe sepsis), and in its most severe form, with cardiovascular collapse, defming septic shock. The cardiovascular failure associated with septic shock affects blood vessels as well as the heart, resulting in a particular form of acute heart failure termed "septic cardiac dysfunction ", whose pathogenic mechanisms remain partly undefined. Gram-negative bacteria can initiate such phenomena, notably by releasing lipopolysaccharide (LPS), which activates innate immune signaling by interacting with its specific toll receptor, the TLR4. Besides LPS, most Gram-negative bacteria also release flagellin into their environment, which is the main structural protein of the bacterial flagellum, an appendage extending from the outer bacterial membrane, responsible for the motility of the microorganism. Recent data indicated that flagellin activate immune responses upon binding to its receptor, TLRS, in various cell types. However, the role of flagellin/TLRS interaction in the development of inflammation and organ dysfunction during sepsis is not known. Therefore, we designed the present work to address the hypothesis that flagellin might trigger such inflammatory responses and thus represent a potential mediator of organ dysfunction during Gram-negative sepsis, with a particular emphasis on cardiac inflammation and contractile dysfunction. In the first part of this work, we investigated the effects of flagellin on NF-κB and MAPK activation and the generation of pro-inflammatory mediators within the heart in vitro (cultured cardiomyocytes) and in vivo (injection of recombinant flagellin into mice). We first observed that TLRS protein is strongly expressed by the myocardium. We then demonstrated that flagellin activates NF-κB and MAP kinases (p38 and JNK), upregulates the transcription of pro-inflammatory cytokines and chemokines in vitro and in vivo, and stimulates the activation of polymorphonuclear neutrophils within the heart in vivo. Finally, we demonstrated that flagellin triggers acute cardiac dilation, and a significant reduction of left ventricular contractility, mimicking characteristics of clinical septic cardiac dysfunction. In the second part, we determined the TLRS distribution in other mice major organs (lung, liver, gut and kidney) and we characterized in these organs the effects of flagellin on NF-κB and MAPK activation, on the expression of pro-inflammatory çytokines, and on the induction of apoptosis. We demonstrated that TLRS protein is constitutively expressed and that flagellin activates prototypical innate immune responses and pro-apoptotic pathways in all these organs. Finally, we also observed that flagellin induces a significant increase of multiple cytokines in the plasma from 1 to 6 hours after its intravenous administration. Altogether, these data provide evidence that bacterial flagellin (a) triggers an important inflammatory response and an acute dysfunction of the myocardium, and (b) significantly activates the mechanisms of innate immunity in most major organs and elicits a systemic inflammatory response. In consequence, flagellin may represent a potent mediator of inflammation and multiple organ failure, notably cardiac dysfunction, during Gram-negative septic shock.
Resumo:
Photo-mosaicing techniques have become popular for seafloor mapping in various marine science applications. However, the common methods cannot accurately map regions with high relief and topographical variations. Ortho-mosaicing borrowed from photogrammetry is an alternative technique that enables taking into account the 3-D shape of the terrain. A serious bottleneck is the volume of elevation information that needs to be estimated from the video data, fused, and processed for the generation of a composite ortho-photo that covers a relatively large seafloor area. We present a framework that combines the advantages of dense depth-map and 3-D feature estimation techniques based on visual motion cues. The main goal is to identify and reconstruct certain key terrain feature points that adequately represent the surface with minimal complexity in the form of piecewise planar patches. The proposed implementation utilizes local depth maps for feature selection, while tracking over several views enables 3-D reconstruction by bundle adjustment. Experimental results with synthetic and real data validate the effectiveness of the proposed approach
Resumo:
The automatic interpretation of conventional traffic signs is very complex and time consuming. The paper concerns an automatic warning system for driving assistance. It does not interpret the standard traffic signs on the roadside; the proposal is to incorporate into the existing signs another type of traffic sign whose information will be more easily interpreted by a processor. The type of information to be added is profuse and therefore the most important object is the robustness of the system. The basic proposal of this new philosophy is that the co-pilot system for automatic warning and driving assistance can interpret with greater ease the information contained in the new sign, whilst the human driver only has to interpret the "classic" sign. One of the codings that has been tested with good results and which seems to us easy to implement is that which has a rectangular shape and 4 vertical bars of different colours. The size of these signs is equivalent to the size of the conventional signs (approximately 0.4 m2). The colour information from the sign can be easily interpreted by the proposed processor and the interpretation is much easier and quicker than the information shown by the pictographs of the classic signs
Resumo:
This study is part of an ongoing collaborative effort between the medical and the signal processing communities to promote research on applying standard Automatic Speech Recognition (ASR) techniques for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patients can receive early treatment. Effective ASR-based detection could dramatically cut medical testing time. Working with a carefully designed speech database of healthy and apnoea subjects, we describe an acoustic search for distinctive apnoea voice characteristics. We also study abnormal nasalization in OSA patients by modelling vowels in nasal and nonnasal phonetic contexts using Gaussian Mixture Model (GMM) pattern recognition on speech spectra. Finally, we present experimental findings regarding the discriminative power of GMMs applied to severe apnoea detection. We have achieved an 81% correct classification rate, which is very promising and underpins the interest in this line of inquiry.
Resumo:
Evaluation of segmentation methods is a crucial aspect in image processing, especially in the medical imaging field, where small differences between segmented regions in the anatomy can be of paramount importance. Usually, segmentation evaluation is based on a measure that depends on the number of segmented voxels inside and outside of some reference regions that are called gold standards. Although some other measures have been also used, in this work we propose a set of new similarity measures, based on different features, such as the location and intensity values of the misclassified voxels, and the connectivity and the boundaries of the segmented data. Using the multidimensional information provided by these measures, we propose a new evaluation method whose results are visualized applying a Principal Component Analysis of the data, obtaining a simplified graphical method to compare different segmentation results. We have carried out an intensive study using several classic segmentation methods applied to a set of MRI simulated data of the brain with several noise and RF inhomogeneity levels, and also to real data, showing that the new measures proposed here and the results that we have obtained from the multidimensional evaluation, improve the robustness of the evaluation and provides better understanding about the difference between segmentation methods.
Resumo:
One of the challenges of tumour immunology remains the identification of strongly immunogenic tumour antigens for vaccination. Reverse immunology, that is, the procedure to predict and identify immunogenic peptides from the sequence of a gene product of interest, has been postulated to be a particularly efficient, high-throughput approach for tumour antigen discovery. Over one decade after this concept was born, we discuss the reverse immunology approach in terms of costs and efficacy: data mining with bioinformatic algorithms, molecular methods to identify tumour-specific transcripts, prediction and determination of proteasomal cleavage sites, peptide-binding prediction to HLA molecules and experimental validation, assessment of the in vitro and in vivo immunogenic potential of selected peptide antigens, isolation of specific cytolytic T lymphocyte clones and final validation in functional assays of tumour cell recognition. We conclude that the overall low sensitivity and yield of every prediction step often requires a compensatory up-scaling of the initial number of candidate sequences to be screened, rendering reverse immunology an unexpectedly complex approach.
Resumo:
BACKGROUND: Solexa/Illumina short-read ultra-high throughput DNA sequencing technology produces millions of short tags (up to 36 bases) by parallel sequencing-by-synthesis of DNA colonies. The processing and statistical analysis of such high-throughput data poses new challenges; currently a fair proportion of the tags are routinely discarded due to an inability to match them to a reference sequence, thereby reducing the effective throughput of the technology. RESULTS: We propose a novel base calling algorithm using model-based clustering and probability theory to identify ambiguous bases and code them with IUPAC symbols. We also select optimal sub-tags using a score based on information content to remove uncertain bases towards the ends of the reads. CONCLUSION: We show that the method improves genome coverage and number of usable tags as compared with Solexa's data processing pipeline by an average of 15%. An R package is provided which allows fast and accurate base calling of Solexa's fluorescence intensity files and the production of informative diagnostic plots.
Resumo:
Gene expression patterns are a key feature in understanding gene function, notably in development. Comparing gene expression patterns between animals is a major step in the study of gene function as well as of animal evolution. It also provides a link between genes and phenotypes. Thus we have developed Bgee, a database designed to compare expression patterns between animals, by implementing ontologies describing anatomies and developmental stages of species, and then designing homology relationships between anatomies and comparison criteria between developmental stages. To define homology relationships between anatomical features we have developed the software Homolonto, which uses a modified ontology alignment approach to propose homology relationships between ontologies. Bgee then uses these aligned ontologies, onto which heterogeneous expression data types are mapped. These already include microarrays and ESTs.
Resumo:
Regulated by histone acetyltransferases and deacetylases (HDACs), histone acetylation is a key epigenetic mechanism controlling chromatin structure, DNA accessibility, and gene expression. HDAC inhibitors induce growth arrest, differentiation, and apoptosis of tumor cells and are used as anticancer agents. Here we describe the effects of HDAC inhibitors on microbial sensing by macrophages and dendritic cells in vitro and host defenses against infection in vivo. HDAC inhibitors down-regulated the expression of numerous host defense genes, including pattern recognition receptors, kinases, transcription regulators, cytokines, chemokines, growth factors, and costimulatory molecules as assessed by genome-wide microarray analyses or innate immune responses of macrophages and dendritic cells stimulated with Toll-like receptor agonists. HDAC inhibitors induced the expression of Mi-2β and enhanced the DNA-binding activity of the Mi-2/NuRD complex that acts as a transcriptional repressor of macrophage cytokine production. In vivo, HDAC inhibitors increased the susceptibility to bacterial and fungal infections but conferred protection against toxic and septic shock. Thus, these data identify an essential role for HDAC inhibitors in the regulation of the expression of innate immune genes and host defenses against microbial pathogens.
Resumo:
Over the past three decades, pedotransfer functions (PTFs) have been widely used by soil scientists to estimate soils properties in temperate regions in response to the lack of soil data for these regions. Several authors indicated that little effort has been dedicated to the prediction of soil properties in the humid tropics, where the need for soil property information is of even greater priority. The aim of this paper is to provide an up-to-date repository of past and recently published articles as well as papers from proceedings of events dealing with water-retention PTFs for soils of the humid tropics. Of the 35 publications found in the literature on PTFs for prediction of water retention of soils of the humid tropics, 91 % of the PTFs are based on an empirical approach, and only 9 % are based on a semi-physical approach. Of the empirical PTFs, 97 % are continuous, and 3 % (one) is a class PTF; of the empirical PTFs, 97 % are based on multiple linear and polynomial regression of n th order techniques, and 3 % (one) is based on the k-Nearest Neighbor approach; 84 % of the continuous PTFs are point-based, and 16 % are parameter-based; 97 % of the continuous PTFs are equation-based PTFs, and 3 % (one) is based on pattern recognition. Additionally, it was found that 26 % of the tropical water-retention PTFs were developed for soils in Brazil, 26 % for soils in India, 11 % for soils in other countries in America, and 11 % for soils in other countries in Africa.