800 resultados para information bottleneck method
Resumo:
The gametocytes of the malaria parasite Plasmodium falciparum are highly resistant to antimalarial drugs. Its presence in the blood can be detected even after a successful malaria treatment. This paper explains a modified Annular Ring Ratio method which successfully locates and differentiates gametocytes of P. falciparum species in thin blood film images. The method can be used as an efficient tool for gametocyte detection for post-treatment malaria diagnosis. It also identifies the presence of any White Blood Cells (WBCs) in the image, and discards other artifacts and non infected cells. It utilizes the information based on structure, color and geometry of the cells and does not require any segmentation or non-illumination correction techniques that are commonly used for cell detection.
Resumo:
This paper proposes a method for analysing the operational complexity in supply chains by using an entropic measure based on information theory. The proposed approach estimates the operational complexity at each stage of the supply chain and analyses the changes between stages. In this paper a stage is identified by the exchange of data and/or material. Through analysis the method identifies the stages where the operational complexity is both generated and propagated (exported, imported, generated or absorbed). Central to the method is the identification of a reference point within the supply chain. This is where the operational complexity is at a local minimum along the data transfer stages. Such a point can be thought of as a ‘sink’ for turbulence generated in the supply chain. Where it exists, it has the merit of stabilising the supply chain by attenuating uncertainty. However, the location of the reference point is also a matter of choice. If the preferred location is other than the current one, this is a trigger for management action. The analysis can help decide appropriate remedial action. More generally, the approach can assist logistics management by highlighting problem areas. An industrial application is presented to demonstrate the applicability of the method.
Resumo:
We describe a novel approach to explore DNA nucleotide sequence data, aiming to produce high-level categorical and structural information about the underlying chromosomes, genomes and species. The article starts by analyzing chromosomal data through histograms using fixed length DNA sequences. After creating the DNA-related histograms, a correlation between pairs of histograms is computed, producing a global correlation matrix. These data are then used as input to several data processing methods for information extraction and tabular/graphical output generation. A set of 18 species is processed and the extensive results reveal that the proposed method is able to generate significant and diversified outputs, in good accordance with current scientific knowledge in domains such as genomics and phylogenetics.
Resumo:
This paper analyzes the DNA code of several species in the perspective of information content. For that purpose several concepts and mathematical tools are selected towards establishing a quantitative method without a priori distorting the alphabet represented by the sequence of DNA bases. The synergies of associating Gray code, histogram characterization and multidimensional scaling visualization lead to a collection of plots with a categorical representation of species and chromosomes.
Resumo:
Proteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide) is a single linear sequence of amino acids that are biochemically bonded together. The amino acid sequence in a protein is defined by the sequence of a gene or several genes encoded in the DNA-based genetic code. This genetic code typically uses twenty amino acids, but in certain organisms the genetic code can also include two other amino acids. After linking the amino acids during protein synthesis, each amino acid becomes a residue in a protein, which is then chemically modified, ultimately changing and defining the protein function. In this study, the authors analyze the amino acid sequence using alignment-free methods, aiming to identify structural patterns in sets of proteins and in the proteome, without any other previous assumptions. The paper starts by analyzing amino acid sequence data by means of histograms using fixed length amino acid words (tuples). After creating the initial relative frequency histograms, they are transformed and processed in order to generate quantitative results for information extraction and graphical visualization. Selected samples from two reference datasets are used, and results reveal that the proposed method is able to generate relevant outputs in accordance with current scientific knowledge in domains like protein sequence/proteome analysis.
Resumo:
This paper aims to present a contrastive approach between three different ways of building concepts after proving the similar syntactic possibilities that coexist in terms. However, from the semantic point of view we can see that each language family has a different distribution in meaning. But the most important point we try to show is that the differences found in the psychological process when communicating concepts should guide the translator and the terminologist in the target text production and the terminology planning process. Differences between languages in the information transmission process are due to the different roles the different types of knowledge play. We distinguish here the analytic-descriptive knowledge and the analogical knowledge among others. We also state that none of them is the best when determining the correctness of a term, but there has to be adequacy criteria in the selection process. This concept building or term building success is important when looking at the linguistic map of the information society.
Resumo:
Seismic data is difficult to analyze and classical mathematical tools reveal strong limitations in exposing hidden relationships between earthquakes. In this paper, we study earthquake phenomena in the perspective of complex systems. Global seismic data, covering the period from 1962 up to 2011 is analyzed. The events, characterized by their magnitude, geographic location and time of occurrence, are divided into groups, either according to the Flinn-Engdahl (F-E) seismic regions of Earth or using a rectangular grid based in latitude and longitude coordinates. Two methods of analysis are considered and compared in this study. In a first method, the distributions of magnitudes are approximated by Gutenberg-Richter (G-R) distributions and the parameters used to reveal the relationships among regions. In the second method, the mutual information is calculated and adopted as a measure of similarity between regions. In both cases, using clustering analysis, visualization maps are generated, providing an intuitive and useful representation of the complex relationships that are present among seismic data. Such relationships might not be perceived on classical geographic maps. Therefore, the generated charts are a valid alternative to other visualization tools, for understanding the global behavior of earthquakes.
Resumo:
To avoid additional hardware deployment, indoor localization systems have to be designed in such a way that they rely on existing infrastructure only. Besides the processing of measurements between nodes, localization procedure can include the information of all available environment information. In order to enhance the performance of Wi-Fi based localization systems, the innovative solution presented in this paper considers also the negative information. An indoor tracking method inspired by Kalman filtering is also proposed.
Resumo:
In this work, kriging with covariates is used to model and map the spatial distribution of salinity measurements gathered by an autonomous underwater vehicle in a sea outfall monitoring campaign aiming to distinguish the effluent plume from the receiving waters and characterize its spatial variability in the vicinity of the discharge. Four different geostatistical linear models for salinity were assumed, where the distance to diffuser, the west-east positioning, and the south-north positioning were used as covariates. Sample variograms were fitted by the Mat`ern models using weighted least squares and maximum likelihood estimation methods as a way to detect eventual discrepancies. Typically, the maximum likelihood method estimated very low ranges which have limited the kriging process. So, at least for these data sets, weighted least squares showed to be the most appropriate estimation method for variogram fitting. The kriged maps show clearly the spatial variation of salinity, and it is possible to identify the effluent plume in the area studied. The results obtained show some guidelines for sewage monitoring if a geostatistical analysis of the data is in mind. It is important to treat properly the existence of anomalous values and to adopt a sampling strategy that includes transects parallel and perpendicular to the effluent dispersion.
Resumo:
In the recent past, hardly anyone could predict this course of GIS development. GIS is moving from desktop to cloud. Web 2.0 enabled people to input data into web. These data are becoming increasingly geolocated. Big amounts of data formed something that is called "Big Data". Scientists still don't know how to deal with it completely. Different Data Mining tools are used for trying to extract some useful information from this Big Data. In our study, we also deal with one part of these data - User Generated Geographic Content (UGGC). The Panoramio initiative allows people to upload photos and describe them with tags. These photos are geolocated, which means that they have exact location on the Earth's surface according to a certain spatial reference system. By using Data Mining tools, we are trying to answer if it is possible to extract land use information from Panoramio photo tags. Also, we tried to answer to what extent this information could be accurate. At the end, we compared different Data Mining methods in order to distinguish which one has the most suited performances for this kind of data, which is text. Our answers are quite encouraging. With more than 70% of accuracy, we proved that extracting land use information is possible to some extent. Also, we found Memory Based Reasoning (MBR) method the most suitable method for this kind of data in all cases.
Resumo:
Diagnostic information on children is typically elicited from both children and their parents. The aims of the present paper were to: (1) compare prevalence estimates according to maternal reports, paternal reports and direct interviews of children [major depressive disorder (MDD), anxiety and attention-deficit and disruptive behavioural disorders]; (2) assess mother-child, father-child and inter-parental agreement for these disorders; (3) determine the association between several child, parent and familial characteristics and the degree of diagnostic agreement or the likelihood of parental reporting; (4) determine the predictive validity of diagnostic information provided by parents and children. Analyses were based on 235 mother-offspring, 189 father-offspring and 128 mother-father pairs. Diagnostic assessment included the Kiddie-schedule for Affective Disorders and Schizophrenia (K-SADS) (offspring) and the Diagnostic Interview for Genetic Studies (DIGS) (parents and offspring at follow-up) interviews. Parental reports were collected using the Family History - Research Diagnostic Criteria (FH-RDC). Analyses revealed: (1) prevalence estimates for internalizing disorders were generally lower according to parental information than according to the K-SADS; (2) mother-child and father-child agreement was poor and within similar ranges; (3) parents with a history of MDD or attention deficit hyperactivity disorder (ADHD) reported these disorders in their children more frequently; (4) in a sub-sample followed-up into adulthood, diagnoses of MDD, separation anxiety and conduct disorder at baseline concurred with the corresponding lifetime diagnosis at age 19 according to the child rather than according to the parents. In conclusion, our findings support large discrepancies of diagnostic information provided by parents and children with generally lower reporting of internalizing disorders by parents, and differential reporting of depression and ADHD by parental disease status. Follow-up data also supports the validity of information provided by adolescent offspring.
Resumo:
The present thesis study is a systematic investigation of information processing at sleep onset, using auditory event-related potentials (ERPs) as a test of the neurocognitive model of insomnia. Insomnia is an extremely prevalent disorder in society resulting in problems with daytime functioning (e.g., memory, concentration, job performance, mood, job and driving safety). Various models have been put forth in an effort to better understand the etiology and pathophysiology of this disorder. One of the newer models, the neurocognitive model of insomnia, suggests that chronic insomnia occurs through conditioned central nervous system arousal. This arousal is reflected through increased information processing which may interfere with sleep initiation or maintenance. The present thesis employed event-related potentials as a direct method to test information processing during the sleep-onset period. Thirteen poor sleepers with sleep-onset insomnia and 1 2 good sleepers participated in the present study. All poor sleepers met the diagnostic criteria for psychophysiological insomnia and had a complaint of problems with sleep initiation. All good sleepers reported no trouble sleeping and no excessive daytime sleepiness. Good and poor sleepers spent two nights at the Brock University Sleep Research Laboratory. The first night was used to screen for sleep disorders; the second night was used to investigate information processing during the sleep-onset period. Both groups underwent a repeated sleep-onsets task during which an auditory oddball paradigm was delivered. Participants signalled detection of a higher pitch target tone with a button press as they fell asleep. In addition, waking alert ERPs were recorded 1 hour before and after sleep on both Nights 1 and 2.As predicted by the neurocognitive model of insomnia, increased CNS activity was found in the poor sleepers; this was reflected by their smaller amplitude P2 component seen during wake of the sleep-onset period. Unlike the P2 component, the Nl, N350, and P300 did not vary between the groups. The smaller P2 seen in our poor sleepers indicates that they have a deficit in the sleep initiation processes. Specifically, poor sleepers do not disengage their attention from the outside environment to the same extent as good sleepers during the sleep-onset period. The lack of findings for the N350 suggest that this sleep component may be intact in those with insomnia and that it is the waking components (i.e., Nl, P2) that may be leading to the deficit in sleep initiation. Further, it may be that the mechanism responsible for the disruption of sleep initiation in the poor sleepers is most reflected by the P2 component. Future research investigating ERPs in insomnia should focus on the identification of the components most sensitive to sleep disruption. As well, methods should be developed in order to more clearly identify the various types of insomnia populations in research contexts (e.g., psychophysiological vs. sleep-state misperception) and the various individual (personality characteristics, motivation) and environmental factors (arousal-related variables) that influence particular ERP components. Insomnia has serious consequences for health, safety, and daytime functioning, thus research efforts should continue in order to help alleviate this highly prevalent condition.
Resumo:
a mixed-method investigation of undergraduate and graduate international students' proficiencies in both information literacy and academic writing to see if a relationship exists between them
Resumo:
RÉSUMÉ - Les images satellitales multispectrales, notamment celles à haute résolution spatiale (plus fine que 30 m au sol), représentent une source d’information inestimable pour la prise de décision dans divers domaines liés à la gestion des ressources naturelles, à la préservation de l’environnement ou à l’aménagement et la gestion des centres urbains. Les échelles d’étude peuvent aller du local (résolutions plus fines que 5 m) à des échelles régionales (résolutions plus grossières que 5 m). Ces images caractérisent la variation de la réflectance des objets dans le spectre qui est l’information clé pour un grand nombre d’applications de ces données. Or, les mesures des capteurs satellitaux sont aussi affectées par des facteurs « parasites » liés aux conditions d’éclairement et d’observation, à l’atmosphère, à la topographie et aux propriétés des capteurs. Deux questions nous ont préoccupé dans cette recherche. Quelle est la meilleure approche pour restituer les réflectances au sol à partir des valeurs numériques enregistrées par les capteurs tenant compte des ces facteurs parasites ? Cette restitution est-elle la condition sine qua non pour extraire une information fiable des images en fonction des problématiques propres aux différents domaines d’application des images (cartographie du territoire, monitoring de l’environnement, suivi des changements du paysage, inventaires des ressources, etc.) ? Les recherches effectuées les 30 dernières années ont abouti à une série de techniques de correction des données des effets des facteurs parasites dont certaines permettent de restituer les réflectances au sol. Plusieurs questions sont cependant encore en suspens et d’autres nécessitent des approfondissements afin, d’une part d’améliorer la précision des résultats et d’autre part, de rendre ces techniques plus versatiles en les adaptant à un plus large éventail de conditions d’acquisition des données. Nous pouvons en mentionner quelques unes : - Comment prendre en compte des caractéristiques atmosphériques (notamment des particules d’aérosol) adaptées à des conditions locales et régionales et ne pas se fier à des modèles par défaut qui indiquent des tendances spatiotemporelles à long terme mais s’ajustent mal à des observations instantanées et restreintes spatialement ? - Comment tenir compte des effets de « contamination » du signal provenant de l’objet visé par le capteur par les signaux provenant des objets environnant (effet d’adjacence) ? ce phénomène devient très important pour des images de résolution plus fine que 5 m; - Quels sont les effets des angles de visée des capteurs hors nadir qui sont de plus en plus présents puisqu’ils offrent une meilleure résolution temporelle et la possibilité d’obtenir des couples d’images stéréoscopiques ? - Comment augmenter l’efficacité des techniques de traitement et d’analyse automatique des images multispectrales à des terrains accidentés et montagneux tenant compte des effets multiples du relief topographique sur le signal capté à distance ? D’autre part, malgré les nombreuses démonstrations par des chercheurs que l’information extraite des images satellitales peut être altérée à cause des tous ces facteurs parasites, force est de constater aujourd’hui que les corrections radiométriques demeurent peu utilisées sur une base routinière tel qu’est le cas pour les corrections géométriques. Pour ces dernières, les logiciels commerciaux de télédétection possèdent des algorithmes versatiles, puissants et à la portée des utilisateurs. Les algorithmes des corrections radiométriques, lorsqu’ils sont proposés, demeurent des boîtes noires peu flexibles nécessitant la plupart de temps des utilisateurs experts en la matière. Les objectifs que nous nous sommes fixés dans cette recherche sont les suivants : 1) Développer un logiciel de restitution des réflectances au sol tenant compte des questions posées ci-haut. Ce logiciel devait être suffisamment modulaire pour pouvoir le bonifier, l’améliorer et l’adapter à diverses problématiques d’application d’images satellitales; et 2) Appliquer ce logiciel dans différents contextes (urbain, agricole, forestier) et analyser les résultats obtenus afin d’évaluer le gain en précision de l’information extraite par des images satellitales transformées en images des réflectances au sol et par conséquent la nécessité d’opérer ainsi peu importe la problématique de l’application. Ainsi, à travers cette recherche, nous avons réalisé un outil de restitution de la réflectance au sol (la nouvelle version du logiciel REFLECT). Ce logiciel est basé sur la formulation (et les routines) du code 6S (Seconde Simulation du Signal Satellitaire dans le Spectre Solaire) et sur la méthode des cibles obscures pour l’estimation de l’épaisseur optique des aérosols (aerosol optical depth, AOD), qui est le facteur le plus difficile à corriger. Des améliorations substantielles ont été apportées aux modèles existants. Ces améliorations concernent essentiellement les propriétés des aérosols (intégration d’un modèle plus récent, amélioration de la recherche des cibles obscures pour l’estimation de l’AOD), la prise en compte de l’effet d’adjacence à l’aide d’un modèle de réflexion spéculaire, la prise en compte de la majorité des capteurs multispectraux à haute résolution (Landsat TM et ETM+, tous les HR de SPOT 1 à 5, EO-1 ALI et ASTER) et à très haute résolution (QuickBird et Ikonos) utilisés actuellement et la correction des effets topographiques l’aide d’un modèle qui sépare les composantes directe et diffuse du rayonnement solaire et qui s’adapte également à la canopée forestière. Les travaux de validation ont montré que la restitution de la réflectance au sol par REFLECT se fait avec une précision de l’ordre de ±0.01 unités de réflectance (pour les bandes spectrales du visible, PIR et MIR), même dans le cas d’une surface à topographie variable. Ce logiciel a permis de montrer, à travers des simulations de réflectances apparentes à quel point les facteurs parasites influant les valeurs numériques des images pouvaient modifier le signal utile qui est la réflectance au sol (erreurs de 10 à plus de 50%). REFLECT a également été utilisé pour voir l’importance de l’utilisation des réflectances au sol plutôt que les valeurs numériques brutes pour diverses applications courantes de la télédétection dans les domaines des classifications, du suivi des changements, de l’agriculture et de la foresterie. Dans la majorité des applications (suivi des changements par images multi-dates, utilisation d’indices de végétation, estimation de paramètres biophysiques, …), la correction des images est une opération cruciale pour obtenir des résultats fiables. D’un point de vue informatique, le logiciel REFLECT se présente comme une série de menus simples d’utilisation correspondant aux différentes étapes de saisie des intrants de la scène, calcul des transmittances gazeuses, estimation de l’AOD par la méthode des cibles obscures et enfin, l’application des corrections radiométriques à l’image, notamment par l’option rapide qui permet de traiter une image de 5000 par 5000 pixels en 15 minutes environ. Cette recherche ouvre une série de pistes pour d’autres améliorations des modèles et méthodes liés au domaine des corrections radiométriques, notamment en ce qui concerne l’intégration de la FDRB (fonction de distribution de la réflectance bidirectionnelle) dans la formulation, la prise en compte des nuages translucides à l’aide de la modélisation de la diffusion non sélective et l’automatisation de la méthode des pentes équivalentes proposée pour les corrections topographiques.
Resumo:
L’Organisation mondiale de la santé animale (OIE) est l’institution internationale responsable de la mise en place des mesures sanitaires associées aux échanges commerciaux d’animaux vivants. Le zonage est une méthode de contrôle recommandée par l’OIE pour certaines maladies infectieuses, dont l’influenza aviaire. Les éclosions d’influenza aviaire été extrêmement coûteuses pour l’industrie avicole partout dans le monde. Afin d’évaluer la possibilité d’user de cette approche en Ontario, les données sur les sites de production avicole ont été fournies par les fédérations d’éleveurs de volailles ce cette province. L’information portant sur les industries associées à la production avicole, soit les meuneries, les abattoirs, les couvoirs, et les usines de classification d’œufs, a été obtenue par l’entremise de plusieurs sources, dont des représentants de l’industrie avicole. Des diagrammes de flux a été crée afin de comprendre les interactions entre les sites de production et les industries associées à ceux-ci. Ces industries constituaient les éléments de bas nécessaires au zonage. Cette analyse a permis de créer une base de données portant sur intrants et extrants de production pour chaque site d’élevage avicole, ainsi que pour les sites de production des industries associées à l’aviculture. À l’aide du logiciel ArcGIS, cette information a été fusionnée à des données géospatiales de Statistique Canada de l’Ontario et du Québec. La base de données résultante a permis de réaliser les essais de zonage. Soixante-douze essais ont été réalisés. Quatre ont été retenus car celles minimisaient de façon similaire les pertes de production de l’industrie. Ces essais montrent que la méthode utilisée pour l’étude du zonage peut démontrer les déficits et les surplus de production de l’industrie avicole commerciale en Ontario. Ceux-ci pourront servir de point de départ lors des discussions des intervenants de l’industrie avicole, étant donné que la coopération et la communication sont essentielles au succès du zonage.