978 resultados para text message analysis and question-answering system
Resumo:
Information about the population genetic structures of parasites is important for an understanding of parasite transmission pathways and ultimately the co-evolution with their hosts. If parasites cannot disperse independently of their hosts, a parasite's population structure will depend upon the host's spatial distribution. Geographical barriers affecting host dispersal can therefore lead to structured parasite populations. However, how the host's social system affects the genetic structure of parasite populations is largely unknown. We used mitochondrial DNA (mtDNA) to describe the spatio-temporal population structure of a contact-transmitted parasitic wing mite (Spinturnix bechsteini) and compared it to that of its social host, the Bechstein's bat (Myotis bechsteinii). We observed no genetic differentiation between mites living on different bats within a colony. This suggests that mites can move freely among bats of the same colony. As expected in case of restricted inter-colony dispersal, we observed a strong genetic differentiation of mites among demographically isolated bat colonies. In contrast, we found a strong genetic turnover between years when we investigated the temporal variation of mite haplotypes within colonies. This can be explained with mite dispersal occuring between colonies and bottlenecks of mite populations within colonies. The observed absence of isolation by distance could be the result from genetic drift and/or from mites dispersing even between remote bat colonies, whose members may meet at mating sites in autumn or in hibernacula in winter. Our data show that the population structure of this parasitic wing mite is influenced by its own demography and the peculiar social system of its bat host.
Resumo:
The present research deals with the review of the analysis and modeling of Swiss franc interest rate curves (IRC) by using unsupervised (SOM, Gaussian Mixtures) and supervised machine (MLP) learning algorithms. IRC are considered as objects embedded into different feature spaces: maturities; maturity-date, parameters of Nelson-Siegel model (NSM). Analysis of NSM parameters and their temporal and clustering structures helps to understand the relevance of model and its potential use for the forecasting. Mapping of IRC in a maturity-date feature space is presented and analyzed for the visualization and forecasting purposes.
Resumo:
The paper presents the Multiple Kernel Learning (MKL) approach as a modelling and data exploratory tool and applies it to the problem of wind speed mapping. Support Vector Regression (SVR) is used to predict spatial variations of the mean wind speed from terrain features (slopes, terrain curvature, directional derivatives) generated at different spatial scales. Multiple Kernel Learning is applied to learn kernels for individual features and thematic feature subsets, both in the context of feature selection and optimal parameters determination. An empirical study on real-life data confirms the usefulness of MKL as a tool that enhances the interpretability of data-driven models.
Resumo:
The present research deals with an important public health threat, which is the pollution created by radon gas accumulation inside dwellings. The spatial modeling of indoor radon in Switzerland is particularly complex and challenging because of many influencing factors that should be taken into account. Indoor radon data analysis must be addressed from both a statistical and a spatial point of view. As a multivariate process, it was important at first to define the influence of each factor. In particular, it was important to define the influence of geology as being closely associated to indoor radon. This association was indeed observed for the Swiss data but not probed to be the sole determinant for the spatial modeling. The statistical analysis of data, both at univariate and multivariate level, was followed by an exploratory spatial analysis. Many tools proposed in the literature were tested and adapted, including fractality, declustering and moving windows methods. The use of Quan-tité Morisita Index (QMI) as a procedure to evaluate data clustering in function of the radon level was proposed. The existing methods of declustering were revised and applied in an attempt to approach the global histogram parameters. The exploratory phase comes along with the definition of multiple scales of interest for indoor radon mapping in Switzerland. The analysis was done with a top-to-down resolution approach, from regional to local lev¬els in order to find the appropriate scales for modeling. In this sense, data partition was optimized in order to cope with stationary conditions of geostatistical models. Common methods of spatial modeling such as Κ Nearest Neighbors (KNN), variography and General Regression Neural Networks (GRNN) were proposed as exploratory tools. In the following section, different spatial interpolation methods were applied for a par-ticular dataset. A bottom to top method complexity approach was adopted and the results were analyzed together in order to find common definitions of continuity and neighborhood parameters. Additionally, a data filter based on cross-validation was tested with the purpose of reducing noise at local scale (the CVMF). At the end of the chapter, a series of test for data consistency and methods robustness were performed. This lead to conclude about the importance of data splitting and the limitation of generalization methods for reproducing statistical distributions. The last section was dedicated to modeling methods with probabilistic interpretations. Data transformation and simulations thus allowed the use of multigaussian models and helped take the indoor radon pollution data uncertainty into consideration. The catego-rization transform was presented as a solution for extreme values modeling through clas-sification. Simulation scenarios were proposed, including an alternative proposal for the reproduction of the global histogram based on the sampling domain. The sequential Gaussian simulation (SGS) was presented as the method giving the most complete information, while classification performed in a more robust way. An error measure was defined in relation to the decision function for data classification hardening. Within the classification methods, probabilistic neural networks (PNN) show to be better adapted for modeling of high threshold categorization and for automation. Support vector machines (SVM) on the contrary performed well under balanced category conditions. In general, it was concluded that a particular prediction or estimation method is not better under all conditions of scale and neighborhood definitions. Simulations should be the basis, while other methods can provide complementary information to accomplish an efficient indoor radon decision making.
Resumo:
Accurate prediction of transcription factor binding sites is needed to unravel the function and regulation of genes discovered in genome sequencing projects. To evaluate current computer prediction tools, we have begun a systematic study of the sequence-specific DNA-binding of a transcription factor belonging to the CTF/NFI family. Using a systematic collection of rationally designed oligonucleotides combined with an in vitro DNA binding assay, we found that the sequence specificity of this protein cannot be represented by a simple consensus sequence or weight matrix. For instance, CTF/NFI uses a flexible DNA binding mode that allows for variations of the binding site length. From the experimental data, we derived a novel prediction method using a generalised profile as a binding site predictor. Experimental evaluation of the generalised profile indicated that it accurately predicts the binding affinity of the transcription factor to natural or synthetic DNA sequences. Furthermore, the in vitro measured binding affinities of a subset of oligonucleotides were found to correlate with their transcriptional activities in transfected cells. The combined computational-experimental approach exemplified in this work thus resulted in an accurate prediction method for CTF/NFI binding sites potentially functioning as regulatory regions in vivo.
Resumo:
sublattices ferrimagnet Cu2OSeO3 with a cubic symmetry and a linear magnetoelectric effect. There is no spectroscopic evidence for structural lattice distortions below T-C=60 K, which are expected due to magnetoelectric coupling. Using symmetry arguments we explain this observation by considering a special type of ferrimagnetic ground state which does not generate a spontaneous electric polarization. Interestingly, Raman scattering shows a strong increase of electric polarization of media through a dynamic magnetoelectric effect as a remarkable enhancement of the scattering intensity below T-C. New lines of purely magnetic origin have been detected in the magnetically ordered state. A part of them are attributed as scattering on exchange magnons. Using this observation and further symmetry considerations we argue for strong Dzyaloshinskii-Moriya interaction existing in the Cu2OSeO3. (c) 2010 American Institute of Physics. [doi:10.1063/1.3455808]
Resumo:
In recent years the Iowa DOT has shifted emphasis from the construction of new roads to the maintenance and preservation of existing highways. A need has developed for analyzing pavements structurally to select the correct rehabilitation strategy and to properly design a pavement overlay if necessary. This need has been fulfilled by Road Rater testing which has been used successfully on all types of pavements to evaluate pavement and subgrade conditions and to design asphaltic concrete overlays. The Iowa Road Rater Design Method has been simplified so that it may be easily understood and used by the widely diverse groups of individuals which may be involved in pavement restoration and management. Road Rater analysis techniques have worked well to date and have been verified by pavement coring, soils sampling and testing, and pavement removal by block sampling. Void detection testing has also been performed experimentally in Iowa, and results indicate that the Road Rater can be used to locate pavement voids and that Road Rater analysis techniques are reasonably accurate. The success of Road Rater research and development has made deflection test data one of the most important pavement management inputs.
Resumo:
It is estimated that around 230 people die each year due to radon (222Rn) exposure in Switzerland. 222Rn occurs mainly in closed environments like buildings and originates primarily from the subjacent ground. Therefore it depends strongly on geology and shows substantial regional variations. Correct identification of these regional variations would lead to substantial reduction of 222Rn exposure of the population based on appropriate construction of new and mitigation of already existing buildings. Prediction of indoor 222Rn concentrations (IRC) and identification of 222Rn prone areas is however difficult since IRC depend on a variety of different variables like building characteristics, meteorology, geology and anthropogenic factors. The present work aims at the development of predictive models and the understanding of IRC in Switzerland, taking into account a maximum of information in order to minimize the prediction uncertainty. The predictive maps will be used as a decision-support tool for 222Rn risk management. The construction of these models is based on different data-driven statistical methods, in combination with geographical information systems (GIS). In a first phase we performed univariate analysis of IRC for different variables, namely the detector type, building category, foundation, year of construction, the average outdoor temperature during measurement, altitude and lithology. All variables showed significant associations to IRC. Buildings constructed after 1900 showed significantly lower IRC compared to earlier constructions. We observed a further drop of IRC after 1970. In addition to that, we found an association of IRC with altitude. With regard to lithology, we observed the lowest IRC in sedimentary rocks (excluding carbonates) and sediments and the highest IRC in the Jura carbonates and igneous rock. The IRC data was systematically analyzed for potential bias due to spatially unbalanced sampling of measurements. In order to facilitate the modeling and the interpretation of the influence of geology on IRC, we developed an algorithm based on k-medoids clustering which permits to define coherent geological classes in terms of IRC. We performed a soil gas 222Rn concentration (SRC) measurement campaign in order to determine the predictive power of SRC with respect to IRC. We found that the use of SRC is limited for IRC prediction. The second part of the project was dedicated to predictive mapping of IRC using models which take into account the multidimensionality of the process of 222Rn entry into buildings. We used kernel regression and ensemble regression tree for this purpose. We could explain up to 33% of the variance of the log transformed IRC all over Switzerland. This is a good performance compared to former attempts of IRC modeling in Switzerland. As predictor variables we considered geographical coordinates, altitude, outdoor temperature, building type, foundation, year of construction and detector type. Ensemble regression trees like random forests allow to determine the role of each IRC predictor in a multidimensional setting. We found spatial information like geology, altitude and coordinates to have stronger influences on IRC than building related variables like foundation type, building type and year of construction. Based on kernel estimation we developed an approach to determine the local probability of IRC to exceed 300 Bq/m3. In addition to that we developed a confidence index in order to provide an estimate of uncertainty of the map. All methods allow an easy creation of tailor-made maps for different building characteristics. Our work is an essential step towards a 222Rn risk assessment which accounts at the same time for different architectural situations as well as geological and geographical conditions. For the communication of 222Rn hazard to the population we recommend to make use of the probability map based on kernel estimation. The communication of 222Rn hazard could for example be implemented via a web interface where the users specify the characteristics and coordinates of their home in order to obtain the probability to be above a given IRC with a corresponding index of confidence. Taking into account the health effects of 222Rn, our results have the potential to substantially improve the estimation of the effective dose from 222Rn delivered to the Swiss population.
Resumo:
A headspace solid-phase microextraction procedure (HS-SPME) was developed for the profiling of traces present in 3,4-methylenedioxymethylampethamine (MDMA). Traces were first extracted using HS-SPME and then analyzed by gas chromatography-mass spectroscopy (GC-MS). The HS-SPME conditions were optimized using varying conditions. Optimal results were obtained when 40 mg of crushed MDMA sample was heated at 80 °C for 15 min, followed by extraction at 80 °C for 15 min with a polydimethylsiloxane/divinylbenzene coated fibre. A total of 31 compounds were identified as traces related to MDMA synthesis, namely precursors, intermediates or by-products. In addition some fatty acids used as tabletting materials and caffeine used as adulterant, were also detected. The use of a restricted set of 10 target compounds was also proposed for developing a screening tool for clustering samples having close profile. 114 seizures were analyzed using an SPME auto-sampler (MultiPurpose Samples MPS2), purchased from Gerstel GMBH & Co. (Germany), and coupled to GC-MS. The data was handled using various pre-treatment methods, followed by the study of similarities between sample pairs based on the Pearson correlation. The results show that HS-SPME, coupled with the suitable statistical method is a powerful tool for distinguishing specimens coming from the same seizure and specimens coming from different seizures. This information can be used by law enforcement personnel to visualize the ecstasy distribution network as well as the clandestine tablet manufacturing.
Resumo:
A raga is a collective melodic expression consisting of motifs. A raga can be identified using motifs which areunique to it. Motifs can be thought of as signature prosodic phrases. Different ragas may be composed of the same setof notes, or even phrases, but the prosody may be completely different. In this paper, an attempt is made to determinethe characteristic motifs that enable identification of a raga and distinguish between them. To determine this, motifs are first manually marked for a set of five popular raga by a professional musician. The motifs are then normalisedwith respect to the tonic. HMMs are trained for each motif using 80% of the data and about 20% are used for testing. The results do indicate that about 80% of the motifs are identified as belonging to a specific raga accurately.
Resumo:
L'Oficina de les Nacions Unides contra la Droga i el Delicte en el seu informe "Global Report on Trafficking in Persons" de 2012 recull que "l'explotació sexual és, amb gran diferència, la forma de tràfic de persones detectades amb més freqüència, concretament en xifres un total del 79 % dels casos. Enllaç a l'informe sencer: http://www.unodc.org/documents/data-and-analysis/glotip/Trafficking_in_Persons_2012_web.pdf
Resumo:
ABSTRACT: BACKGROUND: Decision curve analysis has been introduced as a method to evaluate prediction models in terms of their clinical consequences if used for a binary classification of subjects into a group who should and into a group who should not be treated. The key concept for this type of evaluation is the "net benefit", a concept borrowed from utility theory. METHODS: We recall the foundations of decision curve analysis and discuss some new aspects. First, we stress the formal distinction between the net benefit for the treated and for the untreated and define the concept of the "overall net benefit". Next, we revisit the important distinction between the concept of accuracy, as typically assessed using the Youden index and a receiver operating characteristic (ROC) analysis, and the concept of utility of a prediction model, as assessed using decision curve analysis. Finally, we provide an explicit implementation of decision curve analysis to be applied in the context of case-control studies. RESULTS: We show that the overall net benefit, which combines the net benefit for the treated and the untreated, is a natural alternative to the benefit achieved by a model, being invariant with respect to the coding of the outcome, and conveying a more comprehensive picture of the situation. Further, within the framework of decision curve analysis, we illustrate the important difference between the accuracy and the utility of a model, demonstrating how poor an accurate model may be in terms of its net benefit. Eventually, we expose that the application of decision curve analysis to case-control studies, where an accurate estimate of the true prevalence of a disease cannot be obtained from the data, is achieved with a few modifications to the original calculation procedure. CONCLUSIONS: We present several interrelated extensions to decision curve analysis that will both facilitate its interpretation and broaden its potential area of application.
Resumo:
Indoleamine 2,3-dioxygenase 1 (IDO1) is a key regulator of immune responses and therefore an important therapeutic target for the treatment of diseases that involve pathological immune escape, such as cancer. Here, we describe a robust and sensitive high-throughput screen (HTS) for IDO1 inhibitors using the Prestwick Chemical Library of 1200 FDA-approved drugs and the Maybridge HitFinder Collection of 14,000 small molecules. Of the 60 hits selected for follow-up studies, 14 displayed IC50 values below 20 μM under the secondary assay conditions, and 4 showed an activity in cellular tests. In view of the high attrition rate we used both experimental and computational techniques to identify and to characterize compounds inhibiting IDO1 through unspecific inhibition mechanisms such as chemical reactivity, redox cycling, or aggregation. One specific IDO1 inhibitor scaffold, the imidazole antifungal agents, was chosen for rational structure-based lead optimization, which led to more soluble and smaller compounds with micromolar activity.
Resumo:
Alzheimer's disease is the most prevalent form of progressive degenerative dementia; it has a high socio-economic impact in Western countries. Therefore it is one of the most active research areas today. Alzheimer's is sometimes diagnosed by excluding other dementias, and definitive confirmation is only obtained through a post-mortem study of the brain tissue of the patient. The work presented here is part of a larger study that aims to identify novel technologies and biomarkers for early Alzheimer's disease detection, and it focuses on evaluating the suitability of a new approach for early diagnosis of Alzheimer’s disease by non-invasive methods. The purpose is to examine, in a pilot study, the potential of applying Machine Learning algorithms to speech features obtained from suspected Alzheimer sufferers in order help diagnose this disease and determine its degree of severity. Two human capabilities relevant in communication have been analyzed for feature selection: Spontaneous Speech and Emotional Response. The experimental results obtained were very satisfactory and promising for the early diagnosis and classification of Alzheimer’s disease patients.