947 resultados para Data pre-processing


Relevância:

90.00% 90.00%

Publicador:

Resumo:

1. The ecological niche is a fundamental biological concept. Modelling species' niches is central to numerous ecological applications, including predicting species invasions, identifying reservoirs for disease, nature reserve design and forecasting the effects of anthropogenic and natural climate change on species' ranges. 2. A computational analogue of Hutchinson's ecological niche concept (the multidimensional hyperspace of species' environmental requirements) is the support of the distribution of environments in which the species persist. Recently developed machine-learning algorithms can estimate the support of such high-dimensional distributions. We show how support vector machines can be used to map ecological niches using only observations of species presence to train distribution models for 106 species of woody plants and trees in a montane environment using up to nine environmental covariates. 3. We compared the accuracy of three methods that differ in their approaches to reducing model complexity. We tested models with independent observations of both species presence and species absence. We found that the simplest procedure, which uses all available variables and no pre-processing to reduce correlation, was best overall. Ecological niche models based on support vector machines are theoretically superior to models that rely on simulating pseudo-absence data and are comparable in empirical tests. 4. Synthesis and applications. Accurate species distribution models are crucial for effective environmental planning, management and conservation, and for unravelling the role of the environment in human health and welfare. Models based on distribution estimation rather than classification overcome theoretical and practical obstacles that pervade species distribution modelling. In particular, ecological niche models based on machine-learning algorithms for estimating the support of a statistical distribution provide a promising new approach to identifying species' potential distributions and to project changes in these distributions as a result of climate change, land use and landscape alteration.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Metacaspases are cysteine peptidases that could play a role similar to caspases in the cell death programme of plants, fungi and protozoa. The human protozoan parasite Leishmania major expresses a single metacaspase (LmjMCA) harbouring a central domain with the catalytic dyad histidine and cysteine as found in caspases. In this study, we investigated the processing sites important for the maturation of LmjMCA catalytic domain, the cellular localization of LmjMCA polypeptides, and the functional role of the catalytic domain in the cell death pathway of Leishmania parasites. Although LmjMCA polypeptide precursor form harbours a functional mitochondrial localization signal (MLS), we determined that LmjMCA polypeptides are mainly localized in the cytoplasm. In stress conditions, LmjMCA precursor forms were extensively processed into soluble forms containing the catalytic domain. This domain was sufficient to enhance sensitivity of parasites to hydrogen peroxide by impairing the mitochondrion. These data provide experimental evidences of the importance of LmjMCA processing into an active catalytic domain and of its role in disrupting mitochondria, which could be relevant in the design of new drugs to fight leishmaniasis and likely other protozoan parasitic diseases.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The increasing interest aroused by more advanced forecasting techniques, together with the requirement for more accurate forecasts of tourismdemand at the destination level due to the constant growth of world tourism, has lead us to evaluate the forecasting performance of neural modelling relative to that of time seriesmethods at a regional level. Seasonality and volatility are important features of tourism data, which makes it a particularly favourable context in which to compare the forecasting performance of linear models to that of nonlinear alternative approaches. Pre-processed official statistical data of overnight stays and tourist arrivals fromall the different countries of origin to Catalonia from 2001 to 2009 is used in the study. When comparing the forecasting accuracy of the different techniques for different time horizons, autoregressive integrated moving average models outperform self-exciting threshold autoregressions and artificial neural network models, especially for shorter horizons. These results suggest that the there is a trade-off between the degree of pre-processing and the accuracy of the forecasts obtained with neural networks, which are more suitable in the presence of nonlinearity in the data. In spite of the significant differences between countries, which can be explained by different patterns of consumer behaviour,we also find that forecasts of tourist arrivals aremore accurate than forecasts of overnight stays.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Universal Converter (UNICON) –projektin osana suunniteltiin sähkömoottorikäyttöjen ohjaukseen ja mittaukseen soveltuva digitaaliseen signaaliprosessoriin (DSP) pohjautuva sulautettu järjestelmä. Riittävän laskentatehon varmistamiseksi päädyttiin käyttämään moniprosessorijärjestelmää. Prosessorijärjestelmässä käytettävää DSP-piiriä valittaessa valintaperusteina olivat piirien tarjoama prosessointiteho ja moniprosessorituki. Analog Devices:n SHARC-sarjan DSP-piirit täyttivät parhaiten asetetut vaatimukset: Ne tarjoavat tehokkaan käskykannan lisäksi suuren sisäisen muistin ja sisäänrakennetun moniprosessorituen. Järjestelmän mittalaiteluonteisuudesta johtuen keskeinen suunnitteluparametri oli luoda nopeat tiedonsiirtoyhteydet mittausantureilta DSP-järjestelmään. Tämä toteutettiin käyttäen ohjelmointavia FPGA-logiikkapiirejä digitaalimuotoisen mittausdatan vastaanotossa ja esikäsittelyssä. Tiedonsiirtoyhteys PC-tietokoneelle toteutettiin käyttäen erityistä liityntäkorttia DSP-järjestelmän ja PC-tietokoneen välillä. Liityntäkortin päätehtävänä on puskuroida siirrettävä data. Järjestelyllä estetään PC-tietokoneen vaikutus DSP-järjestelmän toimintaan, jotta kyetään takaamaan järjestelmän reaaliaikainen toiminta kaikissa olosuhteissa.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this thesis we study the field of opinion mining by giving a comprehensive review of the available research that has been done in this topic. Also using this available knowledge we present a case study of a multilevel opinion mining system for a student organization's sales management system. We describe the field of opinion mining by discussing its historical roots, its motivations and applications as well as the different scientific approaches that have been used to solve this challenging problem of mining opinions. To deal with this huge subfield of natural language processing, we first give an abstraction of the problem of opinion mining and describe the theoretical frameworks that are available for dealing with appraisal language. Then we discuss the relation between opinion mining and computational linguistics which is a crucial pre-processing step for the accuracy of the subsequent steps of opinion mining. The second part of our thesis deals with the semantics of opinions where we describe the different ways used to collect lists of opinion words as well as the methods and techniques available for extracting knowledge from opinions present in unstructured textual data. In the part about collecting lists of opinion words we describe manual, semi manual and automatic ways to do so and give a review of the available lists that are used as gold standards in opinion mining research. For the methods and techniques of opinion mining we divide the task into three levels that are the document, sentence and feature level. The techniques that are presented in the document and sentence level are divided into supervised and unsupervised approaches that are used to determine the subjectivity and polarity of texts and sentences at these levels of analysis. At the feature level we give a description of the techniques available for finding the opinion targets, the polarity of the opinions about these opinion targets and the opinion holders. Also at the feature level we discuss the various ways to summarize and visualize the results of this level of analysis. In the third part of our thesis we present a case study of a sales management system that uses free form text and that can benefit from an opinion mining system. Using the knowledge gathered in the review of this field we provide a theoretical multi level opinion mining system (MLOM) that can perform most of the tasks needed from an opinion mining system. Based on the previous research we give some hints that many of the laborious market research tasks that are done by the sales force, which uses this sales management system, can improve their insight about their partners and by that increase the quality of their sales services and their overall results.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This work investigates performance of recent feature-based matching techniques when applied to registration of underwater images. Matching methods are tested versus different contrast enhancing pre-processing of images. As a result of the performed experiments for various dominating in images underwater artifacts and present deformation, the outperforming preprocessing, detection and description methods are proposed

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The influence of pre-processing of arabica coffee beans on the composition of volatile precursors including sugars, chlorogenic acids, phenolics, proteins, aminoacids, trigonelline and fatty acids was assessed and correlated with volatiles formed during roasting. Reducing sugars and free aminoacids were highest for natural coffees whereas total sugars, chlorogenic acids and trigonelline were highest for washed coffees. The highest correlation was observed for total phenolics and volatile phenolics (R= 0.999). Experimental data were evaluated by Principal Components Analysis and results showed that washed coffees formed a distinct group in relation to semi-washed and natural coffees.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Työssä määritettiin luokan 2 eläinperäisistä sivutuotteista liikennekäyttöön tuotettujen biodieselin ja biometaanin elinkaaren aikaiset kasvihuonekaasupäästöt ja tuotantoprosessien energiankulutukset perustuen kirjallisuuslähteistä saatuihin lähtötietoihin. Tätä kautta tutkittiin yhdistelmäprosessia, jossa tuotetaan molempia polttoaineita ja selvitettiin onko tällaisella tuotantotavalla mahdollista vähentää päästöjä ja parantaa polttoaineiden tuotannon energiatehokkuutta. Kasvihuone-kaasupäästöjen laskentamenetelmä pohjautuu direktiivissä 2009/28/EY annettuun ohjeistukseen ja eri kasvihuonekaasupäästöjen karakterisointi IPCC:n sadan vuoden tarkastelumalliin. Käytännön laskenta suoritettiin standardien SFS-EN ISO 14040 ja 14044 määrittelemän elinkaariarviointiselvityksen muodossa. Työssä käytetyn laskentamenetelmän ja tarkasteluun valittujen tuotanto-teknologioiden perusteella lasketut tulokset osoittavat, että yhdistelmäprosessilla ei saavuteta suurempia päästövähenemiä eikä parempaa energiatehokkuutta kuin nykyisin käytössä olevilla tuotantotavoilla. Tulokset ovat kuitenkin hyvin herkkiä laskennassa tehtyjen oletusten ja käytettyjen lähtötietojen vaihtelulle sekä valittujen laskentamenetelmien muutoksille. Suurin päästöjä ja energiankulutusta aiheuttava yksittäinen tekijä on kaikissa tuotejärjestelmissä luokan 2 sivutuotteiden esikäsittelyssä vaadittavaan steri-lointiin tarvittavan lämmön tuotanto. Tutkituissa tuotejärjestelmissä lämpö tuotetaan kokonaan tai osittain fossiilisilla polttoaineilla. Kasvihuone-kaasupäästöjä olisi mahdollista alentaa merkittävästi siirtymällä lämmön tuotannossa kokonaan uusiutuviin polttoaineisiin. Sterilointi on lain edellyttämä käsittelytapa ja siksi energiankulutusta on vallitsevissa olosuhteissa hyvin vaikea pienentää merkittävästi.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Many research works have being carried out on analyzing grain storage facility costs; however a few of them had taken into account the analysis of factors associated to all pre-processing and storage steps. The objective of this work was to develop a decision support system for determining the grain storage facility costs and utilization fees in grain storage facilities. The data of a CONAB storage facility located in Ponta Grossa - PR, Brazil, was used as input of the system developed to analyze its specific characteristics, such as amount of product received and stored throughout the year, hourly capacity of drying, cleaning, and receiving, and dispatch. By applying the decision support system, it was observed that the reception and expedition costs were exponentially reduced as the turnover rate of the storage increased. The cleaning and drying costs increased linearly with grain initial moisture. The storage cost increased exponentially as the occupancy rate of the storage facility decreased.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this thesis, the suitability of different trackers for finger tracking in high-speed videos was studied. Tracked finger trajectories from the videos were post-processed and analysed using various filtering and smoothing methods. Position derivatives of the trajectories, speed and acceleration were extracted for the purposes of hand motion analysis. Overall, two methods, Kernelized Correlation Filters and Spatio-Temporal Context Learning tracking, performed better than the others in the tests. Both achieved high accuracy for the selected high-speed videos and also allowed real-time processing, being able to process over 500 frames per second. In addition, the results showed that different filtering methods can be applied to produce more appropriate velocity and acceleration curves calculated from the tracking data. Local Regression filtering and Unscented Kalman Smoother gave the best results in the tests. Furthermore, the results show that tracking and filtering methods are suitable for high-speed hand-tracking and trajectory-data post-processing.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

La technologie des microarrays demeure à ce jour un outil important pour la mesure de l'expression génique. Au-delà de la technologie elle-même, l'analyse des données provenant des microarrays constitue un problème statistique complexe, ce qui explique la myriade de méthodes proposées pour le pré-traitement et en particulier, l'analyse de l'expression différentielle. Toutefois, l'absence de données de calibration ou de méthodologie de comparaison appropriée a empêché l'émergence d'un consensus quant aux méthodes d'analyse optimales. En conséquence, la décision de l'analyste de choisir telle méthode plutôt qu'une autre se fera la plupart du temps de façon subjective, en se basant par exemple sur la facilité d'utilisation, l'accès au logiciel ou la popularité. Ce mémoire présente une approche nouvelle au problème de la comparaison des méthodes d'analyse de l'expression différentielle. Plus de 800 pipelines d'analyse sont appliqués à plus d'une centaine d'expériences sur deux plateformes Affymetrix différentes. La performance de chacun des pipelines est évaluée en calculant le niveau moyen de co-régulation par l'entremise de scores d'enrichissements pour différentes collections de signatures moléculaires. L'approche comparative proposée repose donc sur un ensemble varié de données biologiques pertinentes, ne confond pas la reproductibilité avec l'exactitude et peut facilement être appliquée à de nouvelles méthodes. Parmi les méthodes testées, la supériorité de la sommarisation FARMS et de la statistique de l'expression différentielle TREAT est sans équivoque. De plus, les résultats obtenus quant à la statistique d'expression différentielle corroborent les conclusions d'autres études récentes à propos de l'importance de prendre en compte la grandeur du changement en plus de sa significativité statistique.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Slides describing streaming data, data stream processing systems and stream reasoning Also we have some description of CSPARQL

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Deep Brain Stimulator devices are becoming widely used for therapeutic benefits in movement disorders such as Parkinson's disease. Prolonging the battery life span of such devices could dramatically reduce the risks and accumulative costs associated with surgical replacement. This paper demonstrates how an artificial neural network can be trained using pre-processing frequency analysis of deep brain electrode recordings to detect the onset of tremor in Parkinsonian patients. Implementing this solution into an 'intelligent' neurostimulator device will remove the need for continuous stimulation currently used, and open up the possibility of demand-driven stimulation. Such a methodology could potentially decrease the power consumption of a deep brain pulse generator.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This work compares classification results of lactose, mandelic acid and dl-mandelic acid, obtained on the basis of their respective THz transients. The performance of three different pre-processing algorithms applied to the time-domain signatures obtained using a THz-transient spectrometer are contrasted by evaluating the classifier performance. A range of amplitudes of zero-mean white Gaussian noise are used to artificially degrade the signal-to-noise ratio of the time-domain signatures to generate the data sets that are presented to the classifier for both learning and validation purposes. This gradual degradation of interferograms by increasing the noise level is equivalent to performing measurements assuming a reduced integration time. Three signal processing algorithms were adopted for the evaluation of the complex insertion loss function of the samples under study; a) standard evaluation by ratioing the sample with the background spectra, b) a subspace identification algorithm and c) a novel wavelet-packet identification procedure. Within class and between class dispersion metrics are adopted for the three data sets. A discrimination metric evaluates how well the three classes can be distinguished within the frequency range 0. 1 - 1.0 THz using the above algorithms.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Transient episodes of synchronisation of neuronal activity in particular frequency ranges are thought to underlie cognition. Empirical mode decomposition phase locking (EMDPL) analysis is a method for determining the frequency and timing of phase synchrony that is adaptive to intrinsic oscillations within data, alleviating the need for arbitrary bandpass filter cut-off selection. It is extended here to address the choice of reference electrode and removal of spurious synchrony resulting from volume conduction. Spline Laplacian transformation and independent component analysis (ICA) are performed as pre-processing steps, and preservation of phase synchrony between synthetic signals. combined using a simple forward model, is demonstrated. The method is contrasted with use of bandpass filtering following the same preprocessing steps, and filter cut-offs are shown to influence synchrony detection markedly. Furthermore, an approach to the assessment of multiple EEG trials using the method is introduced, and the assessment of statistical significance of phase locking episodes is extended to render it adaptive to local phase synchrony levels. EMDPL is validated in the analysis of real EEG data, during finger tapping. The time course of event-related (de)synchronisation (ERD/ERS) is shown to differ from that of longer range phase locking episodes, implying different roles for these different types of synchronisation. It is suggested that the increase in phase locking which occurs just prior to movement, coinciding with a reduction in power (or ERD) may result from selection of the neural assembly relevant to the particular movement. (C) 2009 Elsevier B.V. All rights reserved.