905 resultados para classification methods


Relevância:

60.00% 60.00%

Publicador:

Resumo:

AIMS: Adolescent mental health problems require treatment and care that are adapted to their needs. To evaluate this issue, it was decided to implement a multidimensional instrument focused on a global approach to adolescent social and behavioural functioning, combined with the ICD-10 classification. METHODS: The combination of an assessment interview and a classification tool enabled the method to integrate the measurement of several domains of patient-based outcome rather than focus on the measurement of symptoms. A group of 68 adolescents from an inpatient unit were compared with 67 adolescents from the general population. RESULTS: Results suggest that adolescents from the care unit adopt significantly riskier behaviour compared with adolescents from the control group. As expected, the main problems identified refer to the psychological and familial areas. A cluster analysis was performed and provided three different profiles: a group with externalizing disorders and two groups with internalizing disorders. On the basis of a structured interview it was possible to obtain information in a systematic way about the adolescents' trajectory (delinquency, physical and sexual abuse, psychoactive substance use). CONCLUSION: It was shown that treatment and care should not focus exclusively on mental health symptoms, but also upon physical, psychological and social aspects of the adolescent. A global approach helps in the consideration of the multitude of factors which must be taken into account when working with people with serious mental health problems and may help to turn the care unit's activity more specifically towards the needs of these adolescents.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Adaptation to different ecological environments can promote speciation. Although numerous examples of such 'ecological speciation' now exist, the genomic basis of the process, and the role of gene flow in it, remains less understood. This is, at least in part, because systems that are well characterized in terms of their ecology often lack genomic resources. In this study, we characterize the transcriptome of Timema cristinae stick insects, a system that has been researched intensively in terms of ecological speciation, but for which genomic resources have not been previously developed. Specifically, we obtained >1 million 454 sequencing reads that assembled into 84,937 contigs representing approximately 18,282 unique genes and tens of thousands of potential molecular markers. Second, as an illustration of their utility, we used these genomic resources to assess multilocus genetic divergence within both an ecotype pair and a species pair of Timema stick insects. The results suggest variable levels of genetic divergence and gene flow among taxon pairs and genes and illustrate a first step towards future genomic work in Timema.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

SUMMARY: A top scoring pair (TSP) classifier consists of a pair of variables whose relative ordering can be used for accurately predicting the class label of a sample. This classification rule has the advantage of being easily interpretable and more robust against technical variations in data, as those due to different microarray platforms. Here we describe a parallel implementation of this classifier which significantly reduces the training time, and a number of extensions, including a multi-class approach, which has the potential of improving the classification performance. AVAILABILITY AND IMPLEMENTATION: Full C++ source code and R package Rgtsp are freely available from http://lausanne.isb-sib.ch/~vpopovic/research/. The implementation relies on existing OpenMP libraries.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

For the last 2 decades, supertree reconstruction has been an active field of research and has seen the development of a large number of major algorithms. Because of the growing popularity of the supertree methods, it has become necessary to evaluate the performance of these algorithms to determine which are the best options (especially with regard to the supermatrix approach that is widely used). In this study, seven of the most commonly used supertree methods are investigated by using a large empirical data set (in terms of number of taxa and molecular markers) from the worldwide flowering plant family Sapindaceae. Supertree methods were evaluated using several criteria: similarity of the supertrees with the input trees, similarity between the supertrees and the total evidence tree, level of resolution of the supertree and computational time required by the algorithm. Additional analyses were also conducted on a reduced data set to test if the performance levels were affected by the heuristic searches rather than the algorithms themselves. Based on our results, two main groups of supertree methods were identified: on one hand, the matrix representation with parsimony (MRP), MinFlip, and MinCut methods performed well according to our criteria, whereas the average consensus, split fit, and most similar supertree methods showed a poorer performance or at least did not behave the same way as the total evidence tree. Results for the super distance matrix, that is, the most recent approach tested here, were promising with at least one derived method performing as well as MRP, MinFlip, and MinCut. The output of each method was only slightly improved when applied to the reduced data set, suggesting a correct behavior of the heuristic searches and a relatively low sensitivity of the algorithms to data set sizes and missing data. Results also showed that the MRP analyses could reach a high level of quality even when using a simple heuristic search strategy, with the exception of MRP with Purvis coding scheme and reversible parsimony. The future of supertrees lies in the implementation of a standardized heuristic search for all methods and the increase in computing power to handle large data sets. The latter would prove to be particularly useful for promising approaches such as the maximum quartet fit method that yet requires substantial computing power.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The present research deals with an important public health threat, which is the pollution created by radon gas accumulation inside dwellings. The spatial modeling of indoor radon in Switzerland is particularly complex and challenging because of many influencing factors that should be taken into account. Indoor radon data analysis must be addressed from both a statistical and a spatial point of view. As a multivariate process, it was important at first to define the influence of each factor. In particular, it was important to define the influence of geology as being closely associated to indoor radon. This association was indeed observed for the Swiss data but not probed to be the sole determinant for the spatial modeling. The statistical analysis of data, both at univariate and multivariate level, was followed by an exploratory spatial analysis. Many tools proposed in the literature were tested and adapted, including fractality, declustering and moving windows methods. The use of Quan-tité Morisita Index (QMI) as a procedure to evaluate data clustering in function of the radon level was proposed. The existing methods of declustering were revised and applied in an attempt to approach the global histogram parameters. The exploratory phase comes along with the definition of multiple scales of interest for indoor radon mapping in Switzerland. The analysis was done with a top-to-down resolution approach, from regional to local lev¬els in order to find the appropriate scales for modeling. In this sense, data partition was optimized in order to cope with stationary conditions of geostatistical models. Common methods of spatial modeling such as Κ Nearest Neighbors (KNN), variography and General Regression Neural Networks (GRNN) were proposed as exploratory tools. In the following section, different spatial interpolation methods were applied for a par-ticular dataset. A bottom to top method complexity approach was adopted and the results were analyzed together in order to find common definitions of continuity and neighborhood parameters. Additionally, a data filter based on cross-validation was tested with the purpose of reducing noise at local scale (the CVMF). At the end of the chapter, a series of test for data consistency and methods robustness were performed. This lead to conclude about the importance of data splitting and the limitation of generalization methods for reproducing statistical distributions. The last section was dedicated to modeling methods with probabilistic interpretations. Data transformation and simulations thus allowed the use of multigaussian models and helped take the indoor radon pollution data uncertainty into consideration. The catego-rization transform was presented as a solution for extreme values modeling through clas-sification. Simulation scenarios were proposed, including an alternative proposal for the reproduction of the global histogram based on the sampling domain. The sequential Gaussian simulation (SGS) was presented as the method giving the most complete information, while classification performed in a more robust way. An error measure was defined in relation to the decision function for data classification hardening. Within the classification methods, probabilistic neural networks (PNN) show to be better adapted for modeling of high threshold categorization and for automation. Support vector machines (SVM) on the contrary performed well under balanced category conditions. In general, it was concluded that a particular prediction or estimation method is not better under all conditions of scale and neighborhood definitions. Simulations should be the basis, while other methods can provide complementary information to accomplish an efficient indoor radon decision making.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Land plants have had the reputation of being problematic for DNA barcoding for two general reasons: (i) the standard DNA regions used in algae, animals and fungi have exceedingly low levels of variability and (ii) the typically used land plant plastid phylogenetic markers (e.g. rbcL, trnL-F, etc.) appear to have too little variation. However, no one has assessed how well current phylogenetic resources might work in the context of identification (versus phylogeny reconstruction). In this paper, we make such an assessment, particularly with two of the markers commonly sequenced in land plant phylogenetic studies, plastid rbcL and internal transcribed spacers of the large subunits of nuclear ribosomal DNA (ITS), and find that both of these DNA regions perform well even though the data currently available in GenBank/EBI were not produced to be used as barcodes and BLAST searches are not an ideal tool for this purpose. These results bode well for the use of even more variable regions of plastid DNA (such as, for example, psbA-trnH) as barcodes, once they have been widely sequenced. In the short term, efforts to bring land plant barcoding up to the standards being used now in other organisms should make swift progress. There are two categories of DNA barcode users, scientists in fields other than taxonomy and taxonomists. For the former, the use of mitochondrial and plastid DNA, the two most easily assessed genomes, is at least in the short term a useful tool that permits them to get on with their studies, which depend on knowing roughly which species or species groups they are dealing with, but these same DNA regions have important drawbacks for use in taxonomic studies (i.e. studies designed to elucidate species limits). For these purposes, DNA markers from uniparentally (usually maternally) inherited genomes can only provide half of the story required to improve taxonomic standards being used in DNA barcoding. In the long term, we will need to develop more sophisticated barcoding tools, which would be multiple, low-copy nuclear markers with sufficient genetic variability and PCR-reliability; these would permit the detection of hybrids and permit researchers to identify the 'genetic gaps' that are useful in assessing species limits.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Työn tavoitteet liittyivät varastonhallintakäytäntöjen kehittämiseen sekä tämän seurantaan ja ylläpitoon liittyvien työvälineiden luomiseen. Päätutkimuskysymyksenä oli: 'Miten varastonhallintakäytäntöä kannattaisi parantaa, niin että nykyisillä resursseilla saavutettaisiin kustannussäästöjä asiakkaan palvelutasoa alentamatta?' Keskeinen teoreettinen sisältö liittyy varastonhallintakäytäntöihin. Tämä käsitellään varastonhallinnan perusteiden, varastonohjauksen, suorituskyvyn arvioinnin sekä varastonhallintakäytännön muutosprosessin avulla. Empiirinen osuus suoritetaan kohdeyrityksen logistisen muutosprosessimallin läpiviemisen avulla sisältäen tunnuslukuja, täydennysmenetelmiä, tuoteluokittelua ja muita analyyseja. Muutosprosessin vaiheet ovat edellytysten selvittäminen, nykytilan kuvaus ja analysointi, vaihtoehtoisten ratkaisujen ehdottaminen, nykytilan vertailu ratkaisuehdotelmiin, yhden ratkaisun valitseminen, muutoksen läpivienti ja lopputulosten seuranta. Työn keskeiset tulokset ovat erilaisten varastonhallintaan liittyvien tunnuslukujen laskeminen, tuoteluokittelun suorittaminen, täydennysmenetelmiin kuuluvien kaavojen luominen, varastokartan laatiminen ja toimintavaihtoehtojen esittely. Viimeiseen sisältyy oman muokatun kohdeyritykselle soveltuvan ¿voi tilata¿ täydennysmenetelmän laatiminen, ehdotuksen tekeminen liikkumattomien nimikkeiden eroon pääsemisestä, nimikkeiden uudelleensijoittaminen varastoon laaditun varastokartan mukaisesti, uuden työjaonluominen, muutosehdotuksen säännöllinen seuranta ja uusien tavoitteiden asettaminen sekä koulutustarpeeseen ja tietojärjestelmän kehittämiseen liittyvien ehdotusten tekeminen emoyhtiölle.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Tärkeä tehtävä ympäristön tarkkailussa on arvioida ympäristön nykyinen tila ja ihmisen siihen aiheuttamat muutokset sekä analysoida ja etsiä näiden yhtenäiset suhteet. Ympäristön muuttumista voidaan hallita keräämällä ja analysoimalla tietoa. Tässä diplomityössä on tutkittu vesikasvillisuudessa hai vainuja muutoksia käyttäen etäältä hankittua mittausdataa ja kuvan analysointimenetelmiä. Ympäristön tarkkailuun on käytetty Suomen suurimmasta järvestä Saimaasta vuosina 1996 ja 1999 otettuja ilmakuvia. Ensimmäinen kuva-analyysin vaihe on geometrinen korjaus, jonka tarkoituksena on kohdistaa ja suhteuttaa otetut kuvat samaan koordinaattijärjestelmään. Toinen vaihe on kohdistaa vastaavat paikalliset alueet ja tunnistaa kasvillisuuden muuttuminen. Kasvillisuuden tunnistamiseen on käytetty erilaisia lähestymistapoja sisältäen valvottuja ja valvomattomia tunnistustapoja. Tutkimuksessa käytettiin aitoa, kohinoista mittausdataa, minkä perusteella tehdyt kokeet antoivat hyviä tuloksia tutkimuksen onnistumisesta.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Problem of modeling of anaesthesia depth level is studied in this Master Thesis. It applies analysis of EEG signals with nonlinear dynamics theory and further classification of obtained values. The main stages of this study are the following: data preprocessing; calculation of optimal embedding parameters for phase space reconstruction; obtaining reconstructed phase portraits of each EEG signal; formation of the feature set to characterise obtained phase portraits; classification of four different anaesthesia levels basing on previously estimated features. Classification was performed with: Linear and quadratic Discriminant Analysis, k Nearest Neighbours method and online clustering. In addition, this work provides overview of existing approaches to anaesthesia depth monitoring, description of basic concepts of nonlinear dynamics theory used in this Master Thesis and comparative analysis of several different classification methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Customer satisfaction should be the main focus for all of the parts of the business. Usually supply chain behind the business is in a key role when this focus is pursued especially in repair service business. When focusing on the materials that are needed to make repairs to equipment under service contracts, the time aspect of quality is critical. Do late deliveries from supplier have an effect on the service performance of repairs when distribution center of a centralized purchasing unit is acting as a buffer between suppliers and repair service business? And if so, how should the improvement efforts be prioritized? These are the two main questions that this thesis focuses on. Correlation and linear regression was tested between service levels of supplier and distribution center. Percentage of on-time deliveries were compared to outbound delivery service level. It was found that there is statistically significant correlation between inbound and outbound operations success. The other main question of the thesis, improvement prioritization, was answered by creating material availability based supplier classification and additional to that, by developing the decision process for the analysis of most critical suppliers. This was built on a basis of previous supplier and material classification methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Tässä työssä tarkastellaan kahden kansainvälisen patenttiluokitusjärjestelmän vihreiden teknologioiden luokitusjärjestelmiä autoteollisuudessa. Työn tarkoitus on tutkia, kuinka paljon vihreän teknologian patenttianalyysin tulokset eroavat toisistaan, jos tutkimuksissa käytetään eri patenttien luokitusjärjestelmiä. Vanhempi järjestelmä, International Patent Classification, on asemansa vakiinnuttanut kansainvälinen patenttienluokitusjärjestelmä. Vasta viime vuosina käyttöön otettu Cooperative Patent Classification on Euroopan ja Yhdysvaltojen patenttijärjestöjen kehittämä patenttien luokitusjärjestelmä. Tutkimusmenetelmissä hyödynnetään patenttianalyysia ja joukko-oppia. Tutkimuksessa vihreiden teknologioiden luokittelumenetelmien vertailukohteille saatiin määrällisesti samankaltaiset tulokset, mutta niiden sisältämät patentit eivät olleet pääsäännöllisesti samoja. Työssä tarkastellaan myös Toyotan, Daimlerin ja Fordin vihreiden autoteknologiapolkujen kehitystä. Varsinkin Toyota ja Daimler tulevat yhä enemmän panostamaan sähkö- ja hybridiautoihin verrattuna polttomoottoriautoihin.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Illnesses related to the heart are one of the major reasons for death all over the world causing many people to lose their lives in last decades. The good news is that many of those sicknesses are preventable if they are spotted in early stages. On the other hand, the number of the doctors are much lower than the number of patients. This will makes the auto diagnosing of diseases even more and more essential for humans today. Furthermore, when it comes to the diagnosing methods and algorithms, the current state of the art is lacking a comprehensive study on the comparison between different diagnosis solutions. Not having a single valid diagnosing solution has increased the confusion among scholars and made it harder for them to take further steps. This master thesis will address the issue of reliable diagnosing algorithm. We investigate ECG signals and the relation between different diseases and the heart’s electrical activity. Also, we will discuss the necessary steps needed for auto diagnosing the heart diseases including the literatures discussing the topic. The main goal of this master thesis is to find a single reliable diagnosing algorithm and quest for the best classifier to date for heart related sicknesses. Five most suited and most well-known classifiers, such as KNN, CART, MLP, Adaboost and SVM, have been investigated. To have a fair comparison, the ex-periment condition is kept the same for all classification methods. The UCI repository arrhythmia dataset will be used and the data will not be preprocessed. The experiment results indicates that AdaBoost noticeably classifies different diseases with a considera-bly better accuracy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Les documents publiés par des entreprises, tels les communiqués de presse, contiennent une foule d’informations sur diverses activités des entreprises. C’est une source précieuse pour des analyses en intelligence d’affaire. Cependant, il est nécessaire de développer des outils pour permettre d’exploiter cette source automatiquement, étant donné son grand volume. Ce mémoire décrit un travail qui s’inscrit dans un volet d’intelligence d’affaire, à savoir la détection de relations d’affaire entre les entreprises décrites dans des communiqués de presse. Dans ce mémoire, nous proposons une approche basée sur la classification. Les méthodes de classifications existantes ne nous permettent pas d’obtenir une performance satisfaisante. Ceci est notamment dû à deux problèmes : la représentation du texte par tous les mots, qui n’aide pas nécessairement à spécifier une relation d’affaire, et le déséquilibre entre les classes. Pour traiter le premier problème, nous proposons une approche de représentation basée sur des mots pivots c’est-à-dire les noms d’entreprises concernées, afin de mieux cerner des mots susceptibles de les décrire. Pour le deuxième problème, nous proposons une classification à deux étapes. Cette méthode s’avère plus appropriée que les méthodes traditionnelles de ré-échantillonnage. Nous avons testé nos approches sur une collection de communiqués de presse dans le domaine automobile. Nos expérimentations montrent que les approches proposées peuvent améliorer la performance de classification. Notamment, la représentation du document basée sur les mots pivots nous permet de mieux centrer sur les mots utiles pour la détection de relations d’affaire. La classification en deux étapes apporte une solution efficace au problème de déséquilibre entre les classes. Ce travail montre que la détection automatique des relations d’affaire est une tâche faisable. Le résultat de cette détection pourrait être utilisé dans une analyse d’intelligence d’affaire.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper highlights the prediction of Learning Disabilities (LD) in school-age children using two classification methods, Support Vector Machine (SVM) and Decision Tree (DT), with an emphasis on applications of data mining. About 10% of children enrolled in school have a learning disability. Learning disability prediction in school age children is a very complicated task because it tends to be identified in elementary school where there is no one sign to be identified. By using any of the two classification methods, SVM and DT, we can easily and accurately predict LD in any child. Also, we can determine the merits and demerits of these two classifiers and the best one can be selected for the use in the relevant field. In this study, Sequential Minimal Optimization (SMO) algorithm is used in performing SVM and J48 algorithm is used in constructing decision trees.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

An analysis of historical Corona images, Landsat images, recent radar and Google Earth® images was conducted to determine land use and land cover changes of oases settlements and surrounding rangelands at the fringe of the Altay Mountains from 1964 to 2008. For the Landsat datasets supervised classification methods were used to test the suitability of the Maximum Likelihood Classifier with subsequent smoothing and the Sequential Maximum A Posteriori Classifier (SMAPC). The results show a trend typical for the steppe and desert regions of northern China. From 1964 to 2008 farmland strongly increased (+ 61%), while the area of grassland and forest in the floodplains decreased (- 43%). The urban areas increased threefold and 400 ha of former agricultural land were abandoned. Farmland apparently affected by soil salinity decreased in size from 1990 (1180 ha) to 2008 (630 ha). The vegetated areas of the surrounding rangelands decreased, mainly as a result of overgrazing and drought events.The SMAPC with subsequent post processing revealed the highest classification accuracy. However, the specific landscape characteristics of mountain oasis systems required labour intensive post processing. Further research is needed to test the use of ancillary information for an automated classification of the examined landscape features.