847 resultados para Classification Methods
Resumo:
Introduction: Baseline severity and clinical stroke syndrome (Oxford Community Stroke Project, OCSP) classification are predictors of outcome in stroke. We used data from the ‘Tinzaparin in Acute Ischaemic Stroke Trial’ (TAIST) to assess the relationship between stroke severity, early recovery, outcome and OCSP syndrome. Methods: TAIST was a randomised controlled trial assessing the safety and efficacy of tinzaparin versus aspirin in 1,484 patients with acute ischaemic stroke. Severity was measured as the Scandinavian Neurological Stroke Scale (SNSS) at baseline and days 4, 7 and 10, and baseline OCSP clinical classification recorded: total anterior circulation infarct (TACI), partial anterior circulation infarct (PACI), lacunar infarct (LACI) and posterior circulation infarction (POCI). Recovery was calculated as change in SNSS from baseline at day 4 and 10. The relationship between stroke syndrome and SNSS at days 4 and 10, and outcome (modified Rankin scale at 90 days) were assessed. Results: Stroke severity was significantly different between TACI (most severe) and LACI (mildest) at all four time points (p<0.001), with no difference between PACI and POCI. The largest change in SNSS score occurred between baseline and day 4; improvement was least in TACI (median 2 units), compared to other groups (median 3 units) (p<0.001). If SNSS did not improve by day 4, then early recovery and late functional outcome tended to be limited irrespective of clinical syndrome (SNSS, baseline: 31, day 10: 32; mRS, day 90: 4); patients who recovered early tended to continue to improve and had better functional outcome irrespective of syndrome (SNSS, baseline: 35, day 10: 50; mRS, day 90: 2). Conclusions: Although functional outcome is related to baseline clinical syndrome (best with LACI, worst with TACI), patients who improve early have a more favourable functional outcome, irrespective of their OCSP syndrome. Hence, patients with a TACI syndrome may still achieve a reasonable outcome if early recovery occurs.
Resumo:
Background: Statistical analysis of DNA microarray data provides a valuable diagnostic tool for the investigation of genetic components of diseases. To take advantage of the multitude of available data sets and analysis methods, it is desirable to combine both different algorithms and data from different studies. Applying ensemble learning, consensus clustering and cross-study normalization methods for this purpose in an almost fully automated process and linking different analysis modules together under a single interface would simplify many microarray analysis tasks. Results: We present ArrayMining.net, a web-application for microarray analysis that provides easy access to a wide choice of feature selection, clustering, prediction, gene set analysis and cross-study normalization methods. In contrast to other microarray-related web-tools, multiple algorithms and data sets for an analysis task can be combined using ensemble feature selection, ensemble prediction, consensus clustering and cross-platform data integration. By interlinking different analysis tools in a modular fashion, new exploratory routes become available, e.g. ensemble sample classification using features obtained from a gene set analysis and data from multiple studies. The analysis is further simplified by automatic parameter selection mechanisms and linkage to web tools and databases for functional annotation and literature mining. Conclusion: ArrayMining.net is a free web-application for microarray analysis combining a broad choice of algorithms based on ensemble and consensus methods, using automatic parameter selection and integration with annotation databases.
Resumo:
The paper catalogues the procedures and steps involved in agroclimatic classification. These vary from conventional descriptive methods to modern computer-based numerical techniques. There are three mutually independent numerical classification techniques, namely Ordination, Cluster analysis, and Minimum spanning tree; and under each technique there are several forms of grouping techniques existing. The vhoice of numerical classification procedure differs with the type of data set. In the case of numerical continuous data sets with booth positive and negative values, the simple and least controversial procedures are unweighted pair group method (UPGMA) and weighted pair group method (WPGMA) under clustering techniques with similarity measure obtained either from Gower metric or standardized Euclidean metric. Where the number of attributes are large, these could be reduced to fewer new attributes defined by the principal components or coordinates by ordination technique. The first few components or coodinates explain the maximum variance in the data matrix. These revided attributes are less affected by noise in the data set. It is possible to check misclassifications using minimum spanning tree.
Resumo:
Sequential panel selection methods (spsms — procedures that sequentially use conventional panel unit root tests to identify I(0)I(0) time series in panels) are increasingly used in the empirical literature. We check the reliability of spsms by using Monte Carlo simulations based on generating directly the individual asymptotic pp values to be combined into the panel unit root tests, in this way isolating the classification abilities of the procedures from the small sample properties of the underlying univariate unit root tests. The simulations consider both independent and cross-dependent individual test statistics. Results suggest that spsms may offer advantages over time series tests only under special conditions.
Resumo:
Aim: To determine the prevalence and classification of bifid mandibular canals using cone beam computed tomography (CBCT). Methods: The sample comprised 300 CBCT scans obtained from the Radiology and Imaging Department database at São Leopoldo Mandic Dental School, Campinas, SP, Brazil. All images were performed on Classic I-Cat® CBCT scanner, with standardized voxel at 0.25 mm and 13 cm FOV (field of view). From an axial slice (0.25 mm) a guiding plane was drawn along the alveolar ridge in order to obtain a cross-section. Results: Among 300 patients, 188 (62.7%) were female and 112 (37.3%) were male, aged between 13 to 87 years. Changes in the mandibular canal were observed in 90 patients, 30.0% of the sample, 51 women (56.7%) and 39 men (43.3%). Regarding affected sides, 32.2% were on the right and 24.5% on the left, with 43.3% bilateral cases. Conclusions: According to the results obtained in this study, a prevalence of 30% of bifid mandibular canals was found, with the most prevalent types classified as B (mesial direction) and bilateral.
Resumo:
This paper analyses the advantages and limitations in using the Troll, Hargreaves and modified Thornthwaite approaches for the demarcation of the semi-arid tropics. Data from India, Africa, Brazil, Australia and Thailand, were used for the comparison of these three methods. The modified Thornthwaite approach provided the most relevant agriculturally oriented demarcation of the semi-arid tropics. This method in not only simple, tut uses input data that are avaliable for a global network of stations. Using this method the semi-arid tropics include major dryland or rainfed agricultural zones with annual rainfall varying from about 400 to 1,250 mm. Major dryland crops are pearl millet, sorghum, pigeonpea and groundnut. This paper also presents the brief description of climate, soils and farming systems of the semi-arid tropics.
Resumo:
When it comes to information sets in real life, often pieces of the whole set may not be available. This problem can find its origin in various reasons, describing therefore different patterns. In the literature, this problem is known as Missing Data. This issue can be fixed in various ways, from not taking into consideration incomplete observations, to guessing what those values originally were, or just ignoring the fact that some values are missing. The methods used to estimate missing data are called Imputation Methods. The work presented in this thesis has two main goals. The first one is to determine whether any kind of interactions exists between Missing Data, Imputation Methods and Supervised Classification algorithms, when they are applied together. For this first problem we consider a scenario in which the databases used are discrete, understanding discrete as that it is assumed that there is no relation between observations. These datasets underwent processes involving different combina- tions of the three components mentioned. The outcome showed that the missing data pattern strongly influences the outcome produced by a classifier. Also, in some of the cases, the complex imputation techniques investigated in the thesis were able to obtain better results than simple ones. The second goal of this work is to propose a new imputation strategy, but this time we constrain the specifications of the previous problem to a special kind of datasets, the multivariate Time Series. We designed new imputation techniques for this particular domain, and combined them with some of the contrasted strategies tested in the pre- vious chapter of this thesis. The time series also were subjected to processes involving missing data and imputation to finally propose an overall better imputation method. In the final chapter of this work, a real-world example is presented, describing a wa- ter quality prediction problem. The databases that characterized this problem had their own original latent values, which provides a real-world benchmark to test the algorithms developed in this thesis.
Resumo:
The accuracy of a map is dependent on the reference dataset used in its construction. Classification analyses used in thematic mapping can, for example, be sensitive to a range of sampling and data quality concerns. With particular focus on the latter, the effects of reference data quality on land cover classifications from airborne thematic mapper data are explored. Variations in sampling intensity and effort are highlighted in a dataset that is widely used in mapping and modelling studies; these may need accounting for in analyses. The quality of the labelling in the reference dataset was also a key variable influencing mapping accuracy. Accuracy varied with the amount and nature of mislabelled training cases with the nature of the effects varying between classifiers. The largest impacts on accuracy occurred when mislabelling involved confusion between similar classes. Accuracy was also typically negatively related to the magnitude of mislabelled cases and the support vector machine (SVM), which has been claimed to be relatively insensitive to training data error, was the most sensitive of the set of classifiers investigated, with overall classification accuracy declining by 8% (significant at 95% level of confidence) with the use of a training set containing 20% mislabelled cases.
Resumo:
In knowledge technology work, as expressed by the scope of this conference, there are a number of communities, each uncovering new methods, theories, and practices. The Library and Information Science (LIS) community is one such community. This community, through tradition and innovation, theories and practice, organizes knowledge and develops knowledge technologies formed by iterative research hewn to the values of equal access and discovery for all. The Information Modeling community is another contributor to knowledge technologies. It concerns itself with the construction of symbolic models that capture the meaning of information and organize it in ways that are computer-based, but human understandable. A recent paper that examines certain assumptions in information modeling builds a bridge between these two communities, offering a forum for a discussion on common aims from a common perspective. In a June 2000 article, Parsons and Wand separate classes from instances in information modeling in order to free instances from what they call the “tyranny” of classes. They attribute a number of problems in information modeling to inherent classification – or the disregard for the fact that instances can be conceptualized independent of any class assignment. By faceting instances from classes, Parsons and Wand strike a sonorous chord with classification theory as understood in LIS. In the practice community and in the publications of LIS, faceted classification has shifted the paradigm of knowledge organization theory in the twentieth century. Here, with the proposal of inherent classification and the resulting layered information modeling, a clear line joins both the LIS classification theory community and the information modeling community. Both communities have their eyes turned toward networked resource discovery, and with this conceptual conjunction a new paradigmatic conversation can take place. Parsons and Wand propose that the layered information model can facilitate schema integration, schema evolution, and interoperability. These three spheres in information modeling have their own connotation, but are not distant from the aims of classification research in LIS. In this new conceptual conjunction, established by Parsons and Ward, information modeling through the layered information model, can expand the horizons of classification theory beyond LIS, promoting a cross-fertilization of ideas on the interoperability of subject access tools like classification schemes, thesauri, taxonomies, and ontologies. This paper examines the common ground between the layered information model and faceted classification, establishing a vocabulary and outlining some common principles. It then turns to the issue of schema and the horizons of conventional classification and the differences between Information Modeling and Library and Information Science. Finally, a framework is proposed that deploys an interpretation of the layered information modeling approach in a knowledge technologies context. In order to design subject access systems that will integrate, evolve and interoperate in a networked environment, knowledge organization specialists must consider a semantic class independence like Parsons and Wand propose for information modeling.
Resumo:
2016