42 resultados para Classify
Resumo:
Objectives: Our objective was to test the performance of CA125 in classifying serum samples from a cohort of malignant and benign ovarian cancers and age-matched healthy controls and to assess whether combining information from matrix-assisted laser desorption/ionization (MALDI) time-of-flight profiling could improve diagnostic performance. Materials and Methods: Serum samples from women with ovarian neoplasms and healthy volunteers were subjected to CA125 assay and MALDI time-of-flight mass spectrometry (MS) profiling. Models were built from training data sets using discriminatory MALDI MS peaks in combination with CA125 values and tested their ability to classify blinded test samples. These were compared with models using CA125 threshold levels from 193 patients with ovarian cancer, 290 with benign neoplasm, and 2236 postmenopausal healthy controls. Results: Using a CA125 cutoff of 30 U/mL, an overall sensitivity of 94.8% (96.6% specificity) was obtained when comparing malignancies versus healthy postmenopausal controls, whereas a cutoff of 65 U/mL provided a sensitivity of 83.9% (99.6% specificity). High classification accuracies were obtained for early-stage cancers (93.5% sensitivity). Reasons for high accuracies include recruitment bias, restriction to postmenopausal women, and inclusion of only primary invasive epithelial ovarian cancer cases. The combination of MS profiling information with CA125 did not significantly improve the specificity/accuracy compared with classifications on the basis of CA125 alone. Conclusions: We report unexpectedly good performance of serum CA125 using threshold classification in discriminating healthy controls and women with benign masses from those with invasive ovarian cancer. This highlights the dependence of diagnostic tests on the characteristics of the study population and the crucial need for authors to provide sufficient relevant details to allow comparison. Our study also shows that MS profiling information adds little to diagnostic accuracy. This finding is in contrast with other reports and shows the limitations of serum MS profiling for biomarker discovery and as a diagnostic tool
Resumo:
The dynamics of inter-regional communication within the brain during cognitive processing – referred to as functional connectivity – are investigated as a control feature for a brain computer interface. EMDPL is used to map phase synchronization levels between all channel pair combinations in the EEG. This results in complex networks of channel connectivity at all time–frequency locations. The mean clustering coefficient is then used as a descriptive feature encapsulating information about inter-channel connectivity. Hidden Markov models are applied to characterize and classify dynamics of the resulting complex networks. Highly accurate levels of classification are achieved when this technique is applied to classify EEG recorded during real and imagined single finger taps. These results are compared to traditional features used in the classification of a finger tap BCI demonstrating that functional connectivity dynamics provide additional information and improved BCI control accuracies.
Resumo:
Unhealthy diets can lead to various diseases, which in turn can translate into a bigger burden for the state in the form of health services and lost production. Obesity alone has enormous costs and claims thousands of lives every year. Although diet quality in the European Union has improved across countries, it still falls well short of conformity with the World Health Organization dietary guidelines. In this review, we classify types of policy interventions addressing healthy eating and identify through a literature review what specific policy interventions are better suited to improve diets. Policy interventions are classified into two broad categories: information measures and measures targeting the market environment. Using this classification, we summarize a number of previous systematic reviews, academic papers, and institutional reports and draw some conclusions about their effectiveness. Of the information measures, policy interventions aimed at reducing or banning unhealthy food advertisements generally have had a weak positive effect on improving diets, while public information campaigns have been successful in raising awareness of unhealthy eating but have failed to translate the message into action. Nutritional labeling allows for informed choice. However, informed choice is not necessarily healthier; knowing or being able to read and interpret nutritional labeling on food purchased does not necessarily result in consumption of healthier foods. Interventions targeting the market environment, such as fiscal measures and nutrient, food, and diet standards, are rarer and generally more effective, though more intrusive. Overall, we conclude that measures to support informed choice have a mixed and limited record of success. On the other hand, measures to target the market environment are more intrusive but may be more effective.
Resumo:
Deep Brain Stimulation has been used in the study of and for treating Parkinson’s Disease (PD) tremor symptoms since the 1980s. In the research reported here we have carried out a comparative analysis to classify tremor onset based on intraoperative microelectrode recordings of a PD patient’s brain Local Field Potential (LFP) signals. In particular, we compared the performance of a Support Vector Machine (SVM) with two well known artificial neural network classifiers, namely a Multiple Layer Perceptron (MLP) and a Radial Basis Function Network (RBN). The results show that in this study, using specifically PD data, the SVM provided an overall better classification rate achieving an accuracy of 81% recognition.
Resumo:
In this paper we explore classification techniques for ill-posed problems. Two classes are linearly separable in some Hilbert space X if they can be separated by a hyperplane. We investigate stable separability, i.e. the case where we have a positive distance between two separating hyperplanes. When the data in the space Y is generated by a compact operator A applied to the system states ∈ X, we will show that in general we do not obtain stable separability in Y even if the problem in X is stably separable. In particular, we show this for the case where a nonlinear classification is generated from a non-convergent family of linear classes in X. We apply our results to the problem of quality control of fuel cells where we classify fuel cells according to their efficiency. We can potentially classify a fuel cell using either some external measured magnetic field or some internal current. However we cannot measure the current directly since we cannot access the fuel cell in operation. The first possibility is to apply discrimination techniques directly to the measured magnetic fields. The second approach first reconstructs currents and then carries out the classification on the current distributions. We show that both approaches need regularization and that the regularized classifications are not equivalent in general. Finally, we investigate a widely used linear classification algorithm Fisher's linear discriminant with respect to its ill-posedness when applied to data generated via a compact integral operator. We show that the method cannot stay stable when the number of measurement points becomes large.
Resumo:
Cladistic analyses begin with an assessment of variation for a group of organisms and the subsequent representation of that variation as a data matrix. The step of converting observed organismal variation into a data matrix has been considered subjective, contentious, under-investigated, imprecise, unquantifiable, intuitive, as a black-box, and at the same time as ultimately the most influential phase of any cladistic analysis (Pimentel and Riggins, 1987; Bryant, 1989; Pogue and Mickevich, 1990; de Pinna, 1991; Stevens, 1991; Bateman et al., 1992; Smith, 1994; Pleijel, 1995; Wilkinson, 1995; Patterson and Johnson, 1997). Despite the concerns of these authors, primary homology assessment is often perceived as reproducible. In a recent paper, Hawkins et al. (1997) reiterated two points made by a number of these authors: that different interpretations of characters and coding are possible and that different workers will perceive and define characters in different ways. One reviewer challenged us: did we really think that two people working on the same group would come up with different data sets? The conflicting views regarding the reproducibility of the cladistic character matrix provoke a number of questions. Do the majority of workers consistently follow the same guidelines? Has the theoretical framework informing primary homology assessment been adequately explored? The objective of this study is to classify approaches to primary homology assessment, and to quantify the extent to which different approaches are found in the literature by examining variation in the way characters are defined and coded in a data matrix.
Resumo:
PURPOSE: Since its introduction in 2006, messages posted to the microblogging system Twitter have provided a rich dataset for researchers, leading to the publication of over a thousand academic papers. This paper aims to identify this published work and to classify it in order to understand Twitter based research. DESIGN/METHODOLOGY/APPROACH: Firstly the papers on Twitter were identified. Secondly, following a review of the literature, a classification of the dimensions of microblogging research was established. Thirdly, papers were qualitatively classified using open coded content analysis, based on the paper’s title and abstract, in order to analyze method, subject, and approach. FINDINGS: The majority of published work relating to Twitter concentrates on aspects of the messages sent and details of the users. A variety of methodological approaches are used across a range of identified domains. RESEARCH LIMITATIONS/IMPLICATIONS: This work reviewed the abstracts of all papers available via database search on the term “Twitter” and this has two major implications: 1) the full papers are not considered and so works may be misclassified if their abstract is not clear, 2) publications not indexed by the databases, such as book chapters, are not included. ORIGINALITY/VALUE: To date there has not been an overarching study to look at the methods and purpose of those using Twitter as a research subject. Our major contribution is to scope out papers published on Twitter until the close of 2011. The classification derived here will provide a framework within which researchers studying Twitter related topics will be able to position and ground their work
Resumo:
Top Down Induction of Decision Trees (TDIDT) is the most commonly used method of constructing a model from a dataset in the form of classification rules to classify previously unseen data. Alternative algorithms have been developed such as the Prism algorithm. Prism constructs modular rules which produce qualitatively better rules than rules induced by TDIDT. However, along with the increasing size of databases, many existing rule learning algorithms have proved to be computational expensive on large datasets. To tackle the problem of scalability, parallel classification rule induction algorithms have been introduced. As TDIDT is the most popular classifier, even though there are strongly competitive alternative algorithms, most parallel approaches to inducing classification rules are based on TDIDT. In this paper we describe work on a distributed classifier that induces classification rules in a parallel manner based on Prism.
Resumo:
A number of tests exist to check for statistical significance of phase synchronisation within the Electroencephalogram (EEG); however, the majority suffer from a lack of generality and applicability. They may also fail to account for temporal dynamics in the phase synchronisation, regarding synchronisation as a constant state instead of a dynamical process. Therefore, a novel test is developed for identifying the statistical significance of phase synchronisation based upon a combination of work characterising temporal dynamics of multivariate time-series and Markov modelling. We show how this method is better able to assess the significance of phase synchronisation than a range of commonly used significance tests. We also show how the method may be applied to identify and classify significantly different phase synchronisation dynamics in both univariate and multivariate datasets.
Resumo:
This paper generalises and applies recently developed blocking diagnostics in a two- dimensional latitude-longitude context, which takes into consideration both mid- and high-latitude blocking. These diagnostics identify characteristics of the associated wave-breaking as seen in the potential temperature (θ) on the dynamical tropopause, in particular the cyclonic or anticyclonic Direction of wave-Breaking (DB index), and the Relative Intensity (RI index) of the air masses that contribute to blocking formation. The methodology is extended to a 2-D domain and a cluster technique is deployed to classify mid- and high-latitude blocking according to the wave-breaking characteristics. Mid-latitude blocking is observed over Europe and Asia, where the meridional gradient of θ is generally weak, whereas high-latitude blocking is mainly present over the oceans, to the north of the jet-stream, where the meridional gradient of θ is much stronger. They occur respectively on the equatorward and poleward flank of the jet- stream, where the horizontal shear ∂u/∂y is positive in the first case and negative in the second case. A regional analysis is also conducted. It is found that cold-anticyclonic and cyclonic blocking divert the storm-track respectively to the south and to the north over the East Atlantic and western Europe. Furthermore, warm-cyclonic blocking over the Pacific and cold-anticyclonic blocking over Europe are identified as the most persistent types and are associated with large amplitude anomalies in temperature and precipitation. Finally, the high-latitude, cyclonic events seem to correlate well with low- frequency modes of variability over the Pacific and Atlantic Ocean.
Resumo:
Background: Since their inception, Twitter and related microblogging systems have provided a rich source of information for researchers and have attracted interest in their affordances and use. Since 2009 PubMed has included 123 journal articles on medicine and Twitter, but no overview exists as to how the field uses Twitter in research. // Objective: This paper aims to identify published work relating to Twitter indexed by PubMed, and then to classify it. This classification will provide a framework in which future researchers will be able to position their work, and to provide an understanding of the current reach of research using Twitter in medical disciplines. Limiting the study to papers indexed by PubMed ensures the work provides a reproducible benchmark. // Methods: Papers, indexed by PubMed, on Twitter and related topics were identified and reviewed. The papers were then qualitatively classified based on the paper’s title and abstract to determine their focus. The work that was Twitter focused was studied in detail to determine what data, if any, it was based on, and from this a categorization of the data set size used in the studies was developed. Using open coded content analysis additional important categories were also identified, relating to the primary methodology, domain and aspect. // Results: As of 2012, PubMed comprises more than 21 million citations from biomedical literature, and from these a corpus of 134 potentially Twitter related papers were identified, eleven of which were subsequently found not to be relevant. There were no papers prior to 2009 relating to microblogging, a term first used in 2006. Of the remaining 123 papers which mentioned Twitter, thirty were focussed on Twitter (the others referring to it tangentially). The early Twitter focussed papers introduced the topic and highlighted the potential, not carrying out any form of data analysis. The majority of published papers used analytic techniques to sort through thousands, if not millions, of individual tweets, often depending on automated tools to do so. Our analysis demonstrates that researchers are starting to use knowledge discovery methods and data mining techniques to understand vast quantities of tweets: the study of Twitter is becoming quantitative research. // Conclusions: This work is to the best of our knowledge the first overview study of medical related research based on Twitter and related microblogging. We have used five dimensions to categorise published medical related research on Twitter. This classification provides a framework within which researchers studying development and use of Twitter within medical related research, and those undertaking comparative studies of research relating to Twitter in the area of medicine and beyond, can position and ground their work.
Resumo:
The two-way relationship between Rossby Wave-Breaking (RWB) and intensification of extra tropical cyclones is analysed over the Euro-Atlantic sector. In particular, the timing, intensity and location of cyclone development are related to RWB occurrences. For this purpose, two potential-temperature based indices are used to detect and classify anticyclonic and cyclonic RWB episodes from ERA-40 Re-Analysis data. Results show that explosive cyclogenesis over the North Atlantic (NA) is fostered by enhanced occurrence of RWB on days prior to the cyclone’s maximum intensification. Under such conditions, the eddy-driven jet stream is accelerated over the NA, thus enhancing conditions for cyclogenesis. For explosive cyclogenesis over the eastern NA, enhanced cyclonic RWB over eastern Greenland and anticyclonic RWB over the sub-tropical NA are observed. Typically only one of these is present in any given case, with the RWB over eastern Greenland being more frequent than its southern counterpart. This leads to an intensification of the jet over the eastern NA and enhanced probability of windstorms reaching Western Europe. Explosive cyclones evolving under simultaneous RWB on both sides of the jet feature a higher mean intensity and deepening rates than cyclones preceded by a single RWB event. Explosive developments over the western NA are typically linked to a single area of enhanced cyclonic RWB over western Greenland. Here, the eddy-driven jet is accelerated over the western NA. Enhanced occurrence of cyclonic RWB over southern Greenland and anticyclonic RWB over Europe is also observed after explosive cyclogenesis, potentially leading to the onset of Scandinavian Blocking. However, only very intense developments have a considerable influence on the large-scale atmospheric flow. Non-explosive cyclones depict no sign of enhanced RWB over the whole NA area. We conclude that the links between RWB and cyclogenesis over the Euro-Atlantic sector are sensitive to the cyclone’s maximum intensity, deepening rate and location.
Resumo:
Full-waveform laser scanning data acquired with a Riegl LMS-Q560 instrument were used to classify an orange orchard into orange trees, grass and ground using waveform parameters alone. Gaussian decomposition was performed on this data capture from the National Airborne Field Experiment in November 2006 using a custom peak-detection procedure and a trust-region-reflective algorithm for fitting Gauss functions. Calibration was carried out using waveforms returned from a road surface, and the backscattering coefficient c was derived for every waveform peak. The processed data were then analysed according to the number of returns detected within each waveform and classified into three classes based on pulse width and c. For single-peak waveforms the scatterplot of c versus pulse width was used to distinguish between ground, grass and orange trees. In the case of multiple returns, the relationship between first (or first plus middle) and last return c values was used to separate ground from other targets. Refinement of this classification, and further sub-classification into grass and orange trees was performed using the c versus pulse width scatterplots of last returns. In all cases the separation was carried out using a decision tree with empirical relationships between the waveform parameters. Ground points were successfully separated from orange tree points. The most difficult class to separate and verify was grass, but those points in general corresponded well with the grass areas identified in the aerial photography. The overall accuracy reached 91%, using photography and relative elevation as ground truth. The overall accuracy for two classes, orange tree and combined class of grass and ground, yielded 95%. Finally, the backscattering coefficient c of single-peak waveforms was also used to derive reflectance values of the three classes. The reflectance of the orange tree class (0.31) and ground class (0.60) are consistent with published values at the wavelength of the Riegl scanner (1550 nm). The grass class reflectance (0.46) falls in between the other two classes as might be expected, as this class has a mixture of the contributions of both vegetation and ground reflectance properties.
Resumo:
We examine the effects of international and product diversification through mergers and acquisitions (M&As) on the firm's risk–return profile. We identify the rewards from different types of M&As and investigate whether becoming a global firm is a value-enhancing strategy. Drawing on the theoretical work of Vachani (Journal of International Business Studies, 22 (1991), pp. 307−222) and on Rugman and Verbeke's (Journal of International Business Studies, 35 (2004), pp. 3−18) metrics, we classify firms according to their degree of international and product diversification. To account for the endogeneity of M&As, we develop a panel vector autoregression. We find that global and host-region multinational enterprises benefit from cross-border M&As that reinforce their geographical footprint. Cross-industry M&As enhance the risk–return profile of home-region firms. This effect depends on the degree of product diversification. Hence there is no value-enhancing M&A strategy for home-region and bi-regional firms to become ‘truly global’.
Resumo:
Hydrological ensemble prediction systems (HEPS) have in recent years been increasingly used for the operational forecasting of floods by European hydrometeorological agencies. The most obvious advantage of HEPS is that more of the uncertainty in the modelling system can be assessed. In addition, ensemble prediction systems generally have better skill than deterministic systems both in the terms of the mean forecast performance and the potential forecasting of extreme events. Research efforts have so far mostly been devoted to the improvement of the physical and technical aspects of the model systems, such as increased resolution in time and space and better description of physical processes. Developments like these are certainly needed; however, in this paper we argue that there are other areas of HEPS that need urgent attention. This was also the result from a group exercise and a survey conducted to operational forecasters within the European Flood Awareness System (EFAS) to identify the top priorities of improvement regarding their own system. They turned out to span a range of areas, the most popular being to include verification of an assessment of past forecast performance, a multi-model approach for hydrological modelling, to increase the forecast skill on the medium range (>3 days) and more focus on education and training on the interpretation of forecasts. In light of limited resources, we suggest a simple model to classify the identified priorities in terms of their cost and complexity to decide in which order to tackle them. This model is then used to create an action plan of short-, medium- and long-term research priorities with the ultimate goal of an optimal improvement of EFAS in particular and to spur the development of operational HEPS in general.