19 resultados para New classification

em Aston University Research Archive


Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper, addresses the problem of novelty detection in the case that the observed data is a mixture of a known 'background' process contaminated with an unknown other process, which generates the outliers, or novel observations. The framework we describe here is quite general, employing univariate classification with incomplete information, based on knowledge of the distribution (the 'probability density function', 'pdf') of the data generated by the 'background' process. The relative proportion of this 'background' component (the 'prior' 'background' 'probability), the 'pdf' and the 'prior' probabilities of all other components are all assumed unknown. The main contribution is a new classification scheme that identifies the maximum proportion of observed data following the known 'background' distribution. The method exploits the Kolmogorov-Smirnov test to estimate the proportions, and afterwards data are Bayes optimally separated. Results, demonstrated with synthetic data, show that this approach can produce more reliable results than a standard novelty detection scheme. The classification algorithm is then applied to the problem of identifying outliers in the SIC2004 data set, in order to detect the radioactive release simulated in the 'oker' data set. We propose this method as a reliable means of novelty detection in the emergency situation which can also be used to identify outliers prior to the application of a more general automatic mapping algorithm. © Springer-Verlag 2007.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Neurodegenerative disorders are characterized by the formation of distinct pathological changes in the brain, including extracellular protein deposits, cellular inclusions, and changes in cell morphology. Since the earliest published descriptions of these disorders, diagnosis has been based on clinicopathological features, namely, the coexistence of a specific clinical profile together with the presence or absence of particular types of lesion. In addition, the molecular profile of lesions has become an increasingly important feature both in the diagnosis of existing disorders and in the description of new disease entities. Recent studies, however, have reported considerable overlap between the clinicopathological features of many disorders leading to difficulties in the diagnosis of individual cases and to calls for a new classification of neurodegenerative disease. This article discusses: (i) the nature and degree of the overlap between different neurodegenerative disorders and includes a discussion of Alzheimer's disease, dementia with Lewy bodies, the fronto-temporal dementias, and prion disease; (ii) the factors that contribute to disease overlap, including historical factors, the presence of disease heterogeneity, age-related changes, the problem of apolipoprotein genotype, and the co-occurrence of common diseases; and (iii) whether the current nosological status of disorders should be reconsidered.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The IRDS standard is an international standard produced by the International Organisation for Standardisation (ISO). In this work the process for producing standards in formal standards organisations, for example the ISO, and in more informal bodies, for example the Object Management Group (OMG), is examined. This thesis examines previous models and classifications of standards. The previous models and classifications are then combined to produce a new classification. The IRDS standard is then placed in a class in the new model as a reference anticipatory standard. Anticipatory standards are standards which are developed ahead of the technology in order to attempt to guide the market. The diffusion of the IRDS is traced over a period of eleven years. The economic conditions which affect the diffusion of standards are examined, particularly the economic conditions which prevail in compatibility markets such as the IT and ICT markets. Additionally the consequences of the introduction of gateway or converter devices into a market where a standard has not yet been established is examined. The IRDS standard did not have an installed base and this hindered its diffusion. The thesis concludes that the IRDS standard was overtaken by new developments such as object oriented technologies and middleware. This was partly because of the slow development process of developing standards in traditional organisations which operate on a consensus basis and partly because the IRDS standard did not have an installed base. Also the rise and proliferation of middleware products resulted in exchange mechanisms becoming dominant rather than repository solutions. The research method used in this work is a longitudinal study of the development and diffusion of the ISO/EEC IRDS standard. The research is regarded as a single case study and follows the interpretative epistemological point of view.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The project described in this thesis investigates the needs of a group of people working cooperatively in an OSI environment, and recommends tools and services to meet these needs. The project looks specifically at Services for Activities in Group Editing, and is identified as the `SAGE' project. The project uses case studies to identify user requirements and to determine common functionalities for a variety of group editing activities. A prototype is implemented in an X.400 environment to help refine user requirements, as a source of new ideas and to test the proposed functionalities. The conceptual modelling follows current CCITT proposals, but a new classification of group activities is proposed: Informative, Objective and Supportive application groups. It is proposed that each of these application groups have their own Service Agent. Use of this classification allows the possibility of developing three sets of tools which will cover a wide range of group activities, rather than developing tools for individual activities. Group editing is considered to be in the Supportive application group. A set of additional services and tools to support group editing are proposed in the context of the CCITT draft on group communication, X.gc. The proposed services and tools are mapped onto the X.400 series of recommendations, with the Abstract Service Definition of the operational objects defined, along with their associated component files, by extending the X.420 protocol functionality. It is proposed that each of the Informative, Objective and Supportive application groups should be implemented as a modified X.420 inter-personal messaging system.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The object of this project was to identify those elements of management practice which characterised firms in the West Midlands Road Transport Industry. The object being to establish the contents of what might be termed a management policy portfolio for growth. The First Phase was the review of those factors which were generally accepted as having an influence on the success rate of transport firms in order to ascertain if they explained observed patterns. Secondly, if this were not the case, to instigate a field work study to isolate those policies which were associated with growth organizations. Investigation of the vehicle movements for the entire West Midlands Fleet over a complete licence cycle suggested that conventional explanations could not fully account for the observed patterns. To carry out the second phase of the study a sample of growth firms were visited in order to measure their attitudes on a range of factors hypothesised to affect growth. Field data were analysed to establish management activities over a wide range of areas and the results further investigated through a Principal Components and Cluster Analysis programme. The outcome of the study indicates that some past attitudes on the skills and attitudes of transport managers may have to be re-examined. As a result, the project produced a new classification of road transport firms based not on the conventional categories of long and short haul, or the types of traffics carried, but on the marketing policies and management skills employed within the organization.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the problem of assigning an input vector bfx to one of m classes by predicting P(c|bfx) for c = 1, ldots, m. For a two-class problem, the probability of class 1 given bfx is estimated by s(y(bfx)), where s(y) = 1/(1 + e-y). A Gaussian process prior is placed on y(bfx), and is combined with the training data to obtain predictions for new bfx points. We provide a Bayesian treatment, integrating over uncertainty in y and in the parameters that control the Gaussian process prior; the necessary integration over y is carried out using Laplace's approximation. The method is generalized to multi-class problems (m >2) using the softmax function. We demonstrate the effectiveness of the method on a number of datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

While the retrieval of existing designs to prevent unnecessary duplication of parts is a recognised strategy in the control of design costs the available techniques to achieve this, even in product data management systems, are limited in performance or require large resources. A novel system has been developed based on a new version of an existing coding system (CAMAC) that allows automatic coding of engineering drawings and their subsequent retrieval using a drawing of the desired component as the input. The ability to find designs using a detail drawing rather than textual descriptions is a significant achievement in itself. Previous testing of the system has demonstrated this capability but if a means could be found to find parts from a simple sketch then its practical application would be much more effective. This paper describes the development and testing of such a search capability using a database of over 3000 engineering components.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the problem of assigning an input vector to one of m classes by predicting P(c|x) for c=1,...,m. For a two-class problem, the probability of class one given x is estimated by s(y(x)), where s(y)=1/(1+e-y). A Gaussian process prior is placed on y(x), and is combined with the training data to obtain predictions for new x points. We provide a Bayesian treatment, integrating over uncertainty in y and in the parameters that control the Gaussian process prior the necessary integration over y is carried out using Laplace's approximation. The method is generalized to multiclass problems (m>2) using the softmax function. We demonstrate the effectiveness of the method on a number of datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis presents a number of methodological developments that were raised by a real life application to measuring the efficiency of bank branches. The advent of internet banking and phone banking is changing the role of bank branches from a predominantly transaction-based one to a sales-oriented role. This fact requires the development of new forms of assessing and comparing branches of a bank. In addition, performance assessment models must also take into account the fact that bank branches are service and for-profit organisations to which providing adequate service quality as well as being profitable are crucial objectives. This study analyses bank branches performance in their new roles in three different areas: their effectiveness in fostering the use of new transaction channels such as the internet and the telephone (transactional efficiency); their effectiveness in increasing sales and their customer base (operational efficiency); and their effectiveness in generating profits without compromising the quality of service (profit efficiency). The chosen methodology for the overall analysis is Data Envelopment Analysis (DEA). The application attempted here required some adaptations to existing DEA models and indeed some new models so that some specialities of our data could be handled. These concern the development of models that can account for negative data, the development of models to measure profit efficiency, and the development of models that yield production units with targets that are nearer to their observed levels than targets yielded by traditional DEA models. The application of the developed models to a sample of Portuguese bank branches allowed their classification according to the three performance dimensions (transactional, operational and profit efficiency). It also provided useful insights to bank managers regarding how bank branches compare between themselves in terms of their performance, and how, in general, the three performance dimensions are connected between themselves.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis presents a thorough and principled investigation into the application of artificial neural networks to the biological monitoring of freshwater. It contains original ideas on the classification and interpretation of benthic macroinvertebrates, and aims to demonstrate their superiority over the biotic systems currently used in the UK to report river water quality. The conceptual basis of a new biological classification system is described, and a full review and analysis of a number of river data sets is presented. The biological classification is compared to the common biotic systems using data from the Upper Trent catchment. This data contained 292 expertly classified invertebrate samples identified to mixed taxonomic levels. The neural network experimental work concentrates on the classification of the invertebrate samples into biological class, where only a subset of the sample is used to form the classification. Other experimentation is conducted into the identification of novel input samples, the classification of samples from different biotopes and the use of prior information in the neural network models. The biological classification is shown to provide an intuitive interpretation of a graphical representation, generated without reference to the class labels, of the Upper Trent data. The selection of key indicator taxa is considered using three different approaches; one novel, one from information theory and one from classical statistical methods. Good indicators of quality class based on these analyses are found to be in good agreement with those chosen by a domain expert. The change in information associated with different levels of identification and enumeration of taxa is quantified. The feasibility of using neural network classifiers and predictors to develop numeric criteria for the biological assessment of sediment contamination in the Great Lakes is also investigated.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We address the important bioinformatics problem of predicting protein function from a protein's primary sequence. We consider the functional classification of G-Protein-Coupled Receptors (GPCRs), whose functions are specified in a class hierarchy. We tackle this task using a novel top-down hierarchical classification system where, for each node in the class hierarchy, the predictor attributes to be used in that node and the classifier to be applied to the selected attributes are chosen in a data-driven manner. Compared with a previous hierarchical classification system selecting classifiers only, our new system significantly reduced processing time without significantly sacrificing predictive accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a novel framework where an initial classifier is learned by incorporating prior information extracted from an existing sentiment lexicon. Preferences on expectations of sentiment labels of those lexicon words are expressed using generalized expectation criteria. Documents classified with high confidence are then used as pseudo-labeled examples for automatical domain-specific feature acquisition. The word-class distributions of such self-learned features are estimated from the pseudo-labeled examples and are used to train another classifier by constraining the model's predictions on unlabeled instances. Experiments on both the movie review data and the multi-domain sentiment dataset show that our approach attains comparable or better performance than exiting weakly-supervised sentiment classification methods despite using no labeled documents.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There has been considerable recent research into the connection between Parkinson's disease (PD) and speech impairment. Recently, a wide range of speech signal processing algorithms (dysphonia measures) aiming to predict PD symptom severity using speech signals have been introduced. In this paper, we test how accurately these novel algorithms can be used to discriminate PD subjects from healthy controls. In total, we compute 132 dysphonia measures from sustained vowels. Then, we select four parsimonious subsets of these dysphonia measures using four feature selection algorithms, and map these feature subsets to a binary classification response using two statistical classifiers: random forests and support vector machines. We use an existing database consisting of 263 samples from 43 subjects, and demonstrate that these new dysphonia measures can outperform state-of-the-art results, reaching almost 99% overall classification accuracy using only ten dysphonia features. We find that some of the recently proposed dysphonia measures complement existing algorithms in maximizing the ability of the classifiers to discriminate healthy controls from PD subjects. We see these results as an important step toward noninvasive diagnostic decision support in PD.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

MOTIVATION: There is much interest in reducing the complexity inherent in the representation of the 20 standard amino acids within bioinformatics algorithms by developing a so-called reduced alphabet. Although there is no universally applicable residue grouping, there are numerous physiochemical criteria upon which one can base groupings. Local descriptors are a form of alignment-free analysis, the efficiency of which is dependent upon the correct selection of amino acid groupings. RESULTS: Within the context of G-protein coupled receptor (GPCR) classification, an optimization algorithm was developed, which was able to identify the most efficient grouping when used to generate local descriptors. The algorithm was inspired by the relatively new computational intelligence paradigm of artificial immune systems. A number of amino acid groupings produced by this algorithm were evaluated with respect to their ability to generate local descriptors capable of providing an accurate classification algorithm for GPCRs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Quantitative structure–activity relationship (QSAR) analysis is a main cornerstone of modern informatic disciplines. Predictive computational models, based on QSAR technology, of peptide-major histocompatibility complex (MHC) binding affinity have now become a vital component of modern day computational immunovaccinology. Historically, such approaches have been built around semi-qualitative, classification methods, but these are now giving way to quantitative regression methods. The additive method, an established immunoinformatics technique for the quantitative prediction of peptide–protein affinity, was used here to identify the sequence dependence of peptide binding specificity for three mouse class I MHC alleles: H2–Db, H2–Kb and H2–Kk. As we show, in terms of reliability the resulting models represent a significant advance on existing methods. They can be used for the accurate prediction of T-cell epitopes and are freely available online (http://www.jenner.ac.uk/MHCPred).