20 resultados para Stepwise Discriminant Analysis

em Aston University Research Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Discriminant analysis (also known as discriminant function analysis or multiple discriminant analysis) is a multivariate statistical method of testing the degree to which two or more populations may overlap with each other. It was devised independently by several statisticians including Fisher, Mahalanobis, and Hotelling ). The technique has several possible applications in Microbiology. First, in a clinical microbiological setting, if two different infectious diseases were defined by a number of clinical and pathological variables, it may be useful to decide which measurements were the most effective at distinguishing between the two diseases. Second, in an environmental microbiological setting, the technique could be used to study the relationships between different populations, e.g., to what extent do the properties of soils in which the bacterium Azotobacter is found differ from those in which it is absent? Third, the method can be used as a multivariate ‘t’ test , i.e., given a number of related measurements on two groups, the analysis can provide a single test of the hypothesis that the two populations have the same means for all the variables studied. This statnote describes one of the most popular applications of discriminant analysis in identifying the descriptive variables that can distinguish between two populations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The accurate in silico identification of T-cell epitopes is a critical step in the development of peptide-based vaccines, reagents, and diagnostics. It has a direct impact on the success of subsequent experimental work. Epitopes arise as a consequence of complex proteolytic processing within the cell. Prior to being recognized by T cells, an epitope is presented on the cell surface as a complex with a major histocompatibility complex (MHC) protein. A prerequisite therefore for T-cell recognition is that an epitope is also a good MHC binder. Thus, T-cell epitope prediction overlaps strongly with the prediction of MHC binding. In the present study, we compare discriminant analysis and multiple linear regression as algorithmic engines for the definition of quantitative matrices for binding affinity prediction. We apply these methods to peptides which bind the well-studied human MHC allele HLA-A*0201. A matrix which results from combining results of the two methods proved powerfully predictive under cross-validation. The new matrix was also tested on an external set of 160 binders to HLA-A*0201; it was able to recognize 135 (84%) of them.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most existing color-based tracking algorithms utilize the statistical color information of the object as the tracking clues, without maintaining the spatial structure within a single chromatic image. Recently, the researches on the multilinear algebra provide the possibility to hold the spatial structural relationship in a representation of the image ensembles. In this paper, a third-order color tensor is constructed to represent the object to be tracked. Considering the influence of the environment changing on the tracking, the biased discriminant analysis (BDA) is extended to the tensor biased discriminant analysis (TBDA) for distinguishing the object from the background. At the same time, an incremental scheme for the TBDA is developed for the tensor biased discriminant subspace online learning, which can be used to adapt to the appearance variant of both the object and background. The experimental results show that the proposed method can track objects precisely undergoing large pose, scale and lighting changes, as well as partial occlusion. © 2009 Elsevier B.V.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This retrospective study was designed to investigate the factors that influence performance in examinations comprised of multiple-choice questions (MCQs), short-answer questions (SAQs), and essay questions in an undergraduate population. Final year optometry degree examination marks were analyzed for two separate cohorts. Direct comparison found that students performed better in MCQs than essays. However, forward stepwise regression analysis of module marks compared with the overall score showed that MCQs were the least influential, and the essay or SAQ mark was a more reliable predictor of overall grade. This has implications for examination design.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Risk and knowledge are two concepts and components of business management which have so far been studied almost independently. This is especially true where risk management (RM) is conceived mainly in financial terms, as for example, in the financial institutions sector. Financial institutions are affected by internal and external changes with the consequent accommodation to new business models, new regulations and new global competition that includes new big players. These changes induce financial institutions to develop different methodologies for managing risk, such as the enterprise risk management (ERM) approach, in order to adopt a holistic view of risk management and, consequently, to deal with different types of risk, levels of risk appetite, and policies in risk management. However, the methodologies for analysing risk do not explicitly include knowledge management (KM). This research examines the potential relationships between KM and two RM concepts: perceived quality of risk control and perceived value of ERM. To fulfill the objective of identifying how KM concepts can have a positive influence on some RM concepts, a literature review of KM and its processes and RM and its processes was performed. From this literature review eight hypotheses were analysed using a classification into people, process and technology variables. The data for this research was gathered from a survey applied to risk management employees in financial institutions and 121 answers were analysed. The analysis of the data was based on multivariate techniques, more specifically stepwise regression analysis. The results showed that the perceived quality of risk control is significantly associated with the variables: perceived quality of risk knowledge sharing, perceived quality of communication among people, web channel functionality, and risk management information system functionality. However, the relationships of the KM variables to the perceived value of ERM are not identified because of the low performance of the models describing these relationships. The analysis reveals important insights into the potential KM support to RM such as: the better adoption of KM people and technology actions, the better the perceived quality of risk control. Equally, the results suggest that the quality of risk control and the benefits of ERM follow different patterns given that there is no correlation between both concepts and the distinct influence of the KM variables in each concept. The ERM scenario is different from that of risk control because ERM, as an answer to RM failures and adaptation to new regulation in financial institutions, has led organizations to adopt new processes, technologies, and governance models. Thus, the search for factors influencing the perceived value of ERM implementation needs additional analysis because what is improved in RM processes individually is not having the same effect on the perceived value of ERM. Based on these model results and the literature review the basis of the ERKMAS (Enterprise Risk Knowledge Management System) is presented.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The relationship between the daily deposition of soredia of Hypogymnia physodes (L.) Nyl. and local climatic records was studied in the field during three periods at a site in Seattle, WA, U.S.A: (1) 11 August – 16 September 1986 (Study A); (2) 16 December – 11 January 1987 (Study B) and (3) 8 July 1988 – 30 January 1989 (Study C). The soredia were trapped on adhesive strips placed at various locations on a Prunus blireiana L. tree for 24 hr periods. A correlation matrix of the data from all three studies revealed a negative correlation between soredial deposition and relative humidity; and a positive correlation with rainfall and temperature. A multiple regression and forward stepwise regression analysis selected relative humidity as the most significant climatic variable, i.e. more soredia tended to be deposited when relative humidity was low. Analysis of individual studies by multiple regression revealed: (1) no significant relationships between soredial deposition and climate in Study A; (2) positive relationships with temperature and wind speed in Study B and (3) positive relationships with wind speed and rainfall in the summer/autumn months of Study C; in the winter months no relationships with climate were found because few soredia were deposited. The data suggest that in the field seasonal photoperiod differences combined with moderately high temperatures and high relative humidity may promote soredial formation and accumulation on thalli prior to soredia dispersal. In addition, low relative humidity may promote soredial release while wind and raindrops may be possible agents of dispersal.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis describes the development of a complete data visualisation system for large tabular databases, such as those commonly found in a business environment. A state-of-the-art 'cyberspace cell' data visualisation technique was investigated and a powerful visualisation system using it was implemented. Although allowing databases to be explored and conclusions drawn, it had several drawbacks, the majority of which were due to the three-dimensional nature of the visualisation. A novel two-dimensional generic visualisation system, known as MADEN, was then developed and implemented, based upon a 2-D matrix of 'density plots'. MADEN allows an entire high-dimensional database to be visualised in one window, while permitting close analysis in 'enlargement' windows. Selections of records can be made and examined, and dependencies between fields can be investigated in detail. MADEN was used as a tool for investigating and assessing many data processing algorithms, firstly data-reducing (clustering) methods, then dimensionality-reducing techniques. These included a new 'directed' form of principal components analysis, several novel applications of artificial neural networks, and discriminant analysis techniques which illustrated how groups within a database can be separated. To illustrate the power of the system, MADEN was used to explore customer databases from two financial institutions, resulting in a number of discoveries which would be of interest to a marketing manager. Finally, the database of results from the 1992 UK Research Assessment Exercise was analysed. Using MADEN allowed both universities and disciplines to be graphically compared, and supplied some startling revelations, including empirical evidence of the 'Oxbridge factor'.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The thesis began as a study of new firm formation. Preliminary research suggested that infant death rate was considered to be a closely related problem and the search was for a theory of new firm formation which would explain both. The thesis finds theories of exit and entry inadequate in this respect and focusses instead on theories of entrepreneurship, particularly those which concentrate on entrepreneurship as an agent of change. The role of information is found to be fundamental to economic change and an understanding of information generation and dissemination and the nature and direction of information flows is postulated to lead coterminously to an understanding of entrepreneurhsip and economic change. The economics of information is applied to theories of entrepreneurhsip and some testable hypotheses are derived. The testing relies on etablishing and measuring the information bases of the founders of new firms and then testing for certain hypothesised differences between the information bases of survivors and non-survivors. No theory of entrepreneurship is likely to be straightforwardly testable and many postulates have to be established to bring the theory to a testable stage. A questionnaire is used to gather information from a sample of firms taken from a new micro-data set established as part of the work of the thesis. Discriminant Analysis establishes the variables which best distinguish between survivors and non-survivors. The variables which emerge as important discriminators are consistent with the theory which the analysis is testing. While there are alternative interpretations of the important variables, collective consistency with the theory under test is established. The thesis concludes with an examination of the implications of the theory for policy towards stimulating new firm formation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Since the Second World War a range of policies have been implemented by central and local government agencies, with a view to improving accessibility to facilities, housing and employment opportunities within rural areas. It has been suggested that a lack of reasonable access to a range of such facilities and opportunities constitutes a key aspect of deprivation or disadvantage for rural residents. Despite considerable interest, very few attempts have been made to assess the nature and incidence of this disadvantage or the reaction of different sections of the population of rural areas to it. Moreover, almost all previous assessments have relied on so-called 'objective' measures of accessibility and disadvantage and failed to consider the relationship between such measures and 'subjective' measures such as individual perceptions. It is this gap in knowledge that the research described in this thesis has addressed. Following a critical review of relevant literature the thesis describes the way in which data on 'objective' and 'subjective' indicators of accessibility and behavioural responses to accessibility problems was collected, in six case study areas in Shropshire. Analysis of this data indicates that planning and other government policies have failed to significantly improve rural resident's accessibility to their basic requirements, and may in some cases have exacerbated it, and that as a result certain sections of the rural population are relatively disadvantaged. Moreover, analysis shows that .certain aspects of individual subjective' assessments of such accessibility disadvantage are significantly associated with more easily-obtained 'objective' measures. By using discriminant analysis the research demonstrates that it is possible to predict the likely levels of satisfaction with access to facilities from a range of 'objective' measures. The research concludes by highlighting the potential practical applications of such indicators in policy formulation, policy appraisal and policy evaluation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Decomposition of domestic wastes in an anaerobic environment results in the production of landfill gas. Public concern about landfill disposal and particularly the production of landfill gas has been heightened over the past decade. This has been due in large to the increased quantities of gas being generated as a result of modern disposal techniques, and also to their increasing effect on modern urban developments. In order to avert diasters, effective means of preventing gas migration are required. This, in turn requires accurate detection and monitoring of gas in the subsurface. Point sampling techniques have many drawbacks, and accurate measurement of gas is difficult. Some of the disadvantages of these techniques could be overcome by assessing the impact of gas on biological systems. This research explores the effects of landfill gas on plants, and hence on the spectral response of vegetation canopies. Examination of the landfill gas/vegetation relationship is covered, both by review of the literature and statistical analysis of field data. The work showed that, although vegetation health was related to landfill gas, it was not possible to define a simple correlation. In the landfill environment, contribution from other variables, such as soil characteristics, frequently confused the relationship. Two sites are investigated in detail, the sites contrasting in terms of the data available, site conditions, and the degree of damage to vegetation. Gas migration at the Panshanger site was dominantly upwards, affecting crops being grown on the landfill cap. The injury was expressed as an overall decline in plant health. Discriminant analysis was used to account for the variations in plant health, and hence the differences in spectral response of the crop canopy, using a combination of soil and gas variables. Damage to both woodland and crops at the Ware site was severe, and could be easily related to the presence of gas. Air photographs, aerial video, and airborne thematic mapper data were used to identify damage to vegetation, and relate this to soil type. The utility of different sensors for this type of application is assessed, and possible improvements that could lead to more widespread use are identified. The situations in which remote sensing data could be combined with ground survey are identified. In addition, a possible methodology for integrating the two approaches is suggested.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

With business incubators deemed as a potent infrastructural element for entrepreneurship development, business incubation management practice and performance have received widespread attention. However, despite this surge of interest, scholars have questioned the extent to which business incubation delivers added value. Thus, there is a growing awareness among researchers, practitioners and policy makers of the need for more rigorous evaluation of the business incubation output performance. Aligned to this is an increasing demand for benchmarking business incubation input/process performance and highlighting best practice. This paper offers a business incubation assessment framework, which considers input/process and output performance domains with relevant indicators. This tool adds value on different levels. It has been developed in collaboration with practitioners and industry experts and therefore it would be relevant and useful to business incubation managers. Once a large enough database of completed questionnaires has been populated on an online platform managed by a coordinating mechanism, such as a business incubation membership association, business incubator managers can reflect on their practices by using this assessment framework to learn their relative position vis-à-vis their peers against each domain. This will enable them to align with best practice in this field. Beyond implications for business incubation management practice, this performance assessment framework would also be useful to researchers and policy makers concerned with business incubation management practice and impact. Future large-scale research could test for construct validity and reliability. Also, discriminant analysis could help link input and process indicators with output measures.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: Allergy is a form of hypersensitivity to normally innocuous substances, such as dust, pollen, foods or drugs. Allergens are small antigens that commonly provoke an IgE antibody response. There are two types of bioinformatics-based allergen prediction. The first approach follows FAO/WHO Codex alimentarius guidelines and searches for sequence similarity. The second approach is based on identifying conserved allergenicity-related linear motifs. Both approaches assume that allergenicity is a linearly coded property. In the present study, we applied ACC pre-processing to sets of known allergens, developing alignment-independent models for allergen recognition based on the main chemical properties of amino acid sequences.Results: A set of 684 food, 1,156 inhalant and 555 toxin allergens was collected from several databases. A set of non-allergens from the same species were selected to mirror the allergen set. The amino acids in the protein sequences were described by three z-descriptors (z1, z2 and z3) and by auto- and cross-covariance (ACC) transformation were converted into uniform vectors. Each protein was presented as a vector of 45 variables. Five machine learning methods for classification were applied in the study to derive models for allergen prediction. The methods were: discriminant analysis by partial least squares (DA-PLS), logistic regression (LR), decision tree (DT), naïve Bayes (NB) and k nearest neighbours (kNN). The best performing model was derived by kNN at k = 3. It was optimized, cross-validated and implemented in a server named AllerTOP, freely accessible at http://www.pharmfac.net/allertop. AllerTOP also predicts the most probable route of exposure. In comparison to other servers for allergen prediction, AllerTOP outperforms them with 94% sensitivity.Conclusions: AllerTOP is the first alignment-free server for in silico prediction of allergens based on the main physicochemical properties of proteins. Significantly, as well allergenicity AllerTOP is able to predict the route of allergen exposure: food, inhalant or toxin. © 2013 Dimitrov et al.; licensee BioMed Central Ltd.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Circulating low density lipoproteins (LDL) are thought to play a crucial role in the onset and development of atherosclerosis, though the detailed molecular mechanisms responsible for their biological effects remain controversial. The complexity of biomolecules (lipids, glycans and protein) and structural features (isoforms and chemical modifications) found in LDL particles hampers the complete understanding of the mechanism underlying its atherogenicity. For this reason the screening of LDL for features discriminative of a particular pathology in search of biomarkers is of high importance. Three major biomolecule classes (lipids, protein and glycans) in LDL particles were screened using mass spectrometry coupled to liquid chromatography. Dual-polarity screening resulted in good lipidome coverage, identifying over 300 lipid species from 12 lipid sub-classes. Multivariate analysis was used to investigate potential discriminators in the individual lipid sub-classes for different study groups (age, gender, pathology). Additionally, the high protein sequence coverage of ApoB-100 routinely achieved (≥70%) assisted in the search for protein modifications correlating to aging and pathology. The large size and complexity of the datasets required the use of chemometric methods (Partial Least Square-Discriminant Analysis, PLS-DA) for their analysis and for the identification of ions that discriminate between study groups. The peptide profile from enzymatically digested ApoB-100 can be correlated with the high structural complexity of lipids associated with ApoB-100 using exploratory data analysis. In addition, using targeted scanning modes, glycosylation sites within neutral and acidic sugar residues in ApoB-100 are also being explored. Together or individually, knowledge of the profiles and modifications of the major biomolecules in LDL particles will contribute towards an in-depth understanding, will help to map the structural features that contribute to the atherogenicity of LDL, and may allow identification of reliable, pathology-specific biomarkers. This research was supported by a Marie Curie Intra-European Fellowship within the 7th European Community Framework Program (IEF 255076). Work of A. Rudnitskaya was supported by Portuguese Science and Technology Foundation, through the European Social Fund (ESF) and "Programa Operacional Potencial Humano - POPH".

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Growth in availability and ability of modern statistical software has resulted in greater numbers of research techniques being applied across the marketing discipline. However, with such advances come concerns that techniques may be misinterpreted by researchers. This issue is critical since misinterpretation could cause erroneous findings. This paper investigates some assumptions regarding: 1) the assessment of discriminant validity; and 2) what confirmatory factor analysis accomplishes. Examples that address these points are presented, and some procedural remedies are suggested based upon the literature. This paper is, therefore, primarily concerned with the development of measurement theory and practice. If advances in theory development are not based upon sound methodological practice, we as researchers could be basing our work upon shaky foundations.