937 resultados para principal components analysis (PCA) algorithm
Resumo:
The constancy of phenotypic variation and covariation is an assumption that underlies most recent investigations of past selective regimes and attempts to predict future responses to selection. Few studies have tested this assumption of constancy despite good reasons to expect that the pattern of phenotypic variation and covariation may vary in space and time. We compared phenotypic variance-covariance matrices (P) estimated for Populations of six species of distantly related coral reef fishes sampled at two locations on Australia's Great Barrier Reef separated by more than 1000 km. The intraspecific similarity between these matrices was estimated using two methods: matrix correlation and common principal component analysis. Although there was no evidence of equality between pairs of P, both statistical approaches indicated a high degree of similarity in morphology between the two populations for each species. In general, the hierarchical decomposition of the variance-covariance structure of these populations indicated that all principal components of phenotypic variance-covariance were shared but that they differed in the degree of variation associated with each of these components. The consistency of this pattern is remarkable given the diversity of morphologies and life histories encompassed by these species. Although some phenotypic instability was indicated, these results were consistent with a generally conserved pattern of multivariate selection between populations.
Resumo:
Quality of life has been shown to be poor among people living with chronic hepatitis C However, it is not clear how this relates to the presence of symptoms and their severity. The aim of this study was to describe the typology of a broad array of symptoms that were attributed to hepatitis C virus (HCV) infection. Phase I used qualitative methods to identify symptoms. In Phase 2, 188 treatment-naive people living with HCV participated in a quantitative survey. The most prevalent symptom was physical tiredness (86%) followed by irritability (75%), depression (70%), mental tiredness (70%), and abdominal pain (68%). Temporal clustering of symptoms was reported in 62% of participants. Principal components analysis identified four symptom clusters: neuropsychiatric (mental tiredness, poor concentration, forgetfulness, depression, irritability, physical tiredness, and sleep problems); gastrointestinal (day sweats, nausea, food intolerance, night sweats, abdominal pain, poor appetite, and diarrhea); algesic (joint pain, muscle pain, and general body pain); and dysesthetic (noise sensitivity, light sensitivity, skin. problems, and headaches). These data demonstrate that symptoms are prevalent in treatment-naive people with HCV and support the hypothesis that symptom clustering occurs.
Resumo:
In 2001/02 five case study communities in both metropolitan and regional urban locations in Australia were chosen as test sites to develop measures of community strength on four domains: natural capital; produced economic capital; human capital; and social and institutional capital. Secondary data sources were used to develop measures on the first three domains. For the fourth domain social and institutional capital primary data collection was undertaken through sample surveys of households. A structured approach was devised. This involved developing a survey instrument using scaled items relating to four elements: formal norms; informal norms; formal structures; and informal structures which embrace the concepts of trust, reciprocity, bonds, bridges, links and networks in the interaction of individuals with their community inherent in the notion social capital. Exploratory principal components analysis was used to identify factors that measure those aspects of social and institutional capital, with confirmatory analysis conducted using Cronbach's Alpha. This enabled the construction of four primary scales and 15 sub-scales as a tool for measuring social and institutional capital. Further analyses reveals that two measures anomie and perceived quality of life and wellbeing relate to certain primary scales of social capital.
Resumo:
Multidimensional compound optimization is a new paradigm in the drug discovery process, yielding efficiencies during early stages and reducing attrition in the later stages of drug development. The success of this strategy relies heavily on understanding this multidimensional data and extracting useful information from it. This paper demonstrates how principled visualization algorithms can be used to understand and explore a large data set created in the early stages of drug discovery. The experiments presented are performed on a real-world data set comprising biological activity data and some whole-molecular physicochemical properties. Data visualization is a popular way of presenting complex data in a simpler form. We have applied powerful principled visualization methods, such as generative topographic mapping (GTM) and hierarchical GTM (HGTM), to help the domain experts (screening scientists, chemists, biologists, etc.) understand and draw meaningful decisions. We also benchmark these principled methods against relatively better known visualization approaches, principal component analysis (PCA), Sammon's mapping, and self-organizing maps (SOMs), to demonstrate their enhanced power to help the user visualize the large multidimensional data sets one has to deal with during the early stages of the drug discovery process. The results reported clearly show that the GTM and HGTM algorithms allow the user to cluster active compounds for different targets and understand them better than the benchmarks. An interactive software tool supporting these visualization algorithms was provided to the domain experts. The tool facilitates the domain experts by exploration of the projection obtained from the visualization algorithms providing facilities such as parallel coordinate plots, magnification factors, directional curvatures, and integration with industry standard software. © 2006 American Chemical Society.
Resumo:
Background: The MacDQoL is an individualised measure of the impact of macular degeneration (MD) on quality of life (QoL). There is preliminary evidence of its psychometric properties and sensitivity to severity of MD. The aim of this study was to carry out further psychometric evaluation with a larger sample and investigate the measure's sensitivity to MD severity. Methods: Patients with MD (n = 156: 99 women, 57 men, mean age 79 ± 13 years), recruited from eye clinics (one NHS, one private) completed the MacDQoL by telephone interview and later underwent a clinic vision assessment including near and distance visual acuity (VA), comfortable near VA, contrast sensitivity, colour recognition, recovery from glare and presence or absence of distortion or scotoma in the central 10° of the visual field. Results: The completion rate for the MacDQoL items was 99.8%. Of the 26 items, three were dropped from the measure due to redundancy. A fourth was retained in the questionnaire but excluded when computing the scale score. Principal components analysis and Cronbach's alpha (0.944) supported combining the remaining 22 items in a single scale. Lower MacDQoL scores, indicating more negative impact of MD on QoL, were associated with poorer distance VA (better eye r = -0.431 p < 0.001; worse eye r = -0.350 p < 0.001; binocular vision r = -0.419 p < 0.001) and near VA (better eye r -0.326 p < 0.001; worse eye r = -0.226 p < 0.001; binocular vision r = -0.326 p < 0.001). Poorer MacDQoL scores were associated with poorer contrast sensitivity (better eye r = 0.392 p < 0.001; binocular vision r = 0.423 p < 0.001), poorer colour recognition (r = 0.417 p < 0.001) and poorer comfortable near VA (r = -0.283, p < 0.001). The MacDQoL differentiated between those with and without binocular scotoma (U = 1244 p < 0.001). Conclusion: The MacDQoL 22-item scale has excellent internal consistency reliability and a single-factor structure. The measure is acceptable to respondents and the generic QoL item, MD-specific QoL item and average weighted impact score are related to several measures of vision. The MacDQoL demonstrates that MD has considerable negative impact on many aspects of QoL, particularly independence, leisure activities, dealing with personal affairs and mobility. The measure may be valuable for use in clinical trials and routine clinical care. © 2005 Mitchell et al; licensee BioMed Central Ltd.
Resumo:
Exploratory analysis of data in all sciences seeks to find common patterns to gain insights into the structure and distribution of the data. Typically visualisation methods like principal components analysis are used but these methods are not easily able to deal with missing data nor can they capture non-linear structure in the data. One approach to discovering complex, non-linear structure in the data is through the use of linked plots, or brushing, while ignoring the missing data. In this technical report we discuss a complementary approach based on a non-linear probabilistic model. The generative topographic mapping enables the visualisation of the effects of very many variables on a single plot, which is able to incorporate far more structure than a two dimensional principal components plot could, and deal at the same time with missing data. We show that using the generative topographic mapping provides us with an optimal method to explore the data while being able to replace missing values in a dataset, particularly where a large proportion of the data is missing.
Resumo:
Today, the data available to tackle many scientific challenges is vast in quantity and diverse in nature. The exploration of heterogeneous information spaces requires suitable mining algorithms as well as effective visual interfaces. miniDVMS v1.8 provides a flexible visual data mining framework which combines advanced projection algorithms developed in the machine learning domain and visual techniques developed in the information visualisation domain. The advantage of this interface is that the user is directly involved in the data mining process. Principled projection methods, such as generative topographic mapping (GTM) and hierarchical GTM (HGTM), are integrated with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates, and user interaction facilities, to provide this integrated visual data mining framework. The software also supports conventional visualisation techniques such as principal component analysis (PCA), Neuroscale, and PhiVis. This user manual gives an overview of the purpose of the software tool, highlights some of the issues to be taken care while creating a new model, and provides information about how to install and use the tool. The user manual does not require the readers to have familiarity with the algorithms it implements. Basic computing skills are enough to operate the software.
Resumo:
This article examines female response to gender role portrayals in advertising for Ukraine and Turkey. Being both new potential EU candidates, we argue that gender stereotype could also be used as a \u2018barometer\u2019 of progress and closure towards a more generally accepted EU behaviour against women. While their history remains different, both from a political and society values point of views, constraints are currently being faced that require convergence or justification of practices and understanding. Principal components analysis is employed over 290 questionnaires to identify the underlying dimensions. Results indicate overall similarities in perceptions, fragmentation within groups, but seem to provide divergence regarding thresholds.
Resumo:
A visualization plot of a data set of molecular data is a useful tool for gaining insight into a set of molecules. In chemoinformatics, most visualization plots are of molecular descriptors, and the statistical model most often used to produce a visualization is principal component analysis (PCA). This paper takes PCA, together with four other statistical models (NeuroScale, GTM, LTM, and LTM-LIN), and evaluates their ability to produce clustering in visualizations not of molecular descriptors but of molecular fingerprints. Two different tasks are addressed: understanding structural information (particularly combinatorial libraries) and relating structure to activity. The quality of the visualizations is compared both subjectively (by visual inspection) and objectively (with global distance comparisons and local k-nearest-neighbor predictors). On the data sets used to evaluate clustering by structure, LTM is found to perform significantly better than the other models. In particular, the clusters in LTM visualization space are consistent with the relationships between the core scaffolds that define the combinatorial sublibraries. On the data sets used to evaluate clustering by activity, LTM again gives the best performance but by a smaller margin. The results of this paper demonstrate the value of using both a nonlinear projection map and a Bernoulli noise model for modeling binary data.
Resumo:
This thesis describes the development of a complete data visualisation system for large tabular databases, such as those commonly found in a business environment. A state-of-the-art 'cyberspace cell' data visualisation technique was investigated and a powerful visualisation system using it was implemented. Although allowing databases to be explored and conclusions drawn, it had several drawbacks, the majority of which were due to the three-dimensional nature of the visualisation. A novel two-dimensional generic visualisation system, known as MADEN, was then developed and implemented, based upon a 2-D matrix of 'density plots'. MADEN allows an entire high-dimensional database to be visualised in one window, while permitting close analysis in 'enlargement' windows. Selections of records can be made and examined, and dependencies between fields can be investigated in detail. MADEN was used as a tool for investigating and assessing many data processing algorithms, firstly data-reducing (clustering) methods, then dimensionality-reducing techniques. These included a new 'directed' form of principal components analysis, several novel applications of artificial neural networks, and discriminant analysis techniques which illustrated how groups within a database can be separated. To illustrate the power of the system, MADEN was used to explore customer databases from two financial institutions, resulting in a number of discoveries which would be of interest to a marketing manager. Finally, the database of results from the 1992 UK Research Assessment Exercise was analysed. Using MADEN allowed both universities and disciplines to be graphically compared, and supplied some startling revelations, including empirical evidence of the 'Oxbridge factor'.
Resumo:
Exploratory analysis of petroleum geochemical data seeks to find common patterns to help distinguish between different source rocks, oils and gases, and to explain their source, maturity and any intra-reservoir alteration. However, at the outset, one is typically faced with (a) a large matrix of samples, each with a range of molecular and isotopic properties, (b) a spatially and temporally unrepresentative sampling pattern, (c) noisy data and (d) often, a large number of missing values. This inhibits analysis using conventional statistical methods. Typically, visualisation methods like principal components analysis are used, but these methods are not easily able to deal with missing data nor can they capture non-linear structure in the data. One approach to discovering complex, non-linear structure in the data is through the use of linked plots, or brushing, while ignoring the missing data. In this paper we introduce a complementary approach based on a non-linear probabilistic model. Generative topographic mapping enables the visualisation of the effects of very many variables on a single plot, while also dealing with missing data. We show how using generative topographic mapping also provides an optimal method with which to replace missing values in two geochemical datasets, particularly where a large proportion of the data is missing.
Resumo:
Guest editorial Ali Emrouznejad is a Senior Lecturer at the Aston Business School in Birmingham, UK. His areas of research interest include performance measurement and management, efficiency and productivity analysis as well as data mining. He has published widely in various international journals. He is an Associate Editor of IMA Journal of Management Mathematics and Guest Editor to several special issues of journals including Journal of Operational Research Society, Annals of Operations Research, Journal of Medical Systems, and International Journal of Energy Management Sector. He is in the editorial board of several international journals and co-founder of Performance Improvement Management Software. William Ho is a Senior Lecturer at the Aston University Business School. Before joining Aston in 2005, he had worked as a Research Associate in the Department of Industrial and Systems Engineering at the Hong Kong Polytechnic University. His research interests include supply chain management, production and operations management, and operations research. He has published extensively in various international journals like Computers & Operations Research, Engineering Applications of Artificial Intelligence, European Journal of Operational Research, Expert Systems with Applications, International Journal of Production Economics, International Journal of Production Research, Supply Chain Management: An International Journal, and so on. His first authored book was published in 2006. He is an Editorial Board member of the International Journal of Advanced Manufacturing Technology and an Associate Editor of the OR Insight Journal. Currently, he is a Scholar of the Advanced Institute of Management Research. Uses of frontier efficiency methodologies and multi-criteria decision making for performance measurement in the energy sector This special issue aims to focus on holistic, applied research on performance measurement in energy sector management and for publication of relevant applied research to bridge the gap between industry and academia. After a rigorous refereeing process, seven papers were included in this special issue. The volume opens with five data envelopment analysis (DEA)-based papers. Wu et al. apply the DEA-based Malmquist index to evaluate the changes in relative efficiency and the total factor productivity of coal-fired electricity generation of 30 Chinese administrative regions from 1999 to 2007. Factors considered in the model include fuel consumption, labor, capital, sulphur dioxide emissions, and electricity generated. The authors reveal that the east provinces were relatively and technically more efficient, whereas the west provinces had the highest growth rate in the period studied. Ioannis E. Tsolas applies the DEA approach to assess the performance of Greek fossil fuel-fired power stations taking undesirable outputs into consideration, such as carbon dioxide and sulphur dioxide emissions. In addition, the bootstrapping approach is deployed to address the uncertainty surrounding DEA point estimates, and provide bias-corrected estimations and confidence intervals for the point estimates. The author revealed from the sample that the non-lignite-fired stations are on an average more efficient than the lignite-fired stations. Maethee Mekaroonreung and Andrew L. Johnson compare the relative performance of three DEA-based measures, which estimate production frontiers and evaluate the relative efficiency of 113 US petroleum refineries while considering undesirable outputs. Three inputs (capital, energy consumption, and crude oil consumption), two desirable outputs (gasoline and distillate generation), and an undesirable output (toxic release) are considered in the DEA models. The authors discover that refineries in the Rocky Mountain region performed the best, and about 60 percent of oil refineries in the sample could improve their efficiencies further. H. Omrani, A. Azadeh, S. F. Ghaderi, and S. Abdollahzadeh presented an integrated approach, combining DEA, corrected ordinary least squares (COLS), and principal component analysis (PCA) methods, to calculate the relative efficiency scores of 26 Iranian electricity distribution units from 2003 to 2006. Specifically, both DEA and COLS are used to check three internal consistency conditions, whereas PCA is used to verify and validate the final ranking results of either DEA (consistency) or DEA-COLS (non-consistency). Three inputs (network length, transformer capacity, and number of employees) and two outputs (number of customers and total electricity sales) are considered in the model. Virendra Ajodhia applied three DEA-based models to evaluate the relative performance of 20 electricity distribution firms from the UK and the Netherlands. The first model is a traditional DEA model for analyzing cost-only efficiency. The second model includes (inverse) quality by modelling total customer minutes lost as an input data. The third model is based on the idea of using total social costs, including the firm’s private costs and the interruption costs incurred by consumers, as an input. Both energy-delivered and number of consumers are treated as the outputs in the models. After five DEA papers, Stelios Grafakos, Alexandros Flamos, Vlasis Oikonomou, and D. Zevgolis presented a multiple criteria analysis weighting approach to evaluate the energy and climate policy. The proposed approach is akin to the analytic hierarchy process, which consists of pairwise comparisons, consistency verification, and criteria prioritization. In the approach, stakeholders and experts in the energy policy field are incorporated in the evaluation process by providing an interactive mean with verbal, numerical, and visual representation of their preferences. A total of 14 evaluation criteria were considered and classified into four objectives, such as climate change mitigation, energy effectiveness, socioeconomic, and competitiveness and technology. Finally, Borge Hess applied the stochastic frontier analysis approach to analyze the impact of various business strategies, including acquisition, holding structures, and joint ventures, on a firm’s efficiency within a sample of 47 natural gas transmission pipelines in the USA from 1996 to 2005. The author finds that there were no significant changes in the firm’s efficiency by an acquisition, and there is a weak evidence for efficiency improvements caused by the new shareholder. Besides, the author discovers that parent companies appear not to influence a subsidiary’s efficiency positively. In addition, the analysis shows a negative impact of a joint venture on technical efficiency of the pipeline company. To conclude, we are grateful to all the authors for their contribution, and all the reviewers for their constructive comments, which made this special issue possible. We hope that this issue would contribute significantly to performance improvement of the energy sector.
Resumo:
The α-synuclein-immunoreactive pathology of dementia associated with Parkinson disease (DPD) comprises Lewy bodies (LB), Lewy neurites (LN), and Lewy grains (LG). The densities of LB, LN, LG together with vacuoles, neurons, abnormally enlarged neurons (EN), and glial cell nuclei were measured in fifteen cases of DPD. Densities of LN and LG were up to 19 and 70 times those of LB, respectively, depending on region. Densities were significantly greater in amygdala, entorhinal cortex (EC), and sectors CA2/CA3 of the hippocampus, whereas middle frontal gyrus, sector CA1, and dentate gyrus were least affected. Low densities of vacuoles and EN were recorded in most regions. There were differences in the numerical density of neurons between regions, but no statistical difference between patients and controls. In the cortex, the density of LB and vacuoles was similar in upper and lower laminae, while the densities of LN and LG were greater in upper cortex. The densities of LB, LN, and LG were positively correlated. Principal components analysis suggested that DPD cases were heterogeneous with pathology primarily affecting either hippocampus or cortex. The data suggest in DPD: (1) ratio of LN and LG to LB varies between regions, (2) low densities of vacuoles and EN are present in most brain regions, (3) degeneration occurs across cortical laminae, upper laminae being particularly affected, (4) LB, LN and LG may represent degeneration of the same neurons, and (5) disease heterogeneity may result from variation in anatomical pathway affected by cell-to-cell transfer of α-synuclein. © 2013 Springer-Verlag Wien.
Resumo:
Three studies tested the impact of properties of behavioral intention on intention-behavior consistency, information processing, and resistance. Principal components analysis showed that properties of intention formed distinct factors. Study 1 demonstrated that temporal stability, but not the other intention attributes, moderated intention-behavior consistency. Study 2 found that greater stability of intention was associated with improved memory performance. In Study 3, participants were confronted with a rating scale manipulation designed to alter their intention scores. Findings showed that stable intentions were able to withstand attack. Overall, the present research findings suggest that different properties of intention are not simply manifestations of a single underlying construct ("intention strength"), and that temporal stability exhibits superior resistance and impact compared to other intention attributes. © 2013 Wiley Periodicals, Inc.
Resumo:
Geography, retailing, and power are institutionally bound up together. Within these, the authors situate their research in Clegg's work on power. Online shopping offers a growing challenge to the apparent hegemony of traditional physical retail stores' format. While novel e-formats appear regularly, blogshops in Singapore are enjoying astonishing success that has taken the large retailers by surprise. Even though there are well-developed theoretical frameworks for understanding the role of institutional entrepreneurs and other major stakeholders in bringing about change and innovation, much less attention has been paid to the role of unorganized, nonstrategic actors-such as blogshops-in catalyzing retail change. The authors explore how blogshops are perceived by consumers and how they challenge the power of other shopping formats. They use Principal Components Analysis to analyze results from a survey of 349 blogshops users. While the results show that blogshops stay true to traditional online shopping attributes, deviations occur on the concept of value. Furthermore, consumer power is counter intuitively found to be strongly present in the areas related to cultural ties, excitement, and search for individualist novelty (as opposed to mass-production), thereby encouraging researchers to think critically about emerging power behavior in media practices.