37 resultados para Qualitative data analysis software
Resumo:
Exploratory analysis of data seeks to find common patterns to gain insights into the structure and distribution of the data. In geochemistry it is a valuable means to gain insights into the complicated processes making up a petroleum system. Typically linear visualisation methods like principal components analysis, linked plots, or brushing are used. These methods can not directly be employed when dealing with missing data and they struggle to capture global non-linear structures in the data, however they can do so locally. This thesis discusses a complementary approach based on a non-linear probabilistic model. The generative topographic mapping (GTM) enables the visualisation of the effects of very many variables on a single plot, which is able to incorporate more structure than a two dimensional principal components plot. The model can deal with uncertainty, missing data and allows for the exploration of the non-linear structure in the data. In this thesis a novel approach to initialise the GTM with arbitrary projections is developed. This makes it possible to combine GTM with algorithms like Isomap and fit complex non-linear structure like the Swiss-roll. Another novel extension is the incorporation of prior knowledge about the structure of the covariance matrix. This extension greatly enhances the modelling capabilities of the algorithm resulting in better fit to the data and better imputation capabilities for missing data. Additionally an extensive benchmark study of the missing data imputation capabilities of GTM is performed. Further a novel approach, based on missing data, will be introduced to benchmark the fit of probabilistic visualisation algorithms on unlabelled data. Finally the work is complemented by evaluating the algorithms on real-life datasets from geochemical projects.
Resumo:
We analyze a Big Data set of geo-tagged tweets for a year (Oct. 2013–Oct. 2014) to understand the regional linguistic variation in the U.S. Prior work on regional linguistic variations usually took a long time to collect data and focused on either rural or urban areas. Geo-tagged Twitter data offers an unprecedented database with rich linguistic representation of fine spatiotemporal resolution and continuity. From the one-year Twitter corpus, we extract lexical characteristics for twitter users by summarizing the frequencies of a set of lexical alternations that each user has used. We spatially aggregate and smooth each lexical characteristic to derive county-based linguistic variables, from which orthogonal dimensions are extracted using the principal component analysis (PCA). Finally a regionalization method is used to discover hierarchical dialect regions using the PCA components. The regionalization results reveal interesting linguistic regional variations in the U.S. The discovered regions not only confirm past research findings in the literature but also provide new insights and a more detailed understanding of very recent linguistic patterns in the U.S.
Resumo:
Objective: Recently, much research has been proposed using nature inspired algorithms to perform complex machine learning tasks. Ant colony optimization (ACO) is one such algorithm based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper investigates ant-based algorithms for gene expression data clustering and associative classification. Methods and material: An ant-based clustering (Ant-C) and an ant-based association rule mining (Ant-ARM) algorithms are proposed for gene expression data analysis. The proposed algorithms make use of the natural behavior of ants such as cooperation and adaptation to allow for a flexible robust search for a good candidate solution. Results: Ant-C has been tested on the three datasets selected from the Stanford Genomic Resource Database and achieved relatively high accuracy compared to other classical clustering methods. Ant-ARM has been tested on the acute lymphoblastic leukemia (ALL)/acute myeloid leukemia (AML) dataset and generated about 30 classification rules with high accuracy. Conclusions: Ant-C can generate optimal number of clusters without incorporating any other algorithms such as K-means or agglomerative hierarchical clustering. For associative classification, while a few of the well-known algorithms such as Apriori, FP-growth and Magnum Opus are unable to mine any association rules from the ALL/AML dataset within a reasonable period of time, Ant-ARM is able to extract associative classification rules.
Resumo:
DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT
Resumo:
A recent novel approach to the visualisation and analysis of datasets, and one which is particularly applicable to those of a high dimension, is discussed in the context of real applications. A feed-forward neural network is utilised to effect a topographic, structure-preserving, dimension-reducing transformation of the data, with an additional facility to incorporate different degrees of associated subjective information. The properties of this transformation are illustrated on synthetic and real datasets, including the 1992 UK Research Assessment Exercise for funding in higher education. The method is compared and contrasted to established techniques for feature extraction, and related to topographic mappings, the Sammon projection and the statistical field of multidimensional scaling.
Resumo:
Consideration of the influence of test technique and data analysis method is important for data comparison and design purposes. The paper highlights the effects of replication interval, crack growth rate averaging and curve-fitting procedures on crack growth rate results for a Ni-base alloy. It is shown that an upper bound crack growth rate line is not appropriate for use in fatigue design, and that the derivative of a quadratic fit to the a vs N data looks promising. However, this type of averaging, or curve fitting, is not useful in developing an understanding of microstructure/crack tip interactions. For this purpose, simple replica-to-replica growth rate calculations are preferable. © 1988.
Resumo:
DUE TO INCOMPLETE PAPERWORK, ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT
Resumo:
This is a multiple case study of the leadership language of three senior women working in a large corporation in Bahrain. The study’s main aim is to explore the linguistic practices the women leaders use with their colleagues and subordinates in corporate meetings. Adopting a Foucauldian (1972) notion of ‘discourses’ as social practices and a view of gender as socially constructed and discursively performed (Butler 1990), this research aims to unveil the competing discourses which may shape the leadership language of senior women in their communities of practice. The research is situated within the broader field of Sociolinguistics and the specific field of Language and Gender. To address the research aim, a case study approach incorporating multiple methods of qualitative data collection (observation, interviews, and shadowing) was utilised to gather information about the three women leaders and produce a rich description of their use of language in and out of meeting contexts. For analysis, principles of Qualitative Data Analysis (QDA) were used to organise and sort the large amount of data. Also, Feminist Post- Structuralist Discourse Analysis (FPDA) was adopted to produce a multi-faceted analysis of the subjects, their language leadership, power relations, and competing discourses in the context. It was found that the three senior women enact leadership differently making variable use of a repertoire of conventionally masculine and feminine linguistic practices. However, they all appear to have limited language resources and even more limiting subject positions; and they all have to exercise considerable linguistic expertise to police and modify their language in order to avoid the ‘double bind’. Yet, the extent of this limitation and constraints depends on the community of practice with its prevailing discourses, which appear to have their roots in Islamic and cultural practices as well as some Western influences acquired throughout the company’s history. It is concluded that it may be particularly challenging for Middle Eastern women to achieve any degree of equality with men in the workplace because discourses of Gender difference lie at the core of Islamic teaching and ideology.
Resumo:
The availability of regular supply has been identified as one of the major stimulants for the growth and development of any nation and is thus important for the economic well-being of a nation. The problems of the Nigerian power sector stems from a lot of factors culminating in her slow developmental growth and inability to meet the power demands of her citizens regardless of the abundance of human and natural resources prevalent in the nation. The research therefore had the main aim of investigating the importance and contributions of risk management to the success of projects specific to the power sector. To achieve this aim it was pertinent to examine the efficacy of risk management process in practice and elucidate the various risks typically associated with projects (Construction, Contractual, Political, Financial, Design, Human resource and Environmental risk factors) in the power sector as well as determine the current situation of risk management practice in Nigeria. To address this factors inhibiting the proficiency of the overarching and prevailing issue which have only been subject to limited in-depth academic research, a rigorous mixed research method was adopted (quantitative and qualitative data analysis). A review of the Nigeria power sector was also carried out as a precursor to the data collection stage. Using purposive sampling technique, respondents were identified and a questionnaire survey was administered. The research hypotheses were tested using inferential statistics (Pearson correlation, Chi-square test, t-test and ANOVA technique) and the findings revealed the need for the development of a new risk management implementation Framework. The proposed Framework was tested within a company project, for interpreting the dynamism and essential benefits of risk management with the aim of improving the project performances (time), reducing the level of fragmentation (quality) and improving profitability (cost) within the Nigerian power sector in order to bridge a gap between theory and practice. It was concluded that Nigeria’s poor risk management practices have prevented it from experiencing strong growth and development. The study however, concludes that the successful implementation of the developed risk management framework may help it to attain this status by enabling it to become more prepared and flexible, to face challenges that previously led to project failures, and thus contributing to its prosperity. The research study provides an original contribution theoretically, methodologically and practically which adds to the project risk management body of knowledge and to the Nigerian power sector.
Resumo:
This article is aimed primarily at eye care practitioners who are undertaking advanced clinical research, and who wish to apply analysis of variance (ANOVA) to their data. ANOVA is a data analysis method of great utility and flexibility. This article describes why and how ANOVA was developed, the basic logic which underlies the method and the assumptions that the method makes for it to be validly applied to data from clinical experiments in optometry. The application of the method to the analysis of a simple data set is then described. In addition, the methods available for making planned comparisons between treatment means and for making post hoc tests are evaluated. The problem of determining the number of replicates or patients required in a given experimental situation is also discussed. Copyright (C) 2000 The College of Optometrists.
Resumo:
Software development methodologies are becoming increasingly abstract, progressing from low level assembly and implementation languages such as C and Ada, to component based approaches that can be used to assemble applications using technologies such as JavaBeans and the .NET framework. Meanwhile, model driven approaches emphasise the role of higher level models and notations, and embody a process of automatically deriving lower level representations and concrete software implementations. The relationship between data and software is also evolving. Modern data formats are becoming increasingly standardised, open and empowered in order to support a growing need to share data in both academia and industry. Many contemporary data formats, most notably those based on XML, are self-describing, able to specify valid data structure and content, and can also describe data manipulations and transformations. Furthermore, while applications of the past have made extensive use of data, the runtime behaviour of future applications may be driven by data, as demonstrated by the field of dynamic data driven application systems. The combination of empowered data formats and high level software development methodologies forms the basis of modern game development technologies, which drive software capabilities and runtime behaviour using empowered data formats describing game content. While low level libraries provide optimised runtime execution, content data is used to drive a wide variety of interactive and immersive experiences. This thesis describes the Fluid project, which combines component based software development and game development technologies in order to define novel component technologies for the description of data driven component based applications. The thesis makes explicit contributions to the fields of component based software development and visualisation of spatiotemporal scenes, and also describes potential implications for game development technologies. The thesis also proposes a number of developments in dynamic data driven application systems in order to further empower the role of data in this field.
Resumo:
This book is aimed primarily at microbiologists who are undertaking research and who require a basic knowledge of statistics to analyse their experimental data. Computer software employing a wide range of data analysis methods is widely available to experimental scientists. The availability of this software, however, makes it essential that investigators understand the basic principles of statistics. Statistical analysis of data can be complex with many different methods of approach, each of which applies in a particular experimental circumstance. Hence, it is possible to apply an incorrect statistical method to data and to draw the wrong conclusions from an experiment. The purpose of this book, which has its origin in a series of articles published in the Society for Applied Microbiology journal ‘The Microbiologist’, is an attempt to present the basic logic of statistics as clearly as possible and therefore, to dispel some of the myths that often surround the subject. The 28 ‘Statnotes’ deal with various topics that are likely to be encountered, including the nature of variables, the comparison of means of two or more groups, non-parametric statistics, analysis of variance, correlating variables, and more complex methods such as multiple linear regression and principal components analysis. In each case, the relevant statistical method is illustrated with examples drawn from experiments in microbiological research. The text incorporates a glossary of the most commonly used statistical terms and there are two appendices designed to aid the investigator in the selection of the most appropriate test.
Resumo:
OObjectives: We explored the perceptions, views and experiences of diabetes education in people with type 2 diabetes who were participating in a UK randomized controlled trial of methods of education. The intervention arm of the trial was based on DESMOND, a structured programme of group education sessions aimed at enabling self-management of diabetes, while the standard arm was usual care from general practices. Methods: Individual semi-structured interviews were conducted with 36 adult patients, of whom 19 had attended DESMOND education sessions and 17 had been randomized to receive usual care. Data analysis was based on the constant comparative method. Results: Four principal orientations towards diabetes and its management were identified: `resisters', `identity resisters, consequence accepters', `identity accepters, consequence resisters' and `accepters'. Participants offered varying accounts of the degree of personal responsibility that needed to be assumed in response to the diagnosis. Preferences for different styles of education were also expressed, with many reporting that they enjoyed and benefited from group education, although some reported ambivalence or disappointment with their experiences of education. It was difficult to identify striking thematic differences between accounts of people on different arms of the trial, although there was some very tentative evidence that those who attended DESMOND were more accepting of a changed identity and its implications for their management of diabetes. Discussion: No one single approach to education is likely to suit all people newly diagnosed with diabetes, although structured group education may suit many. This paper identifies varying orientations and preferences of people with diabetes towards forms of both education and self-management, which should be taken into account when planning approaches to education.