936 resultados para qualitative data analysis
Resumo:
We analyze a Big Data set of geo-tagged tweets for a year (Oct. 2013–Oct. 2014) to understand the regional linguistic variation in the U.S. Prior work on regional linguistic variations usually took a long time to collect data and focused on either rural or urban areas. Geo-tagged Twitter data offers an unprecedented database with rich linguistic representation of fine spatiotemporal resolution and continuity. From the one-year Twitter corpus, we extract lexical characteristics for twitter users by summarizing the frequencies of a set of lexical alternations that each user has used. We spatially aggregate and smooth each lexical characteristic to derive county-based linguistic variables, from which orthogonal dimensions are extracted using the principal component analysis (PCA). Finally a regionalization method is used to discover hierarchical dialect regions using the PCA components. The regionalization results reveal interesting linguistic regional variations in the U.S. The discovered regions not only confirm past research findings in the literature but also provide new insights and a more detailed understanding of very recent linguistic patterns in the U.S.
Resumo:
Objective: Recently, much research has been proposed using nature inspired algorithms to perform complex machine learning tasks. Ant colony optimization (ACO) is one such algorithm based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper investigates ant-based algorithms for gene expression data clustering and associative classification. Methods and material: An ant-based clustering (Ant-C) and an ant-based association rule mining (Ant-ARM) algorithms are proposed for gene expression data analysis. The proposed algorithms make use of the natural behavior of ants such as cooperation and adaptation to allow for a flexible robust search for a good candidate solution. Results: Ant-C has been tested on the three datasets selected from the Stanford Genomic Resource Database and achieved relatively high accuracy compared to other classical clustering methods. Ant-ARM has been tested on the acute lymphoblastic leukemia (ALL)/acute myeloid leukemia (AML) dataset and generated about 30 classification rules with high accuracy. Conclusions: Ant-C can generate optimal number of clusters without incorporating any other algorithms such as K-means or agglomerative hierarchical clustering. For associative classification, while a few of the well-known algorithms such as Apriori, FP-growth and Magnum Opus are unable to mine any association rules from the ALL/AML dataset within a reasonable period of time, Ant-ARM is able to extract associative classification rules.
Resumo:
DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT
Resumo:
A recent novel approach to the visualisation and analysis of datasets, and one which is particularly applicable to those of a high dimension, is discussed in the context of real applications. A feed-forward neural network is utilised to effect a topographic, structure-preserving, dimension-reducing transformation of the data, with an additional facility to incorporate different degrees of associated subjective information. The properties of this transformation are illustrated on synthetic and real datasets, including the 1992 UK Research Assessment Exercise for funding in higher education. The method is compared and contrasted to established techniques for feature extraction, and related to topographic mappings, the Sammon projection and the statistical field of multidimensional scaling.
Resumo:
Consideration of the influence of test technique and data analysis method is important for data comparison and design purposes. The paper highlights the effects of replication interval, crack growth rate averaging and curve-fitting procedures on crack growth rate results for a Ni-base alloy. It is shown that an upper bound crack growth rate line is not appropriate for use in fatigue design, and that the derivative of a quadratic fit to the a vs N data looks promising. However, this type of averaging, or curve fitting, is not useful in developing an understanding of microstructure/crack tip interactions. For this purpose, simple replica-to-replica growth rate calculations are preferable. © 1988.
Resumo:
Developers of interactive software are confronted by an increasing variety of software tools to help engineer the interactive aspects of software applications. Not only do these tools fall into different categories in terms of functionality, but within each category there is a growing number of competing tools with similar, although not identical, features. Choice of user interface development tool (UIDT) is therefore becoming increasingly complex.
Resumo:
DUE TO INCOMPLETE PAPERWORK, ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT
Resumo:
A tanulmány arra a feltevésre épül, hogy minél erősebb a bizalomra méltóság szintje egy adott üzleti kapcsolatban, annál inkább igaz, hogy nagy kockázatú tevékenységek mennek végbe benne. Ilyen esetekben a bizalomra méltóság a kapcsolatban zajló események, cselekvések irányítási eszközévé válik, és az üzleti kapcsolatban megjelenik a cselekvési hajlandóságként értelmezett bizalom. A tanulmány felhívja a figyelmet a bizalom és a bizalomra méltóság fogalmai közötti különbségre, szisztematikus különválasztásuk fontosságára. Bemutatja az úgynevezett diadikus adatelemzés gazdálkodástudományi alkalmazását. Empirikus eredményei is igazolják, hogy ezzel a módszerrel az üzleti kapcsolatok társas jellemzőinek (köztük a bizalomnak) és a közöttük lévő kapcsolatoknak mélyebb elemzésére nyílik lehetőség. ____ The paper rests on the behavioral interpretation of trust, making a clear distinction between trustworthiness (honesty) and trust interpreted as willingness to engage in risky situations with specific partners. The hypothesis tested is that in a business relation marked by high levels of trustworthiness as perceived by the opposite parties, willingness to be involved in risky situations is higher than it is in relations where actors do not believe their partners to be highly trustworthy. Testing this hypothesis clearly calls for dyadic operationalization, measurement, and analysis. The authors present the first economic application of a newly developed statistical technique called dyadic data analysis, which has already been applied in social psychology. It clearly overcomes the problem of single-ended research in business relations analysis and allows a deeper understanding of any dyadic phenomenon, including trust/trustworthiness as a governance mechanism.
Resumo:
With the latest development in computer science, multivariate data analysis methods became increasingly popular among economists. Pattern recognition in complex economic data and empirical model construction can be more straightforward with proper application of modern softwares. However, despite the appealing simplicity of some popular software packages, the interpretation of data analysis results requires strong theoretical knowledge. This book aims at combining the development of both theoretical and applicationrelated data analysis knowledge. The text is designed for advanced level studies and assumes acquaintance with elementary statistical terms. After a brief introduction to selected mathematical concepts, the highlighting of selected model features is followed by a practice-oriented introduction to the interpretation of SPSS1 outputs for the described data analysis methods. Learning of data analysis is usually time-consuming and requires efforts, but with tenacity the learning process can bring about a significant improvement of individual data analysis skills.
Resumo:
Background: Autism Spectrum Disorder (ASD) is a major global health challenge as the majority of individuals with ASD live in low- and middle-income countries (LMICs) and receive little to no services or support from health or social care systems. Despite this global crisis, the development and validation of ASD interventions has almost exclusively occurred in high-income countries, leaving many unanswered questions regarding what contextual factors would need to be considered to ensure the effectiveness of interventions in LMICs. This study sought to conduct explorative research on the contextual adaptation of a caregiver-mediated early ASD intervention for use in a low-resource setting in South Africa.
Methods: Participants included 22 caregivers of children with autism, including mothers (n=16), fathers (n=4), and grandmothers (n=2). Four focus groups discussions were conducted in Cape Town, South Africa with caregivers and lasted between 1.5-3.5 hours in length. Data was recorded, translated, and transcribed by research personnel. Data was then coded for emerging themes and analyzed using the NVivo qualitative data analysis software package.
Results: Nine contextual factors were reported to be important for the adaptation process including culture, language, location of treatment, cost of treatment, type of service provider, familial needs, length of treatment, support, and parenting practices. One contextual factor, evidence-based treatment, was reported to be both important and not important for adaptation by caregivers. The contextual factor of stigma was identified as an emerging theme and a specifically relevant challenge when developing an ASD intervention for use in a South African context.
Conclusions: Eleven contextual factors were discussed in detail by caregivers and examples were given regarding the challenges, sources, and preferences related to the contextual adaptation of a parent-mediated early ASD intervention in South Africa. Caregivers reported a preference for an affordable, in-home, individualized early ASD intervention, where they have an active voice in shaping treatment goals. Distrust of community-based nurses and health workers to deliver an early ASD intervention and challenges associated with ASD-based stigma were two unanticipated findings from this data set. Implications for practice and further research are discussed.
Resumo:
A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.
Resumo:
In the last several years there has been an increase in the amount of qualitative research using in-depth interviews and comprehensive content analyses in sport psychology. However, no explicit method has been provided to deal with the large amount of unstructured data. This article provides common guidelines for organizing and interpreting unstructured data. Two main operations are suggested and discussed: first, coding meaningful text segments, or creating tags, and second, regrouping similar text segments,or creating categories. Furthermore, software programs for the microcomputer are presented as away to facilitate the organization and interpretation of qualitative data
Resumo:
This paper is part of a special issue of Applied Geochemistry focusing on reliable applications of compositional multivariate statistical methods. This study outlines the application of compositional data analysis (CoDa) to calibration of geochemical data and multivariate statistical modelling of geochemistry and grain-size data from a set of Holocene sedimentary cores from the Ganges-Brahmaputra (G-B) delta. Over the last two decades, understanding near-continuous records of sedimentary sequences has required the use of core-scanning X-ray fluorescence (XRF) spectrometry, for both terrestrial and marine sedimentary sequences. Initial XRF data are generally unusable in ‘raw-format’, requiring data processing in order to remove instrument bias, as well as informed sequence interpretation. The applicability of these conventional calibration equations to core-scanning XRF data are further limited by the constraints posed by unknown measurement geometry and specimen homogeneity, as well as matrix effects. Log-ratio based calibration schemes have been developed and applied to clastic sedimentary sequences focusing mainly on energy dispersive-XRF (ED-XRF) core-scanning. This study has applied high resolution core-scanning XRF to Holocene sedimentary sequences from the tidal-dominated Indian Sundarbans, (Ganges-Brahmaputra delta plain). The Log-Ratio Calibration Equation (LRCE) was applied to a sub-set of core-scan and conventional ED-XRF data to quantify elemental composition. This provides a robust calibration scheme using reduced major axis regression of log-ratio transformed geochemical data. Through partial least squares (PLS) modelling of geochemical and grain-size data, it is possible to derive robust proxy information for the Sundarbans depositional environment. The application of these techniques to Holocene sedimentary data offers an improved methodological framework for unravelling Holocene sedimentation patterns.
Resumo:
Qualitative Comparative Analysis (QCA) is a method for the systematic analysis of cases. A holistic view of cases and an approach to causality emphasizing complexity are some of its core features. Over the last decades, QCA has found application in many fields of the social sciences. In spite of this, its use in feminist research has been slower, and only recently QCA has been applied to topics related to social care, the political representation of women, and reproductive politics. In spite of the comparative turn in feminist studies, researchers still privilege qualitative methods, in particular case studies, and are often skeptical of quantitative techniques (Spierings 2012). These studies show that the meaning and measurement of many gender concepts differ across countries and that the factors leading to feminist success and failure are context specific. However, case study analyses struggle to systematically account for the ways in which these forces operate in different locations.