880 resultados para selection methods
Resumo:
The research objectives of this thesis were to contribute to Bayesian statistical methodology by contributing to risk assessment statistical methodology, and to spatial and spatio-temporal methodology, by modelling error structures using complex hierarchical models. Specifically, I hoped to consider two applied areas, and use these applications as a springboard for developing new statistical methods as well as undertaking analyses which might give answers to particular applied questions. Thus, this thesis considers a series of models, firstly in the context of risk assessments for recycled water, and secondly in the context of water usage by crops. The research objective was to model error structures using hierarchical models in two problems, namely risk assessment analyses for wastewater, and secondly, in a four dimensional dataset, assessing differences between cropping systems over time and over three spatial dimensions. The aim was to use the simplicity and insight afforded by Bayesian networks to develop appropriate models for risk scenarios, and again to use Bayesian hierarchical models to explore the necessarily complex modelling of four dimensional agricultural data. The specific objectives of the research were to develop a method for the calculation of credible intervals for the point estimates of Bayesian networks; to develop a model structure to incorporate all the experimental uncertainty associated with various constants thereby allowing the calculation of more credible credible intervals for a risk assessment; to model a single day’s data from the agricultural dataset which satisfactorily captured the complexities of the data; to build a model for several days’ data, in order to consider how the full data might be modelled; and finally to build a model for the full four dimensional dataset and to consider the timevarying nature of the contrast of interest, having satisfactorily accounted for possible spatial and temporal autocorrelations. This work forms five papers, two of which have been published, with two submitted, and the final paper still in draft. The first two objectives were met by recasting the risk assessments as directed, acyclic graphs (DAGs). In the first case, we elicited uncertainty for the conditional probabilities needed by the Bayesian net, incorporated these into a corresponding DAG, and used Markov chain Monte Carlo (MCMC) to find credible intervals, for all the scenarios and outcomes of interest. In the second case, we incorporated the experimental data underlying the risk assessment constants into the DAG, and also treated some of that data as needing to be modelled as an ‘errors-invariables’ problem [Fuller, 1987]. This illustrated a simple method for the incorporation of experimental error into risk assessments. In considering one day of the three-dimensional agricultural data, it became clear that geostatistical models or conditional autoregressive (CAR) models over the three dimensions were not the best way to approach the data. Instead CAR models are used with neighbours only in the same depth layer. This gave flexibility to the model, allowing both the spatially structured and non-structured variances to differ at all depths. We call this model the CAR layered model. Given the experimental design, the fixed part of the model could have been modelled as a set of means by treatment and by depth, but doing so allows little insight into how the treatment effects vary with depth. Hence, a number of essentially non-parametric approaches were taken to see the effects of depth on treatment, with the model of choice incorporating an errors-in-variables approach for depth in addition to a non-parametric smooth. The statistical contribution here was the introduction of the CAR layered model, the applied contribution the analysis of moisture over depth and estimation of the contrast of interest together with its credible intervals. These models were fitted using WinBUGS [Lunn et al., 2000]. The work in the fifth paper deals with the fact that with large datasets, the use of WinBUGS becomes more problematic because of its highly correlated term by term updating. In this work, we introduce a Gibbs sampler with block updating for the CAR layered model. The Gibbs sampler was implemented by Chris Strickland using pyMCMC [Strickland, 2010]. This framework is then used to consider five days data, and we show that moisture in the soil for all the various treatments reaches levels particular to each treatment at a depth of 200 cm and thereafter stays constant, albeit with increasing variances with depth. In an analysis across three spatial dimensions and across time, there are many interactions of time and the spatial dimensions to be considered. Hence, we chose to use a daily model and to repeat the analysis at all time points, effectively creating an interaction model of time by the daily model. Such an approach allows great flexibility. However, this approach does not allow insight into the way in which the parameter of interest varies over time. Hence, a two-stage approach was also used, with estimates from the first-stage being analysed as a set of time series. We see this spatio-temporal interaction model as being a useful approach to data measured across three spatial dimensions and time, since it does not assume additivity of the random spatial or temporal effects.
Resumo:
This paper is the second in a series of reviews of cross-cultural studies of menopausal symptoms. The goal of this review is to compare and contrast methods which have been previously utilized in Cross-Cultural Midlife Women's Health Studies with a view to (1) identifying the challenges in measurement across cultures in psychological symptoms and (2) suggesting a set of unified questions and tools that can be used in future research in this area. This review also aims to examine the determinants of psychological symptoms and how those determinants were measured. The review included eight studies that explicitly compared symptoms in different countries or different ethnic groups in the same country and included: Australian/Japanese Midlife Women's Health Study (AJMWHS), Decisions At Menopause Study (DAMeS), Four Major Ethnic Groups (FMEG), Hilo Women's Health Survey (HWHS), Penn Ovarian Aging Study (POAS), Study of Women's Health Across the Nation (SWAN), Women's Health in Midlife National Study (WHiMNS), and the Women's International Study of Health and Sexuality (WISHeS). This review concludes that mental morbidity does affect vasomotor symptom prevalence across cultures and therefore should be measured. Based on the review of these eight studies it is recommended that the following items be included when measuring psychological symptoms across cultures, feeling tense or nervous, sleeping difficulty, difficulty in concentrating, depressed and irritability along with the CES-D Scale, and the Perceived Stress Scale. The measurement of these symptoms will provide an evidence based approach when forming any future menopause symptom list and allow for comparisons across studies.
Resumo:
This is the fourth in a series of reviews of cross-cultural studies of menopausal symptoms. The purpose of this review is to examine methods used in cross-cultural comparisons of sexual symptoms among women at midlife, and to examine the determinants of sexual symptoms and how those determinants were measured. The goal of this review is to make recommendations that will improve cross-cultural comparisons in the future. The review included nine studies that explicitly examined symptoms in different countries or different ethnic groups in the same country and included: Australian/Japanese Midlife Women's Health Study (AJMWHS), Decisions At Menopause Study (DAMeS), Four Major Ethnic Groups (FMEG), Hilo Women's Health Survey (HWHS), Mid-Aged Health in Women from the Indian Subcontinent (MAHWIS), Penn Ovarian Aging Study (POAS), Study of Women's Health Across the Nation (SWAN), Women's Health in Midlife National Study (WHiMNS), and Women's International Study of Health and Sexuality (WISHeS). Although methods used for assessing sexual symptoms across cultures differed between studies, statistically significant differences were reported. Cross-cultural differences in sexual symptoms exist, and should be measured by including the following symptoms: loss of interest in sex, vaginal dryness, and the Females Sexual Function Index which covers desire, arousal, lubrication, orgasm, satisfaction, and pain on intercourse. The measurement of these symptoms will provide an evidence-based approach when forming any future menopause symptom list and allow for comparisons across studies.
Resumo:
Methodological differences among studies of vasomotor symptoms limit rigorous comparison or systematic review. Vasomotor symptoms generally include hot flushes and night sweats although other associated symptoms exist. Prevalence rates vary between and within populations, but different studies collect data on frequency, bothersomeness, and/or severity using different outcome measures and scales, making comparisons difficult. We reviewed only cross-cultural studies of menopausal symptoms that explicitly examined symptoms in general populations of women in different countries or different ethnic groups in the same country. This resulted in the inclusion of nine studies: Australian/Japanese Midlife Women's Health Study (AJMWHS), Decisions At Menopause Study (DAMeS), Four Major Ethnic Groups (FMEG), Hilo Women's Health Survey (HWHS), Mid-Aged Health in Women from the Indian Subcontinent (MAHWIS), Penn Ovarian Aging Study (POAS), Study of Women's Health Across the Nation (SWAN), Women's Health in Midlife National Study (WHiMNS), and Women's International Study of Health and Sexuality (WISHeS). These studies highlight the methodological challenges involved in conducting multi-population studies, particularly when languages differ, but also highlight the importance of performing multivariate and factor analyses. Significant cultural differences in one or more vasomotor symptoms were observed in 8 of 9 studies, and symptoms were influenced by the following determinants: menopausal status, hormones (and variance), age (or actually, the square of age, age2), BMI, depression, anxiety, poor physical health, perceived stress, lifestyle factors (hormone therapy use, smoking and exposure to passive smoke), and acculturation (in immigrant populations). Recommendations are made to improve methodological rigor and facilitate comparisons in future cross-cultural menopause studies.
Resumo:
This paper reviews the methods used in cross-cultural studies of menopausal symptoms with the goal of formulating recommendations to facilitate comparisons of menopausal symptoms across cultures. It provides an overview of existing approaches and serves to introduce four separate reviews of vasomotor, psychological, somatic, and sexual symptoms at midlife. Building on an earlier review of cross-cultural studies of menopause covering time periods until 2004, these reviews are based on searches of Medline, PsycINFO, CINAHL and Google Scholar for English-language articles published from 2004 to 2010 using the terms “cross cultural comparison” and “menopause.” Two major criteria were used: a study had to include more than one culture, country, or ethnic group and to have asked about actual menopausal symptom experience. We found considerable variation across studies in age ranges, symptom lists, reference period for symptom recall, variables included in multivariate analyses, and the measurement of factors (e.g., menopausal status and hormonal factors, demographic, anthropometric, mental/physical health, and lifestyle measures) that influence vasomotor, psychological, somatic and sexual symptoms. Based on these reviews, we make recommendations for future research regarding age range, symptom lists, reference/recall periods, and measurement of menopausal status. Recommendations specific to the cross-cultural study of vasomotor, psychological, somatic, and sexual symptoms are found in the four reviews that follow this introduction.
Resumo:
This paper is the third in a series of reviews of cross-cultural studies of symptoms at midlife. The goal of this review is to examine methods used previously in cross-cultural studies of menopause and women's health at midlife to (1) identify challenges in the measurement of somatic symptoms across cultures and (2) recommend questions and tools that can be used in future research. This review also aims to examine the determinants of somatic symptoms. The review concludes that methods used for assessing somatic symptoms differ across studies. Somatic symptoms, particularly, aches, pain, and fatigue have a high prevalence. Statistically significant differences were seen in the prevalence of somatic symptoms across cultures. Based on the number of studies that demonstrated cross-cultural differences in symptom prevalence, we recommend that the following symptoms be included in future studies of symptoms at midlife: headaches, aches/pain, palpitations, dizziness, fatigue, breathing difficulties, numbness or tingling, and gastrointestinal difficulties. We also recommend that objective measures of physical function be administered when possible to supplement subjective self-evaluation.
Resumo:
The psychological contract is a key analytical device utilised by both academics and practitioners to conceptualise and explore the dynamics of the employment relationship. However, despite the recognised import of the construct, some authors suggest that its empirical investigation has fallen into a 'methodological rut' [Conway & Briner, 2005, p. 89] and is neglecting to assess key tenets of the concept, such as its temporal and dynamic nature. This paper describes the research design of a longitudinal, mixed methods study which draws upon the strengths of both qualitative and quantitative modes of inquiry in order to explore the development of, and changes in, the psychological contract. Underpinned by a critical realist philosophy, the paper seeks to offer a research design suitable for exploring the process of change not only within the psychological contract domain, but also for similar constructs in the human resource management and broader organisational behaviour fields.
Resumo:
In 2009 the Australian Federal and State governments are expected to have spent some AU$30 billion procuring infrastructure projects. For governments with finite resources but many competing projects, formal capital rationing is achieved through use of Business Cases. These Business cases articulate the merits of investing in particular projects along with the estimated costs and risks of each project. Despite the sheer size and impact of infrastructure projects, there is very little research in Australia, or internationally, on the performance of these projects against Business Case assumptions when the decision to invest is made. If such assumptions (particularly cost assumptions) are not met, then there is serious potential for the misallocation of Australia’s finite financial resources. This research addresses this important gap in the literature by using combined quantitative and qualitative research methods, to examine the actual performance of 14 major Australian government infrastructure projects. The research findings are controversial as they challenge widely held perceptions of the effectiveness of certain infrastructure delivery practices. Despite this controversy, the research has had a significant impact on the field and has been described as ‘outstanding’ and ‘definitive’ (Alliancing Association of Australasia), "one of the first of its kind" (Infrastructure Partnerships of Australia) and "making a critical difference to infrastructure procurement" (Victorian Department of Treasury). The implications for practice of the research have been profound and included the withdrawal by Government of various infrastructure procurement guidelines, the formulation of new infrastructure policies by several state governments and the preparation of new infrastructure guidelines that substantially reflect the research findings. Building on the practical research, a more rigorous academic investigation focussed on the comparative cost uplift of various project delivery strategies was submitted to Australia’s premier academic management conference, the Australian and New Zealand Academy of Management (ANZAM) Annual Conference. This paper has been accepted for the 2010 ANZAM National Conference following a process of double blind peer review with reviewers rating the paper’s overall contribution as "Excellent" and "Good".
Resumo:
Background: Cancer patients experience distress and anxiety related to their diagnosis, treatment and the unfamiliar cancer centre. Strategies with the aim of orienting patients to a cancer care facility may improve patient outcomes. Although meeting patients' information needs at different stages is important, there is little agreement about the type of information and the timing for information to be given. Orientation interventions aim to address information needs at the start of a person's experience with a cancer care facility. The extent of any benefit of these interventions is unknown. Objectives: To assess the effects of information interventions which orient patients and their carers/family to a cancer care facility, and to the services available in the facility. Search Methods: We searched the Cochrane Central Register of Controlled Trials (CENTRAL) (The Cochrane Library 2011, Issue 2); MEDLINE (OvidSP) (1966 to Jun 2011), EMBASE (Ovid SP) (1966 to Jun 2011), CINAHL (EBSCO) (1982 to Jun 2011), PsycINFO (OvidSP) (1966 to Jun 2011), review articles and reference lists of relevant articles. We contacted principal investigators and experts in the field. Selection Criteria: Randomised controlled trials (RCTs), cluster RCTs and quasi-RCTs evaluating the effects of information interventions that orient patients and their carers/family to a cancer care facility. Data collection and analysis: Results of searches were reviewed against the pre-determined criteria for inclusion by two review authors. The primary outcomes were knowledge and understanding; health status and wellbeing, evaluation of care, and harms. Secondary outcomes were communication, skills acquisition, behavioural outcomes, service delivery, and health professional outcomes. We pooled results of RCTs using mean differences (MD) and 95% confidence intervals (CI). Main results: We included four RCTs involving 610 participants. All four trials aimed to investigate the effects of orientation programs for cancer patients to a cancer facility. There was high risk of bias across studies. Findings from two of the RCTs demonstrated significant benefits of the orientation intervention in relation to levels of distress (mean difference (MD) -8.96 (95% confidence interval (CI) -11.79 to -6.13), but non-significant benefits in relation to state anxiety levels (MD -9.77 (95% CI -24.96 to 5.41). Other outcomes for participants were generally positive (e.g. more knowledgeable about the cancer centre and cancer therapy, better coping abilities). No harms or adverse effects were measured or reported by any of the included studies. There were insufficient data on the other outcomes of interest. Authors conclusion: This review has demonstrated the feasibility and some potential benefits of orientation interventions. There was a low level of evidence suggesting that orientation interventions can reduce distress in patients. However, most of the other outcomes remain inconclusive (patient knowledge recall/ satisfaction). The majority of studies were subject to high risk of bias, and were likely to be insufficiently powered. Further well conducted and powered RCTs are required to provide evidence for determining the most appropriate intensity, nature, mode and resources for such interventions. Patient and carer-focused outcomes should be included.
Resumo:
With the increasing number of XML documents in varied domains, it has become essential to identify ways of finding interesting information from these documents. Data mining techniques were used to derive this interesting information. Mining on XML documents is impacted by its model due to the semi-structured nature of these documents. Hence, in this chapter we present an overview of the various models of XML documents, how these models were used for mining and some of the issues and challenges in these models. In addition, this chapter also provides some insights into the future models of XML documents for effectively capturing the two important features namely structure and content of XML documents for mining.
Resumo:
This paper proposes a new research method, Participatory Action Design Research (PADR), for studies in the Urban Informatics domain. PADR supports Urban Informatics research in developing new technological means (e.g. using mobile and ubiquitous computing) to resolve contemporary issues or support everyday life in urban environments. The paper discusses the nature, aims and inherent methodological needs of Urban Informatics research, and proposes PADR as a method to address these needs. Situated in a socio-technical context, Urban Informatics requires a close dialogue between social and design-oriented fields of research as well as their methods. PADR combines Action Research and Design Science Research, both of which are used in Information Systems, another field with a strong socio-technical emphasis, and further adapts them to the cross-disciplinary needs and research context of Urban Informatics.
Resumo:
Bioinformatics involves analyses of biological data such as DNA sequences, microarrays and protein-protein interaction (PPI) networks. Its two main objectives are the identification of genes or proteins and the prediction of their functions. Biological data often contain uncertain and imprecise information. Fuzzy theory provides useful tools to deal with this type of information, hence has played an important role in analyses of biological data. In this thesis, we aim to develop some new fuzzy techniques and apply them on DNA microarrays and PPI networks. We will focus on three problems: (1) clustering of microarrays; (2) identification of disease-associated genes in microarrays; and (3) identification of protein complexes in PPI networks. The first part of the thesis aims to detect, by the fuzzy C-means (FCM) method, clustering structures in DNA microarrays corrupted by noise. Because of the presence of noise, some clustering structures found in random data may not have any biological significance. In this part, we propose to combine the FCM with the empirical mode decomposition (EMD) for clustering microarray data. The purpose of EMD is to reduce, preferably to remove, the effect of noise, resulting in what is known as denoised data. We call this method the fuzzy C-means method with empirical mode decomposition (FCM-EMD). We applied this method on yeast and serum microarrays, and the silhouette values are used for assessment of the quality of clustering. The results indicate that the clustering structures of denoised data are more reasonable, implying that genes have tighter association with their clusters. Furthermore we found that the estimation of the fuzzy parameter m, which is a difficult step, can be avoided to some extent by analysing denoised microarray data. The second part aims to identify disease-associated genes from DNA microarray data which are generated under different conditions, e.g., patients and normal people. We developed a type-2 fuzzy membership (FM) function for identification of diseaseassociated genes. This approach is applied to diabetes and lung cancer data, and a comparison with the original FM test was carried out. Among the ten best-ranked genes of diabetes identified by the type-2 FM test, seven genes have been confirmed as diabetes-associated genes according to gene description information in Gene Bank and the published literature. An additional gene is further identified. Among the ten best-ranked genes identified in lung cancer data, seven are confirmed that they are associated with lung cancer or its treatment. The type-2 FM-d values are significantly different, which makes the identifications more convincing than the original FM test. The third part of the thesis aims to identify protein complexes in large interaction networks. Identification of protein complexes is crucial to understand the principles of cellular organisation and to predict protein functions. In this part, we proposed a novel method which combines the fuzzy clustering method and interaction probability to identify the overlapping and non-overlapping community structures in PPI networks, then to detect protein complexes in these sub-networks. Our method is based on both the fuzzy relation model and the graph model. We applied the method on several PPI networks and compared with a popular protein complex identification method, the clique percolation method. For the same data, we detected more protein complexes. We also applied our method on two social networks. The results showed our method works well for detecting sub-networks and give a reasonable understanding of these communities.
Resumo:
It is a big challenge to guarantee the quality of discovered relevance features in text documents for describing user preferences because of the large number of terms, patterns, and noise. Most existing popular text mining and classification methods have adopted term-based approaches. However, they have all suffered from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern-based methods should perform better than term- based ones in describing user preferences, but many experiments do not support this hypothesis. This research presents a promising method, Relevance Feature Discovery (RFD), for solving this challenging issue. It discovers both positive and negative patterns in text documents as high-level features in order to accurately weight low-level features (terms) based on their specificity and their distributions in the high-level features. The thesis also introduces an adaptive model (called ARFD) to enhance the exibility of using RFD in adaptive environment. ARFD automatically updates the system's knowledge based on a sliding window over new incoming feedback documents. It can efficiently decide which incoming documents can bring in new knowledge into the system. Substantial experiments using the proposed models on Reuters Corpus Volume 1 and TREC topics show that the proposed models significantly outperform both the state-of-the-art term-based methods underpinned by Okapi BM25, Rocchio or Support Vector Machine and other pattern-based methods.
Resumo:
With the growing number of XML documents on theWeb it becomes essential to effectively organise these XML documents in order to retrieve useful information from them. A possible solution is to apply clustering on the XML documents to discover knowledge that promotes effective data management, information retrieval and query processing. However, many issues arise in discovering knowledge from these types of semi-structured documents due to their heterogeneity and structural irregularity. Most of the existing research on clustering techniques focuses only on one feature of the XML documents, this being either their structure or their content due to scalability and complexity problems. The knowledge gained in the form of clusters based on the structure or the content is not suitable for reallife datasets. It therefore becomes essential to include both the structure and content of XML documents in order to improve the accuracy and meaning of the clustering solution. However, the inclusion of both these kinds of information in the clustering process results in a huge overhead for the underlying clustering algorithm because of the high dimensionality of the data. The overall objective of this thesis is to address these issues by: (1) proposing methods to utilise frequent pattern mining techniques to reduce the dimension; (2) developing models to effectively combine the structure and content of XML documents; and (3) utilising the proposed models in clustering. This research first determines the structural similarity in the form of frequent subtrees and then uses these frequent subtrees to represent the constrained content of the XML documents in order to determine the content similarity. A clustering framework with two types of models, implicit and explicit, is developed. The implicit model uses a Vector Space Model (VSM) to combine the structure and the content information. The explicit model uses a higher order model, namely a 3- order Tensor Space Model (TSM), to explicitly combine the structure and the content information. This thesis also proposes a novel incremental technique to decompose largesized tensor models to utilise the decomposed solution for clustering the XML documents. The proposed framework and its components were extensively evaluated on several real-life datasets exhibiting extreme characteristics to understand the usefulness of the proposed framework in real-life situations. Additionally, this research evaluates the outcome of the clustering process on the collection selection problem in the information retrieval on the Wikipedia dataset. The experimental results demonstrate that the proposed frequent pattern mining and clustering methods outperform the related state-of-the-art approaches. In particular, the proposed framework of utilising frequent structures for constraining the content shows an improvement in accuracy over content-only and structure-only clustering results. The scalability evaluation experiments conducted on large scaled datasets clearly show the strengths of the proposed methods over state-of-the-art methods. In particular, this thesis work contributes to effectively combining the structure and the content of XML documents for clustering, in order to improve the accuracy of the clustering solution. In addition, it also contributes by addressing the research gaps in frequent pattern mining to generate efficient and concise frequent subtrees with various node relationships that could be used in clustering.