879 resultados para Panel Data Model
Resumo:
Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.
Resumo:
We consider a mixture model approach to the regression analysis of competing-risks data. Attention is focused on inference concerning the effects of factors on both the probability of occurrence and the hazard rate conditional on each of the failure types. These two quantities are specified in the mixture model using the logistic model and the proportional hazards model, respectively. We propose a semi-parametric mixture method to estimate the logistic and regression coefficients jointly, whereby the component-baseline hazard functions are completely unspecified. Estimation is based on maximum likelihood on the basis of the full likelihood, implemented via an expectation-conditional maximization (ECM) algorithm. Simulation studies are performed to compare the performance of the proposed semi-parametric method with a fully parametric mixture approach. The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples. When the component-baseline hazard is not monotonic increasing, the semi-parametric method consistently provides less biased estimates than a fully parametric approach and is comparable in efficiency in the estimation of the parameters for all levels of censoring. The methods are illustrated using a real data set of prostate cancer patients treated with different dosages of the drug diethylstilbestrol. Copyright (C) 2003 John Wiley Sons, Ltd.
Resumo:
In microarray studies, the application of clustering techniques is often used to derive meaningful insights into the data. In the past, hierarchical methods have been the primary clustering tool employed to perform this task. The hierarchical algorithms have been mainly applied heuristically to these cluster analysis problems. Further, a major limitation of these methods is their inability to determine the number of clusters. Thus there is a need for a model-based approach to these. clustering problems. To this end, McLachlan et al. [7] developed a mixture model-based algorithm (EMMIX-GENE) for the clustering of tissue samples. To further investigate the EMMIX-GENE procedure as a model-based -approach, we present a case study involving the application of EMMIX-GENE to the breast cancer data as studied recently in van 't Veer et al. [10]. Our analysis considers the problem of clustering the tissue samples on the basis of the genes which is a non-standard problem because the number of genes greatly exceed the number of tissue samples. We demonstrate how EMMIX-GENE can be useful in reducing the initial set of genes down to a more computationally manageable size. The results from this analysis also emphasise the difficulty associated with the task of separating two tissue groups on the basis of a particular subset of genes. These results also shed light on why supervised methods have such a high misallocation error rate for the breast cancer data.
Resumo:
Websites are, nowadays, the face of institutions, but they are often neglected, especially when it comes to contents. In the present paper, we put forth an investigation work whose final goal is the development of a model for the measurement of data quality in institutional websites for health units. To that end, we have carried out a bibliographic review of the available approaches for the evaluation of website content quality, in order to identify the most recurrent dimensions and the attributes, and we are currently carrying out a Delphi Method process, presently in its second stage, with the purpose of reaching an adequate set of attributes for the measurement of content quality.
Resumo:
This article presents a research work, the goal of which was to achieve a model for the evaluation of data quality in institutional websites of health units in a broad and balanced way. We have carried out a literature review of the available approaches for the evaluation of website content quality, in order to identify the most recurrent dimensions and the attributes, and we have also carried out a Delphi method process with experts in order to reach an adequate set of attributes and their respective weights for the measurement of content quality. The results obtained revealed a high level of consensus among the experts who participated in the Delphi process. On the other hand, the different statistical analysis and techniques implemented are robust and attach confidence to our results and consequent model obtained.
Resumo:
27th Annual Conference of the European Cetacean Society. Setúbal, Portugal, 8-10 April 2013.
Resumo:
Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies
Resumo:
This paper presents the creation and development of technological schools directly linked to the business community and to higher public education. Establishing themselves as the key interface between the two sectors they make a signigicant contribution by having a greater competitive edge when faced with increasing competition in the tradional markets. The development of new business strategies supported by references of excellence, quality and competitiveness also provides a good link between the estalishment of partnerships aiming at the qualification of education boards at a medium level between the technological school and higher education with a technological foundation. We present a case study as an example depicting the success of Escola Tecnológica de Vale de Cambra.
Resumo:
Dissertação para a obtenção do grau de Mestre em Engenharia Electrotécnica Ramo de Energia
Resumo:
Dissertação apresentada para obtenção do Grau de Doutor em Engenharia do Ambiente pela Universidade Nova de Lisboa,Faculdade de Ciências e Tecnologia
Resumo:
This article develops a latent class model for estimating willingness-to-pay for public goods using simultaneously contingent valuation (CV) and attitudinal data capturing protest attitudes related to the lack of trust in public institutions providing those goods. A measure of the social cost associated with protest responses and the consequent loss in potential contributions for providing the public good is proposed. The presence of potential justification biases is further considered, that is, the possibility that for psychological reasons the response to the CV question affects the answers to the attitudinal questions. The results from our empirical application suggest that psychological factors should not be ignored in CV estimation for policy purposes, allowing for a correct identification of protest responses.
Resumo:
Genome-scale metabolic models are valuable tools in the metabolic engineering process, based on the ability of these models to integrate diverse sources of data to produce global predictions of organism behavior. At the most basic level, these models require only a genome sequence to construct, and once built, they may be used to predict essential genes, culture conditions, pathway utilization, and the modifications required to enhance a desired organism behavior. In this chapter, we address two key challenges associated with the reconstruction of metabolic models: (a) leveraging existing knowledge of microbiology, biochemistry, and available omics data to produce the best possible model; and (b) applying available tools and data to automate the reconstruction process. We consider these challenges as we progress through the model reconstruction process, beginning with genome assembly, and culminating in the integration of constraints to capture the impact of transcriptional regulation. We divide the reconstruction process into ten distinct steps: (1) genome assembly from sequenced reads; (2) automated structural and functional annotation; (3) phylogenetic tree-based curation of genome annotations; (4) assembly and standardization of biochemistry database; (5) genome-scale metabolic reconstruction; (6) generation of core metabolic model; (7) generation of biomass composition reaction; (8) completion of draft metabolic model; (9) curation of metabolic model; and (10) integration of regulatory constraints. Each of these ten steps is documented in detail.
Resumo:
Publicado em "Information control in manufacturing 1998 : (INCOM'98) : advances in industrial engineering : a proceedings volume from the 9th IFAC Symposium, Nancy-Metz, France, 24-26 June 1998. Vol. 2"
Resumo:
Magdeburg, Univ., Fak. für Informatik, Diss., 2015