963 resultados para count data models
Resumo:
27th Annual Conference of the European Cetacean Society. Setúbal, Portugal, 8-10 April 2013.
Resumo:
27th Annual Conference of the European Cetacean Society. Setúbal, Portugal, 8-10 April 2013.
Resumo:
A great number of low-temperature geothermal fields occur in Northern-Portugal related to fractured rocks. The most important superficial manifestations of these hydrothermal systems appear in pull-apart tectonic basins and are strongly conditioned by the orientation of the main fault systems in the region. This work presents the interpretation of gravity gradient maps and 3D inversion model produced from a regional gravity survey. The horizontal gradients reveal a complex fault system. The obtained 3D model of density contrast puts into evidence the main fault zone in the region and the depth distribution of the granitic bodies. Their relationship with the hydrothermal systems supports the conceptual models elaborated from hydrochemical and isotopic water analyses. This work emphasizes the importance of the role of the gravity method and analysis to better understand the connection between hydrothermal systems and the fractured rock pattern and surrounding geology. (c) 2013 Elsevier B.V. All rights reserved.
Resumo:
Mestrado em Engenharia Electrotécnica – Sistemas Eléctricos de Energia
Resumo:
This article is is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Attribution-NonCommercial (CC BY-NC) license lets others remix, tweak, and build upon work non-commercially, and although the new works must also acknowledge & be non-commercial.
Resumo:
Trabalho de Projeto para obtenção do grau de Mestre em Engenharia Informática e de Computadores
Resumo:
The tongue is the most important and dynamic articulator for speech formation, because of its anatomic aspects (particularly, the large volume of this muscular organ comparatively to the surrounding organs of the vocal tract) and also due to the wide range of movements and flexibility that are involved. In speech communication research, a variety of techniques have been used for measuring the three-dimensional vocal tract shapes. More recently, magnetic resonance imaging (MRI) becomes common; mainly, because this technique allows the collection of a set of static and dynamic images that can represent the entire vocal tract along any orientation. Over the years, different anatomical organs of the vocal tract have been modelled; namely, 2D and 3D tongue models, using parametric or statistical modelling procedures. Our aims are to present and describe some 3D reconstructed models from MRI data, for one subject uttering sustained articulations of some typical Portuguese sounds. Thus, we present a 3D database of the tongue obtained by stack combinations with the subject articulating Portuguese vowels. This 3D knowledge of the speech organs could be very important; especially, for clinical purposes (for example, for the assessment of articulatory impairments followed by tongue surgery in speech rehabilitation), and also for a better understanding of acoustic theory in speech formation.
Resumo:
The aim of this paper is to develop models for experimental open-channel water delivery systems and assess the use of three data-driven modeling tools toward that end. Water delivery canals are nonlinear dynamical systems and thus should be modeled to meet given operational requirements while capturing all relevant dynamics, including transport delays. Typically, the derivation of first principle models for open-channel systems is based on the use of Saint-Venant equations for shallow water, which is a time-consuming task and demands for specific expertise. The present paper proposes and assesses the use of three data-driven modeling tools: artificial neural networks, composite local linear models and fuzzy systems. The canal from Hydraulics and Canal Control Nucleus (A parts per thousand vora University, Portugal) will be used as a benchmark: The models are identified using data collected from the experimental facility, and then their performances are assessed based on suitable validation criterion. The performance of all models is compared among each other and against the experimental data to show the effectiveness of such tools to capture all significant dynamics within the canal system and, therefore, provide accurate nonlinear models that can be used for simulation or control. The models are available upon request to the authors.
Resumo:
Research on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.
Resumo:
Research on cluster analysis for categorical data continues to develop, new clustering algorithms being proposed. However, in this context, the determination of the number of clusters is rarely addressed. We propose a new approach in which clustering and the estimation of the number of clusters is done simultaneously for categorical data. We assume that the data originate from a finite mixture of multinomial distributions and use a minimum message length criterion (MML) to select the number of clusters (Wallace and Bolton, 1986). For this purpose, we implement an EM-type algorithm (Silvestre et al., 2008) based on the (Figueiredo and Jain, 2002) approach. The novelty of the approach rests on the integration of the model estimation and selection of the number of clusters in a single algorithm, rather than selecting this number based on a set of pre-estimated candidate models. The performance of our approach is compared with the use of Bayesian Information Criterion (BIC) (Schwarz, 1978) and Integrated Completed Likelihood (ICL) (Biernacki et al., 2000) using synthetic data. The obtained results illustrate the capacity of the proposed algorithm to attain the true number of cluster while outperforming BIC and ICL since it is faster, which is especially relevant when dealing with large data sets.
Resumo:
Myocardial Perfusion Gated Single Photon Emission Tomography (Gated-SPET) imaging is used for the combined evaluation of myocardial perfusion and left ventricular (LV). The purpose of this study is to evaluate the influence of the total number of counts acquired from myocardium, in the calculation of myocardial functional parameters using routine software procedures. Methods: Gated-SPET studies were simulated using Monte Carlo GATE package and NURBS phantom. Simulated data were reconstructed and processed using the commercial software package Quantitative Gated-SPECT. The Bland-Altman and Mann-Whitney-Wilcoxon tests were used to analyze the influence of the number of total counts in the calculation of LV myocardium functional parameters. Results: In studies simulated with 3MBq in the myocardium there were significant differences in the functional parameters: Left ventricular ejection fraction (LVEF), end-systolic volume (ESV), Motility and Thickness; between studies acquired with 15s/projection and 30s/projection. Simulations with 4.2MBq show significant differences in LVEF, end-diastolic volume (EDV) and Thickness. Meanwhile in the simulations with 5.4MBq and 8.4MBq the differences were statistically significant for Motility and Thickness. Conclusion: The total number of counts per simulation doesn't significantly interfere with the determination of Gated-SPET functional parameters using the administered average activity of 450MBq to 5.4MBq in myocardium.
Resumo:
Adhesively-bonded joints are extensively used in several fields of engineering. Cohesive Zone Models (CZM) have been used for the strength prediction of adhesive joints, as an add-in to Finite Element (FE) analyses that allows simulation of damage growth, by consideration of energetic principles. A useful feature of CZM is that different shapes can be developed for the cohesive laws, depending on the nature of the material or interface to be simulated, allowing an accurate strength prediction. This work studies the influence of the CZM shape (triangular, exponential or trapezoidal) used to model a thin adhesive layer in single-lap adhesive joints, for an estimation of its influence on the strength prediction under different material conditions. By performing this study, guidelines are provided on the possibility to use a CZM shape that may not be the most suited for a particular adhesive, but that may be more straightforward to use/implement and have less convergence problems (e.g. triangular shaped CZM), thus attaining the solution faster. The overall results showed that joints bonded with ductile adhesives are highly influenced by the CZM shape, and that the trapezoidal shape fits best the experimental data. Moreover, the smaller is the overlap length (LO), the greater is the influence of the CZM shape. On the other hand, the influence of the CZM shape can be neglected when using brittle adhesives, without compromising too much the accuracy of the strength predictions.
Resumo:
Transdermal biotechnologies are an ever increasing field of interest, due to the medical and pharmaceutical applications that they underlie. There are several mathematical models at use that permit a more inclusive vision of pure experimental data and even allow practical extrapolation for new dermal diffusion methodologies. However, they grasp a complex variety of theories and assumptions that allocate their use for specific situations. Models based on Fick's First Law found better use in contexts where scaled particle theory Models would be extensive in time-span but the reciprocal is also true, as context of transdermal diffusion of particular active compounds changes. This article reviews extensively the various theoretical methodologies for studying dermic diffusion in the rate limiting dermic barrier, the stratum corneum, and systematizes its characteristics, their proper context of application, advantages and limitations, as well as future perspectives.
Resumo:
The principal topic of this work is the application of data mining techniques, in particular of machine learning, to the discovery of knowledge in a protein database. In the first chapter a general background is presented. Namely, in section 1.1 we overview the methodology of a Data Mining project and its main algorithms. In section 1.2 an introduction to the proteins and its supporting file formats is outlined. This chapter is concluded with section 1.3 which defines that main problem we pretend to address with this work: determine if an amino acid is exposed or buried in a protein, in a discrete way (i.e.: not continuous), for five exposition levels: 2%, 10%, 20%, 25% and 30%. In the second chapter, following closely the CRISP-DM methodology, whole the process of construction the database that supported this work is presented. Namely, it is described the process of loading data from the Protein Data Bank, DSSP and SCOP. Then an initial data exploration is performed and a simple prediction model (baseline) of the relative solvent accessibility of an amino acid is introduced. It is also introduced the Data Mining Table Creator, a program developed to produce the data mining tables required for this problem. In the third chapter the results obtained are analyzed with statistical significance tests. Initially the several used classifiers (Neural Networks, C5.0, CART and Chaid) are compared and it is concluded that C5.0 is the most suitable for the problem at stake. It is also compared the influence of parameters like the amino acid information level, the amino acid window size and the SCOP class type in the accuracy of the predictive models. The fourth chapter starts with a brief revision of the literature about amino acid relative solvent accessibility. Then, we overview the main results achieved and finally discuss about possible future work. The fifth and last chapter consists of appendices. Appendix A has the schema of the database that supported this thesis. Appendix B has a set of tables with additional information. Appendix C describes the software provided in the DVD accompanying this thesis that allows the reconstruction of the present work.
Resumo:
Managing the physical and compute infrastructure of a large data center is an embodiment of a Cyber-Physical System (CPS). The physical parameters of the data center (such as power, temperature, pressure, humidity) are tightly coupled with computations, even more so in upcoming data centers, where the location of workloads can vary substantially due, for example, to workloads being moved in a cloud infrastructure hosted in the data center. In this paper, we describe a data collection and distribution architecture that enables gathering physical parameters of a large data center at a very high temporal and spatial resolutionof the sensor measurements. We think this is an important characteristic to enable more accurate heat-flow models of the data center andwith them, _and opportunities to optimize energy consumption. Havinga high resolution picture of the data center conditions, also enables minimizing local hotspots, perform more accurate predictive maintenance (pending failures in cooling and other infrastructure equipment can be more promptly detected) and more accurate billing. We detail this architecture and define the structure of the underlying messaging system that is used to collect and distribute the data. Finally, we show the results of a preliminary study of a typical data center radio environment.