159 resultados para multivariate data
Resumo:
The multivariate skew-t distribution (J Multivar Anal 79:93-113, 2001; J R Stat Soc, Ser B 65:367-389, 2003; Statistics 37:359-363, 2003) includes the Student t, skew-Cauchy and Cauchy distributions as special cases and the normal and skew-normal ones as limiting cases. In this paper, we explore the use of Markov Chain Monte Carlo (MCMC) methods to develop a Bayesian analysis of repeated measures, pretest/post-test data, under multivariate null intercept measurement error model (J Biopharm Stat 13(4):763-771, 2003) where the random errors and the unobserved value of the covariate (latent variable) follows a Student t and skew-t distribution, respectively. The results and methods are numerically illustrated with an example in the field of dentistry.
Resumo:
Considering the Wald, score, and likelihood ratio asymptotic test statistics, we analyze a multivariate null intercept errors-in-variables regression model, where the explanatory and the response variables are subject to measurement errors, and a possible structure of dependency between the measurements taken within the same individual are incorporated, representing a longitudinal structure. This model was proposed by Aoki et al. (2003b) and analyzed under the bayesian approach. In this article, considering the classical approach, we analyze asymptotic test statistics and present a simulation study to compare the behavior of the three test statistics for different sample sizes, parameter values and nominal levels of the test. Also, closed form expressions for the score function and the Fisher information matrix are presented. We consider two real numerical illustrations, the odontological data set from Hadgu and Koch (1999), and a quality control data set.
Resumo:
Scale mixtures of the skew-normal (SMSN) distribution is a class of asymmetric thick-tailed distributions that includes the skew-normal (SN) distribution as a special case. The main advantage of these classes of distributions is that they are easy to simulate and have a nice hierarchical representation facilitating easy implementation of the expectation-maximization algorithm for the maximum-likelihood estimation. In this paper, we assume an SMSN distribution for the unobserved value of the covariates and a symmetric scale mixtures of the normal distribution for the error term of the model. This provides a robust alternative to parameter estimation in multivariate measurement error models. Specific distributions examined include univariate and multivariate versions of the SN, skew-t, skew-slash and skew-contaminated normal distributions. The results and methods are applied to a real data set.
Resumo:
This paper derives the second-order biases Of maximum likelihood estimates from a multivariate normal model where the mean vector and the covariance matrix have parameters in common. We show that the second order bias can always be obtained by means of ordinary weighted least-squares regressions. We conduct simulation studies which indicate that the bias correction scheme yields nearly unbiased estimators. (C) 2009 Elsevier B.V. All rights reserved.
A robust Bayesian approach to null intercept measurement error model with application to dental data
Resumo:
Measurement error models often arise in epidemiological and clinical research. Usually, in this set up it is assumed that the latent variable has a normal distribution. However, the normality assumption may not be always correct. Skew-normal/independent distribution is a class of asymmetric thick-tailed distributions which includes the Skew-normal distribution as a special case. In this paper, we explore the use of skew-normal/independent distribution as a robust alternative to null intercept measurement error model under a Bayesian paradigm. We assume that the random errors and the unobserved value of the covariate (latent variable) follows jointly a skew-normal/independent distribution, providing an appealing robust alternative to the routine use of symmetric normal distribution in this type of model. Specific distributions examined include univariate and multivariate versions of the skew-normal distribution, the skew-t distributions, the skew-slash distributions and the skew contaminated normal distributions. The methods developed is illustrated using a real data set from a dental clinical trial. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Advances in diagnostic research are moving towards methods whereby the periodontal risk can be identified and quantified by objective measures using biomarkers. Patients with periodontitis may have elevated circulating levels of specific inflammatory markers that can be correlated to the severity of the disease. The purpose of this study was to evaluate whether differences in the serum levels of inflammatory biomarkers are differentially expressed in healthy and periodontitis patients. Twenty-five patients (8 healthy patients and 17 chronic periodontitis patients) were enrolled in the study. A 15 mL blood sample was used for identification of the inflammatory markers, with a human inflammatory flow cytometry multiplex assay. Among 24 assessed cytokines, only 3 (RANTES, MIG and Eotaxin) were statistically different between groups (p<0.05). In conclusion, some of the selected markers of inflammation are differentially expressed in healthy and periodontitis patients. Cytokine profile analysis may be further explored to distinguish the periodontitis patients from the ones free of disease and also to be used as a measure of risk. The present data, however, are limited and larger sample size studies are required to validate the findings of the specific biomarkers.
Resumo:
ABSTRACT Microphysical and thermodynamical features of two tropical systems, namely Hurricane Ivan and Typhoon Conson, and one sub-tropical, Catarina, have been analyzed based on space-born radar PR measurements available on the TRMM satellite. The procedure to classify the reflectivity profiles followed the Heymsfield et al (2000) and Steiner et al (1995) methodologies. The water and ice content have been calculated using a relationship obtained with data of the surface SPOL radar and PR in Rondonia State in Brazil. The diabatic heating rate due to latent heat release has been estimated using the methodology developed by Tao et al (1990). A more detailed analysis has been performed for Hurricane Catarina, the first of its kind in South Atlantic. High water content mean value has been found in Conson and Ivan at low levels and close to their centers. Results indicate that hurricane Catarina was shallower than the other two systems, with less water and the water was concentrated closer to its center. The mean ice content in Catarina was about 0.05 g kg-1 while in Conson it was 0.06 g kg-1 and in Ivan 0.08 g kg-1. Conson and Ivan had water content up to 0.3 g kg-1 above the 0ºC layer, while Catarina had less than 0.15 g kg-1. The latent heat released by Catarina showed to be very similar to the other two systems, except in the regions closer to the center.
Resumo:
A origem e a dispersão dos povos Tupiguarani têm sido intensamente debatidas entre arqueólogos e linguistas nas últimas cinco décadas. Em resumo, pode-se dizer que a ideia de que esses povos, que ocuparam grande parte do território brasileiro e parte da Bolívia, do Paraguai, do Uruguai e da Argentina, tiveram sua etnogênese na Amazônia e dali partiram para o leste e para o sul, por volta de 2.500 anos antes do presente, é bastante aceita entre os especialistas, embora uma dispersão no sentido oposto, isto é, do sul para o norte, com origem na bacia do Tietê-Paraná, não seja completamente descartada. Entre os arqueólogos que consideram a Amazônia como berço desses povos, alguns acreditam que esse surgimento se deu na Amazônia central. Outros acreditam que a etnogênese Tupiguarani ocorreu no sudoeste da Amazônia, onde hoje se concentra a maior diversidade linguística do tronco Tupi. Neste trabalho, a morfologia de 19 crânios associados à cerâmica Tupiguarani ou etnograficamente classificados como tais foram comparados a várias séries cranianas pré-históricas e etnográficas brasileiras por meio de estatísticas multivariadas. Duas técnicas multivariadas foram empregadas: Análise de Componentes Principais, aplicada sobre os centróides de cada série, e Distâncias de Mahalanobis, aplicadas aos dados individuais. Os resultados obtidos sugerem uma origem amazônica para os povos Tupiguarani, sobretudo pela forte associação encontrada entre crânios Tupi e Guarani do sudeste e do sul brasileiro e dos Tupi do norte do Brasil, com os espécimes provenientes da ilha de Marajó incluídos no estudo.
Resumo:
Lepidocharax, new genus, and Lepidocharax diamantina and L. burnsi new species from eastern Brazil are described herein. Lepidocharax is considered a monophyletic genus of the Stevardiinae and can be distinguished from the other members of this subfamily except Planaltina, Pseudocorynopoma, and Xenurobrycon by having the dorsal-fin origin vertically aligned with the anal-fin origin, vs. dorsal fin origin anterior or posterior to anal-fin origin. Additionally the new genus can be distinguished from those three genera by not having the scales extending over the ventral caudal-fin lobe modified to form the dorsal border of the pheromone pouch organ or to represent a pouch scale in sexually mature males. In this paper, we describe these two recently discovered species and the ultrastructure of their spermatozoa.
Resumo:
During the exploration and mapping of new caves in Serra do Ramalho karst area, southern Bahia state, cavers from the Grupo Bambuí de Pesquisas Espeleológicas - GBPE (Belo Horizonte) noticed the presence of troglomorphic catfishes (species with reduced eyes and/or melanic pigmentation), which we intensively investigated with regards to their ecology and behavior since 2005. Non-troglomorphic fishes regularly found in the studied caves were included in this investigation. We present here data on the natural history of two troglobitic (exclusively subterranean troglomorphic species) fishes - Rhamdia enfurnada Bichuette & Trajano, 2005 (Heptapteridae; Gruna do Enfurnado) and Trichomycterus undescribed species (Trichomycteridae; Lapa dos Peixes and Gruna da Água Clara), and non-troglomorphic Hoplias cf. malabaricus, probably a troglophile (able to form populations both in epigean and subterranean habitats) in the Gruna do Enfurnado, and Pimelodella sp., a species with a sink population in the Lapa dos Peixes.
Resumo:
Abelhas das orquídeas (Apini, Euglossina) apresentam distribuição principalmente Neotropical, com cerca de 200 espécies e cinco gêneros descritos. Muitos levantamentos locais de fauna estão disponíveis na literatura, mas estudos comparativos sobre a composição e distribuição dos Euglossina são ainda escassos. O objetivo deste estudo é analisar os dados disponíveis de 29 assembleias a fim de entender os padrões gerais de distribuição espacial nas áreas amostradas ao longo do Neotrópico. Métodos de ordenação (DCA e NMDS) foram utilizados para descrever os agrupamentos de assembleias de acordo com as ocorrências de abelhas das orquídeas. As localidades de florestas da América Central e da Amazônia formaram grupos coesos em ambas as análises, enquanto as localidades de Mata Atlântica ficaram mais dispersas nos gráficos. Localidades na margem leste da Amazônia aparecem como áreas de transição características entre esta sub-região e a Mata Atlântica. As análises de variância entre o primeiro eixo da DCA e variáveis selecionadas apresentaram valores significantes quanto à influência dos gradientes de latitude, longitude e precipitação, bem como das sub-regiões biogeográficas nos agrupamentos das assembleias. O padrão geral encontrado é congruente com os padrões biogeográficos previamente propostos para a região Neotropical. Os resultados do DCA auxiliam ainda a identificar, de forma independente, os elementos das faunas de cada uma das formações vegetais estudadas.
Resumo:
Geographic Data Warehouses (GDW) are one of the main technologies used in decision-making processes and spatial analysis, and the literature proposes several conceptual and logical data models for GDW. However, little effort has been focused on studying how spatial data redundancy affects SOLAP (Spatial On-Line Analytical Processing) query performance over GDW. In this paper, we investigate this issue. Firstly, we compare redundant and non-redundant GDW schemas and conclude that redundancy is related to high performance losses. We also analyze the issue of indexing, aiming at improving SOLAP query performance on a redundant GDW. Comparisons of the SB-index approach, the star-join aided by R-tree and the star-join aided by GiST indicate that the SB-index significantly improves the elapsed time in query processing from 25% up to 99% with regard to SOLAP queries defined over the spatial predicates of intersection, enclosure and containment and applied to roll-up and drill-down operations. We also investigate the impact of the increase in data volume on the performance. The increase did not impair the performance of the SB-index, which highly improved the elapsed time in query processing. Performance tests also show that the SB-index is far more compact than the star-join, requiring only a small fraction of at most 0.20% of the volume. Moreover, we propose a specific enhancement of the SB-index to deal with spatial data redundancy. This enhancement improved performance from 80 to 91% for redundant GDW schemas.
Resumo:
Due to the imprecise nature of biological experiments, biological data is often characterized by the presence of redundant and noisy data. This may be due to errors that occurred during data collection, such as contaminations in laboratorial samples. It is the case of gene expression data, where the equipments and tools currently used frequently produce noisy biological data. Machine Learning algorithms have been successfully used in gene expression data analysis. Although many Machine Learning algorithms can deal with noise, detecting and removing noisy instances from the training data set can help the induction of the target hypothesis. This paper evaluates the use of distance-based pre-processing techniques for noise detection in gene expression data classification problems. This evaluation analyzes the effectiveness of the techniques investigated in removing noisy data, measured by the accuracy obtained by different Machine Learning classifiers over the pre-processed data.
Resumo:
OBJECTIVE: To estimate the spatial intensity of urban violence events using wavelet-based methods and emergency room data. METHODS: Information on victims attended at the emergency room of a public hospital in the city of São Paulo, Southeastern Brazil, from January 1, 2002 to January 11, 2003 were obtained from hospital records. The spatial distribution of 3,540 events was recorded and a uniform random procedure was used to allocate records with incomplete addresses. Point processes and wavelet analysis technique were used to estimate the spatial intensity, defined as the expected number of events by unit area. RESULTS: Of all georeferenced points, 59% were accidents and 40% were assaults. There is a non-homogeneous spatial distribution of the events with high concentration in two districts and three large avenues in the southern area of the city of São Paulo. CONCLUSIONS: Hospital records combined with methodological tools to estimate intensity of events are useful to study urban violence. The wavelet analysis is useful in the computation of the expected number of events and their respective confidence bands for any sub-region and, consequently, in the specification of risk estimates that could be used in decision-making processes for public policies.
Resumo:
The aim of this study was to analyze the distribution and abundance of the fish fauna of Palmas bay on Anchieta Island in southeastern Brazil. Specimens were caught in the summer and winter of 1992, using an otter trawl at three locations in the bay. The specimens were caught in both the nighttime and daytime. Data on the water temperature and salinity were recorded for the characterization of the predominant water mass in the region, and sediment samples were taken for granulometric analysis. A total of 7 656 specimens (79 species), with a total weight of approximately 300 kg, were recorded. The most abundant species were Eucinostomus argenteus, Ctenosciaena gracilicirrhus, Haemulon steindachneri, Eucinostomus gula and Diapterus rhombeus, which together accounted for more than 73% of the sample. In general, the ecological indices showed no differences in the composition of species for the abiotic variables analyzed. The multivariate analysis showed that the variations in the distribution of the fish fauna were mainly associated with intra-annual differences in temperature and salinity, resulting from the presence of South Atlantic Central Water (SACW) in the area during the summer. The analysis also showed an association with the type of bottom and a lesser association with respect to the night/day periods.