901 resultados para Techniques of data analysis
Resumo:
In this paper, we present an algorithm for cluster analysis that integrates aspects from cluster ensemble and multi-objective clustering. The algorithm is based on a Pareto-based multi-objective genetic algorithm, with a special crossover operator, which uses clustering validation measures as objective functions. The algorithm proposed can deal with data sets presenting different types of clusters, without the need of expertise in cluster analysis. its result is a concise set of partitions representing alternative trade-offs among the objective functions. We compare the results obtained with our algorithm, in the context of gene expression data sets, to those achieved with multi-objective Clustering with automatic K-determination (MOCK). the algorithm most closely related to ours. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
The TCABR data analysis and acquisition system has been upgraded to support a joint research programme using remote participation technologies. The architecture of the new system uses Java language as programming environment. Since application parameters and hardware in a joint experiment are complex with a large variability of components, requirements and specification solutions need to be flexible and modular, independent from operating system and computer architecture. To describe and organize the information on all the components and the connections among them, systems are developed using the extensible Markup Language (XML) technology. The communication between clients and servers uses remote procedure call (RPC) based on the XML (RPC-XML technology). The integration among Java language, XML and RPC-XML technologies allows to develop easily a standard data and communication access layer between users and laboratories using common software libraries and Web application. The libraries allow data retrieval using the same methods for all user laboratories in the joint collaboration, and the Web application allows a simple graphical user interface (GUI) access. The TCABR tokamak team in collaboration with the IPFN (Instituto de Plasmas e Fusao Nuclear, Instituto Superior Tecnico, Universidade Tecnica de Lisboa) is implementing this remote participation technologies. The first version was tested at the Joint Experiment on TCABR (TCABRJE), a Host Laboratory Experiment, organized in cooperation with the IAEA (International Atomic Energy Agency) in the framework of the IAEA Coordinated Research Project (CRP) on ""Joint Research Using Small Tokamaks"". (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Specific choices about how to represent complex networks can have a substantial impact on the execution time required for the respective construction and analysis of those structures. In this work we report a comparison of the effects of representing complex networks statically by adjacency matrices or dynamically by adjacency lists. Three theoretical models of complex networks are considered: two types of Erdos-Renyi as well as the Barabasi-Albert model. We investigated the effect of the different representations with respect to the construction and measurement of several topological properties (i.e. degree, clustering coefficient, shortest path length, and betweenness centrality). We found that different forms of representation generally have a substantial effect on the execution time, with the sparse representation frequently resulting in remarkably superior performance. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
The Survivability of Swedish Emergency Management Related Research Centers and Academic Programs: A Preliminary Sociology of Science Analysis Despite being a relatively safe nation, Sweden has four different universities supporting four emergency management research centers and an equal and growing number of academic programs. In this paper, I discuss how these centers and programs survive within the current organizational environment. The sociology of science or the sociology of scientific knowledge perspectives should provide a theoretical guide. Yet, scholars of these perspectives have produced no research on these related topics. Thus, the population ecology model and the notion of organizational niche provide my theoretical foundation. My data come from 26 interviews from those four institutions, the gathering of documents, and observations. I found that each institution has found its own niche with little or no competition – with one exception. Three of the universities do have an international focus. Yet, their foci have minimal overlap. Finally, I suggest that key aspects of Swedish culture, including safety, and a need aid to the poor, help explain the extensive funding these centers and programs receive to survive.
Resumo:
Excessive labor turnover may be considered, to a great extent, an undesirable feature of a given economy. This follows from considerations such as underinvestment in human capital by firms. Understanding the determinants and the evolution of turnover in a particular labor market is therefore of paramount importance, including policy considerations. The present paper proposes an econometric analysis of turnover in the Brazilian labor market, based on a partial observability bivariate probit model. This model considers the interdependence of decisions taken by workers and firms, helping to elucidate the causes that lead each of them to end an employment relationship. The Employment and Unemployment Survey (PED) conducted by the State System of Data Analysis (SEADE) and by the Inter-Union Department of Statistics and Socioeconomic Studies (DIEESE) provides data at the individual worker level, allowing for the estimation of the joint probabilities of decisions to quit or stay on the job on the worker’s side, and to maintain or fire the employee on the firm’s side, during a given time period. The estimated parameters relate these estimated probabilities to the characteristics of workers, job contracts, and to the potential macroeconomic determinants in different time periods. The results confirm the theoretical prediction that the probability of termination of an employment relationship tends to be smaller as the worker acquires specific skills. The results also show that the establishment of a formal employment relationship reduces the probability of a quit decision by the worker, and also the firm’s firing decision in non-industrial sectors. With regard to the evolution of quit probability over time, the results show that an increase in the unemployment rate inhibits quitting, although this tends to wane as the unemployment rate rises.
Resumo:
The aim of this article is to assess the role of real effective exchange rate volatility on long-run economic growth for a set of 82 advanced and emerging economies using a panel data set ranging from 1970 to 2009. With an accurate measure for exchange rate volatility, the results for the two-step system GMM panel growth models show that a more (less) volatile RER has significant negative (positive) impact on economic growth and the results are robust for different model specifications. In addition to that, exchange rate stability seems to be more important to foster long-run economic growth than exchange rate misalignment
Resumo:
There are four different hypotheses analyzed in the literature that explain deunionization, namely: the decrease in the demand for union representation by the workers; the impaet of globalization over unionization rates; teehnieal ehange and ehanges in the legal and politieal systems against unions. This paper aims to test alI ofthem. We estimate a logistie regression using panel data proeedure with 35 industries from 1973 to 1999 and eonclude that the four hypotheses ean not be rejeeted by the data. We also use a varianee analysis deeomposition to study the impaet of these variables over the drop in unionization rates. In the model with no demographic variables the results show that these economic (tested) variables can account from 10% to 12% of the drop in unionization. However, when we include demographic variables these tested variables can account from 10% to 35% in the total variation of unionization rates. In this case the four hypotheses tested can explain up to 50% ofthe total drop in unionization rates explained by the model.
Resumo:
The objective of the present study was to investigate the effect of data structure on estimated genetic parameters and predicted breeding values of direct and maternal genetic effects for weaning weight (WW) and weight gain from birth to weaning (BWG), including or not the genetic covariance between direct and maternal effects. Records of 97,490 Nellore animals born between 1993 and 2006, from the Jacarezinho cattle raising farm, were used. Two different data sets were analyzed: DI_all, which included all available progenies of dams without their own performance; DII_all, which included DI_all + 20% of recorded progenies with maternal phenotypes. Two subsets were obtained from each data set (DI_all and DII_all): DI_1 and DII_1, which included only dams with three or fewer progenies; DI_5 and DII_5, which included only dams with five or more progenies. (Co)variance components and heritabilities were estimated by Bayesian inference through Gibbs sampling using univariate animal models. In general, for the population and traits studied, the proportion of dams with known phenotypic information and the number of progenies per dam influenced direct and maternal heritabilities, as well as the contribution of maternal permanent environmental variance to phenotypic variance. Only small differences were observed in the genetic and environmental parameters when the genetic covariance between direct and maternal effects was set to zero in the data sets studied. Thus, the inclusion or not of the genetic covariance between direct and maternal effects had little effect on the ranking of animals according to their breeding values for WW and BWG. Accurate estimation of genetic correlations between direct and maternal genetic effects depends on the data structure. Thus, this covariance should be set to zero in Nellore data sets in which the proportion of dams with phenotypic information is low, the number of progenies per dam is small, and pedigree relationships are poorly known. (c) 2012 Elsevier B.V. All rights reserved.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
OBJETIVO: comparar medidas de tamanhos dentários, suas reprodutibilidades e a aplicação da equação de regressão de Tanaka e Johnston na predição do tamanho dos caninos e pré-molares em modelos de gesso e digital. MÉTODOS: trinta modelos de gesso foram escaneados para obtenção dos modelos digitais. As medidas do comprimento mesiodistal dos dentes foram obtidas com paquímetro digital nos modelos de gesso e nos modelos digitais utilizando o software O3d (Widialabs). A somatória do tamanho dos incisivos inferiores foi utilizada para obter os valores de predição do tamanho dos pré-molares e caninos utilizando equação de regressão, e esses valores foram comparados ao tamanho real dos dentes. Os dados foram analisados estatisticamente, aplicando-se aos resultados o teste de correlação de Pearson, a fórmula de Dahlberg, o teste t pareado e a análise de variância (p < 0,05). RESULTADOS: excelente concordância intraexaminador foi observada nas medidas realizadas em ambos os modelos. O erro aleatório não esteve presente nas medidas obtidas com paquímetro, e o erro sistemático foi mais frequente no modelo digital. A previsão de espaço obtida pela aplicação da equação de regressão foi maior que a somatória dos pré-molares e caninos presentes nos modelos de gesso e nos modelos digitais. CONCLUSÃO: apesar da boa reprodutibilidade das medidas realizadas em ambos os modelos, a maioria das medidas dos modelos digitais foram superiores às do modelos de gesso. O espaço previsto foi superestimado em ambos os modelos e significativamente maior nos modelos digitais.
Resumo:
Diplopods feed organic matter in decomposition; however, some environmental factors can promote changes in tissues of these animals. Sewage sludge has been applied for recuperation of physical structure of degraded soil. This work analyzed the influence of the sludge from a city of So Paulo in the midgut of the diplopod Rhinocricus padbergi. After the exposition to sludge, the midgut was prepared for histological and ultra-structural analyses. After 1 week of exposition, there were various glycoprotein globules in the fat body, which appeared, ultrastructurally, little electron dense. In the animals exposed for 2 weeks, there was an intensive renovation of the epithelium with the invasion of regenerative cells, which was observed in the histological and ultra-structural analyses. These data showed that the sludge present various substances that were very hazardous for these animals; more studies were necessary before the application of this in agriculture.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Restricted breeding seasons in beef cattle lead to censoring of reproductive data. In this paper, age at first conception (AFC) of Nellore females exposed to the sires for the first time between 11 and 16 months of age, was studied aiming to verify the possibility of genetically advance sexual precocity using a survival model. The final data set contained 6699 records of AFC in days. Records of females that did not calve in the next year following exposure to the sire were considered censored (77.5% of total). The model used was a Weibull mixed survival model including effects of contemporary groups, period (fixed) and animal (random). The effect of the contemporary groups on AFC was important (p < 0.01). Heritabilities were 0.51 and 0.76 in logarithmic and original scales respectively. Results indicate that it is possible to genetically advance sexual precocity, using the outcome of survival analysis of AFC as selection criterion. They also suggest that improvements of the environment could advance sexual precocity too, thus an adequate pregnancy rate for farmers could quickly be achieved.
Resumo:
Linear mixed effects models are frequently used to analyse longitudinal data, due to their flexibility in modelling the covariance structure between and within observations. Further, it is easy to deal with unbalanced data, either with respect to the number of observations per subject or per time period, and with varying time intervals between observations. In most applications of mixed models to biological sciences, a normal distribution is assumed both for the random effects and for the residuals. This, however, makes inferences vulnerable to the presence of outliers. Here, linear mixed models employing thick-tailed distributions for robust inferences in longitudinal data analysis are described. Specific distributions discussed include the Student-t, the slash and the contaminated normal. A Bayesian framework is adopted, and the Gibbs sampler and the Metropolis-Hastings algorithms are used to carry out the posterior analyses. An example with data on orthodontic distance growth in children is discussed to illustrate the methodology. Analyses based on either the Student-t distribution or on the usual Gaussian assumption are contrasted. The thick-tailed distributions provide an appealing robust alternative to the Gaussian process for modelling distributions of the random effects and of residuals in linear mixed models, and the MCMC implementation allows the computations to be performed in a flexible manner.