Biblioteca Digital

37 resultados para Yield curve data sets

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)

An incremental space to visualize dynamic data sets

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In Information Visualization, adding and removing data elements can strongly impact the underlying visual space. We have developed an inherently incremental technique (incBoard) that maintains a coherent disposition of elements from a dynamic multidimensional data set on a 2D grid as the set changes. Here, we introduce a novel layout that uses pairwise similarity from grid neighbors, as defined in incBoard, to reposition elements on the visual space, free from constraints imposed by the grid. The board continues to be updated and can be displayed alongside the new space. As similar items are placed together, while dissimilar neighbors are moved apart, it supports users in the identification of clusters and subsets of related elements. Densely populated areas identified in the incSpace can be efficiently explored with the corresponding incBoard visualization, which is not susceptible to occlusion. The solution remains inherently incremental and maintains a coherent disposition of elements, even for fully renewed sets. The algorithm considers relative positions for the initial placement of elements, and raw dissimilarity to fine tune the visualization. It has low computational cost, with complexity depending only on the size of the currently viewed subset, V. Thus, a data set of size N can be sequentially displayed in O(N) time, reaching O(N (2)) only if the complete set is simultaneously displayed.

Two-Phase Mapping for Projecting Massive Data Sets

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most multidimensional projection techniques rely on distance (dissimilarity) information between data instances to embed high-dimensional data into a visual space. When data are endowed with Cartesian coordinates, an extra computational effort is necessary to compute the needed distances, making multidimensional projection prohibitive in applications dealing with interactivity and massive data. The novel multidimensional projection technique proposed in this work, called Part-Linear Multidimensional Projection (PLMP), has been tailored to handle multivariate data represented in Cartesian high-dimensional spaces, requiring only distance information between pairs of representative samples. This characteristic renders PLMP faster than previous methods when processing large data sets while still being competitive in terms of precision. Moreover, knowing the range of variation for data instances in the high-dimensional space, we can make PLMP a truly streaming data projection technique, a trait absent in previous methods.

Transiting exoplanets from the CoRoT space mission XI. CoRoT-8b: a hot and dense sub-Saturn around a K1 dwarf

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aims. We report the discovery of CoRoT-8b, a dense small Saturn-class exoplanet that orbits a K1 dwarf in 6.2 days, and we derive its orbital parameters, mass, and radius. Methods. We analyzed two complementary data sets: the photometric transit curve of CoRoT-8b as measured by CoRoT and the radial velocity curve of CoRoT-8 as measured by the HARPS spectrometer**. Results. We find that CoRoT-8b is on a circular orbit with a semi-major axis of 0.063 +/- 0.001 AU. It has a radius of 0.57 +/- 0.02 R(J), a mass of 0.22 +/- 0.03 M(J), and therefore a mean density of 1.6 +/- 0.1 g cm(-3). Conclusions. With 67% of the size of Saturn and 72% of its mass, CoRoT-8b has a density comparable to that of Neptune (1.76 g cm(-3)). We estimate its content in heavy elements to be 47-63 M(circle plus), and the mass of its hydrogen-helium envelope to be 7-23 M(circle plus). At 0.063 AU, the thermal loss of hydrogen of CoRoT-8b should be no more than similar to 0.1% over an assumed integrated lifetime of 3 Ga.

The Kumaraswamy Weibull distribution with application to failure data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For the first time, we introduce and study some mathematical properties of the Kumaraswamy Weibull distribution that is a quite flexible model in analyzing positive data. It contains as special sub-models the exponentiated Weibull, exponentiated Rayleigh, exponentiated exponential, Weibull and also the new Kumaraswamy exponential distribution. We provide explicit expressions for the moments and moment generating function. We examine the asymptotic distributions of the extreme values. Explicit expressions are derived for the mean deviations, Bonferroni and Lorenz curves, reliability and Renyi entropy. The moments of the order statistics are calculated. We also discuss the estimation of the parameters by maximum likelihood. We obtain the expected information matrix. We provide applications involving two real data sets on failure times. Finally, some multivariate generalizations of the Kumaraswamy Weibull distribution are discussed. (C) 2010 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.

Modeling of transpiration reduction in van Genuchten-Mualem type soils

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We derive an analytic expression for the matric flux potential (M) for van Genuchten-Mualem (VGM) type soils which can also be written in terms of a converging infinite series. Considering the first four terms of this series, the accuracy of the approximation was verified by comparing it to values of M estimated by numerical finite difference integration. Using values of the parameters for three soils from different texture classes, the proposed four-term approximation showed an almost perfect match with the numerical solution, except for effective saturations higher than 0.9. Including more terms reduced the discrepancy but also increased the complexity of the equation. The four-term equation can be used for most applications. Cases with special interest in nearly saturated soils should include more terms from the infinite series. A transpiration reduction function for use with the VGM equations is derived by combining the derived expression for M with a root water extraction model. The shape of the resulting reduction function and its dependency on the derivative of the soil hydraulic diffusivity D with respect to the soil water content theta is discussed. Positive and negative values of dD/d theta yield concave and convex or S-shaped reduction functions, respectively. On the basis of three data sets, the hydraulic properties of virtually all soils yield concave reduction curves. Such curves based solely on soil hydraulic properties do not account for the complex interactions between shoot growth, root growth, and water availability.

Comparison of different nonlinear functions to describe Nelore cattle growth

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work aims to compare different nonlinear functions for describing the growth curves of Nelore females. The growth curve parameters, their (co) variance components, and environmental and genetic effects were estimated jointly through a Bayesian hierarchical model. In the first stage of the hierarchy, 4 nonlinear functions were compared: Brody, Von Bertalanffy, Gompertz, and logistic. The analyses were carried out using 3 different data sets to check goodness of fit while having animals with few records. Three different assumptions about SD of fitting errors were considered: constancy throughout the trajectory, linear increasing until 3 yr of age and constancy thereafter, and variation following the nonlinear function applied in the first stage of the hierarchy. Comparisons of the overall goodness of fit were based on Akaike information criterion, the Bayesian information criterion, and the deviance information criterion. Goodness of fit at different points of the growth curve was compared applying the Gelfand`s check function. The posterior means of adult BW ranged from 531.78 to 586.89 kg. Greater estimates of adult BW were observed when the fitting error variance was considered constant along the trajectory. The models were not suitable to describe the SD of fitting errors at the beginning of the growth curve. All functions provided less accurate predictions at the beginning of growth, and predictions were more accurate after 48 mo of age. The prediction of adult BW using nonlinear functions can be accurate when growth curve parameters and their (co) variance components are estimated jointly. The hierarchical model used in the present study can be applied to the prediction of mature BW in herds in which a portion of the animals are culled before adult age. Gompertz, Von Bertalanffy, and Brody functions were adequate to establish mean growth patterns and to predict the adult BW of Nelore females. The Brody model was more accurate in predicting the birth weight of these animals and presented the best overall goodness of fit.

Assessment of DGAT1 and LEP gene polymorphisms in three Nelore (Bos indicus) lines selected for growth and their relationship with growth and carcass traits

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this study was to analyze LEP and DGAT1 gene polymorphisms in 3 Nelore lines selected for growth and to evaluate their effects on growth and carcass traits. Traits analyzed were birth, weaning, and yearling weight, rump height, LM area, backfat thickness, and rump fat thickness obtained by ultrasound. Two SNP in the LEP gene [LEP 1620(A/G) and LEP 305(T/C)] and the K232A mutation in the DGAT1 gene were analyzed. The sample consisted of 357 Nelore heifers from 2 lines selected for yearling weight and a control line, established in 1980, at the Estacao Experimental de Zootecnia de Sertaozinho (Sertaozinho, Brazil). Three genotypes were obtained for each marker. Differences in allele frequencies among the 3 lines were only observed for the DGAT1 K232A polymorphism, with the frequency of the A allele being greater in the control line than in the selected lines. The DGAT1 K232A mutation was associated only with rump height, whereas LEP 1620(A/G) was associated with weaning weight and LEP 305(T/C) with birth weight and backfat thickness. However, more studies, with larger data sets, are necessary before these makers can be used for marker-assisted selection.

Are Reanalysis Data Useful for Calculating Climate Indices over South America?

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Precipitation and temperature climate indices are calculated using the National Centers for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) reanalysis and validated against observational data from some stations over Brazil and other data sources. The spatial patterns of the climate indices trends are analyzed for the period 1961-1990 over South America. In addition, the correlation and linear regression coefficients for some specific stations were also obtained in order to compare with the reanalysis data. In general, the results suggest that NCEP/NCAR reanalysis can provide useful information about minimum temperature and consecutive dry days indices at individual grid cells in Brazil. However, some regional differences in the climate indices trends are observed when different data sets are compared. For instance, the NCEP/NCAR reanalysis shows a reversal signal for all rainfall annual indices and the cold night index over Argentina. Despite these differences, maps of the trends for most of the annual climate indices obtained from the NCEP/NCAR reanalysis and BRANT analysis are generally in good agreement with other available data sources and previous findings in the literature for large areas of southern South America. The pattern of trends for the precipitation annual indices over the 30 years analyzed indicates a change to wetter conditions over southern and southeastern parts of Brazil, Paraguay, Uruguay, central and northern Argentina, and parts of Chile and a decrease over southwestern South America. All over South America, the climate indices related to the minimum temperature (warm or cold nights) have clearly shown a warming tendency; however, no consistent changes in maximum temperature extremes (warm and cold days) have been observed. Therefore, one must be careful before suggesting an), trends for warm or cold days.

PCA Tomography: how to extract information from data cubes

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Astronomy has evolved almost exclusively by the use of spectroscopic and imaging techniques, operated separately. With the development of modern technologies, it is possible to obtain data cubes in which one combines both techniques simultaneously, producing images with spectral resolution. To extract information from them can be quite complex, and hence the development of new methods of data analysis is desirable. We present a method of analysis of data cube (data from single field observations, containing two spatial and one spectral dimension) that uses Principal Component Analysis (PCA) to express the data in the form of reduced dimensionality, facilitating efficient information extraction from very large data sets. PCA transforms the system of correlated coordinates into a system of uncorrelated coordinates ordered by principal components of decreasing variance. The new coordinates are referred to as eigenvectors, and the projections of the data on to these coordinates produce images we will call tomograms. The association of the tomograms (images) to eigenvectors (spectra) is important for the interpretation of both. The eigenvectors are mutually orthogonal, and this information is fundamental for their handling and interpretation. When the data cube shows objects that present uncorrelated physical phenomena, the eigenvector`s orthogonality may be instrumental in separating and identifying them. By handling eigenvectors and tomograms, one can enhance features, extract noise, compress data, extract spectra, etc. We applied the method, for illustration purpose only, to the central region of the low ionization nuclear emission region (LINER) galaxy NGC 4736, and demonstrate that it has a type 1 active nucleus, not known before. Furthermore, we show that it is displaced from the centre of its stellar bulge.

Generic relationships and dating of lineages in Winteraceae based on nuclear (ITS) and plastid (rpS16 and psbA-trnH) sequence data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Phylogenetic analyses of representative species from the five genera of Winteraceae (Drimys, Pseudowintera, Takhtajania, Tasmannia, and Zygogynum s.l.) were performed using ITS nuclear sequences and a combined data-set of ITS + psbA-trnH + rpS16 sequences (sampling of 30 and 15 species, respectively). Indel informativity using simple gap coding or gaps as a fifth character was examined in both data-sets. Parsimony and Bayesian analyses support the monophyly of Drimys, Tasmannia, and Zygogynum s.l., but do not support the monophyly of Belliolum, Zygogynum s.s., and Bubbia. Within Drimys, the combined data-set recovers two subclades. Divergence time estimates suggest that the splitting between Drimys and its sister clade (Pseudowintera + Zygogynum s.l.) occurred around the end of the Cretaceous; in contrast, the divergence between the two subclades within Drimys is more recent (15.5-18.5 MY) and coincides in time with the Andean uplift. Estimates suggest that the earliest divergences within Winteraceae could have predated the first events of Gondwana fragmentation. (C) 2009 Elsevier Inc. All rights reserved.

Evolutionary fuzzy clustering of relational data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper is concerned with the computational efficiency of fuzzy clustering algorithms when the data set to be clustered is described by a proximity matrix only (relational data) and the number of clusters must be automatically estimated from such data. A fuzzy variant of an evolutionary algorithm for relational clustering is derived and compared against two systematic (pseudo-exhaustive) approaches that can also be used to automatically estimate the number of fuzzy clusters in relational data. An extensive collection of experiments involving 18 artificial and two real data sets is reported and analyzed. (C) 2011 Elsevier B.V. All rights reserved.

Multi-objective clustering ensemble for gene expression data analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present an algorithm for cluster analysis that integrates aspects from cluster ensemble and multi-objective clustering. The algorithm is based on a Pareto-based multi-objective genetic algorithm, with a special crossover operator, which uses clustering validation measures as objective functions. The algorithm proposed can deal with data sets presenting different types of clusters, without the need of expertise in cluster analysis. its result is a concise set of partitions representing alternative trade-offs among the objective functions. We compare the results obtained with our algorithm, in the context of gene expression data sets, to those achieved with multi-objective Clustering with automatic K-determination (MOCK). the algorithm most closely related to ours. (C) 2009 Elsevier B.V. All rights reserved.

A hybrid approach to learn with imbalanced classes using evolutionary algorithms

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There is an increasing interest in the application of Evolutionary Algorithms (EAs) to induce classification rules. This hybrid approach can benefit areas where classical methods for rule induction have not been very successful. One example is the induction of classification rules in imbalanced domains. Imbalanced data occur when one or more classes heavily outnumber other classes. Frequently, classical machine learning (ML) classifiers are not able to learn in the presence of imbalanced data sets, inducing classification models that always predict the most numerous classes. In this work, we propose a novel hybrid approach to deal with this problem. We create several balanced data sets with all minority class cases and a random sample of majority class cases. These balanced data sets are fed to classical ML systems that produce rule sets. The rule sets are combined creating a pool of rules and an EA is used to build a classifier from this pool of rules. This hybrid approach has some advantages over undersampling, since it reduces the amount of discarded information, and some advantages over oversampling, since it avoids overfitting. The proposed approach was experimentally analysed and the experimental results show an improvement in the classification performance measured as the area under the receiver operating characteristics (ROC) curve.

Transformed generalized linear models

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The estimation of data transformation is very useful to yield response variables satisfying closely a normal linear model, Generalized linear models enable the fitting of models to a wide range of data types. These models are based on exponential dispersion models. We propose a new class of transformed generalized linear models to extend the Box and Cox models and the generalized linear models. We use the generalized linear model framework to fit these models and discuss maximum likelihood estimation and inference. We give a simple formula to estimate the parameter that index the transformation of the response variable for a subclass of models. We also give a simple formula to estimate the rth moment of the original dependent variable. We explore the possibility of using these models to time series data to extend the generalized autoregressive moving average models discussed by Benjamin er al. [Generalized autoregressive moving average models. J. Amer. Statist. Assoc. 98, 214-223]. The usefulness of these models is illustrated in a Simulation study and in applications to three real data sets. (C) 2009 Elsevier B.V. All rights reserved.

Mortalidade feminina por hipertensão: análise por causas múltiplas

Relevância:

100.00% 100.00%

Publicador:

Resumo:

INTRODUÇÃO: A prevalência da hipertensão arterial vem crescendo no país, constituindo-se em um problema de saúde pública por sua magnitude e dificuldades no controle. OBJETIVO: Avaliar a qualidade dos dados sobre hipertensão como causa de morte e verificar o ganho de informação na mortalidade por hipertensão arterial de mulheres de 10 a 49 anos, por meio da metodologia de análise por causas múltiplas de morte. MATERIAL E MÉTODOS: Foi constituída uma base de dados com 7.332 óbitos ocorridos no primeiro semestre de 2002 pertencentes ao "Estudo da Morbi-Mortalidade de Mulheres de 10 a 49 anos". A metodologia RAMOS (Reproductive Age Mortality Survey) foi aplicada em todas as capitais de Estados brasileiros e Distrito Federal. Com as informações adicionais, foi preenchida uma nova declaração de óbito - DO-NOVA. Foram analisados dois conjuntos de dados (DO-ORIGINAL - antes da investigação - e DO-NOVA - após resgate das informações. Foram realizadas comparações segundo causas básicas e múltiplas por fontes dos dados (DO-O, DO-N). RESULTADOS E CONCLUSÃO: A DO-ORIGINAL apresentou algumas falhas quantitativas e qualitativas. Concluiu-se que a análise por causas múltiplas enriquece a informação, com base nas DO. São necessárias ações contínuas para um melhor preenchimento da DO, pelos médicos, e deve haver mais estudos que adotem a metodologia de causas múltiplas.

«
1
2
3
»