54 resultados para random forest data analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A four-parameter extension of the generalized gamma distribution capable of modelling a bathtub-shaped hazard rate function is defined and studied. The beauty and importance of this distribution lies in its ability to model monotone and non-monotone failure rate functions, which are quite common in lifetime data analysis and reliability. The new distribution has a number of well-known lifetime special sub-models, such as the exponentiated Weibull, exponentiated generalized half-normal, exponentiated gamma and generalized Rayleigh, among others. We derive two infinite sum representations for its moments. We calculate the density of the order statistics and two expansions for their moments. The method of maximum likelihood is used for estimating the model parameters and the observed information matrix is obtained. Finally, a real data set from the medical area is analysed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The use of remote sensing is necessary for monitoring forest carbon stocks at large scales. Optical remote sensing, although not the most suitable technique for the direct estimation of stand biomass, offers the advantage of providing large temporal and spatial datasets. In particular, information on canopy structure is encompassed in stand reflectance time series. This study focused on the example of Eucalyptus forest plantations, which have recently attracted much attention as a result of their high expansion rate in many tropical countries. Stand scale time-series of Normalized Difference Vegetation Index (NDVI) were obtained from MODIS satellite data after a procedure involving un-mixing and interpolation, on about 15,000 ha of plantations in southern Brazil. The comparison of the planting date of the current rotation (and therefore the age of the stands) estimated from these time series with real values provided by the company showed that the root mean square error was 35.5 days. Age alone explained more than 82% of stand wood volume variability and 87% of stand dominant height variability. Age variables were combined with other variables derived from the NDVI time series and simple bioclimatic data by means of linear (Stepwise) or nonlinear (Random Forest) regressions. The nonlinear regressions gave r-square values of 0.90 for volume and 0.92 for dominant height, and an accuracy of about 25 m(3)/ha for volume (15% of the volume average value) and about 1.6 m for dominant height (8% of the height average value). The improvement including NDVI and bioclimatic data comes from the fact that the cumulative NDVI since planting date integrates the interannual variability of leaf area index (LAI), light interception by the foliage and growth due for example to variations of seasonal water stress. The accuracy of biomass and height predictions was strongly improved by using the NDVI integrated over the two first years after planting, which are critical for stand establishment. These results open perspectives for cost-effective monitoring of biomass at large scales in intensively-managed plantation forests. (C) 2011 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The stock market suffers uncertain relations throughout the entire negotiation process, with different variables exerting direct and indirect influence on stock prices. This study focuses on the analysis of certain aspects that may influence these values offered by the capital market, based on the Brazil Index of the Sao Paulo Stock Exchange (Bovespa), which selects 100 stocks among the most traded on Bovespa in terms of number of trades and financial volume. The selected variables are characterized by the companies` activity area and the business volume in the month of data collection, i.e. April/2007. This article proposes an analysis that joins the accounting view of the stock price variables that can be influenced with the use of multivariate qualitative data analysis. Data were explored through Correspondence Analysis (Anacor) and Homogeneity Analysis (Homals). According to the research, the selected variables are associated with the values presented by the stocks, which become an internal control instrument and a decision-making tool when it comes to choosing investments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Saimiri sciureus is one of the smallest Cebidae native of Amazon region and also found at the biological reserve of northeast Atlantic forest. It is an omnivore animal, with diversified diet that directly influences the lingual mucosa, which includes certain types of papillae with different organization levels. The present study attempted to describe the morphological and ultrastructure aspects of the dorsal surface of the S. sciureus. Five tongues of de S. sciureus were analyzed from three males and two females who died from natural causes and were obtained from breeding colonies of CENP-Ananindeua-PA. Main macroscopic features were a general triangular shape with a craniocaudal elongation pointed apex. Tissue samples-apex, body, and root of tongue-were fixed in modified Karnovsky solution, following standard scanning protocol, mounted in stubs, coated by gold, and analyzed by Scanning Electron Macroscopy (SEM). Four types of papillae were described: filiform (along all tissue extension with 154 mu m of diameter), fungiform (along all tissue extension with 272 mu m of diameter), vallate [just three units in caudal (dorsal) portion with 830 mu m of diameter] and foliate (one pair at caudolateral surface with similar to 13 projections and 3000 mu m in length). Data analysis indicates that the distribution and ultra structural morphology of the S. sciureus lingual papillae are some similar to other primates. Microsc. Res. Tech. 74:484-487, 2011. (C) 2010 Wiley-Liss, Inc.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

During field work in Nazare Paulista, state of Sao Paulo, Brazil, we found 13 (56.5%) of 23 birds (mostly Passeriformes) to be infested by 28 larvae and I nymph of Amblyomma spp. Two larvae were reared to the adult stage, being taxonomically identified as Amblyomma parkeri Fonseca and Aragao, whereas five larvae and one nymph were identified as Amblyomma longirostre Koch. All six A. longirostre specimens were shown to be infected by rickettsia, as demonstrated by polymerase chain reaction (PCR) targeting two rickettsial genes (gltA and ompA) or isolation of rickettsia in cell culture from one of the ticks. This isolate was designated as strain AL, which was established in Vero cell culture and was molecularly characterized by DNA sequencing fragments of the rickettsial genes gltA, htrA, ompA, and ompB. Phylogenetic analyses inferred from ompA and ompB partial sequences showed a high degree of similarity of strain AL with Rickettsia sp. strain ARANHA, previously detected by PCR in A. longirostre ticks from Rondonia, northern Brazil. We conclude that strain AL is a new rickettsia genotype belonging to the same species of strain ARANHA, which are closely related to Candidatus `R. amblyomniii`. Further studies should elucidate if strains AL and ARANHA are different strains of Candidatus `R. amblyommii` or are a new species.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We studied superclusters of galaxies in a volume-limited sample extracted from the Sloan Digital Sky Survey Data Release 7 and from mock catalogues based on a semi-analytical model of galaxy evolution in the Millennium Simulation. A density field method was applied to a sample of galaxies brighter than M(r) = -21+5 log h(100) to identify superclusters, taking into account selection and boundary effects. In order to evaluate the influence of the threshold density, we have chosen two thresholds: the first maximizes the number of objects (D1) and the second constrains the maximum supercluster size to similar to 120 h(-1) Mpc (D2). We have performed a morphological analysis, using Minkowski Functionals, based on a parameter, which increases monotonically from filaments to pancakes. An anticorrelation was found between supercluster richness (and total luminosity or size) and the morphological parameter, indicating that filamentary structures tend to be richer, larger and more luminous than pancakes in both observed and mock catalogues. We have also used the mock samples to compare supercluster morphologies identified in position and velocity spaces, concluding that our morphological classification is not biased by the peculiar velocities. Monte Carlo simulations designed to investigate the reliability of our results with respect to random fluctuations show that these results are robust. Our analysis indicates that filaments and pancakes present different luminosity and size distributions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Astronomy has evolved almost exclusively by the use of spectroscopic and imaging techniques, operated separately. With the development of modern technologies, it is possible to obtain data cubes in which one combines both techniques simultaneously, producing images with spectral resolution. To extract information from them can be quite complex, and hence the development of new methods of data analysis is desirable. We present a method of analysis of data cube (data from single field observations, containing two spatial and one spectral dimension) that uses Principal Component Analysis (PCA) to express the data in the form of reduced dimensionality, facilitating efficient information extraction from very large data sets. PCA transforms the system of correlated coordinates into a system of uncorrelated coordinates ordered by principal components of decreasing variance. The new coordinates are referred to as eigenvectors, and the projections of the data on to these coordinates produce images we will call tomograms. The association of the tomograms (images) to eigenvectors (spectra) is important for the interpretation of both. The eigenvectors are mutually orthogonal, and this information is fundamental for their handling and interpretation. When the data cube shows objects that present uncorrelated physical phenomena, the eigenvector`s orthogonality may be instrumental in separating and identifying them. By handling eigenvectors and tomograms, one can enhance features, extract noise, compress data, extract spectra, etc. We applied the method, for illustration purpose only, to the central region of the low ionization nuclear emission region (LINER) galaxy NGC 4736, and demonstrate that it has a type 1 active nucleus, not known before. Furthermore, we show that it is displaced from the centre of its stellar bulge.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Non-linear methods for estimating variability in time-series are currently of widespread use. Among such methods are approximate entropy (ApEn) and sample approximate entropy (SampEn). The applicability of ApEn and SampEn in analyzing data is evident and their use is increasing. However, consistency is a point of concern in these tools, i.e., the classification of the temporal organization of a data set might indicate a relative less ordered series in relation to another when the opposite is true. As highlighted by their proponents themselves, ApEn and SampEn might present incorrect results due to this lack of consistency. In this study, we present a method which gains consistency by using ApEn repeatedly in a wide range of combinations of window lengths and matching error tolerance. The tool is called volumetric approximate entropy, vApEn. We analyze nine artificially generated prototypical time-series with different degrees of temporal order (combinations of sine waves, logistic maps with different control parameter values, random noises). While ApEn/SampEn clearly fail to consistently identify the temporal order of the sequences, vApEn correctly do. In order to validate the tool we performed shuffled and surrogate data analysis. Statistical analysis confirmed the consistency of the method. (C) 2008 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sodreaninae is reviewed and all ten species are combined under its type genus, Sodreana Mello-Leitao, 1922, according to a cladistic analysis of morphological characters, which revealed a pectinate pattern of clades. The subfamily is endemic to the Brazilian Atlantic rainforest from Santa Catarina state to Rio de Janeiro state. Sodreana is herein considered a senior synonym of Stygnobates Mello-Leitao, 1927, Zortalia Mello-Leitao, 1936, Gertia B. Soares & H. Soares, 1946 and Annampheres H. Soares, 1979. The following new combinations are proposed: Sodreana barbiellinii (Mello-Leitao, 1927), Sodreana hatschbachi (B. Soares & H. Soares, 1946), Sodreana inscripta (Mello-Leitao, 1939), Sodreana leprevosti (B. Soares & H. Soares, 1947b), Sodreana bicalcarata (Mello-Leitao, 1936). Sodreana granulata (Mello-Leitao, 1937) is revalidated from the synonymy of Sodreana sodreana Mello-Leitao, 1922. Three new species are described: Sodreana glaucoi from Ilhabela and Boraceia, Sao Paulo state; S. curupira from Parque Nacional da Serra dos Orgaos, Rio de Janeiro state, and S. caipora from Ubatuba, Sao Paulo state. Sodreaninae species are restricted to forested areas and most occur in the southern part of the coastal Atlantic rainforest, one species occurs in interior Atlantic rainforest. The biogeographical analysis (Brooks Parsimony Analysis) resulted in a single and fully resolved most parsimonious tree with three main: components: northern (Bahia and Serra do Espinhaco), southern (Santa Catarina, Parana, Serra do Mar of Sao Paulo), and central (Espirito Santo, Serra da Bocaina, southern state of Rio de Janeiro, Serra dos Orgaos, Serra da Mantiqueira, Serra do Mar of Sao Paulo).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The species related to Vriesea paraibica (Bromeliaceae, Tillandsioideae) have controversial taxonomic limits. For several decades, this group has been identified in herbarium collections as V. x morreniana, an artificial hybrid that does not grow in natural habitats. The aim of this study was to assess the morphological variation in the V. paraibica complex through morphometric analyses of natural populations. Two sets of analyses were performed: the first involved six natural populations (G1) and the second was carried out on taxa that emerged from the first analysis, but using material from herbarium collections (G2). Univariate ANOVA was used, as well as discriminant analysis of 16 morphometric variables in G1 and 18 in G2. The results of the analyses of the two groups were similar and led to the selection of diagnostic traits of four species. Lengths of the lower and median floral bracts were significant for the separation of red and yellow floral bracts. Vriesea paraibica and V. interrogatoria have red bracts; these two species are differentiated by the widths of the lower and median portions of the inflorescence and by scape length. These structures are larger in the former and smaller in the latter. Of the species with yellow floral bracts, V. eltoniana is distinguished by longer leaf blades and scapes and V. flava is characterized by its shorter sepal lengths. (C) 2009 The Linnean Society of London, Botanical Journal of the Linnean Society, 2009, 159, 163-181.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A large amount of biological data has been produced in the last years. Important knowledge can be extracted from these data by the use of data analysis techniques. Clustering plays an important role in data analysis, by organizing similar objects from a dataset into meaningful groups. Several clustering algorithms have been proposed in the literature. However, each algorithm has its bias, being more adequate for particular datasets. This paper presents a mathematical formulation to support the creation of consistent clusters for biological data. Moreover. it shows a clustering algorithm to solve this formulation that uses GRASP (Greedy Randomized Adaptive Search Procedure). We compared the proposed algorithm with three known other algorithms. The proposed algorithm presented the best clustering results confirmed statistically. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In interval-censored survival data, the event of interest is not observed exactly but is only known to occur within some time interval. Such data appear very frequently. In this paper, we are concerned only with parametric forms, and so a location-scale regression model based on the exponentiated Weibull distribution is proposed for modeling interval-censored data. We show that the proposed log-exponentiated Weibull regression model for interval-censored data represents a parametric family of models that include other regression models that are broadly used in lifetime data analysis. Assuming the use of interval-censored data, we employ a frequentist analysis, a jackknife estimator, a parametric bootstrap and a Bayesian analysis for the parameters of the proposed model. We derive the appropriate matrices for assessing local influences on the parameter estimates under different perturbation schemes and present some ways to assess global influences. Furthermore, for different parameter settings, sample sizes and censoring percentages, various simulations are performed; in addition, the empirical distribution of some modified residuals are displayed and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be straightforwardly extended to a modified deviance residual in log-exponentiated Weibull regression models for interval-censored data. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work we propose and analyze nonlinear elliptical models for longitudinal data, which represent an alternative to gaussian models in the cases of heavy tails, for instance. The elliptical distributions may help to control the influence of the observations in the parameter estimates by naturally attributing different weights for each case. We consider random effects to introduce the within-group correlation and work with the marginal model without requiring numerical integration. An iterative algorithm to obtain maximum likelihood estimates for the parameters is presented, as well as diagnostic results based on residual distances and local influence [Cook, D., 1986. Assessment of local influence. journal of the Royal Statistical Society - Series B 48 (2), 133-169; Cook D., 1987. Influence assessment. journal of Applied Statistics 14 (2),117-131; Escobar, L.A., Meeker, W.Q., 1992, Assessing influence in regression analysis with censored data, Biometrics 48, 507-528]. As numerical illustration, we apply the obtained results to a kinetics longitudinal data set presented in [Vonesh, E.F., Carter, R.L., 1992. Mixed-effects nonlinear regression for unbalanced repeated measures. Biometrics 48, 1-17], which was analyzed under the assumption of normality. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The use of inter-laboratory test comparisons to determine the performance of individual laboratories for specific tests (or for calibration) [ISO/IEC Guide 43-1, 1997. Proficiency testing by interlaboratory comparisons - Part 1: Development and operation of proficiency testing schemes] is called Proficiency Testing (PT). In this paper we propose the use of the generalized likelihood ratio test to compare the performance of the group of laboratories for specific tests relative to the assigned value and illustrate the procedure considering an actual data from the PT program in the area of volume. The proposed test extends the test criteria in use allowing to test for the consistency of the group of laboratories. Moreover, the class of elliptical distributions are considered for the obtained measurements. (C) 2008 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In survival analysis applications, the failure rate function may frequently present a unimodal shape. In such case, the log-normal or log-logistic distributions are used. In this paper, we shall be concerned only with parametric forms, so a location-scale regression model based on the Burr XII distribution is proposed for modeling data with a unimodal failure rate function as an alternative to the log-logistic regression model. Assuming censored data, we consider a classic analysis, a Bayesian analysis and a jackknife estimator for the parameters of the proposed model. For different parameter settings, sample sizes and censoring percentages, various simulation studies are performed and compared to the performance of the log-logistic and log-Burr XII regression models. Besides, we use sensitivity analysis to detect influential or outlying observations, and residual analysis is used to check the assumptions in the model. Finally, we analyze a real data set under log-Buff XII regression models. (C) 2008 Published by Elsevier B.V.