54 resultados para Data distribution


Relevância:

30.00% 30.00%

Publicador:

Resumo:

There is great scientific and popular interest in understanding the genetic history of populations in the Americas. We wish to understand when different regions of the continent were inhabited, where settlers came from, and how current inhabitants relate genetically to earlier populations. Recent studies unraveled parts of the genetic history of the continent using genotyping arrays and uniparental markers. The 1000 Genomes Project provides a unique opportunity for improving our understanding of population genetic history by providing over a hundred sequenced low coverage genomes and exomes from Colombian (CLM), Mexican-American (MXL), and Puerto Rican (PUR) populations. Here, we explore the genomic contributions of African, European, and especially Native American ancestry to these populations. Estimated Native American ancestry is 48% in MXL, 25% in CLM, and 13% in PUR. Native American ancestry in PUR is most closely related to populations surrounding the Orinoco River basin, confirming the Southern American ancestry of the Taíno people of the Caribbean. We present new methods to estimate the allele frequencies in the Native American fraction of the populations, and model their distribution using a demographic model for three ancestral Native American populations. These ancestral populations likely split in close succession: the most likely scenario, based on a peopling of the Americas 16 thousand years ago (kya), supports that the MXL Ancestors split 12.2kya, with a subsequent split of the ancestors to CLM and PUR 11.7kya. The model also features effective populations of 62,000 in Mexico, 8,700 in Colombia, and 1,900 in Puerto Rico. Modeling Identity-by-descent (IBD) and ancestry tract length, we show that post-contact populations also differ markedly in their effective sizes and migration patterns, with Puerto Rico showing the smallest effective size and the earlier migration from Europe. Finally, we compare IBD and ancestry assignments to find evidence for relatedness among European founders to the three populations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The origin of Spanish regional economic divergence can be traced back at least until the seventeenth century, although its full definition took place during industrialisation. Historians have often included uneven regional infrastructure endowments among the factors that explain divergence among Spanish regions, although no systematic analysis of the spatial distribution of Spanish infrastructure and its determinants has been carried out so far. This paper aims at filling that gap, by offering a description of the regional distribution of the main Spanish transport infrastructure between the middle of the nineteenth century and the Civil War. In addition, it estimates a panel data model to search into the main reasons that explain the differences among the Spanish regional endowments of railways and roads during that period. The outcomes of that analysis indicate that both institutional factors and the physical characteristics of each area had a strong influence on the distribution of transport infrastructure among the Spanish regions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this study is to collect and analyze variables that could have a relationship with the second-hand house prices in Barcelona, as much disaggregated as possible from 2008 to 2011 and make a statistical analysis. The study consists of two parts. The first part is the preliminary study of the data and the second part is the econometric analysis of the data to see if there is any relationship between the second-hand house prices and the variables chosen. Finally, we looked at if there was any atypical observation and if the model presented multicollinarity. With all this information, we extract some conclusions and then, we analyzed more deeply the information.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The most suitable method for estimation of size diversity is investigated. Size diversity is computed on the basis of the Shannon diversity expression adapted for continuous variables, such as size. It takes the form of an integral involving the probability density function (pdf) of the size of the individuals. Different approaches for the estimation of pdf are compared: parametric methods, assuming that data come from a determinate family of pdfs, and nonparametric methods, where pdf is estimated using some kind of local evaluation. Exponential, generalized Pareto, normal, and log-normal distributions have been used to generate simulated samples using estimated parameters from real samples. Nonparametric methods include discrete computation of data histograms based on size intervals and continuous kernel estimation of pdf. Kernel approach gives accurate estimation of size diversity, whilst parametric methods are only useful when the reference distribution have similar shape to the real one. Special attention is given for data standardization. The division of data by the sample geometric mean is proposedas the most suitable standardization method, which shows additional advantages: the same size diversity value is obtained when using original size or log-transformed data, and size measurements with different dimensionality (longitudes, areas, volumes or biomasses) may be immediately compared with the simple addition of ln k where kis the dimensionality (1, 2, or 3, respectively). Thus, the kernel estimation, after data standardization by division of sample geometric mean, arises as the most reliable and generalizable method of size diversity evaluation

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper aims to provide insights into the phenomenon of knowledge flows. We study one of the main mechanisms through which these flows occur, i.e., the mobility of highly-skilled individuals. We focus on the geographical mobility of inventors across European regions. Thus, patent data are used to trace the pattern of inventors’ mobility across european regions, to track down focuses of attraction of talent throughout the continent, and to study their distribution across the space. To do so, we gather information from PCT patent documents and we first match the names which seemed to belong to the same inventor and then we create a new algorithm to decide whether each patent applied for under each name belongs to the same inventor.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Soil properties on the Cap de Creus Peninsula, NE Spain depend primarily on scarce agricultural practices and early abandonment. In the study area, 90% of which is mainly covered by Cistus shrubs, 8 environments representing variations in land use/land cover and soil properties at different depths were identified. In each environment variously vegetated areas were selected and sampled. The soils, collected at different depths, were classified as Lithic Xerorthents according to the United States Department of Agriculture system of soil classification (USDA-NRCS 1975). Differences in soil properties were largely found according to the evolution of the plant canopy and the land use history. To identify underlying patterns in soil properties related to environmental evolution, factor analysis was performed and factor scores were used to determine how the factor patterns varied between soil variables, soil depths and selected environments. The three-factor model always accounted for 80% of the total variation in the data at the different soil depths. Organic matter was the more relevant soil property at 0–2 cm depth, whereas active minerals (silt and clay) were found to be the most relevant soil parameters controlling soil dynamics at the other depths investigated. Results showed that vineyards and olive tree soils are poorly developed and present worse conditions for mineral and organic compounds. Analysis of factor scores allowed independent assessment of soils, depth and plant cover and demonstrated that soils present the best physico-chemical characteristics under Erica arborea and meadows. In contrast, soils under Cistus monspeliensis were less nutrient rich and less well structured

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Phenomena with a constrained sample space appear frequently in practice. This is the case e.g. with strictly positive data, or with compositional data, like percentages or proportions. If the natural measure of difference is not the absolute one, simple algebraic properties show that it is more convenient to work with a geometry different from the usual Euclidean geometry in real space, and with a measure different from the usual Lebesgue measure, leading to alternative models which better fit the phenomenon under study. The general approach is presented and illustrated using the normal distribution, both on the positive real line and on the D-part simplex. The original ideas of McAlister in his introduction to the lognormal distribution in 1879, are recovered and updated

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Peer-reviewed