54 resultados para Data distribution
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
Aquest projecte descriu la fusió de les necessitats diaries de monitorització del experiment ATLAS des del punt de vista del cloud. La idea principal es desenvolupar un conjunt de col·lectors que recullin informació de la distribució i processat de les dades i dels test de wlcg (Service Availability Monitoring), emmagatzemant-la en BBDD específiques per tal de mostrar els resultats en una sola pàgina HLM (High Level Monitoring). Un cop aconseguit, l’aplicació ha de permetre investigar més enllà via interacció amb el front-end, el qual estarà alimentat per les estadístiques emmagatzemades a la BBDD.
Resumo:
Proyecto Fin de Carrera del Área de Redes de Computadores de la titulación de Ingeniería Informática. El proyecto versa sobre el desarrollo de un sistema automático de descarga, distribución de datos y transformación de datos de boyas con el sistema ARGOS, así como una aplicación móvil para el sistema operativo móvil iOS, para su uso en dispositivos móviles iPhone.
Resumo:
In this paper, we propose a new supervised linearfeature extraction technique for multiclass classification problemsthat is specially suited to the nearest neighbor classifier (NN).The problem of finding the optimal linear projection matrix isdefined as a classification problem and the Adaboost algorithmis used to compute it in an iterative way. This strategy allowsthe introduction of a multitask learning (MTL) criterion in themethod and results in a solution that makes no assumptions aboutthe data distribution and that is specially appropriated to solvethe small sample size problem. The performance of the methodis illustrated by an application to the face recognition problem.The experiments show that the representation obtained followingthe multitask approach improves the classic feature extractionalgorithms when using the NN classifier, especially when we havea few examples from each class
Resumo:
Regional disparities in unemployment rates are large and persistent. The literature provides evidence of their magnitude and evolution, as well as evidence of the role of certain economic, demographic and environmental factors in explaining the gap between regions of low and high unemployment. Most of these studies, however, adopt an aggregate approach and so do not account for the individual characteristics of the unemployed and employed in each region. This paper, by drawing on micro-data from the Spanish wave of the Labour Force Survey, seeks to remedy this shortcoming by analysing regional differentials in unemployment rates. An appropriate decomposition of the regional gap in the average probability of being unemployed enables us to distinguish between the contribution of differences in the regional distribution of individual characteristics from that attributable to a different impact of these characteristics on the probability of unemployment. Our results suggest that the well-documented disparities in regional unemployment are not just the result of regional heterogeneity in the distribution of individual characteristics. Non-negligible differences in the probability of unemployment remain after controlling for this type of heterogeneity, as a result of differences across regions in the impact of the observed characteristics. Among the factors considered in our analysis, regional differences in the endowment and impact of an individual’s education are shown to play a major role.
Resumo:
Income distribution in Spain has experienced a substantial improvement towards equalisation during the second half of the seventies and the eighties; a period during which most OECD countries experienced the opposite trend. In spite of the many recent papers on the Spanish income distribution, the period covered by those stops in 1990. The aim of this paper is to extent the analysis to 1996 employing the same methodology and the same data set (ECPF). Our results not only corroborate the (decreasing inequality) trend found by others during the second half of the eighties, but also suggest that this trend extends over the first half of the nineties. We also show that our main conclusions are robust to changes in the equivalence scale, to changes in the definition of income and to potential data contamination. Finally, we analyse some of the causes which may be driving the overall picture of income inequality using two decomposition techniques. From this analyses three variables emerge as the major responsible factors for the observed improvement in the income distribution: education, household composition and socioeconomic situation of the household head.
Resumo:
We present experimental and theoretical analyses of data requirements for haplotype inference algorithms. Our experiments include a broad range of problem sizes under two standard models of tree distribution and were designed to yield statistically robust results despite the size of the sample space. Our results validate Gusfield's conjecture that a population size of n log n is required to give (with high probability) sufficient information to deduce the n haplotypes and their complete evolutionary history. The experimental results inspired our experimental finding with theoretical bounds on the population size. We also analyze the population size required to deduce some fixed fraction of the evolutionary history of a set of n haplotypes and establish linear bounds on the required sample size. These linear bounds are also shown theoretically.
Resumo:
Report for the scientific sojourn at the Simon Fraser University, Canada, from July to September 2007. General context: landscape change during the last years is having significant impacts on biodiversity in many Mediterranean areas. Land abandonment, urbanisation and specially fire are profoundly transforming large areas in the Western Mediterranean basin and we know little on how these changes influence species distribution and in particular how these species will respond to further change in a context of global change including climate. General objectives: integrate landscape and population dynamics models in a platform allowing capturing species distribution responses to landscape changes and assessing impact on species distribution of different scenarios of further change. Specific objective 1: develop a landscape dynamic model capturing fire and forest succession dynamics in Catalonia and linked to a stochastic landscape occupancy (SLOM) (or spatially explicit population, SEPM) model for the Ortolan bunting, a species strongly linked to fire related habitat in the region. Predictions from the occupancy or spatially explicit population Ortolan bunting model (SEPM) should be evaluated using data from the DINDIS database. This database tracks bird colonisation of recently burnt big areas (&50 ha). Through a number of different SEPM scenarios with different values for a number of parameter, we should be able to assess different hypothesis in factors driving bird colonisation in new burnt patches. These factors to be mainly, landscape context (i.e. difficulty to reach the patch, and potential presence of coloniser sources), dispersal constraints, type of regenerating vegetation after fire, and species characteristics (niche breadth, etc).
Resumo:
An abundant scientific literature about climate change economics points out that the future participation of developing countries in international environmental policies will depend on their amount of pay offs inside and outside specific agreements. These studies are aimed at analyzing coalitions stability typically through a game theoretical approach. Though these contributions represent a corner stone in the research field investigating future plausible international coalitions and the reasons behind the difficulties incurred over time to implement emissions stabilizing actions, they cannot disentangle satisfactorily the role that equality play in inducing poor regions to tackle global warming. If we focus on the Stern Review findings stressing that climate change will generate heavy damages and policy actions will be costly in a finite time horizon, we understand why there is a great incentive to free ride in order to exploit benefits from emissions reduction efforts of others. The reluctance of poor countries in joining international agreements is mainly supported by historical responsibility of rich regions in generating atmospheric carbon concentration, whereas rich countries claim that emissions stabilizing policies will be effective only when developing countries will join them.Scholars recently outline that a perceived fairness in the distribution of emissions would facilitate a wide spread participation in international agreements. In this paper we overview the literature about distributional aspects of emissions by focusing on those contributions investigating past trends of emissions distribution through empirical data and future trajectories through simulations obtained by integrated assessment models. We will explain methodologies used to elaborate data and the link between real data and those coming from simulations. Results from this strand of research will be interpreted in order to discuss future negotiations for post Kyoto agreements that will be the focus of the next. Conference of the Parties in Copenhagen at the end of 2009. A particular attention will be devoted to the role that technological change will play in affecting the distribution of emissions over time and to how spillovers and experience diffusion could influence equality issues and future outcomes of policy negotiations.
Resumo:
In this paper we analyze the persistence of aggregate real exchange rates (RERs) for a group of EU-15 countries by using sectoral data. The tight relation between aggregate and sectoral persistence recently investigated by Mayoral (2008) allows us to decompose aggregate RER persistence into the persistence of its different subcomponents. We show that the distribution of sectoral persistence is highly heterogeneous and very skewed to the right, and that a limited number of sectors are responsible for the high levels of persistence observed at the aggregate level. We use quantile regression to investigate whether the traditional theories proposed to account for the slow reversion to parity (lack of arbitrage due to nontradibilities or imperfect competition and price stickiness) are able to explain the behavior of the upper quantiles of sectoral persistence. We conclude that pricing to market in the intermediate goods sector together with price stickiness have more explanatory power than variables related to the tradability of the goods or their inputs.
Resumo:
The aim of this paper is to analyse the colocation patterns of industries and firms. We study the spatial distribution of firms from different industries at a microgeographic level and from this identify the main reasons for this locational behaviour. The empirical application uses data from Mercantile Registers of Spanish firms (manufacturers and services). Inter-sectorial linkages are shown using self-organizing maps. Key words: clusters, microgeographic data, self-organizing maps, firm location JEL classification: R10, R12, R34
Resumo:
This analysis was stimulated by the real data analysis problem of householdexpenditure data. The full dataset contains expenditure data for a sample of 1224 households. The expenditure is broken down at 2 hierarchical levels: 9 major levels (e.g. housing, food, utilities etc.) and 92 minor levels. There are also 5 factors and 5 covariates at the household level. Not surprisingly, there are a small number of zeros at the major level, but many zeros at the minor level. The question is how best to model the zeros. Clearly, models that tryto add a small amount to the zero terms are not appropriate in general as at least some of the zeros are clearly structural, e.g. alcohol/tobacco for households that are teetotal. The key question then is how to build suitable conditional models. For example, is the sub-composition of spendingexcluding alcohol/tobacco similar for teetotal and non-teetotal households?In other words, we are looking for sub-compositional independence. Also, what determines whether a household is teetotal? Can we assume that it is independent of the composition? In general, whether teetotal will clearly depend on the household level variables, so we need to be able to model this dependence. The other tricky question is that with zeros on more than onecomponent, we need to be able to model dependence and independence of zeros on the different components. Lastly, while some zeros are structural, others may not be, for example, for expenditure on durables, it may be chance as to whether a particular household spends money on durableswithin the sample period. This would clearly be distinguishable if we had longitudinal data, but may still be distinguishable by looking at the distribution, on the assumption that random zeros will usually be for situations where any non-zero expenditure is not small.While this analysis is based on around economic data, the ideas carry over tomany other situations, including geological data, where minerals may be missing for structural reasons (similar to alcohol), or missing because they occur only in random regions which may be missed in a sample (similar to the durables)
Resumo:
The low levels of unemployment recorded in the UK in recent years are widely cited asevidence of the country’s improved economic performance, and the apparent convergence of unemployment rates across the country’s regions used to suggest that the longstanding divide in living standards between the relatively prosperous ‘south’ and the more depressed ‘north’ has been substantially narrowed. Dissenters from theseconclusions have drawn attention to the greatly increased extent of non-employment(around a quarter of the UK’s working age population are not in employment) and themarked regional dimension in its distribution across the country. Amongst these dissenters it is generally agreed that non-employment is concentrated amongst oldermales previously employed in the now very much smaller ‘heavy’ industries (e.g. coal,steel, shipbuilding).This paper uses the tools of compositiona l data analysis to provide a much richer picture of non-employment and one which challenges the conventional analysis wisdom about UK labour market performance as well as the dissenters view of the nature of theproblem. It is shown that, associated with the striking ‘north/south’ divide in nonemployment rates, there is a statistically significant relationship between the size of the non-employment rate and the composition of non-employment. Specifically, it is shown that the share of unemployment in non-employment is negatively correlated with the overall non-employment rate: in regions where the non-employment rate is high the share of unemployment is relatively low. So the unemployment rate is not a very reliable indicator of regional disparities in labour market performance. Even more importantly from a policy viewpoint, a significant positive relationship is found between the size ofthe non-employment rate and the share of those not employed through reason of sicknessor disability and it seems (contrary to the dissenters) that this connection is just as strong for women as it is for men
Resumo:
This paper presents a new charging scheme for cost distribution along a point-to-multipoint connection when destination nodes are responsible for the cost. The scheme focus on QoS considerations and a complete range of choices is presented. These choices go from a safe scheme for the network operator to a fair scheme to the customer. The in-between cases are also covered. Specific and general problems, like the incidence of users disconnecting dynamically is also discussed. The aim of this scheme is to encourage the users to disperse the resource demand instead of having a large number of direct connections to the source of the data, which would result in a higher than necessary bandwidth use from the source. This would benefit the overall performance of the network. The implementation of this task must balance between the necessity to offer a competitive service and the risk of not recovering such service cost for the network operator. Throughout this paper reference to multicast charging is made without making any reference to any specific category of service. The proposed scheme is also evaluated with the criteria set proposed in the European ATM charging project CANCAN
Resumo:
Fault location has been studied deeply for transmission lines due to its importance in power systems. Nowadays the problem of fault location on distribution systems is receiving special attention mainly because of the power quality regulations. In this context, this paper presents an application software developed in Matlabtrade that automatically calculates the location of a fault in a distribution power system, starting from voltages and currents measured at the line terminal and the model of the distribution power system data. The application is based on a N-ary tree structure, which is suitable to be used in this application due to the highly branched and the non- homogeneity nature of the distribution systems, and has been developed for single-phase, two-phase, two-phase-to-ground, and three-phase faults. The implemented application is tested by using fault data in a real electrical distribution power system
Resumo:
The log-ratio methodology makes available powerful tools for analyzing compositionaldata. Nevertheless, the use of this methodology is only possible for those data setswithout null values. Consequently, in those data sets where the zeros are present, aprevious treatment becomes necessary. Last advances in the treatment of compositionalzeros have been centered especially in the zeros of structural nature and in the roundedzeros. These tools do not contemplate the particular case of count compositional datasets with null values. In this work we deal with \count zeros" and we introduce atreatment based on a mixed Bayesian-multiplicative estimation. We use the Dirichletprobability distribution as a prior and we estimate the posterior probabilities. Then weapply a multiplicative modi¯cation for the non-zero values. We present a case studywhere this new methodology is applied.Key words: count data, multiplicative replacement, composition, log-ratio analysis