923 resultados para Data Systems


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Numerical weather prediction can be regarded as an initial value problem whereby the governing atmospheric equations are integrated forward from fully determined initial values of the meteorological parameters. However, in spite of the considerable improvements of the observing systems in recent years, the initial values are known only incompletely and inaccurately and one of the major tasks of any forecasting centre is to determine the best possible initial state from available observations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this lecture is to review recent development in data analysis, initialization and data assimilation. The development of 3-dimensional multivariate schemes has been very timely because of its suitability to handle the many different types of observations during FGGE. Great progress has taken place in the initialization of global models by the aid of non-linear normal mode technique. However, in spite of great progress, several fundamental problems are still unsatisfactorily solved. Of particular importance is the question of the initialization of the divergent wind fields in the Tropics and to find proper ways to initialize weather systems driven by non-adiabatic processes. The unsatisfactory ways in which such processes are being initialized are leading to excessively long spin-up times.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the introduction of new observing systems based on asynoptic observations, the analysis problem has changed in character. In the near future we may expect that a considerable part of meteorological observations will be unevenly distributed in four dimensions, i.e. three dimensions in space and one in time. The term analysis, or objective analysis in meteorology, means the process of interpolating observed meteorological observations from unevenly distributed locations to a network of regularly spaced grid points. Necessitated by the requirement of numerical weather prediction models to solve the governing finite difference equations on such a grid lattice, the objective analysis is a three-dimensional (or mostly two-dimensional) interpolation technique. As a consequence of the structure of the conventional synoptic network with separated data-sparse and data-dense areas, four-dimensional analysis has in fact been intensively used for many years. Weather services have thus based their analysis not only on synoptic data at the time of the analysis and climatology, but also on the fields predicted from the previous observation hour and valid at the time of the analysis. The inclusion of the time dimension in objective analysis will be called four-dimensional data assimilation. From one point of view it seems possible to apply the conventional technique on the new data sources by simply reducing the time interval in the analysis-forecasting cycle. This could in fact be justified also for the conventional observations. We have a fairly good coverage of surface observations 8 times a day and several upper air stations are making radiosonde and radiowind observations 4 times a day. If we have a 3-hour step in the analysis-forecasting cycle instead of 12 hours, which is applied most often, we may without any difficulties treat all observations as synoptic. No observation would thus be more than 90 minutes off time and the observations even during strong transient motion would fall within a horizontal mesh of 500 km * 500 km.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The concepts of on-line transactional processing (OLTP) and on-line analytical processing (OLAP) are often confused with the technologies or models that are used to design transactional and analytics based information systems. This in some way has contributed to existence of gaps between the semantics in information captured during transactional processing and information stored for analytical use. In this paper, we propose the use of a unified semantics design model, as a solution to help bridge the semantic gaps between data captured by OLTP systems and the information provided by OLAP systems. The central focus of this design approach is on enabling business intelligence using not just data, but data with context.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Global communicationrequirements andloadimbalanceof someparalleldataminingalgorithms arethe major obstacles to exploitthe computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication costin parallel data mining algorithms and, in particular, in the k-means algorithm for cluster analysis. In the straightforward parallel formulation of the k-means algorithm, data and computation loads are uniformly distributed over the processing nodes. This approach has excellent load balancing characteristics that may suggest it could scale up to large and extreme-scale parallel computing systems. However, at each iteration step the algorithm requires a global reduction operationwhichhinders thescalabilityoftheapproach.Thisworkstudiesadifferentparallelformulation of the algorithm where the requirement of global communication is removed, while maintaining the same deterministic nature ofthe centralised algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real-world distributed applications or can be induced by means ofmulti-dimensional binary searchtrees. The approachcanalso be extended to accommodate an approximation error which allows a further reduction ofthe communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing element

Relevância:

30.00% 30.00%

Publicador:

Resumo:

n the past decade, the analysis of data has faced the challenge of dealing with very large and complex datasets and the real-time generation of data. Technologies to store and access these complex and large datasets are in place. However, robust and scalable analysis technologies are needed to extract meaningful information from these datasets. The research field of Information Visualization and Visual Data Analytics addresses this need. Information visualization and data mining are often used complementary to each other. Their common goal is the extraction of meaningful information from complex and possibly large data. However, though data mining focuses on the usage of silicon hardware, visualization techniques also aim to access the powerful image-processing capabilities of the human brain. This article highlights the research on data visualization and visual analytics techniques. Furthermore, we highlight existing visual analytics techniques, systems, and applications including a perspective on the field from the chemical process industry.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Expression microarrays are increasingly used to obtain large scale transcriptomic information on a wide range of biological samples. Nevertheless, there is still much debate on the best ways to process data, to design experiments and analyse the output. Furthermore, many of the more sophisticated mathematical approaches to data analysis in the literature remain inaccessible to much of the biological research community. In this study we examine ways of extracting and analysing a large data set obtained using the Agilent long oligonucleotide transcriptomics platform, applied to a set of human macrophage and dendritic cell samples. Results: We describe and validate a series of data extraction, transformation and normalisation steps which are implemented via a new R function. Analysis of replicate normalised reference data demonstrate that intrarray variability is small (only around 2 of the mean log signal), while interarray variability from replicate array measurements has a standard deviation (SD) of around 0.5 log(2) units (6 of mean). The common practise of working with ratios of Cy5/Cy3 signal offers little further improvement in terms of reducing error. Comparison to expression data obtained using Arabidopsis samples demonstrates that the large number of genes in each sample showing a low level of transcription reflect the real complexity of the cellular transcriptome. Multidimensional scaling is used to show that the processed data identifies an underlying structure which reflect some of the key biological variables which define the data set. This structure is robust, allowing reliable comparison of samples collected over a number of years and collected by a variety of operators. Conclusions: This study outlines a robust and easily implemented pipeline for extracting, transforming normalising and visualising transcriptomic array data from Agilent expression platform. The analysis is used to obtain quantitative estimates of the SD arising from experimental (non biological) intra- and interarray variability, and for a lower threshold for determining whether an individual gene is expressed. The study provides a reliable basis for further more extensive studies of the systems biology of eukaryotic cells.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Communication signal processing applications often involve complex-valued (CV) functional representations for signals and systems. CV artificial neural networks have been studied theoretically and applied widely in nonlinear signal and data processing [1–11]. Note that most artificial neural networks cannot be automatically extended from the real-valued (RV) domain to the CV domain because the resulting model would in general violate Cauchy-Riemann conditions, and this means that the training algorithms become unusable. A number of analytic functions were introduced for the fully CV multilayer perceptrons (MLP) [4]. A fully CV radial basis function (RBF) nework was introduced in [8] for regression and classification applications. Alternatively, the problem can be avoided by using two RV artificial neural networks, one processing the real part and the other processing the imaginary part of the CV signal/system. A even more challenging problem is the inverse of a CV

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The optimal utilisation of hyper-spectral satellite observations in numerical weather prediction is often inhibited by incorrectly assuming independent interchannel observation errors. However, in order to represent these observation-error covariance structures, an accurate knowledge of the true variances and correlations is needed. This structure is likely to vary with observation type and assimilation system. The work in this article presents the initial results for the estimation of IASI interchannel observation-error correlations when the data are processed in the Met Office one-dimensional (1D-Var) and four-dimensional (4D-Var) variational assimilation systems. The method used to calculate the observation errors is a post-analysis diagnostic which utilises the background and analysis departures from the two systems. The results show significant differences in the source and structure of the observation errors when processed in the two different assimilation systems, but also highlight some common features. When the observations are processed in 1D-Var, the diagnosed error variances are approximately half the size of the error variances used in the current operational system and are very close in size to the instrument noise, suggesting that this is the main source of error. The errors contain no consistent correlations, with the exception of a handful of spectrally close channels. When the observations are processed in 4D-Var, we again find that the observation errors are being overestimated operationally, but the overestimation is significantly larger for many channels. In contrast to 1D-Var, the diagnosed error variances are often larger than the instrument noise in 4D-Var. It is postulated that horizontal errors of representation, not seen in 1D-Var, are a significant contributor to the overall error here. Finally, observation errors diagnosed from 4D-Var are found to contain strong, consistent correlation structures for channels sensitive to water vapour and surface properties.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Reduction of vegetation height is recommended as a management strategy for controlling rodent pests of rice in South-east Asia, but there are limited field data to assess its effectiveness. The breeding biology of the main pest species of rodent in the Philippines, Rattus tanezumi, suggests that habitat manipulation in irrigated rice–coconut cropping systems may be an effective strategy to limit the quality and availability of their nesting habitat. The authors imposed a replicated manipulation of vegetation cover in adjacent coconut groves during a single rice-cropping season, and added artificial nest sites to facilitate capture and culling of young. RESULTS: Three trapping sessions in four rice fields (two treatments, two controls) adjacent to coconut groves led to the capture of 176 R. tanezumi, 12Rattus exulans and seven Chrotomysmindorensis individuals. There was no significant difference in overall abundance between crop stages or between treatments, and there was no treatment effect on damage to tillers or rice yield. Only two R. tanezumi were caught at the artificial nest sites. CONCLUSION: Habitat manipulation to reduce the quality of R. tanezumi nesting habitat adjacent to rice fields is not effective as a lone rodent management tool in rice–coconut cropping systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis is concerned with development of improved management practices in indigenous chicken production systems in a research process that includes participatory approaches with smallholder farmers and other stakeholders in Kenya. The research process involved a wide range of activities that included on-station experiments, field surveys, stakeholder consultations in workshops, seminars and visits, and on-farm farmer participatory research to evaluate the effect of some improved management interventions on production performance of indigenous chickens. The participatory research was greatly informed from collective experiences and lessons of the previous activities. The on-station studies focused on hatching, growth and nutritional characteristics of the indigenous chickens. Four research publications from these studies are included in this thesis. Quantitative statistical analyses were applied and they involved use of growth models estimated with non-linear regressions for the growth characteristics, chi-square determinations to investigate differences among different reciprocal crosses of indigenous chickens and general linear models and covariance determination for the nutrition study. The on-station studies brought greater understanding of performance and production characteristics of indigenous chickens and the influence of management practices on these characteristics. The field surveys and stakeholder consultations helped in understanding the overarching issues affecting the productivity of the indigenous chickens systems and their place in the livelihoods of smallholder farmers. These activities created strong networking opportunities with stakeholders from a wide spectrum. The on-farm farmer participatory research involved selection of 200 farmers in five regions followed by training and introduction of interventions on improved management practices which included housing, vaccination, deworming and feed supplementation. Implementation and monitoring was mainly done by individual farmers continuously for close to one and half years. Six quarterly visits to the farms were made by the research team to monitor and provide support for on-going project activities. The data collected has been analysed for 5 consecutive 3-monthly periods. Descriptive and inferential statistics were applied to analyse the data collected involving treatment applications, production characteristics and flock demography characteristics. Out of the 200 farmers initially selected, 173 had records on treatment applications and flock demography characteristics while 127 farmers had records on production characteristics. The demographic analysis with a dissimilarity index of flock size produced 7 distinct farm groups from among the 173 farms. Two of these farm groups were represented in similar numbers in each of the five regions. The research process also involved a number of dissemination and communication strategies that have brought the process and project outcomes into the domain of accessibility by wider readership locally and globally. These include workshops, seminars, field visits and consultations, local and international conferences, electronic conferencing, publications and personal communication via emailing and conventional posting. A number of research and development proposals were also developed based on the knowledge and experiences gained from the research process. The thesis captures the research process activities and outcomes in 8 chapters which include in ascending order – introduction, theoretical concepts underpinning FPR, research methodology and process, on-station research output, FPR descriptive statistical analysis, FPR inferential statistical analysis on production characteristics, FPR demographic analysis and conclusions. Various research approaches both quantitative and qualitative have been applied in the research process indicating the possibilities and importance of combining both systems for greater understanding of issues being studied. In our case, participatory studies of the improved management of indigenous chickens indicates their potential importance as livelihood assets for poor people.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Exascale systems are the next frontier in high-performance computing and are expected to deliver a performance of the order of 10^18 operations per second using massive multicore processors. Very large- and extreme-scale parallel systems pose critical algorithmic challenges, especially related to concurrency, locality and the need to avoid global communication patterns. This work investigates a novel protocol for dynamic group communication that can be used to remove the global communication requirement and to reduce the communication cost in parallel formulations of iterative data mining algorithms. The protocol is used to provide a communication-efficient parallel formulation of the k-means algorithm for cluster analysis. The approach is based on a collective communication operation for dynamic groups of processes and exploits non-uniform data distributions. Non-uniform data distributions can be either found in real-world distributed applications or induced by means of multidimensional binary search trees. The analysis of the proposed dynamic group communication protocol has shown that it does not introduce significant communication overhead. The parallel clustering algorithm has also been extended to accommodate an approximation error, which allows a further reduction of the communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing elements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a new class of neurofuzzy construction algorithms with the aim of maximizing generalization capability specifically for imbalanced data classification problems based on leave-one-out (LOO) cross validation. The algorithms are in two stages, first an initial rule base is constructed based on estimating the Gaussian mixture model with analysis of variance decomposition from input data; the second stage carries out the joint weighted least squares parameter estimation and rule selection using orthogonal forward subspace selection (OFSS)procedure. We show how different LOO based rule selection criteria can be incorporated with OFSS, and advocate either maximizing the leave-one-out area under curve of the receiver operating characteristics, or maximizing the leave-one-out Fmeasure if the data sets exhibit imbalanced class distribution. Extensive comparative simulations illustrate the effectiveness of the proposed algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We contrast attempts to introduce what were seen as sophisticated Western-style human resource management (HRM) systems into two Russian oil companies – a joint venture with a Western multinational corporation (TNK-BP) and a wholly Russian-owned company (Yukos). The drivers for Western hegemony within the joint venture, heavily influenced by expatriates and the established HRM processes introduced by the Western parent, were counteracted to a significant degree by the Russian spetsifika – the peculiarly Russian way of thinking and doing things. In contrast, developments were absorbed faster in the more authoritarian Russian-owned company. The research adds to the theoretical debate about international knowledge transfer and provides detailed empirical data to support our understanding of the effect of both organizational and cultural context on the knowledge-transfer mechanisms of local and multinational companies. As the analysis is based on the perspective of senior local nationals, we also address a relatively under-researched area in the international HRM literature which mostly relies on empirical data collected from expatriates and those based solely in multinational headquarters.