946 resultados para Legacy datasets
Resumo:
Summarizing topological relations is fundamental to many spatial applications including spatial query optimization. In this paper, we present several novel techniques to eectively construct cell density based spatial histograms for range (window) summarizations restricted to the four most important topological relations: contains, contained, overlap, and disjoint. We rst present a novel framework to construct a multiscale histogram composed of multiple Euler histograms with the guarantee of the exact summarization results for aligned windows in constant time. Then we present an approximate algorithm, with the approximate ratio 19/12, to minimize the storage spaces of such multiscale Euler histograms, although the problem is generally NP-hard. To conform to a limited storage space where only k Euler histograms are allowed, an effective algorithm is presented to construct multiscale histograms to achieve high accuracy. Finally, we present a new approximate algorithm to query an Euler histogram that cannot guarantee the exact answers; it runs in constant time. Our extensive experiments against both synthetic and real world datasets demonstrated that the approximate mul- tiscale histogram techniques may improve the accuracy of the existing techniques by several orders of magnitude while retaining the cost effciency, and the exact multiscale histogram technique requires only a storage space linearly proportional to the number of cells for the real datasets.
Resumo:
Rapid economic development has occurred during the past few decades in China with the Yangtze River Delta (YRD) area as one of the most progressive areas. The urbanization, industrialization, agricultural and aquaculture activities result in extensive production and application of chemicals. Organohalogen contaminants (OHCs) have been widely used as i.e. pesticides, flame retardants and plasticizers. They are persistent, bioaccumulative and pose a potential threat to ecosystem and human health. However, limited research has been conducted in the YRD with respect to chemicals environmental exposure. The main objective of this thesis is to investigate the contamination level, distribution pattern and sources of OHCs in the YRD. Wildlife from different habitats are used to indicate the environmental pollution situation, and evaluate selected matrices for use in long term biomonitoring to determine the environmental stress the contamination may cause. In addition, a method is developed for dicofol analysis. Moreover, a specific effort is made to introduce statistic power analysis to assist in optimal sampling design. The thesis results show extensive contamination of OHCs in wildlife in the YRD. The occurrences of high concentrations of chlorinated paraffins (CPs) are reported in wildlife, in particular in terrestrial species, (i.e. short-tailed mamushi snake and peregrine falcon). Impurities and byproducts of pentachlorophenol products, i.e. polychlorinated diphenyl ethers (PCDEs) and hydroxylated polychlorinated diphenyl ethers (OH-PCDEs) are identified and reported for the first time in eggs from black-crowned night heron and whiskered tern. High concentrations of octachlorodibenzo-p-dioxin (OCDD) are determined in these samples. The toxic equivalents (TEQs) of polychlorinated dibenzo-p-dioxin (PCDDs) and polychlorinated dibenzofurans (PCDFs) are at mean levels of 300 and 520 pg TEQ g-1lw (WHO2005 TEQ) in eggs from the two bird species, respectively. This is two orders of magnitude higher than European Union (EU) regulation limit in chicken eggs. Also, a novel pattern of polychlorinated biphenyls (PCBs) with octa- to decaCBs, contributing to as much as 20% of total PCBs therein, are reported in birds. The legacy POPs shows a common characteristic with relatively high level of organochlorine pesticides (i.e. DDT, hexacyclohexanes (HCHs) and Mirex), indicating historic applications. In contrast, rather low concentrations are shown of industrial chemicals such as PCBs and polybrominated diphenyl ethers (PBDEs). A refined and improved analytical method is developed to separate dicofol from its major decomposition compound, 4,4’-dichlorobenzophenone. Hence dicofol is possible to assess as such. Statistic power analysis demonstrates that sampling of sedentary species should be consistently spread over a larger area to monitor temporal trends of contaminants in a robust manner. The results presented in this thesis show high CPs and OCDD concentrations in wildlife. The levels and patterns of OHCs in YRD differ from other well studied areas of the world. This is likely due to the extensive production and use of chemicals in the YRD. The results strongly signal the need of research biomonitoring programs that meet the current situation of the YRD. Such programs will contribute to the management of chemicals and environment in YRD, with the potential to grow into the human health sector, and to expand to China as a whole.
Resumo:
Data Envelopment Analysis (DEA) is one of the most widely used methods in the measurement of the efficiency and productivity of Decision Making Units (DMUs). DEA for a large dataset with many inputs/outputs would require huge computer resources in terms of memory and CPU time. This paper proposes a neural network back-propagation Data Envelopment Analysis to address this problem for the very large scale datasets now emerging in practice. Neural network requirements for computer memory and CPU time are far less than that needed by conventional DEA methods and can therefore be a useful tool in measuring the efficiency of large datasets. Finally, the back-propagation DEA algorithm is applied to five large datasets and compared with the results obtained by conventional DEA.
Resumo:
Guest editorial
Resumo:
This paper revisits the issue of intra-industry foreign direct investment (FDI). This issue was considered in Stephen Hymer's early work, but was not subsequently developed, and was largely ignored in the literature for some time. Using the example of the UK, this paper traces the patterns of intra-industry FDI, both across countries and industries, for both the manufacturing and service sectors. Despite the undoubted increase in the integration of goods and factor markets since the time of Hymer's writing, the analysis presented here shows that the pattern has changed little in the last 40 years. The paper then goes on to discuss the motives for intra-industry FDI, relating it to technology flows and factor cost differentials. Finally, we present some analysis relating intra-industry FDI to uneven development, both between developed and developing countries, and between regions of a developed country. It is clear that intra-industry FDI is still very much a developed country phenomenon, as Hymer suggested, with both developing countries and poorer regions of developed countries unlikely to reap any of the benefits. In this context, one-way and two-way FDI must be seen as different phenomena within the debate on globalisation. © The Author 2005. Published by Oxford University Press on behalf of the Cambridge Political Economy Society. All rights reserved.
Resumo:
Very large spatially-referenced datasets, for example, those derived from satellite-based sensors which sample across the globe or large monitoring networks of individual sensors, are becoming increasingly common and more widely available for use in environmental decision making. In large or dense sensor networks, huge quantities of data can be collected over small time periods. In many applications the generation of maps, or predictions at specific locations, from the data in (near) real-time is crucial. Geostatistical operations such as interpolation are vital in this map-generation process and in emergency situations, the resulting predictions need to be available almost instantly, so that decision makers can make informed decisions and define risk and evacuation zones. It is also helpful when analysing data in less time critical applications, for example when interacting directly with the data for exploratory analysis, that the algorithms are responsive within a reasonable time frame. Performing geostatistical analysis on such large spatial datasets can present a number of problems, particularly in the case where maximum likelihood. Although the storage requirements only scale linearly with the number of observations in the dataset, the computational complexity in terms of memory and speed, scale quadratically and cubically respectively. Most modern commodity hardware has at least 2 processor cores if not more. Other mechanisms for allowing parallel computation such as Grid based systems are also becoming increasingly commonly available. However, currently there seems to be little interest in exploiting this extra processing power within the context of geostatistics. In this paper we review the existing parallel approaches for geostatistics. By recognising that diffeerent natural parallelisms exist and can be exploited depending on whether the dataset is sparsely or densely sampled with respect to the range of variation, we introduce two contrasting novel implementations of parallel algorithms based on approximating the data likelihood extending the methods of Vecchia [1988] and Tresp [2000]. Using parallel maximum likelihood variogram estimation and parallel prediction algorithms we show that computational time can be significantly reduced. We demonstrate this with both sparsely sampled data and densely sampled data on a variety of architectures ranging from the common dual core processor, found in many modern desktop computers, to large multi-node super computers. To highlight the strengths and weaknesses of the diffeerent methods we employ synthetic data sets and go on to show how the methods allow maximum likelihood based inference on the exhaustive Walker Lake data set.
Resumo:
Half a decade has passed since the objectives and benefits of autonomic computing were stated, yet even the latest system designs and deployments exhibit only limited and isolated elements of autonomic functionality. From an autonomic computing standpoint, all computing systems – old, new or under development – are legacy systems, and will continue to be so for some time to come. In this paper, we propose a generic architecture for developing fully-fledged autonomic systems out of legacy, non-autonomic components, and we investigate how existing technologies can be used to implement this architecture.
Resumo:
DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT
Resumo:
With the ability to collect and store increasingly large datasets on modern computers comes the need to be able to process the data in a way that can be useful to a Geostatistician or application scientist. Although the storage requirements only scale linearly with the number of observations in the dataset, the computational complexity in terms of memory and speed, scale quadratically and cubically respectively for likelihood-based Geostatistics. Various methods have been proposed and are extensively used in an attempt to overcome these complexity issues. This thesis introduces a number of principled techniques for treating large datasets with an emphasis on three main areas: reduced complexity covariance matrices, sparsity in the covariance matrix and parallel algorithms for distributed computation. These techniques are presented individually, but it is also shown how they can be combined to produce techniques for further improving computational efficiency.
Resumo:
This thesis presents experimental investigation of different effects/techniques that can be used to upgrade legacy WDM communication systems. The main issue in upgrading legacy systems is that the fundamental setup, including components settings such as EDFA gains, does not need to be altered thus the improvement must be carried out at the network terminal. A general introduction to optical fibre communications is given at the beginning, including optical communication components and system impairments. Experimental techniques for performing laboratory optical transmission experiments are presented before the experimental work of this thesis. These techniques include optical transmitter and receiver designs as well as the design and operation of the recirculating loop. The main experimental work includes three different studies. The first study involves a development of line monitoring equipment that can be reliably used to monitor the performance of optically amplified long-haul undersea systems. This equipment can provide instant finding of the fault locations along the legacy communication link which in tum enables rapid repair execution to be performed hence upgrading the legacy system. The second study investigates the effect of changing the number of transmitted 1s and Os on the performance of WDM system. This effect can, in reality, be seen in some coding systems, e.g. forward-error correction (FEC) technique, where the proportion of the 1s and Os are changed at the transmitter by adding extra bits to the original bit sequence. The final study presents transmission results after all-optical format conversion from NRZ to CSRZ and from RZ to CSRZ using semiconductor optical amplifier in nonlinear optical loop mirror (SOA-NOLM). This study is mainly based on the fact that the use of all-optical processing, including format conversion, has become attractive for the future data networks that are proposed to be all-optical. The feasibility of the SOA-NOLM device for converting single and WDM signals is described. The optical conversion bandwidth and its limitations for WDM conversion are also investigated. All studies of this thesis employ 10Gbit/s single or WDM signals being transmitted over dispersion managed fibre span in the recirculating loop. The fibre span is composed of single-mode fibres (SMF) whose losses and dispersion are compensated using erbium-doped fibre amplifiers (EDFAs) and dispersion compensating fibres (DCFs), respectively. Different configurations of the fibre span are presented in different parts.
Resumo:
Despite expectations being high, the industrial take-up of Semantic Web technologies in developing services and applications has been slower than expected. One of the main reasons is that many legacy systems have been developed without considering the potential of theWeb in integrating services and sharing resources.Without a systematic methodology and proper tool support, the migration from legacy systems to SemanticWeb Service-based systems can be a tedious and expensive process, which carries a significant risk of failure. There is an urgent need to provide strategies, allowing the migration of legacy systems to Semantic Web Services platforms, and also tools to support such strategies. In this paper we propose a methodology and its tool support for transitioning these applications to Semantic Web Services, which allow users to migrate their applications to Semantic Web Services platforms automatically or semi-automatically. The transition of the GATE system is used as a case study. © 2009 - IOS Press and the authors. All rights reserved.
Resumo:
Although the importance of dataset fitness-for-use evaluation and intercomparison is widely recognised within the GIS community, no practical tools have yet been developed to support such interrogation. GeoViQua aims to develop a GEO label which will visually summarise and allow interrogation of key informational aspects of geospatial datasets upon which users rely when selecting datasets for use. The proposed GEO label will be integrated in the Global Earth Observation System of Systems (GEOSS) and will be used as a value and trust indicator for datasets accessible through the GEO Portal. As envisioned, the GEO label will act as a decision support mechanism for dataset selection and thereby hopefully improve user recognition of the quality of datasets. To date we have conducted 3 user studies to (1) identify the informational aspects of geospatial datasets upon which users rely when assessing dataset quality and trustworthiness, (2) elicit initial user views on a GEO label and its potential role and (3), evaluate prototype label visualisations. Our first study revealed that, when evaluating quality of data, users consider 8 facets: dataset producer information; producer comments on dataset quality; dataset compliance with international standards; community advice; dataset ratings; links to dataset citations; expert value judgements; and quantitative quality information. Our second study confirmed the relevance of these facets in terms of the community-perceived function that a GEO label should fulfil: users and producers of geospatial data supported the concept of a GEO label that provides a drill-down interrogation facility covering all 8 informational aspects. Consequently, we developed three prototype label visualisations and evaluated their comparative effectiveness and user preference via a third user study to arrive at a final graphical GEO label representation. When integrated in the GEOSS, an individual GEO label will be provided for each dataset in the GEOSS clearinghouse (or other data portals and clearinghouses) based on its available quality information. Producer and feedback metadata documents are being used to dynamically assess information availability and generate the GEO labels. The producer metadata document can either be a standard ISO compliant metadata record supplied with the dataset, or an extended version of a GeoViQua-derived metadata record, and is used to assess the availability of a producer profile, producer comments, compliance with standards, citations and quantitative quality information. GeoViQua is also currently developing a feedback server to collect and encode (as metadata records) user and producer feedback on datasets; these metadata records will be used to assess the availability of user comments, ratings, expert reviews and user-supplied citations for a dataset. The GEO label will provide drill-down functionality which will allow a user to navigate to a GEO label page offering detailed quality information for its associated dataset. At this stage, we are developing the GEO label service that will be used to provide GEO labels on demand based on supplied metadata records. In this presentation, we will provide a comprehensive overview of the GEO label development process, with specific emphasis on the GEO label implementation and integration into the GEOSS.