976 resultados para pacs: document processing techniques
Resumo:
This paper discusses how numerical gradient estimation methods may be used in order to reduce the computational demands on a class of multidimensional clustering algorithms. The study is motivated by the recognition that several current point-density based cluster identification algorithms could benefit from a reduction of computational demand if approximate a-priori estimates of the cluster centres present in a given data set could be supplied as starting conditions for these algorithms. In this particular presentation, the algorithm shown to benefit from the technique is the Mean-Tracking (M-T) cluster algorithm, but the results obtained from the gradient estimation approach may also be applied to other clustering algorithms and their related disciplines.
Resumo:
Optimal state estimation from given observations of a dynamical system by data assimilation is generally an ill-posed inverse problem. In order to solve the problem, a standard Tikhonov, or L2, regularization is used, based on certain statistical assumptions on the errors in the data. The regularization term constrains the estimate of the state to remain close to a prior estimate. In the presence of model error, this approach does not capture the initial state of the system accurately, as the initial state estimate is derived by minimizing the average error between the model predictions and the observations over a time window. Here we examine an alternative L1 regularization technique that has proved valuable in image processing. We show that for examples of flow with sharp fronts and shocks, the L1 regularization technique performs more accurately than standard L2 regularization.
Resumo:
The technique of constructing a transformation, or regrading, of a discrete data set such that the histogram of the transformed data matches a given reference histogram is commonly known as histogram modification. The technique is widely used for image enhancement and normalization. A method which has been previously derived for producing such a regrading is shown to be “best” in the sense that it minimizes the error between the cumulative histogram of the transformed data and that of the given reference function, over all single-valued, monotone, discrete transformations of the data. Techniques for smoothed regrading, which provide a means of balancing the error in matching a given reference histogram against the information lost with respect to a linear transformation are also examined. The smoothed regradings are shown to optimize certain cost functionals. Numerical algorithms for generating the smoothed regradings, which are simple and efficient to implement, are described, and practical applications to the processing of LANDSAT image data are discussed.
Resumo:
Traditional chemometrics techniques are augmented with algorithms tailored specifically for the de-noising and analysis of femtosecond duration pulse datasets. The new algorithms provide additional insights on sample responses to broadband excitation and multi-moded propagation phenomena.
Resumo:
Keyphrases are added to documents to help identify the areas of interest they contain. However, in a significant proportion of papers author selected keyphrases are not appropriate for the document they accompany: for instance, they can be classificatory rather than explanatory, or they are not updated when the focus of the paper changes. As such, automated methods for improving the use of keyphrases are needed, and various methods have been published. However, each method was evaluated using a different corpus, typically one relevant to the field of study of the method’s authors. This not only makes it difficult to incorporate the useful elements of algorithms in future work, but also makes comparing the results of each method inefficient and ineffective. This paper describes the work undertaken to compare five methods across a common baseline of corpora. The methods chosen were Term Frequency, Inverse Document Frequency, the C-Value, the NC-Value, and a Synonym based approach. These methods were analysed to evaluate performance and quality of results, and to provide a future benchmark. It is shown that Term Frequency and Inverse Document Frequency were the best algorithms, with the Synonym approach following them. Following these findings, a study was undertaken into the value of using human evaluators to judge the outputs. The Synonym method was compared to the original author keyphrases of the Reuters’ News Corpus. The findings show that authors of Reuters’ news articles provide good keyphrases but that more often than not they do not provide any keyphrases.
Resumo:
n the past decade, the analysis of data has faced the challenge of dealing with very large and complex datasets and the real-time generation of data. Technologies to store and access these complex and large datasets are in place. However, robust and scalable analysis technologies are needed to extract meaningful information from these datasets. The research field of Information Visualization and Visual Data Analytics addresses this need. Information visualization and data mining are often used complementary to each other. Their common goal is the extraction of meaningful information from complex and possibly large data. However, though data mining focuses on the usage of silicon hardware, visualization techniques also aim to access the powerful image-processing capabilities of the human brain. This article highlights the research on data visualization and visual analytics techniques. Furthermore, we highlight existing visual analytics techniques, systems, and applications including a perspective on the field from the chemical process industry.
Resumo:
We have optimised the atmospheric radiation algorithm of the FAMOUS climate model on several hardware platforms. The optimisation involved translating the Fortran code to C and restructuring the algorithm around the computation of a single air column. Instead of the existing MPI-based domain decomposition, we used a task queue and a thread pool to schedule the computation of individual columns on the available processors. Finally, four air columns are packed together in a single data structure and computed simultaneously using Single Instruction Multiple Data operations. The modified algorithm runs more than 50 times faster on the CELL’s Synergistic Processing Elements than on its main PowerPC processing element. On Intel-compatible processors, the new radiation code runs 4 times faster. On the tested graphics processor, using OpenCL, we find a speed-up of more than 2.5 times as compared to the original code on the main CPU. Because the radiation code takes more than 60% of the total CPU time, FAMOUS executes more than twice as fast. Our version of the algorithm returns bit-wise identical results, which demonstrates the robustness of our approach. We estimate that this project required around two and a half man-years of work.
Resumo:
It is well known that atmospheric concentrations of carbon dioxide (CO2) (and other greenhouse gases) have increased markedly as a result of human activity since the industrial revolution. It is perhaps less appreciated that natural and managed soils are an important source and sink for atmospheric CO2 and that, primarily as a result of the activities of soil microorganisms, there is a soil-derived respiratory flux of CO2 to the atmosphere that overshadows by tenfold the annual CO2 flux from fossil fuel emissions. Therefore small changes in the soil carbon cycle could have large impacts on atmospheric CO2 concentrations. Here we discuss the role of soil microbes in the global carbon cycle and review the main methods that have been used to identify the microorganisms responsible for the processing of plant photosynthetic carbon inputs to soil. We discuss whether application of these techniques can provide the information required to underpin the management of agro-ecosystems for carbon sequestration and increased agricultural sustainability. We conclude that, although crucial in enabling the identification of plant-derived carbon-utilising microbes, current technologies lack the high-throughput ability to quantitatively apportion carbon use by phylogentic groups and its use efficiency and destination within the microbial metabolome. It is this information that is required to inform rational manipulation of the plant–soil system to favour organisms or physiologies most important for promoting soil carbon storage in agricultural soil.
Resumo:
Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors.
Resumo:
The processing of fish roe leads to changes in its chemical composition, the extent of which depends on the techniques and additives employed. This study aimed to investigate the effects of ripening temperature and the use of sodium benzoate and citric acid on the quality of ripened cod roe, with respect to the contents of volatile base nitrogen (VBN), trimethylamine (TMA), biogenic amines (BA) and on the lipid composition. In comparison with fresh roes, ripened roes presented higher contents of VBN, TMA, BA and the proportion of free fatty acids regardless of the temperature and additives used during the ripening process. The greatest increases were observed in the samples ripened at 17 degrees C without additives, in which histamine was detected at 8.8 mg/100 g. A low ripening temperature was the main factor responsible for minimising changes in the cod roe composition. The addition of sodium benzoate as a preservative or citric acid to decrease the pH value had a significant effect in maintaining the quality of the cod roes, mainly at high ripening temperature. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
Point placement strategies aim at mapping data points represented in higher dimensions to bi-dimensional spaces and are frequently used to visualize relationships amongst data instances. They have been valuable tools for analysis and exploration of data sets of various kinds. Many conventional techniques, however, do not behave well when the number of dimensions is high, such as in the case of documents collections. Later approaches handle that shortcoming, but may cause too much clutter to allow flexible exploration to take place. In this work we present a novel hierarchical point placement technique that is capable of dealing with these problems. While good grouping and separation of data with high similarity is maintained without increasing computation cost, its hierarchical structure lends itself both to exploration in various levels of detail and to handling data in subsets, improving analysis capability and also allowing manipulation of larger data sets.
Resumo:
Photovoltaic processing is one of the processes that have significance in semiconductor process line. It is complicated due to the no. of elements involved that directly or indirectly affect the processing and final yield. So mathematically or empirically we can’t say assertively about the results specially related with diffusion, antireflective coating and impurity poisoning. Here I have experimented and collected data on the mono-crystal silicon wafers with varying properties and outputs. Then by using neural network with available experimental data output required can be estimated which is further tested by the test data for authenticity. One can say that it’s a kind of process simulation with varying input of raw wafers to get desired yield of photovoltaic mono-crystal cells.
Resumo:
A sound and complete first-order goal-oriented sequent-type calculus is developed with ``large-block'' inference rules. In particular, the calculus contains formal analogues of such natural proof-search techniques as handling definitions and applying auxiliary propositions.
Resumo:
Being able to ask questions about the provenance of some data requires documentation on each influence on that data's existence and content. Much software exists, and is being developed, for which there is no provenance-awareness, i.e. at best, the data it outputs can be connected to its inputs, but with no record of intermediate processing. Further, where some record of processing does exist, e.g. as logs, it is not in a form easily connected with that of other processes. We would like to enable compiled software to record useful documentation without requiring prior manual adaptation. In this paper, we present an approach to adapting source code from its original form without manual manipulation, to record information on data provenance during execution.
Resumo:
With the ever increasing demands for high complexity consumer electronic products, market pressures demand faster product development and lower cost. SoCbased design can provide the required design flexibility and speed by allowing the use of IP cores. However, testing costs in the SoC environment can reach a substantial percent of the total production cost. Analog testing costs may dominate the total test cost, as testing of analog circuits usually require functional verification of the circuit and special testing procedures. For RF analog circuits commonly used in wireless applications, testing is further complicated because of the high frequencies involved. In summary, reducing analog test cost is of major importance in the electronic industry today. BIST techniques for analog circuits, though potentially able to solve the analog test cost problem, have some limitations. Some techniques are circuit dependent, requiring reconfiguration of the circuit being tested, and are generally not usable in RF circuits. In the SoC environment, as processing and memory resources are available, they could be used in the test. However, the overhead for adding additional AD and DA converters may be too costly for most systems, and analog routing of signals may not be feasible and may introduce signal distortion. In this work a simple and low cost digitizer is used instead of an ADC in order to enable analog testing strategies to be implemented in a SoC environment. Thanks to the low analog area overhead of the converter, multiple analog test points can be observed and specific analog test strategies can be enabled. As the digitizer is always connected to the analog test point, it is not necessary to include muxes and switches that would degrade the signal path. For RF analog circuits, this is specially useful, as the circuit impedance is fixed and the influence of the digitizer can be accounted for in the design phase. Thanks to the simplicity of the converter, it is able to reach higher frequencies, and enables the implementation of low cost RF test strategies. The digitizer has been applied successfully in the testing of both low frequency and RF analog circuits. Also, as testing is based on frequency-domain characteristics, nonlinear characteristics like intermodulation products can also be evaluated. Specifically, practical results were obtained for prototyped base band filters and a 100MHz mixer. The application of the converter for noise figure evaluation was also addressed, and experimental results for low frequency amplifiers using conventional opamps were obtained. The proposed method is able to enhance the testability of current mixed-signal designs, being suitable for the SoC environment used in many industrial products nowadays.