862 resultados para Arabic alphabet Data processing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

All-optical technologies for data processing and signal manipulation are expected to play a major role in future optical communications. Nonlinear phenomena occurring in optical fibre have many attractive features and great, but not yet fully exploited potential in optical signal processing. Here, we overview our recent results and advances in developing novel photonic techniques and approaches to all-optical processing based on fibre nonlinearities. Amongst other topics, we will discuss phase-preserving optical 2R regeneration, the possibility of using parabolic/flat-top pulses for optical signal processing and regeneration, and nonlinear optical pulse shaping. A method for passive nonlinear pulse shaping based on pulse pre-chirping and propagation in a normally dispersive fibre will be presented. The approach provides a simple way of generating various temporal waveforms of fundamental and practical interest. Particular emphasis will be given to the formation and characterization of pulses with a triangular intensity profile. A new technique of doubling/copying optical pulses in both the frequency and time domains using triangular-shaped pulses will be also introduced.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Very often the experimental data are the realization of the process, fully determined by some unknown function, being distorted by hindrances. Treatment and experimental data analysis are substantially facilitated, if these data to represent as analytical expression. The experimental data processing algorithm and the example of using this algorithm for spectrographic analysis of oncologic preparations of blood is represented in this article.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this paper is to investigate the technological development of electronic inventory solutions from perspective of patent analysis. We first applied the international patent classification to classify the top categories of data processing technologies and their corresponding top patenting countries. Then we identified the core technologies by the calculation of patent citation strength and standard deviation criterion for each patent. To eliminate those core innovations having no reference relationships with the other core patents, relevance strengths between core technologies were evaluated also. Our findings provide market intelligence not only for the research and development community, but for the decision making of advanced inventory solutions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The development of new all-optical technologies for data processing and signal manipulation is a field of growing importance with a strong potential for numerous applications in diverse areas of modern science. Nonlinear phenomena occurring in optical fibres have many attractive features and great, but not yet fully explored, potential in signal processing. Here, we review recent progress on the use of fibre nonlinearities for the generation and shaping of optical pulses and on the applications of advanced pulse shapes in all-optical signal processing. Amongst other topics, we will discuss ultrahigh repetition rate pulse sources, the generation of parabolic shaped pulses in active and passive fibres, the generation of pulses with triangular temporal profiles, and coherent supercontinuum sources. The signal processing applications will span optical regeneration, linear distortion compensation, optical decision at the receiver in optical communication systems, spectral and temporal signal doubling, and frequency conversion. © Copyright 2012 Sonia Boscolo and Christophe Finot.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research presents several components encompassing the scope of the objective of Data Partitioning and Replication Management in Distributed GIS Database. Modern Geographic Information Systems (GIS) databases are often large and complicated. Therefore data partitioning and replication management problems need to be addresses in development of an efficient and scalable solution. ^ Part of the research is to study the patterns of geographical raster data processing and to propose the algorithms to improve availability of such data. These algorithms and approaches are targeting granularity of geographic data objects as well as data partitioning in geographic databases to achieve high data availability and Quality of Service(QoS) considering distributed data delivery and processing. To achieve this goal a dynamic, real-time approach for mosaicking digital images of different temporal and spatial characteristics into tiles is proposed. This dynamic approach reuses digital images upon demand and generates mosaicked tiles only for the required region according to user's requirements such as resolution, temporal range, and target bands to reduce redundancy in storage and to utilize available computing and storage resources more efficiently. ^ Another part of the research pursued methods for efficient acquiring of GIS data from external heterogeneous databases and Web services as well as end-user GIS data delivery enhancements, automation and 3D virtual reality presentation. ^ There are vast numbers of computing, network, and storage resources idling or not fully utilized available on the Internet. Proposed "Crawling Distributed Operating System "(CDOS) approach employs such resources and creates benefits for the hosts that lend their CPU, network, and storage resources to be used in GIS database context. ^ The results of this dissertation demonstrate effective ways to develop a highly scalable GIS database. The approach developed in this dissertation has resulted in creation of TerraFly GIS database that is used by US government, researchers, and general public to facilitate Web access to remotely-sensed imagery and GIS vector information. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The microarray technology provides a high-throughput technique to study gene expression. Microarrays can help us diagnose different types of cancers, understand biological processes, assess host responses to drugs and pathogens, find markers for specific diseases, and much more. Microarray experiments generate large amounts of data. Thus, effective data processing and analysis are critical for making reliable inferences from the data. ^ The first part of dissertation addresses the problem of finding an optimal set of genes (biomarkers) to classify a set of samples as diseased or normal. Three statistical gene selection methods (GS, GS-NR, and GS-PCA) were developed to identify a set of genes that best differentiate between samples. A comparative study on different classification tools was performed and the best combinations of gene selection and classifiers for multi-class cancer classification were identified. For most of the benchmarking cancer data sets, the gene selection method proposed in this dissertation, GS, outperformed other gene selection methods. The classifiers based on Random Forests, neural network ensembles, and K-nearest neighbor (KNN) showed consistently god performance. A striking commonality among these classifiers is that they all use a committee-based approach, suggesting that ensemble classification methods are superior. ^ The same biological problem may be studied at different research labs and/or performed using different lab protocols or samples. In such situations, it is important to combine results from these efforts. The second part of the dissertation addresses the problem of pooling the results from different independent experiments to obtain improved results. Four statistical pooling techniques (Fisher inverse chi-square method, Logit method. Stouffer's Z transform method, and Liptak-Stouffer weighted Z-method) were investigated in this dissertation. These pooling techniques were applied to the problem of identifying cell cycle-regulated genes in two different yeast species. As a result, improved sets of cell cycle-regulated genes were identified. The last part of dissertation explores the effectiveness of wavelet data transforms for the task of clustering. Discrete wavelet transforms, with an appropriate choice of wavelet bases, were shown to be effective in producing clusters that were biologically more meaningful. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation established a software-hardware integrated design for a multisite data repository in pediatric epilepsy. A total of 16 institutions formed a consortium for this web-based application. This innovative fully operational web application allows users to upload and retrieve information through a unique human-computer graphical interface that is remotely accessible to all users of the consortium. A solution based on a Linux platform with My-SQL and Personal Home Page scripts (PHP) has been selected. Research was conducted to evaluate mechanisms to electronically transfer diverse datasets from different hospitals and collect the clinical data in concert with their related functional magnetic resonance imaging (fMRI). What was unique in the approach considered is that all pertinent clinical information about patients is synthesized with input from clinical experts into 4 different forms, which were: Clinical, fMRI scoring, Image information, and Neuropsychological data entry forms. A first contribution of this dissertation was in proposing an integrated processing platform that was site and scanner independent in order to uniformly process the varied fMRI datasets and to generate comparative brain activation patterns. The data collection from the consortium complied with the IRB requirements and provides all the safeguards for security and confidentiality requirements. An 1-MR1-based software library was used to perform data processing and statistical analysis to obtain the brain activation maps. Lateralization Index (LI) of healthy control (HC) subjects in contrast to localization-related epilepsy (LRE) subjects were evaluated. Over 110 activation maps were generated, and their respective LIs were computed yielding the following groups: (a) strong right lateralization: (HC=0%, LRE=18%), (b) right lateralization: (HC=2%, LRE=10%), (c) bilateral: (HC=20%, LRE=15%), (d) left lateralization: (HC=42%, LRE=26%), e) strong left lateralization: (HC=36%, LRE=31%). Moreover, nonlinear-multidimensional decision functions were used to seek an optimal separation between typical and atypical brain activations on the basis of the demographics as well as the extent and intensity of these brain activations. The intent was not to seek the highest output measures given the inherent overlap of the data, but rather to assess which of the many dimensions were critical in the overall assessment of typical and atypical language activations with the freedom to select any number of dimensions and impose any degree of complexity in the nonlinearity of the decision space.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cloud computing offers massive scalability and elasticity required by many scien-tific and commercial applications. Combining the computational and data handling capabilities of clouds with parallel processing also has the potential to tackle Big Data problems efficiently. Science gateway frameworks and workflow systems enable application developers to implement complex applications and make these available for end-users via simple graphical user interfaces. The integration of such frameworks with Big Data processing tools on the cloud opens new oppor-tunities for application developers. This paper investigates how workflow sys-tems and science gateways can be extended with Big Data processing capabilities. A generic approach based on infrastructure aware workflows is suggested and a proof of concept is implemented based on the WS-PGRADE/gUSE science gateway framework and its integration with the Hadoop parallel data processing solution based on the MapReduce paradigm in the cloud. The provided analysis demonstrates that the methods described to integrate Big Data processing with workflows and science gateways work well in different cloud infrastructures and application scenarios, and can be used to create massively parallel applications for scientific analysis of Big Data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper is part of a special issue of Applied Geochemistry focusing on reliable applications of compositional multivariate statistical methods. This study outlines the application of compositional data analysis (CoDa) to calibration of geochemical data and multivariate statistical modelling of geochemistry and grain-size data from a set of Holocene sedimentary cores from the Ganges-Brahmaputra (G-B) delta. Over the last two decades, understanding near-continuous records of sedimentary sequences has required the use of core-scanning X-ray fluorescence (XRF) spectrometry, for both terrestrial and marine sedimentary sequences. Initial XRF data are generally unusable in ‘raw-format’, requiring data processing in order to remove instrument bias, as well as informed sequence interpretation. The applicability of these conventional calibration equations to core-scanning XRF data are further limited by the constraints posed by unknown measurement geometry and specimen homogeneity, as well as matrix effects. Log-ratio based calibration schemes have been developed and applied to clastic sedimentary sequences focusing mainly on energy dispersive-XRF (ED-XRF) core-scanning. This study has applied high resolution core-scanning XRF to Holocene sedimentary sequences from the tidal-dominated Indian Sundarbans, (Ganges-Brahmaputra delta plain). The Log-Ratio Calibration Equation (LRCE) was applied to a sub-set of core-scan and conventional ED-XRF data to quantify elemental composition. This provides a robust calibration scheme using reduced major axis regression of log-ratio transformed geochemical data. Through partial least squares (PLS) modelling of geochemical and grain-size data, it is possible to derive robust proxy information for the Sundarbans depositional environment. The application of these techniques to Holocene sedimentary data offers an improved methodological framework for unravelling Holocene sedimentation patterns.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent advances in the massively parallel computational abilities of graphical processing units (GPUs) have increased their use for general purpose computation, as companies look to take advantage of big data processing techniques. This has given rise to the potential for malicious software targeting GPUs, which is of interest to forensic investigators examining the operation of software. The ability to carry out reverse-engineering of software is of great importance within the security and forensics elds, particularly when investigating malicious software or carrying out forensic analysis following a successful security breach. Due to the complexity of the Nvidia CUDA (Compute Uni ed Device Architecture) framework, it is not clear how best to approach the reverse engineering of a piece of CUDA software. We carry out a review of the di erent binary output formats which may be encountered from the CUDA compiler, and their implications on reverse engineering. We then demonstrate the process of carrying out disassembly of an example CUDA application, to establish the various techniques available to forensic investigators carrying out black-box disassembly and reverse engineering of CUDA binaries. We show that the Nvidia compiler, using default settings, leaks useful information. Finally, we demonstrate techniques to better protect intellectual property in CUDA algorithm implementations from reverse engineering.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The only method used to date to measure dissolved nitrate concentration (NITRATE) with sensors mounted on profiling floats is based on the absorption of light at ultraviolet wavelengths by nitrate ion (Johnson and Coletti, 2002; Johnson et al., 2010; 2013; D’Ortenzio et al., 2012). Nitrate has a modest UV absorption band with a peak near 210 nm, which overlaps with the stronger absorption band of bromide, which has a peak near 200 nm. In addition, there is a much weaker absorption due to dissolved organic matter and light scattering by particles (Ogura and Hanya, 1966). The UV spectrum thus consists of three components, bromide, nitrate and a background due to organics and particles. The background also includes thermal effects on the instrument and slow drift. All of these latter effects (organics, particles, thermal effects and drift) tend to be smooth spectra that combine to form an absorption spectrum that is linear in wavelength over relatively short wavelength spans. If the light absorption spectrum is measured in the wavelength range around 217 to 240 nm (the exact range is a bit of a decision by the operator), then the nitrate concentration can be determined. Two different instruments based on the same optical principles are in use for this purpose. The In Situ Ultraviolet Spectrophotometer (ISUS) built at MBARI or at Satlantic has been mounted inside the pressure hull of a Teledyne/Webb Research APEX and NKE Provor profiling floats and the optics penetrate through the upper end cap into the water. The Satlantic Submersible Ultraviolet Nitrate Analyzer (SUNA) is placed on the outside of APEX, Provor, and Navis profiling floats in its own pressure housing and is connected to the float through an underwater cable that provides power and communications. Power, communications between the float controller and the sensor, and data processing requirements are essentially the same for both ISUS and SUNA. There are several possible algorithms that can be used for the deconvolution of nitrate concentration from the observed UV absorption spectrum (Johnson and Coletti, 2002; Arai et al., 2008; Sakamoto et al., 2009; Zielinski et al., 2011). In addition, the default algorithm that is available in Satlantic sensors is a proprietary approach, but this is not generally used on profiling floats. There are some tradeoffs in every approach. To date almost all nitrate sensors on profiling floats have used the Temperature Compensated Salinity Subtracted (TCSS) algorithm developed by Sakamoto et al. (2009), and this document focuses on that method. It is likely that there will be further algorithm development and it is necessary that the data systems clearly identify the algorithm that is used. It is also desirable that the data system allow for recalculation of prior data sets using new algorithms. To accomplish this, the float must report not just the computed nitrate, but the observed light intensity. Then, the rule to obtain only one NITRATE parameter is, if the spectrum is present then, the NITRATE should be recalculated from the spectrum while the computation of nitrate concentration can also generate useful diagnostics of data quality.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The CATARINA Leg1 cruise was carried out from June 22 to July 24 2012 on board the B/O Sarmiento de Gamboa, under the scientific supervision of Aida Rios (CSIC-IIM). It included the occurrence of the OVIDE hydrological section that was performed in June 2002, 2004, 2006, 2008 and 2010, as part of the CLIVAR program (name A25) ), and under the supervision of Herlé Mercier (CNRSLPO). This section begins near Lisbon (Portugal), runs through the West European Basin and the Iceland Basin, crosses the Reykjanes Ridge (300 miles north of Charlie-Gibbs Fracture Zone, and ends at Cape Hoppe (southeast tip of Greenland). The objective of this repeated hydrological section is to monitor the variability of water mass properties and main current transports in the basin, complementing the international observation array relevant for climate studies. In addition, the Labrador Sea was partly sampled (stations 101-108) between Greenland and Newfoundland, but heavy weather conditions prevented the achievement of the section south of 53°40’N. The quality of CTD data is essential to reach the first objective of the CATARINA project, i.e. to quantify the Meridional Overturning Circulation and water mass ventilation changes and their effect on the changes in the anthropogenic carbon ocean uptake and storage capacity. The CATARINA project was mainly funded by the Spanish Ministry of Sciences and Innovation and co-funded by the Fondo Europeo de Desarrollo Regional. The hydrological OVIDE section includes 95 surface-bottom stations from coast to coast, collecting profiles of temperature, salinity, oxygen and currents, spaced by 2 to 25 Nm depending on the steepness of the topography. The position of the stations closely follows that of OVIDE 2002. In addition, 8 stations were carried out in the Labrador Sea. From the 24 bottles closed at various depth at each stations, samples of sea water are used for salinity and oxygen calibration, and for measurements of biogeochemical components that are not reported here. The data were acquired with a Seabird CTD (SBE911+) and an SBE43 for the dissolved oxygen, belonging to the Spanish UTM group. The software SBE data processing was used after decoding and cleaning the raw data. Then, the LPO matlab toolbox was used to calibrate and bin the data as it was done for the previous OVIDE cruises, using on the one hand pre and post-cruise calibration results for the pressure and temperature sensors (done at Ifremer) and on the other hand the water samples of the 24 bottles of the rosette at each station for the salinity and dissolved oxygen data. A final accuracy of 0.002°C, 0.002 psu and 0.04 ml/l (2.3 umol/kg) was obtained on final profiles of temperature, salinity and dissolved oxygen, compatible with international requirements issued from the WOCE program.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monitoring unused or dark IP addresses offers opportunities to extract useful information about both on-going and new attack patterns. In recent years, different techniques have been used to analyze such traffic including sequential analysis where a change in traffic behavior, for example change in mean, is used as an indication of malicious activity. Change points themselves say little about detected change; further data processing is necessary for the extraction of useful information and to identify the exact cause of the detected change which is limited due to the size and nature of observed traffic. In this paper, we address the problem of analyzing a large volume of such traffic by correlating change points identified in different traffic parameters. The significance of the proposed technique is two-fold. Firstly, automatic extraction of information related to change points by correlating change points detected across multiple traffic parameters. Secondly, validation of the detected change point by the simultaneous presence of another change point in a different parameter. Using a real network trace collected from unused IP addresses, we demonstrate that the proposed technique enables us to not only validate the change point but also extract useful information about the causes of change points.