963 resultados para Fish populations -- Data processing
Resumo:
The purpose of this paper is to investigate the technological development of electronic inventory solutions from perspective of patent analysis. We first applied the international patent classification to classify the top categories of data processing technologies and their corresponding top patenting countries. Then we identified the core technologies by the calculation of patent citation strength and standard deviation criterion for each patent. To eliminate those core innovations having no reference relationships with the other core patents, relevance strengths between core technologies were evaluated also. Our findings provide market intelligence not only for the research and development community, but for the decision making of advanced inventory solutions.
Resumo:
The development of new all-optical technologies for data processing and signal manipulation is a field of growing importance with a strong potential for numerous applications in diverse areas of modern science. Nonlinear phenomena occurring in optical fibres have many attractive features and great, but not yet fully explored, potential in signal processing. Here, we review recent progress on the use of fibre nonlinearities for the generation and shaping of optical pulses and on the applications of advanced pulse shapes in all-optical signal processing. Amongst other topics, we will discuss ultrahigh repetition rate pulse sources, the generation of parabolic shaped pulses in active and passive fibres, the generation of pulses with triangular temporal profiles, and coherent supercontinuum sources. The signal processing applications will span optical regeneration, linear distortion compensation, optical decision at the receiver in optical communication systems, spectral and temporal signal doubling, and frequency conversion. © Copyright 2012 Sonia Boscolo and Christophe Finot.
Resumo:
This research presents several components encompassing the scope of the objective of Data Partitioning and Replication Management in Distributed GIS Database. Modern Geographic Information Systems (GIS) databases are often large and complicated. Therefore data partitioning and replication management problems need to be addresses in development of an efficient and scalable solution. ^ Part of the research is to study the patterns of geographical raster data processing and to propose the algorithms to improve availability of such data. These algorithms and approaches are targeting granularity of geographic data objects as well as data partitioning in geographic databases to achieve high data availability and Quality of Service(QoS) considering distributed data delivery and processing. To achieve this goal a dynamic, real-time approach for mosaicking digital images of different temporal and spatial characteristics into tiles is proposed. This dynamic approach reuses digital images upon demand and generates mosaicked tiles only for the required region according to user's requirements such as resolution, temporal range, and target bands to reduce redundancy in storage and to utilize available computing and storage resources more efficiently. ^ Another part of the research pursued methods for efficient acquiring of GIS data from external heterogeneous databases and Web services as well as end-user GIS data delivery enhancements, automation and 3D virtual reality presentation. ^ There are vast numbers of computing, network, and storage resources idling or not fully utilized available on the Internet. Proposed "Crawling Distributed Operating System "(CDOS) approach employs such resources and creates benefits for the hosts that lend their CPU, network, and storage resources to be used in GIS database context. ^ The results of this dissertation demonstrate effective ways to develop a highly scalable GIS database. The approach developed in this dissertation has resulted in creation of TerraFly GIS database that is used by US government, researchers, and general public to facilitate Web access to remotely-sensed imagery and GIS vector information. ^
Resumo:
The microarray technology provides a high-throughput technique to study gene expression. Microarrays can help us diagnose different types of cancers, understand biological processes, assess host responses to drugs and pathogens, find markers for specific diseases, and much more. Microarray experiments generate large amounts of data. Thus, effective data processing and analysis are critical for making reliable inferences from the data. ^ The first part of dissertation addresses the problem of finding an optimal set of genes (biomarkers) to classify a set of samples as diseased or normal. Three statistical gene selection methods (GS, GS-NR, and GS-PCA) were developed to identify a set of genes that best differentiate between samples. A comparative study on different classification tools was performed and the best combinations of gene selection and classifiers for multi-class cancer classification were identified. For most of the benchmarking cancer data sets, the gene selection method proposed in this dissertation, GS, outperformed other gene selection methods. The classifiers based on Random Forests, neural network ensembles, and K-nearest neighbor (KNN) showed consistently god performance. A striking commonality among these classifiers is that they all use a committee-based approach, suggesting that ensemble classification methods are superior. ^ The same biological problem may be studied at different research labs and/or performed using different lab protocols or samples. In such situations, it is important to combine results from these efforts. The second part of the dissertation addresses the problem of pooling the results from different independent experiments to obtain improved results. Four statistical pooling techniques (Fisher inverse chi-square method, Logit method. Stouffer's Z transform method, and Liptak-Stouffer weighted Z-method) were investigated in this dissertation. These pooling techniques were applied to the problem of identifying cell cycle-regulated genes in two different yeast species. As a result, improved sets of cell cycle-regulated genes were identified. The last part of dissertation explores the effectiveness of wavelet data transforms for the task of clustering. Discrete wavelet transforms, with an appropriate choice of wavelet bases, were shown to be effective in producing clusters that were biologically more meaningful. ^
Resumo:
This dissertation established a software-hardware integrated design for a multisite data repository in pediatric epilepsy. A total of 16 institutions formed a consortium for this web-based application. This innovative fully operational web application allows users to upload and retrieve information through a unique human-computer graphical interface that is remotely accessible to all users of the consortium. A solution based on a Linux platform with My-SQL and Personal Home Page scripts (PHP) has been selected. Research was conducted to evaluate mechanisms to electronically transfer diverse datasets from different hospitals and collect the clinical data in concert with their related functional magnetic resonance imaging (fMRI). What was unique in the approach considered is that all pertinent clinical information about patients is synthesized with input from clinical experts into 4 different forms, which were: Clinical, fMRI scoring, Image information, and Neuropsychological data entry forms. A first contribution of this dissertation was in proposing an integrated processing platform that was site and scanner independent in order to uniformly process the varied fMRI datasets and to generate comparative brain activation patterns. The data collection from the consortium complied with the IRB requirements and provides all the safeguards for security and confidentiality requirements. An 1-MR1-based software library was used to perform data processing and statistical analysis to obtain the brain activation maps. Lateralization Index (LI) of healthy control (HC) subjects in contrast to localization-related epilepsy (LRE) subjects were evaluated. Over 110 activation maps were generated, and their respective LIs were computed yielding the following groups: (a) strong right lateralization: (HC=0%, LRE=18%), (b) right lateralization: (HC=2%, LRE=10%), (c) bilateral: (HC=20%, LRE=15%), (d) left lateralization: (HC=42%, LRE=26%), e) strong left lateralization: (HC=36%, LRE=31%). Moreover, nonlinear-multidimensional decision functions were used to seek an optimal separation between typical and atypical brain activations on the basis of the demographics as well as the extent and intensity of these brain activations. The intent was not to seek the highest output measures given the inherent overlap of the data, but rather to assess which of the many dimensions were critical in the overall assessment of typical and atypical language activations with the freedom to select any number of dimensions and impose any degree of complexity in the nonlinearity of the decision space.
Resumo:
We used a one-dimensional, spatially explicit model to simulate the community of small fishes in the freshwater wetlands of southern Florida, USA. The seasonality of rainfall in these wetlands causes annual fluctuations in the amount of flooded area. We modeled fish populations that differed from each other only in efficiency of resource utilization and dispersal ability. The simulations showed that these trade-offs, along with the spatial and temporal variability of the environment, allow coexistence of several species competing exploitatively for a common resource type. This mechanism, while sharing some characteristics with other mechanisms proposed for coexistence of competing species, is novel in detail. Simulated fish densities resembled patterns observed in Everglades empirical data. Cells with hydroperiods less than 6 months accumulated negligible fish biomass. One unique model result was that, when multiple species coexisted, it was possible for one of the coexisting species to have both lower local resource utilization efficiency and lower dispersal ability than one of the other species. This counterintuitive result is a consequence of stronger effects of other competitors on the superior species.
Resumo:
During the 1960s, water management practices resulted in the conversion of the wetlands that fringe northeastern Florida Bay (USA) from freshwater/oligohaline herbaceous marshes to dwarf red mangrove forests. Coincident with this conversion were several ecological changes to Florida Bay’s fauna, including reductions in the abundances of top trophic-level consumers: piscivorous fishes, alligators, crocodiles, and wading birds. Because these taxa rely on a common forage base of small demersal fishes, food stress has been implicated as playing a role in their respective declines. In the present study, we monitored the demersal fishes seasonally at six sites over an 8-year time period. During monitoring, extremely high rainfall conditions occurred over a 3.5-year period leading to salinity regimes that can be viewed as “windows” to the area’s natural past and future restored states. In this paper, we: (1) examine the changes in fish communities over the 8-year study period and relate them to measured changes in salinity; (2) make comparisons among marine, brackish and freshwater demersal fish communities in terms of species composition, density, and biomass; and (3) discuss several implications of our findings in light of the intended and unintended water management changes that are planned or underway as part of Everglades restoration. Results suggest the reduction in freshwater flow to Florida Bay over the last several decades has reduced demersal fish populations, and thus prey availability for apex consumers in the coastal wetlands compared to the pre-drainage inferred standard. Furthermore, greater discharge of freshwater toward Florida Bay may result in the re-establishment of pre-1960s fauna, including a more robust demersal-fish community that should prompt increases in populations of several important predatory species.
Resumo:
A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.
Resumo:
Changes in olfactory-mediated behaviour caused by elevated CO2 levels in the ocean could affect recruitment to reef fish populations because larval fish become more vulnerable to predation. However, it is currently unclear how elevated CO2 will impact the other key part of the predator-prey interaction - the predators. We investigated the effects of elevated CO2 and reduced pH on olfactory preferences, activity levels and feeding behaviour of a common coral reef meso-predator, the brown dottyback (Pseudochromis fuscus). Predators were exposed to either current-day CO2 levels or one of two elevated CO2 levels (~600 µatm or ~950 µatm) that may occur by 2100 according to climate change predictions. Exposure to elevated CO2 and reduced pH caused a shift from preference to avoidance of the smell of injured prey, with CO2treated predators spending approximately 20% less time in a water stream containing prey odour compared with controls. Furthermore, activity levels of fish was higher in the high CO2 treatment and feeding activity was lower for fish in the mid CO2treatment; indicating that future conditions may potentially reduce the ability of the fish to respond rapidly to fluctuations in food availability. Elevated activity levels of predators in the high CO2 treatment, however, may compensate for reduced olfactory ability, as greater movement facilitated visual detection of food. Our findings show that, at least for the species tested to date, both parties in the predator-prey relationship may be affected by ocean acidification. Although impairment of olfactory-mediated behaviour of predators might reduce the risk of predation for larval fishes, the magnitude of the observed effects of elevated CO2 acidification appear to be more dramatic for prey compared to predators. Thus, it is unlikely that the altered behaviour of predators is sufficient to fully compensate for the effects of ocean acidification on prey mortality.
Resumo:
During U.S. Department of Interior, Bureau of Land Management (BLM) public hearings held in 1973, 1974 and 1975 prior to Texas Outer Continental Shelf (OCS) oil and gas lease sales, concern was expressed by the National Marine Fisheries Service, scientists from Texas A&M and the University of Texas and private citizens over the possible environmental impact of oil and gas drilling and production operations on coral reefs and fishing banks in or adjacent to lease blocks to be sold. As a result, certain restrictive regulations concerning drilling operations in the vicinity of the well documented coral reefs and biostromal communities at the East and West Flower Gardens were established by BLM, and Signal Oil Company was required to provide a biological and geological baseline study of the less well known Stetson Bank before a drilling permit could be issued. Considering the almost total lack of knowledge of the geology and biotic communities associated with the South Texas OCS banks lying in or near lease blocks to be offered for sale in 1975, BLM contracted with Texas A&M University to provide the biological and geological baseline information required to facilitate judgments as to the extent and nature of restrictive regulations on drilling near these banks which might be required to insure their protection. In pursuit of this, scientists from Texas A&M University were to direct their attention toward assessments of ground fish populations, unique biological and geological features, substratum type and distribution, and the biotic and geologic relationships between these banks and those farther north.
Resumo:
Fine-fraction (<63 µm) grain-size analyses of 530 samples from Holes 1095A, 1095B, and 1095D allow assessment of the downhole grain-size distribution at Drift 7. A variety of data processing methods, statistical treatment, and display techniques were used to describe this data set. The downhole fine-fraction grain-size distribution documents significant variations in the average grain-size composition and its cyclic pattern, revealed in five prominent intervals: (1) between 0 and 40 meters composite depth (mcd) (0 and 1.3 Ma), (2) between 40 and 80 mcd (1.3 and 2.4 Ma), (3) between 80 and 220 mcd (2.4 and 6 Ma), (4) between 220 and 360 mcd, and (5) below 360 mcd (prior to 8.1 Ma). In an approach designed to characterize depositional processes at Drift 7, we used statistical parameters determined by the method of moments for the sortable silt fraction to distinguish groups in the grainsize data set. We found three distinct grain-size populations and used these for a tentative environmental interpretation. Population 1 is related to a process in which glacially eroded shelf material was redeposited by turbidites with an ice-rafted debris influence. Population 2 is composed of interglacial turbidites. Population 3 is connected to depositional sequence tops linked to bioturbated sections that, in turn, are influenced by contourite currents and pelagic background sedimentation.
Resumo:
Cloud computing offers massive scalability and elasticity required by many scien-tific and commercial applications. Combining the computational and data handling capabilities of clouds with parallel processing also has the potential to tackle Big Data problems efficiently. Science gateway frameworks and workflow systems enable application developers to implement complex applications and make these available for end-users via simple graphical user interfaces. The integration of such frameworks with Big Data processing tools on the cloud opens new oppor-tunities for application developers. This paper investigates how workflow sys-tems and science gateways can be extended with Big Data processing capabilities. A generic approach based on infrastructure aware workflows is suggested and a proof of concept is implemented based on the WS-PGRADE/gUSE science gateway framework and its integration with the Hadoop parallel data processing solution based on the MapReduce paradigm in the cloud. The provided analysis demonstrates that the methods described to integrate Big Data processing with workflows and science gateways work well in different cloud infrastructures and application scenarios, and can be used to create massively parallel applications for scientific analysis of Big Data.
Resumo:
This paper is part of a special issue of Applied Geochemistry focusing on reliable applications of compositional multivariate statistical methods. This study outlines the application of compositional data analysis (CoDa) to calibration of geochemical data and multivariate statistical modelling of geochemistry and grain-size data from a set of Holocene sedimentary cores from the Ganges-Brahmaputra (G-B) delta. Over the last two decades, understanding near-continuous records of sedimentary sequences has required the use of core-scanning X-ray fluorescence (XRF) spectrometry, for both terrestrial and marine sedimentary sequences. Initial XRF data are generally unusable in ‘raw-format’, requiring data processing in order to remove instrument bias, as well as informed sequence interpretation. The applicability of these conventional calibration equations to core-scanning XRF data are further limited by the constraints posed by unknown measurement geometry and specimen homogeneity, as well as matrix effects. Log-ratio based calibration schemes have been developed and applied to clastic sedimentary sequences focusing mainly on energy dispersive-XRF (ED-XRF) core-scanning. This study has applied high resolution core-scanning XRF to Holocene sedimentary sequences from the tidal-dominated Indian Sundarbans, (Ganges-Brahmaputra delta plain). The Log-Ratio Calibration Equation (LRCE) was applied to a sub-set of core-scan and conventional ED-XRF data to quantify elemental composition. This provides a robust calibration scheme using reduced major axis regression of log-ratio transformed geochemical data. Through partial least squares (PLS) modelling of geochemical and grain-size data, it is possible to derive robust proxy information for the Sundarbans depositional environment. The application of these techniques to Holocene sedimentary data offers an improved methodological framework for unravelling Holocene sedimentation patterns.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Recent advances in the massively parallel computational abilities of graphical processing units (GPUs) have increased their use for general purpose computation, as companies look to take advantage of big data processing techniques. This has given rise to the potential for malicious software targeting GPUs, which is of interest to forensic investigators examining the operation of software. The ability to carry out reverse-engineering of software is of great importance within the security and forensics elds, particularly when investigating malicious software or carrying out forensic analysis following a successful security breach. Due to the complexity of the Nvidia CUDA (Compute Uni ed Device Architecture) framework, it is not clear how best to approach the reverse engineering of a piece of CUDA software. We carry out a review of the di erent binary output formats which may be encountered from the CUDA compiler, and their implications on reverse engineering. We then demonstrate the process of carrying out disassembly of an example CUDA application, to establish the various techniques available to forensic investigators carrying out black-box disassembly and reverse engineering of CUDA binaries. We show that the Nvidia compiler, using default settings, leaks useful information. Finally, we demonstrate techniques to better protect intellectual property in CUDA algorithm implementations from reverse engineering.