11 resultados para multi-source noise
em Digital Commons at Florida International University
Resumo:
To carry out their specific roles in the cell, genes and gene products often work together in groups, forming many relationships among themselves and with other molecules. Such relationships include physical protein-protein interaction relationships, regulatory relationships, metabolic relationships, genetic relationships, and much more. With advances in science and technology, some high throughput technologies have been developed to simultaneously detect tens of thousands of pairwise protein-protein interactions and protein-DNA interactions. However, the data generated by high throughput methods are prone to noise. Furthermore, the technology itself has its limitations, and cannot detect all kinds of relationships between genes and their products. Thus there is a pressing need to investigate all kinds of relationships and their roles in a living system using bioinformatic approaches, and is a central challenge in Computational Biology and Systems Biology. This dissertation focuses on exploring relationships between genes and gene products using bioinformatic approaches. Specifically, we consider problems related to regulatory relationships, protein-protein interactions, and semantic relationships between genes. A regulatory element is an important pattern or "signal", often located in the promoter of a gene, which is used in the process of turning a gene "on" or "off". Predicting regulatory elements is a key step in exploring the regulatory relationships between genes and gene products. In this dissertation, we consider the problem of improving the prediction of regulatory elements by using comparative genomics data. With regard to protein-protein interactions, we have developed bioinformatics techniques to estimate support for the data on these interactions. While protein-protein interactions and regulatory relationships can be detected by high throughput biological techniques, there is another type of relationship called semantic relationship that cannot be detected by a single technique, but can be inferred using multiple sources of biological data. The contributions of this thesis involved the development and application of a set of bioinformatic approaches that address the challenges mentioned above. These included (i) an EM-based algorithm that improves the prediction of regulatory elements using comparative genomics data, (ii) an approach for estimating the support of protein-protein interaction data, with application to functional annotation of genes, (iii) a novel method for inferring functional network of genes, and (iv) techniques for clustering genes using multi-source data.
Resumo:
The primary aim of this dissertation is to develop data mining tools for knowledge discovery in biomedical data when multiple (homogeneous or heterogeneous) sources of data are available. The central hypothesis is that, when information from multiple sources of data are used appropriately and effectively, knowledge discovery can be better achieved than what is possible from only a single source. ^ Recent advances in high-throughput technology have enabled biomedical researchers to generate large volumes of diverse types of data on a genome-wide scale. These data include DNA sequences, gene expression measurements, and much more; they provide the motivation for building analysis tools to elucidate the modular organization of the cell. The challenges include efficiently and accurately extracting information from the multiple data sources; representing the information effectively, developing analytical tools, and interpreting the results in the context of the domain. ^ The first part considers the application of feature-level integration to design classifiers that discriminate between soil types. The machine learning tools, SVM and KNN, were used to successfully distinguish between several soil samples. ^ The second part considers clustering using multiple heterogeneous data sources. The resulting Multi-Source Clustering (MSC) algorithm was shown to have a better performance than clustering methods that use only a single data source or a simple feature-level integration of heterogeneous data sources. ^ The third part proposes a new approach to effectively incorporate incomplete data into clustering analysis. Adapted from K-means algorithm, the Generalized Constrained Clustering (GCC) algorithm makes use of incomplete data in the form of constraints to perform exploratory analysis. Novel approaches for extracting constraints were proposed. For sufficiently large constraint sets, the GCC algorithm outperformed the MSC algorithm. ^ The last part considers the problem of providing a theme-specific environment for mining multi-source biomedical data. The database called PlasmoTFBM, focusing on gene regulation of Plasmodium falciparum, contains diverse information and has a simple interface to allow biologists to explore the data. It provided a framework for comparing different analytical tools for predicting regulatory elements and for designing useful data mining tools. ^ The conclusion is that the experiments reported in this dissertation strongly support the central hypothesis.^
Resumo:
Climate warming is predicted to cause an increase in the growing season by as much as 30% for regions of the arctic tundra. This will have a significant effect on the physiological activity of the vascular plant species and the ecosystem as a whole. The need to understand the possible physiological change within this ecosystem is confounded by the fact that research in this extreme environment has been limited to periods when conditions are most favorable, mid June–mid August. This study attempted to develop the most comprehensive understanding to date of the physiological activity of seven tundra plant species in the Alaskan Arctic under natural and lengthened growing season conditions. Four interrelated lines of research, scaling from cellular signals to ecosystem processes, set the foundation for this study. ^ I established an experiment looking at the physiological response of arctic sedges to soil temperature stress with emphasis on the role of the hormone abscisic acid (ABA). A manipulation was also developed where the growing season was lengthened and soils were warmed in an attempt to determine the maximum physiological capacity of these seven vascular species. Additionally, the physiological capacities of four evergreens were tested in the subnivean environment along with the potential role anthocyanins play in their activity. The measurements were scaled up to determine the physiological role of these evergreens in maintaining ecosystem carbon fluxes. ^ These studies determined that soil temperature differentials significantly affect vascular plant physiology. ABA appears to be a physiological modifier that limits stomatal processes when root temperatures are low. Photosynthetic capacity was limited by internal plant physiological mechanisms in the face of a lengthened growing season. Therefore shifts in ecosystem carbon dynamics are driven by changes in species composition and biomass production on a per/unit area basis. These studies also found that changes in soil temperatures will have a greater effect of physiological processes than would the same magnitude of change in air temperature. The subnivean environment exhibits conditions that are favorable for photosynthetic activity in evergreen species. These measurements when scaled to the ecosystem have a significant role in limiting the system's carbon source capacity. ^
Resumo:
The Internet has become an integral part of our nation’s critical socio-economic infrastructure. With its heightened use and growing complexity however, organizations are at greater risk of cyber crimes. To aid in the investigation of crimes committed on or via the Internet, a network forensics analysis tool pulls together needed digital evidence. It provides a platform for performing deep network analysis by capturing, recording and analyzing network events to find out the source of a security attack or other information security incidents. Existing network forensics work has been mostly focused on the Internet and fixed networks. But the exponential growth and use of wireless technologies, coupled with their unprecedented characteristics, necessitates the development of new network forensic analysis tools. This dissertation fostered the emergence of a new research field in cellular and ad-hoc network forensics. It was one of the first works to identify this problem and offer fundamental techniques and tools that laid the groundwork for future research. In particular, it introduced novel methods to record network incidents and report logged incidents. For recording incidents, location is considered essential to documenting network incidents. However, in network topology spaces, location cannot be measured due to absence of a ‘distance metric’. Therefore, a novel solution was proposed to label locations of nodes within network topology spaces, and then to authenticate the identity of nodes in ad hoc environments. For reporting logged incidents, a novel technique based on Distributed Hash Tables (DHT) was adopted. Although the direct use of DHTs for reporting logged incidents would result in an uncontrollably recursive traffic, a new mechanism was introduced that overcome this recursive process. These logging and reporting techniques aided forensics over cellular and ad-hoc networks, which in turn increased their ability to track and trace attacks to their source. These techniques were a starting point for further research and development that would result in equipping future ad hoc networks with forensic components to complement existing security mechanisms.
Resumo:
Dissolved organic matter (DOM) in groundwater and surface water samples from the Florida coastal Everglades were studied using excitation–emission matrix fluorescence modeled through parallel factor analysis (EEM-PARAFAC). DOM in both surface and groundwater from the eastern Everglades S332 basin reflected a terrestrial-derived fingerprint through dominantly higher abundances of humic-like PARAFAC components. In contrast, surface water DOM from northeastern Florida Bay featured a microbial-derived DOM signature based on the higher abundance of microbial humic-like and protein-like components consistent with its marine source. Surprisingly, groundwater DOM from northeastern Florida Bay reflected a terrestrial-derived source except for samples from central Florida Bay well, which mirrored a combination of terrestrial and marine end-member origin. Furthermore, surface water and groundwater displayed effects of different degradation pathways such as photodegradation and biodegradation as exemplified by two PARAFAC components seemingly indicative of such degradation processes. Finally, Principal Component Analysis of the EEM-PARAFAC data was able to distinguish and classify most of the samples according to DOM origins and degradation processes experienced, except for a small overlap of S332 surface water and groundwater, implying rather active surface-to-ground water interaction in some sites particularly during the rainy season. This study highlights that EEM-PARAFAC could be used successfully to trace and differentiate DOM from diverse sources across both horizontal and vertical flow profiles, and as such could be a convenient and useful tool for the better understanding of hydrological interactions and carbon biogeochemical cycling.
Resumo:
The Everglades freshwater marl prairie is a dynamic and spatially heterogeneous landscape, containing thousands of tree islands nested within a marsh matrix. Spatial processes underlie population and community dynamics across the mosaic, especially the balance between woody and graminoid components, and landscape patterns reflect interactions among multiple biotic and abiotic drivers. To better understand these complex, multi-scaled relationships we employed a three-tiered hierarchical design to investigate the effects of seed source, hydrology, and more indirectly fire on the establishment of new woody recruits in the marsh, and to assess current tree island patterning across the landscape. Our analyses were conducted at the ground level at two scales, which we term the micro- and meso-scapes, and results were related to remotely detected tree island distributions assessed in the broader landscape, that is, the macro-scape. Seed source and hydrologic effects on recruitment in the micro- and meso-scapes were analyzed via logistic regression, and spatial aggregation in the macro-scape was evaluated using a grid-based univariate O-ring function. Results varied among regions and scales but several general trends were observed. The patterning of adult populations was the strongest driver of recruitment in the micro- and meso-scape prairies, with recruits frequently aggregating around adults or tree islands. However in the macro-scape biologically associated (second order) aggregation was rare, suggesting that emergent woody patches are heavily controlled by underlying physical and environmental factors such as topography, hydrology, and fire.
Resumo:
The assessment of organic matter (OM) sources in sediments and soils is a key to better understand the biogeochemical cycling of carbon in aquatic environments. While traditional molecular marker-based methods have provided such information for typical two end member (allochthonous/terrestrial vs. autochthonous/microbial)-dominated systems, more detailed, biomass-specific assessments are needed for ecosystems with complex OM inputs such as tropical and sub-tropical wetlands and estuaries where aquatic macrophytes and macroalgae may play an important role as OM sources. The aim of this study was to assess the utility of a combined approach using compound specific stable carbon isotope analysis and an n-alkane based proxy (Paq) to differentiate submerged and emergent/terrestrial vegetation OM inputs to soils/sediments from a sub-tropical wetland and estuarine system, the Florida Coastal Everglades. Results show that Paq values (0.13–0.51) for the emergent/terrestrial plants were generally lower than those for freshwater/marine submerged vegetation (0.45–1.00) and that compound specific δ13C values for the n-alkanes (C23 to C31) were distinctively different for terrestrial/emergent and freshwater/marine submerged plants. While crossplots of the Paq and n-alkane stable isotope values for the C23n-alkane suggest that OM inputs are controlled by vegetation changes along the freshwater to marine transect, further resolution regarding OM input changes along this landscape was obtained through principal component analysis (PCA), successfully grouping the study sites according to the OM source strengths. The data show the potential for this n-alkane based multi-proxy approach as a means of assessing OM inputs to complex ecosystems.
Resumo:
Experimental and theoretical studies regarding noise processes in various kinds of AlGaAs/GaAs heterostructures with a quantum well are reported. The measurement processes, involving a Fast Fourier Transform and analog wave analyzer in the frequency range from 10 Hz to 1 MHz, a computerized data storage and processing system, and cryostat in the temperature range from 78 K to 300 K are described in detail. The current noise spectra are obtained with the “three-point method”, using a Quan-Tech and avalanche noise source for calibration. ^ The properties of both GaAs and AlGaAs materials and field effect transistors, based on the two-dimensional electron gas in the interface quantum well, are discussed. Extensive measurements are performed in three types of heterostructures, viz., Hall structures with a large spacer layer, modulation-doped non-gated FETs, and more standard gated FETs; all structures are grown by MBE techniques. ^ The Hall structures show Lorentzian generation-recombination noise spectra with near temperature independent relaxation times. This noise is attributed to g-r processes in the 2D electron gas. For the TEGFET structures, we observe several Lorentzian g-r noise components which have strongly temperature dependent relaxation times. This noise is attributed to trapping processes in the doped AlGaAs layer. The trap level energies are determined from an Arrhenius plot of log (τT2) versus 1/T as well as from the plateau values. The theory to interpret these measurements and to extract the defect level data is reviewed and further developed. Good agreement with the data is found for all reported devices. ^
Resumo:
Electronic noise has been investigated in AlxGa1−x N/GaN Modulation-Doped Field Effect Transistors (MODFETs) of submicron dimensions, grown for us by MBE (Molecular Beam Epitaxy) techniques at Virginia Commonwealth University by Dr. H. Morkoç and coworkers. Some 20 devices were grown on a GaN substrate, four of which have leads bonded to source (S), drain (D), and gate (G) pads, respectively. Conduction takes place in the quasi-2D layer of the junction (xy plane) which is perpendicular to the quantum well (z-direction) of average triangular width ∼3 nm. A non-doped intrinsic buffer layer of ∼5 nm separates the Si-doped donors in the AlxGa1−xN layer from the 2D-transistor plane, which affords a very high electron mobility, thus enabling high-speed devices. Since all contacts (S, D, and G) must reach through the AlxGa1−xN layer to connect internally to the 2D plane, parallel conduction through this layer is a feature of all modulation-doped devices. While the shunting effect may account for no more than a few percent of the current IDS, it is responsible for most excess noise, over and above thermal noise of the device. ^ The excess noise has been analyzed as a sum of Lorentzian spectra and 1/f noise. The Lorentzian noise has been ascribed to trapping of the carriers in the AlxGa1−xN layer. A detailed, multitrapping generation-recombination noise theory is presented, which shows that an exponential relationship exists for the time constants obtained from the spectral components as a function of 1/kT. The trap depths have been obtained from Arrhenius plots of log (τT2) vs. 1000/T. Comparison with previous noise results for GaAs devices shows that: (a) many more trapping levels are present in these nitride-based devices; (b) the traps are deeper (farther below the conduction band) than for GaAs. Furthermore, the magnitude of the noise is strongly dependent on the level of depletion of the AlxGa1−xN donor layer, which can be altered by a negative or positive gate bias VGS. ^ Altogether, these frontier nitride-based devices are promising for bluish light optoelectronic devices and lasers; however, the noise, though well understood, indicates that the purity of the constituent layers should be greatly improved for future technological applications. ^
Resumo:
The Internet has become an integral part of our nation's critical socio-economic infrastructure. With its heightened use and growing complexity however, organizations are at greater risk of cyber crimes. To aid in the investigation of crimes committed on or via the Internet, a network forensics analysis tool pulls together needed digital evidence. It provides a platform for performing deep network analysis by capturing, recording and analyzing network events to find out the source of a security attack or other information security incidents. Existing network forensics work has been mostly focused on the Internet and fixed networks. But the exponential growth and use of wireless technologies, coupled with their unprecedented characteristics, necessitates the development of new network forensic analysis tools. This dissertation fostered the emergence of a new research field in cellular and ad-hoc network forensics. It was one of the first works to identify this problem and offer fundamental techniques and tools that laid the groundwork for future research. In particular, it introduced novel methods to record network incidents and report logged incidents. For recording incidents, location is considered essential to documenting network incidents. However, in network topology spaces, location cannot be measured due to absence of a 'distance metric'. Therefore, a novel solution was proposed to label locations of nodes within network topology spaces, and then to authenticate the identity of nodes in ad hoc environments. For reporting logged incidents, a novel technique based on Distributed Hash Tables (DHT) was adopted. Although the direct use of DHTs for reporting logged incidents would result in an uncontrollably recursive traffic, a new mechanism was introduced that overcome this recursive process. These logging and reporting techniques aided forensics over cellular and ad-hoc networks, which in turn increased their ability to track and trace attacks to their source. These techniques were a starting point for further research and development that would result in equipping future ad hoc networks with forensic components to complement existing security mechanisms.
Resumo:
A pilot scale multi-media filtration system was used to evaluate the effectiveness of filtration in removing petroleum hydrocarbons from a source water contaminated with diesel fuel. Source water was artificially prepared by mixing bentonite clay and tap water to produce a turbidity range of 10-15 NTU. Diesel fuel concentrations of 150 ppm or 750 ppm were used to contaminate the source water. The coagulants used included Cat Floc K-10 and Cat Floc T-2. The experimental phase was conducted under direct filtration conditions at constant head and constant rate filtration at 8.0 gpm. Filtration experiments were run until the filter reached its clogging point as noted by a measured peak pressure loss of 10 psi. The experimental variables include type of coagulant, oil concentration and source water. Filtration results were evaluated based on turbidity removal and petroleum hydrocarbon (PHC) removal efficiency as measured by gas chromatography. Experiments indicated that clogging was controlled by the clay loading on the filter and that inadequate destabilization of the contaminated water by the coagulant limited the PHC removal. ^