28 resultados para data movement problem
em Digital Commons at Florida International University
Resumo:
By using near infrared spectroscopy (NIRS) and by modifying the current Somanetics® optodes being used with the INVOS oximeter, the modified optodes are made to be fairly functional not only across the forehead, but across the hairy regions of the scalp as well. A major problem arises in the positioning of these optodes on the patients scalp and holding them in place while recording data. Another problem arises in the inconsistent repeatability of the trends displayed in the recorded data. A method was developed to facilitate the easy placement of these optodes on the patients scalp keeping in mind thepatient's comfort. The sensitivity of the optodes, too, was improved by incorporating better refined techniques for manufacturing the fiber optic brushes and fixing the same to the optode transmitting and receiving windows. The modified and improved optodes, in the single as well as in the multiplexed modes, were subjected to various tests on different areas of the brain to determine their efficiency and functionality.
Resumo:
The purpose of this study was to examine the perspectives of three graduates of a problem-based leaning (PBL) physical therapy (PT) program about their clinical practice. The study used the qualitative methods of observations, interviews, and journaling to gather the data. Three sessions of audiotaped interviews and two observation sessions were conducted with three exemplars from Nova Southeastern University PBL PT program. Each participant also maintained a reflective journal. The data were analyzed using content analysis. A systematic filing system was used by employing a mechanical means of maintaining and indexing coded data and sorting data into coded classifications of subtopics or themes. All interview transcripts, field notes from observations, and journal accounts were read, and index sheets were appropriately annotated. From the findings of the study, it was noted that, from the participants' perspectives, they were practicing at typically expected levels as clinicians. The attributes that governed the perspectives of the participants about their physical therapy clinical practice included flexibility, reflection, analysis, decision-making, self-reliance, problem-solving, independent thinking, and critical thinking. Further, the findings indicated that the factors that influenced those attributes included the PBL process, parents' value system, self-reliant personality, innate personality traits, and deliberate choice. Finally, the findings indicated that the participants' perspectives, for the most part, appeared to support the espoused efficacy of the PBL educational approach. In conclusion, there is evidence that the physical therapy clinical practice of the participants were positively impacted by the PBL curriculum. Among the many attributes they noted which governed these perspectives, problem-solving, as postulated by Barrows, was one of the most frequently mentioned benefits gained from their PBL PT training. With more schools adopting the PBL approach, this research will hopefully add to the knowledge base regarding the efficacy of embracing a problem-based learning instructional approach in physical therapy programs. ^
Resumo:
This dissertation develops a new mathematical approach that overcomes the effect of a data processing phenomenon known as “histogram binning” inherent to flow cytometry data. A real-time procedure is introduced to prove the effectiveness and fast implementation of such an approach on real-world data. The histogram binning effect is a dilemma posed by two seemingly antagonistic developments: (1) flow cytometry data in its histogram form is extended in its dynamic range to improve its analysis and interpretation, and (2) the inevitable dynamic range extension introduces an unwelcome side effect, the binning effect, which skews the statistics of the data, undermining as a consequence the accuracy of the analysis and the eventual interpretation of the data. ^ Researchers in the field contended with such a dilemma for many years, resorting either to hardware approaches that are rather costly with inherent calibration and noise effects; or have developed software techniques based on filtering the binning effect but without successfully preserving the statistical content of the original data. ^ The mathematical approach introduced in this dissertation is so appealing that a patent application has been filed. The contribution of this dissertation is an incremental scientific innovation based on a mathematical framework that will allow researchers in the field of flow cytometry to improve the interpretation of data knowing that its statistical meaning has been faithfully preserved for its optimized analysis. Furthermore, with the same mathematical foundation, proof of the origin of such an inherent artifact is provided. ^ These results are unique in that new mathematical derivations are established to define and solve the critical problem of the binning effect faced at the experimental assessment level, providing a data platform that preserves its statistical content. ^ In addition, a novel method for accumulating the log-transformed data was developed. This new method uses the properties of the transformation of statistical distributions to accumulate the output histogram in a non-integer and multi-channel fashion. Although the mathematics of this new mapping technique seem intricate, the concise nature of the derivations allow for an implementation procedure that lends itself to a real-time implementation using lookup tables, a task that is also introduced in this dissertation. ^
Resumo:
The primary aim of this dissertation is to develop data mining tools for knowledge discovery in biomedical data when multiple (homogeneous or heterogeneous) sources of data are available. The central hypothesis is that, when information from multiple sources of data are used appropriately and effectively, knowledge discovery can be better achieved than what is possible from only a single source. ^ Recent advances in high-throughput technology have enabled biomedical researchers to generate large volumes of diverse types of data on a genome-wide scale. These data include DNA sequences, gene expression measurements, and much more; they provide the motivation for building analysis tools to elucidate the modular organization of the cell. The challenges include efficiently and accurately extracting information from the multiple data sources; representing the information effectively, developing analytical tools, and interpreting the results in the context of the domain. ^ The first part considers the application of feature-level integration to design classifiers that discriminate between soil types. The machine learning tools, SVM and KNN, were used to successfully distinguish between several soil samples. ^ The second part considers clustering using multiple heterogeneous data sources. The resulting Multi-Source Clustering (MSC) algorithm was shown to have a better performance than clustering methods that use only a single data source or a simple feature-level integration of heterogeneous data sources. ^ The third part proposes a new approach to effectively incorporate incomplete data into clustering analysis. Adapted from K-means algorithm, the Generalized Constrained Clustering (GCC) algorithm makes use of incomplete data in the form of constraints to perform exploratory analysis. Novel approaches for extracting constraints were proposed. For sufficiently large constraint sets, the GCC algorithm outperformed the MSC algorithm. ^ The last part considers the problem of providing a theme-specific environment for mining multi-source biomedical data. The database called PlasmoTFBM, focusing on gene regulation of Plasmodium falciparum, contains diverse information and has a simple interface to allow biologists to explore the data. It provided a framework for comparing different analytical tools for predicting regulatory elements and for designing useful data mining tools. ^ The conclusion is that the experiments reported in this dissertation strongly support the central hypothesis.^
Resumo:
The microarray technology provides a high-throughput technique to study gene expression. Microarrays can help us diagnose different types of cancers, understand biological processes, assess host responses to drugs and pathogens, find markers for specific diseases, and much more. Microarray experiments generate large amounts of data. Thus, effective data processing and analysis are critical for making reliable inferences from the data. ^ The first part of dissertation addresses the problem of finding an optimal set of genes (biomarkers) to classify a set of samples as diseased or normal. Three statistical gene selection methods (GS, GS-NR, and GS-PCA) were developed to identify a set of genes that best differentiate between samples. A comparative study on different classification tools was performed and the best combinations of gene selection and classifiers for multi-class cancer classification were identified. For most of the benchmarking cancer data sets, the gene selection method proposed in this dissertation, GS, outperformed other gene selection methods. The classifiers based on Random Forests, neural network ensembles, and K-nearest neighbor (KNN) showed consistently god performance. A striking commonality among these classifiers is that they all use a committee-based approach, suggesting that ensemble classification methods are superior. ^ The same biological problem may be studied at different research labs and/or performed using different lab protocols or samples. In such situations, it is important to combine results from these efforts. The second part of the dissertation addresses the problem of pooling the results from different independent experiments to obtain improved results. Four statistical pooling techniques (Fisher inverse chi-square method, Logit method. Stouffer's Z transform method, and Liptak-Stouffer weighted Z-method) were investigated in this dissertation. These pooling techniques were applied to the problem of identifying cell cycle-regulated genes in two different yeast species. As a result, improved sets of cell cycle-regulated genes were identified. The last part of dissertation explores the effectiveness of wavelet data transforms for the task of clustering. Discrete wavelet transforms, with an appropriate choice of wavelet bases, were shown to be effective in producing clusters that were biologically more meaningful. ^
Resumo:
Groundwater systems of different densities are often mathematically modeled to understand and predict environmental behavior such as seawater intrusion or submarine groundwater discharge. Additional data collection may be justified if it will cost-effectively aid in reducing the uncertainty of a model's prediction. The collection of salinity, as well as, temperature data could aid in reducing predictive uncertainty in a variable-density model. However, before numerical models can be created, rigorous testing of the modeling code needs to be completed. This research documents the benchmark testing of a new modeling code, SEAWAT Version 4. The benchmark problems include various combinations of density-dependent flow resulting from variations in concentration and temperature. The verified code, SEAWAT, was then applied to two different hydrological analyses to explore the capacity of a variable-density model to guide data collection. ^ The first analysis tested a linear method to guide data collection by quantifying the contribution of different data types and locations toward reducing predictive uncertainty in a nonlinear variable-density flow and transport model. The relative contributions of temperature and concentration measurements, at different locations within a simulated carbonate platform, for predicting movement of the saltwater interface were assessed. Results from the method showed that concentration data had greater worth than temperature data in reducing predictive uncertainty in this case. Results also indicated that a linear method could be used to quantify data worth in a nonlinear model. ^ The second hydrological analysis utilized a model to identify the transient response of the salinity, temperature, age, and amount of submarine groundwater discharge to changes in tidal ocean stage, seasonal temperature variations, and different types of geology. The model was compared to multiple kinds of data to (1) calibrate and verify the model, and (2) explore the potential for the model to be used to guide the collection of data using techniques such as electromagnetic resistivity, thermal imagery, and seepage meters. Results indicated that the model can be used to give insight to submarine groundwater discharge and be used to guide data collection. ^
Resumo:
Graph-structured databases are widely prevalent, and the problem of effective search and retrieval from such graphs has been receiving much attention recently. For example, the Web can be naturally viewed as a graph. Likewise, a relational database can be viewed as a graph where tuples are modeled as vertices connected via foreign-key relationships. Keyword search querying has emerged as one of the most effective paradigms for information discovery, especially over HTML documents in the World Wide Web. One of the key advantages of keyword search querying is its simplicity—users do not have to learn a complex query language, and can issue queries without any prior knowledge about the structure of the underlying data. The purpose of this dissertation was to develop techniques for user-friendly, high quality and efficient searching of graph structured databases. Several ranked search methods on data graphs have been studied in the recent years. Given a top-k keyword search query on a graph and some ranking criteria, a keyword proximity search finds the top-k answers where each answer is a substructure of the graph containing all query keywords, which illustrates the relationship between the keyword present in the graph. We applied keyword proximity search on the web and the page graph of web documents to find top-k answers that satisfy user’s information need and increase user satisfaction. Another effective ranking mechanism applied on data graphs is the authority flow based ranking mechanism. Given a top- k keyword search query on a graph, an authority-flow based search finds the top-k answers where each answer is a node in the graph ranked according to its relevance and importance to the query. We developed techniques that improved the authority flow based search on data graphs by creating a framework to explain and reformulate them taking in to consideration user preferences and feedback. We also applied the proposed graph search techniques for Information Discovery over biological databases. Our algorithms were experimentally evaluated for performance and quality. The quality of our method was compared to current approaches by using user surveys.
Resumo:
Research has identified a number of putative risk factors that places adolescents at incrementally higher risk for involvement in alcohol and other drug (AOD) use and sexual risk behaviors (SRBs). Such factors include personality characteristics such as sensation-seeking, cognitive factors such as positive expectancies and inhibition conflict as well as peer norm processes. The current study was guided by a conceptual perspective that support the notion that an integrative framework that includes multi-level factors has significant explanatory value for understanding processes associated with the co-occurrence of AOD use and sexual risk behavior outcomes. This study evaluated simultaneously the mediating role of AOD-sex related expectancies and inhibition conflict on antecedents of AOD use and SRBs including sexual sensation-seeking and peer norms for condom use.^ The sample was drawn from the Enhancing My Personal Options While Evaluating Risk (EMPOWER: Jonathan Tubman, PI), data set (N = 396; aged 12-18 years). Measures used in the study included Sexual Sensation-Seeking Scale, Inhibition Conflict for Condom Use, Risky Sex Scale. All relevant measures had well-documented psychometric properties. A global assessment of alcohol, drug use and sexual risk behaviors was used.^ Results demonstrated that AOD-sex related expectancies mediated the influence of sexual sensation-seeking on the co-occurrence of alcohol and other drug use and sexual risk behaviors. The evaluation of the integrative model also revealed that sexual sensation-seeking was positively associated with peer norms for condom use. Also, peer norms predicted inhibition conflict among this sample of multi-problem youth. ^ This dissertation research identified mechanisms of risk and protection associated with the co-occurrence of AOD use and SRBs among a multi-problem sample of adolescents receiving treatment for alcohol or drug use and related problems. This study is informative for adolescent-serving programs that address those individual and contextual characteristics that enhance treatment efficacy and effectiveness among adolescents receiving substance use and related problems services.^
Resumo:
As massive data sets become increasingly available, people are facing the problem of how to effectively process and understand these data. Traditional sequential computing models are giving way to parallel and distributed computing models, such as MapReduce, both due to the large size of the data sets and their high dimensionality. This dissertation, as in the same direction of other researches that are based on MapReduce, tries to develop effective techniques and applications using MapReduce that can help people solve large-scale problems. Three different problems are tackled in the dissertation. The first one deals with processing terabytes of raster data in a spatial data management system. Aerial imagery files are broken into tiles to enable data parallel computation. The second and third problems deal with dimension reduction techniques that can be used to handle data sets of high dimensionality. Three variants of the nonnegative matrix factorization technique are scaled up to factorize matrices of dimensions in the order of millions in MapReduce based on different matrix multiplication implementations. Two algorithms, which compute CANDECOMP/PARAFAC and Tucker tensor decompositions respectively, are parallelized in MapReduce based on carefully partitioning the data and arranging the computation to maximize data locality and parallelism.
Resumo:
This dissertation develops a process improvement method for service operations based on the Theory of Constraints (TOC), a management philosophy that has been shown to be effective in manufacturing for decreasing WIP and improving throughput. While TOC has enjoyed much attention and success in the manufacturing arena, its application to services in general has been limited. The contribution to industry and knowledge is a method for improving global performance measures based on TOC principles. The method proposed in this dissertation will be tested using discrete event simulation based on the scenario of the service factory of airline turnaround operations. To evaluate the method, a simulation model of aircraft turn operations of a U.S. based carrier was made and validated using actual data from airline operations. The model was then adjusted to reflect an application of the Theory of Constraints for determining how to deploy the scarce resource of ramp workers. The results indicate that, given slight modifications to TOC terminology and the development of a method for constraint identification, the Theory of Constraints can be applied with success to services. Bottlenecks in services must be defined as those processes for which the process rates and amount of work remaining are such that completing the process will not be possible without an increase in the process rate. The bottleneck ratio is used to determine to what degree a process is a constraint. Simulation results also suggest that redefining performance measures to reflect a global business perspective of reducing costs related to specific flights versus the operational local optimum approach of turning all aircraft quickly results in significant savings to the company. Savings to the annual operating costs of the airline were simulated to equal 30% of possible current expenses for misconnecting passengers with a modest increase in utilization of the workers through a more efficient heuristic of deploying them to the highest priority tasks. This dissertation contributes to the literature on service operations by describing a dynamic, adaptive dispatch approach to manage service factory operations similar to airline turnaround operations using the management philosophy of the Theory of Constraints.
Resumo:
A method to estimate speed of free-ranging fishes using a passive sampling device is described and illustrated with data from the Everglades, U.S.A. Catch per unit effort (CPUE) from minnow traps embedded in drift fences was treated as an encounter rate and used to estimate speed, when combined with an independent estimate of density obtained by use of throw traps that enclose 1 m2 of marsh habitat. Underwater video was used to evaluate capture efficiency and species-specific bias of minnow traps and two sampling studies were used to estimate trap saturation and diel-movement patterns; these results were used to optimize sampling and derive correction factors to adjust species-specific encounter rates for bias and capture efficiency. Sailfin mollies Poecilia latipinna displayed a high frequency of escape from traps, whereas eastern mosquitofish Gambusia holbrooki were most likely to avoid a trap once they encountered it; dollar sunfish Lepomis marginatus were least likely to avoid the trap once they encountered it or to escape once they were captured. Length of sampling and time of day affected CPUE; fishes generally had a very low retention rate over a 24 h sample time and only the Everglades pygmy sunfish Elassoma evergladei were commonly captured at night. Dispersal speed of fishes in the Florida Everglades, U.S.A., was shown to vary seasonally and among species, ranging from 0· 05 to 0· 15 m s−1 for small poeciliids and fundulids to 0· 1 to 1· 8 m s−1 for L. marginatus. Speed was generally highest late in the wet season and lowest in the dry season, possibly tied to dispersal behaviours linked to finding and remaining in dry-season refuges. These speed estimates can be used to estimate the diffusive movement rate, which is commonly employed in spatial ecological models.
Resumo:
Modern data centers host hundreds of thousands of servers to achieve economies of scale. Such a huge number of servers create challenges for the data center network (DCN) to provide proportionally large bandwidth. In addition, the deployment of virtual machines (VMs) in data centers raises the requirements for efficient resource allocation and find-grained resource sharing. Further, the large number of servers and switches in the data center consume significant amounts of energy. Even though servers become more energy efficient with various energy saving techniques, DCN still accounts for 20% to 50% of the energy consumed by the entire data center. The objective of this dissertation is to enhance DCN performance as well as its energy efficiency by conducting optimizations on both host and network sides. First, as the DCN demands huge bisection bandwidth to interconnect all the servers, we propose a parallel packet switch (PPS) architecture that directly processes variable length packets without segmentation-and-reassembly (SAR). The proposed PPS achieves large bandwidth by combining switching capacities of multiple fabrics, and it further improves the switch throughput by avoiding padding bits in SAR. Second, since certain resource demands of the VM are bursty and demonstrate stochastic nature, to satisfy both deterministic and stochastic demands in VM placement, we propose the Max-Min Multidimensional Stochastic Bin Packing (M3SBP) algorithm. M3SBP calculates an equivalent deterministic value for the stochastic demands, and maximizes the minimum resource utilization ratio of each server. Third, to provide necessary traffic isolation for VMs that share the same physical network adapter, we propose the Flow-level Bandwidth Provisioning (FBP) algorithm. By reducing the flow scheduling problem to multiple stages of packet queuing problems, FBP guarantees the provisioned bandwidth and delay performance for each flow. Finally, while DCNs are typically provisioned with full bisection bandwidth, DCN traffic demonstrates fluctuating patterns, we propose a joint host-network optimization scheme to enhance the energy efficiency of DCNs during off-peak traffic hours. The proposed scheme utilizes a unified representation method that converts the VM placement problem to a routing problem and employs depth-first and best-fit search to find efficient paths for flows.
Resumo:
The dissertation reports on two studies. The purpose of Study I was to develop and evaluate a measure of cognitive competence (the Critical Problem Solving Skills Scale – Qualitative Extension) using Relational Data Analysis (RDA) with a multi-ethnic, adolescent sample. My study builds on previous work that has been conducted to provide evidence for the reliability and validity of the RDA framework in evaluating youth development programs (Kurtines et al., 2008). Inter-coder percent agreement among the TOC and TCC coders for each of the category levels was moderate to high, with a range of .76 to .94. The Fleiss' kappa across all category levels was from substantial agreement to almost perfect agreement, with a range of .72 to .91. The correlation between the TOC and the TCC demonstrated medium to high correlation, with a range of r(40)=.68, p<.001 to r(40)=.79, p<.001. Study II reports an investigation of a positive youth development program using an Outcome Mediation Cascade (OMC) evaluation model, an integrated model for evaluating the empirical intersection between intervention and developmental processes. The Changing Lives Program (CLP) is a community supported positive youth development intervention implemented in a practice setting as a selective/indicated program for multi-ethnic, multi-problem at risk youth in urban alternative high schools in the Miami Dade County Public Schools (M-DCPS). The 259 participants for this study were drawn from the CLP's archival data file. The study used a structural equation modeling approach to construct and evaluate the hypothesized model. Findings indicated that the hypothesized model fit the data (χ2 (7) = 5.651, p = .83; RMSEA = .00; CFI = 1.00; WRMR = .319). My study built on previous research using the OMC evaluation model (Eichas, 2010), and the findings are consistent with the hypothesis that in addition to having effects on targeted positive outcomes, PYD interventions are likely to have progressive cascading effects on untargeted problem outcomes that operate through effects on positive outcomes.
Resumo:
Chromium (Cr) is a metal of particular environmental concern, owing to its toxicity and widespread occurrence in groundwater, soil, and soil solution. A combination of hydrological, geochemical, and microbiological processes governs the subsurface migration of Cr. Little effort has been devoted to examining how these biogeochemical reactions combine with hydrologic processes influence Cr migration. This study has focused on the complex problem of predicting the Cr transport in laboratory column experiments. A 1-D reactive transport model was developed and evaluated against data obtained from laboratory column experiments. ^ A series of dynamic laboratory column experiments were conducted under abiotic and biotic conditions. Cr(III) was injected into columns packed with β-MnO 2-coated sand at different initial concentrations, variable flow rates, and at two different pore water pH (3.0 and 4.0). In biotic anaerobic column experiments Cr(VI) along with lactate was injected into columns packed with quartz sand or β-MnO2-coated sand and bacteria, Shewanella alga Simidu (BrY-MT). A mathematical model was developed which included advection-dispersion equations for the movement of Cr(III), Cr(VI), dissolved oxygen, lactate, and biomass. The model included first-order rate laws governing the adsorption of each Cr species and lactate. The equations for transport and adsorption were coupled with nonlinear equations for rate-limited oxidation-reduction reactions along with dual-monod kinetic equations. Kinetic batch experiments were conducted to determine the reduction of Cr(VI) by BrY-MT in three different substrates. Results of the column experiments with Cr(III)-containing influent solutions demonstrate that β-MnO2 effectively catalyzes the oxidation of Cr(III) to Cr(VI). For a given influent concentration and pore water velocity, oxidation rates are higher, and hence effluent concentrations of Cr(VI) are greater, at pH 4 relative to pH 3. Reduction of Cr(VI) by BrY-MT was rapid (within one hour) in columns packed with quartz sand, whereas Cr(VI) reduction by BrY-MT was delayed (57 hours) in presence of β-MnO 2-coated sand. BrY-MT grown in BHIB (brain heart infusion broth) reduced maximum amount of Cr(VI) to Cr(III) followed by TSB (tryptic soy broth) and M9 (minimum media). The comparisons of data and model results from the column experiments show that the depths associated with Cr(III) oxidation and transport within sediments of shallow aquatic systems can strongly influence trends in surface water quality. The results of this study suggests that carefully performed, laboratory column experiments is a useful tool in determining the biotransformation of redox-sensitive metals even in the presence of strong oxidant, like β-MnO2. ^
Resumo:
This study reports one of the first controlled studies to examine the impact of a school based positive youth development program (Lerner, Fisher, & Weinberg, 2000) on promoting qualitative change in life course experiences as a positive intervention outcome. The study built on a recently proposed relational developmental methodological metanarrative (Overton, 1998) and advances in use of qualitative research methods (Denzin & Lincoln, 2000). The study investigated the use the Life Course Interview (Clausen, 1998) and an integrated qualitative and quantitative data analytic strategy (IQ-DAS) to provide empirical documentation of the impact the Changing Lives Program on qualitative change in positive identity in a multicultural population of troubled youth in an alternative public high school. The psychosocial life course intervention approach used in this study draws its developmental framework from both psychosocial developmental theory (Erikson, 1968) and life course theory (Elder, 1998) and its intervention strategies from the transformative pedagogy of Freire's (1983/1970). ^ Using the 22 participants in the Intervention Condition and the 10 participants in the Control Condition, RMANOVAs found significantly more positive qualitative change in personal identity for program participants relative to the non-intervention control condition. In addition, the 2X2X2X3 mixed design RMANOVA in which Time (pre, post) was the repeated factor and Condition (Intervention versus Control), Gender, and Ethnicity the between group factors, also found significant interactions for the Time by Gender and Time by Ethnicity. ^ Moreover, the directionality of the basic pattern of change was positive for participants of both genders and all three ethnic groups. The pattern of the moderation effects also indicated a marked tendency for participants in the intervention group to characterize their sense of self as more secure and less negative at the end of the their first semester in the intervention, that was stable across both genders and all three ethnicities. The basic differential pattern of an increase in the intervention condition of a positive characterization of sense of self relative to both pre test and relative to the directionality of the movement of the non-intervention controls, was stable across both genders and all three ethnic groups. ^