Biblioteca Digital

842 resultados para data movement problem

Machine learning for large-scale wearable sensor data in Parkinson disease:concepts, promises, pitfalls, and futures

Relevância:

30.00% 30.00%

Publicador:

Resumo:

For the treatment and monitoring of Parkinson's disease (PD) to be scientific, a key requirement is that measurement of disease stages and severity is quantitative, reliable, and repeatable. The last 50 years in PD research have been dominated by qualitative, subjective ratings obtained by human interpretation of the presentation of disease signs and symptoms at clinical visits. More recently, “wearable,” sensor-based, quantitative, objective, and easy-to-use systems for quantifying PD signs for large numbers of participants over extended durations have been developed. This technology has the potential to significantly improve both clinical diagnosis and management in PD and the conduct of clinical studies. However, the large-scale, high-dimensional character of the data captured by these wearable sensors requires sophisticated signal processing and machine-learning algorithms to transform it into scientifically and clinically meaningful information. Such algorithms that “learn” from data have shown remarkable success in making accurate predictions for complex problems in which human skill has been required to date, but they are challenging to evaluate and apply without a basic understanding of the underlying logic on which they are based. This article contains a nontechnical tutorial review of relevant machine-learning algorithms, also describing their limitations and how these can be overcome. It discusses implications of this technology and a practical road map for realizing the full potential of this technology in PD research and practice. © 2016 International Parkinson and Movement Disorder Society.

Numerical solution of a Cauchy problem for Laplace equation in 3-dimensional domains by integral equations

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A numerical method based on integral equations is proposed and investigated for the Cauchy problem for the Laplace equation in 3-dimensional smooth bounded doubly connected domains. To numerically reconstruct a harmonic function from knowledge of the function and its normal derivative on the outer of two closed boundary surfaces, the harmonic function is represented as a single-layer potential. Matching this representation against the given data, a system of boundary integral equations is obtained to be solved for two unknown densities. This system is rewritten over the unit sphere under the assumption that each of the two boundary surfaces can be mapped smoothly and one-to-one to the unit sphere. For the discretization of this system, Weinert’s method (PhD, Göttingen, 1990) is employed, which generates a Galerkin type procedure for the numerical solution, and the densities in the system of integral equations are expressed in terms of spherical harmonics. Tikhonov regularization is incorporated, and numerical results are included showing the efficiency of the proposed procedure.

The Long-Term impact of Business Support? - Exploring the Role of Evaluation Timing using Micro Data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The original contribution of this work is threefold. Firstly, this thesis develops a critical perspective on current evaluation practice of business support, with focus on the timing of evaluation. The general time frame applied for business support policy evaluation is limited to one to two, seldom three years post intervention. This is despite calls for long-term impact studies by various authors, concerned about time lags before effects are fully realised. This desire for long-term evaluation opposes the requirements by policy-makers and funders, seeking quick results. Also, current ‘best practice’ frameworks do not refer to timing or its implications, and data availability affects the ability to undertake long-term evaluation. Secondly, this thesis provides methodological value for follow-up and similar studies by using data linking of scheme-beneficiary data with official performance datasets. Thus data availability problems are avoided through the use of secondary data. Thirdly, this thesis builds the evidence, through the application of a longitudinal impact study of small business support in England, covering seven years of post intervention data. This illustrates the variability of results for different evaluation periods, and the value in using multiple years of data for a robust understanding of support impact. For survival, impact of assistance is found to be immediate, but limited. Concerning growth, significant impact centres on a two to three year period post intervention for the linear selection and quantile regression models – positive for employment and turnover, negative for productivity. Attribution of impact may present a problem for subsequent periods. The results clearly support the argument for the use of longitudinal data and analysis, and a greater appreciation by evaluators of the factor time. This analysis recommends a time frame of four to five years post intervention for soft business support evaluation.

The long-term impact of business support? - Exploring the role of evaluation timing using micro data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The original contribution of this work is threefold. Firstly, this thesis develops a critical perspective on current evaluation practice of business support, with focus on the timing of evaluation. The general time frame applied for business support policy evaluation is limited to one to two, seldom three years post intervention. This is despite calls for long-term impact studies by various authors, concerned about time lags before effects are fully realised. This desire for long-term evaluation opposes the requirements by policy-makers and funders, seeking quick results. Also, current ‘best practice’ frameworks do not refer to timing or its implications, and data availability affects the ability to undertake long-term evaluation. Secondly, this thesis provides methodological value for follow-up and similar studies by using data linking of scheme-beneficiary data with official performance datasets. Thus data availability problems are avoided through the use of secondary data. Thirdly, this thesis builds the evidence, through the application of a longitudinal impact study of small business support in England, covering seven years of post intervention data. This illustrates the variability of results for different evaluation periods, and the value in using multiple years of data for a robust understanding of support impact. For survival, impact of assistance is found to be immediate, but limited. Concerning growth, significant impact centres on a two to three year period post intervention for the linear selection and quantile regression models – positive for employment and turnover, negative for productivity. Attribution of impact may present a problem for subsequent periods. The results clearly support the argument for the use of longitudinal data and analysis, and a greater appreciation by evaluators of the factor time. This analysis recommends a time frame of four to five years post intervention for soft business support evaluation.

A villamos energia áralakulásának egy új modellje (A new model for price movement in electric power)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A tanulmány az aukciós villamosenergia-tőzsdéken kialakuló óránkénti árak statisztikai jellemzőivel foglalkozik. Célja, hogy egyes legújabb kutatási eredmények alapján új megvilágításban mutassa be a villamos energia óránkénti árára jellemző főbb megállapításokat, amelyek a későbbiekben az ár modellezésének alapjául szolgálhatnak. A jelenségeket az EEX és Nord Pool áramtőzsdén kereskedett termékek árainak adatain szemlélteti. Látni fogjuk, hogy át kell értékelnünk több, a villamosenergia-árak statisztikai viselkedéséről alkotott meggondolást. / === / The article concerns the statistical features of the hourly prices on auction-based markets for electric power. The purpose is to use the latest research findings to present the main statements about the hourly price for electric power in a new light, so that they can serve later as a basis for price modelling. The phenomena are viewed through the price data of products traded on the EEX and Nord Pool power exchanges. It emerges that several ideas about the statistical behaviour of electric power prices have to be reviewed.

Bizalom az üzleti kapcsolatokban. A diadikus adatelemzés egy alkalmazása = Trust in business relations - an application of dyadic data analysis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A tanulmány arra a feltevésre épül, hogy minél erősebb a bizalomra méltóság szintje egy adott üzleti kapcsolatban, annál inkább igaz, hogy nagy kockázatú tevékenységek mennek végbe benne. Ilyen esetekben a bizalomra méltóság a kapcsolatban zajló események, cselekvések irányítási eszközévé válik, és az üzleti kapcsolatban megjelenik a cselekvési hajlandóságként értelmezett bizalom. A tanulmány felhívja a figyelmet a bizalom és a bizalomra méltóság fogalmai közötti különbségre, szisztematikus különválasztásuk fontosságára. Bemutatja az úgynevezett diadikus adatelemzés gazdálkodástudományi alkalmazását. Empirikus eredményei is igazolják, hogy ezzel a módszerrel az üzleti kapcsolatok társas jellemzőinek (köztük a bizalomnak) és a közöttük lévő kapcsolatoknak mélyebb elemzésére nyílik lehetőség. ____ The paper rests on the behavioral interpretation of trust, making a clear distinction between trustworthiness (honesty) and trust interpreted as willingness to engage in risky situations with specific partners. The hypothesis tested is that in a business relation marked by high levels of trustworthiness as perceived by the opposite parties, willingness to be involved in risky situations is higher than it is in relations where actors do not believe their partners to be highly trustworthy. Testing this hypothesis clearly calls for dyadic operationalization, measurement, and analysis. The authors present the first economic application of a newly developed statistical technique called dyadic data analysis, which has already been applied in social psychology. It clearly overcomes the problem of single-ended research in business relations analysis and allows a deeper understanding of any dyadic phenomenon, including trust/trustworthiness as a governance mechanism.

A qualitative study of the clinical practice of graduates of a problem-based physical therapy program

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this study was to examine the perspectives of three graduates of a problem-based leaning (PBL) physical therapy (PT) program about their clinical practice. The study used the qualitative methods of observations, interviews, and journaling to gather the data. Three sessions of audiotaped interviews and two observation sessions were conducted with three exemplars from Nova Southeastern University PBL PT program. Each participant also maintained a reflective journal. The data were analyzed using content analysis. A systematic filing system was used by employing a mechanical means of maintaining and indexing coded data and sorting data into coded classifications of subtopics or themes. All interview transcripts, field notes from observations, and journal accounts were read, and index sheets were appropriately annotated. From the findings of the study, it was noted that, from the participants' perspectives, they were practicing at typically expected levels as clinicians. The attributes that governed the perspectives of the participants about their physical therapy clinical practice included flexibility, reflection, analysis, decision-making, self-reliance, problem-solving, independent thinking, and critical thinking. Further, the findings indicated that the factors that influenced those attributes included the PBL process, parents' value system, self-reliant personality, innate personality traits, and deliberate choice. Finally, the findings indicated that the participants' perspectives, for the most part, appeared to support the espoused efficacy of the PBL educational approach. In conclusion, there is evidence that the physical therapy clinical practice of the participants were positively impacted by the PBL curriculum. Among the many attributes they noted which governed these perspectives, problem-solving, as postulated by Barrows, was one of the most frequently mentioned benefits gained from their PBL PT training. With more schools adopting the PBL approach, this research will hopefully add to the knowledge base regarding the efficacy of embracing a problem-based learning instructional approach in physical therapy programs. ^

A mathematical resolution to log transformations and the binning effect in applied processing of data in flow cytometry

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation develops a new mathematical approach that overcomes the effect of a data processing phenomenon known as “histogram binning” inherent to flow cytometry data. A real-time procedure is introduced to prove the effectiveness and fast implementation of such an approach on real-world data. The histogram binning effect is a dilemma posed by two seemingly antagonistic developments: (1) flow cytometry data in its histogram form is extended in its dynamic range to improve its analysis and interpretation, and (2) the inevitable dynamic range extension introduces an unwelcome side effect, the binning effect, which skews the statistics of the data, undermining as a consequence the accuracy of the analysis and the eventual interpretation of the data. ^ Researchers in the field contended with such a dilemma for many years, resorting either to hardware approaches that are rather costly with inherent calibration and noise effects; or have developed software techniques based on filtering the binning effect but without successfully preserving the statistical content of the original data. ^ The mathematical approach introduced in this dissertation is so appealing that a patent application has been filed. The contribution of this dissertation is an incremental scientific innovation based on a mathematical framework that will allow researchers in the field of flow cytometry to improve the interpretation of data knowing that its statistical meaning has been faithfully preserved for its optimized analysis. Furthermore, with the same mathematical foundation, proof of the origin of such an inherent artifact is provided. ^ These results are unique in that new mathematical derivations are established to define and solve the critical problem of the binning effect faced at the experimental assessment level, providing a data platform that preserves its statistical content. ^ In addition, a novel method for accumulating the log-transformed data was developed. This new method uses the properties of the transformation of statistical distributions to accumulate the output histogram in a non-integer and multi-channel fashion. Although the mathematics of this new mapping technique seem intricate, the concise nature of the derivations allow for an implementation procedure that lends itself to a real-time implementation using lookup tables, a task that is also introduced in this dissertation. ^

Knowledge discovery using multiple sources of biological data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The primary aim of this dissertation is to develop data mining tools for knowledge discovery in biomedical data when multiple (homogeneous or heterogeneous) sources of data are available. The central hypothesis is that, when information from multiple sources of data are used appropriately and effectively, knowledge discovery can be better achieved than what is possible from only a single source. ^ Recent advances in high-throughput technology have enabled biomedical researchers to generate large volumes of diverse types of data on a genome-wide scale. These data include DNA sequences, gene expression measurements, and much more; they provide the motivation for building analysis tools to elucidate the modular organization of the cell. The challenges include efficiently and accurately extracting information from the multiple data sources; representing the information effectively, developing analytical tools, and interpreting the results in the context of the domain. ^ The first part considers the application of feature-level integration to design classifiers that discriminate between soil types. The machine learning tools, SVM and KNN, were used to successfully distinguish between several soil samples. ^ The second part considers clustering using multiple heterogeneous data sources. The resulting Multi-Source Clustering (MSC) algorithm was shown to have a better performance than clustering methods that use only a single data source or a simple feature-level integration of heterogeneous data sources. ^ The third part proposes a new approach to effectively incorporate incomplete data into clustering analysis. Adapted from K-means algorithm, the Generalized Constrained Clustering (GCC) algorithm makes use of incomplete data in the form of constraints to perform exploratory analysis. Novel approaches for extracting constraints were proposed. For sufficiently large constraint sets, the GCC algorithm outperformed the MSC algorithm. ^ The last part considers the problem of providing a theme-specific environment for mining multi-source biomedical data. The database called PlasmoTFBM, focusing on gene regulation of Plasmodium falciparum, contains diverse information and has a simple interface to allow biologists to explore the data. It provided a framework for comparing different analytical tools for predicting regulatory elements and for designing useful data mining tools. ^ The conclusion is that the experiments reported in this dissertation strongly support the central hypothesis.^

Statistical analysis and meta-analysis of microarray data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The microarray technology provides a high-throughput technique to study gene expression. Microarrays can help us diagnose different types of cancers, understand biological processes, assess host responses to drugs and pathogens, find markers for specific diseases, and much more. Microarray experiments generate large amounts of data. Thus, effective data processing and analysis are critical for making reliable inferences from the data. ^ The first part of dissertation addresses the problem of finding an optimal set of genes (biomarkers) to classify a set of samples as diseased or normal. Three statistical gene selection methods (GS, GS-NR, and GS-PCA) were developed to identify a set of genes that best differentiate between samples. A comparative study on different classification tools was performed and the best combinations of gene selection and classifiers for multi-class cancer classification were identified. For most of the benchmarking cancer data sets, the gene selection method proposed in this dissertation, GS, outperformed other gene selection methods. The classifiers based on Random Forests, neural network ensembles, and K-nearest neighbor (KNN) showed consistently god performance. A striking commonality among these classifiers is that they all use a committee-based approach, suggesting that ensemble classification methods are superior. ^ The same biological problem may be studied at different research labs and/or performed using different lab protocols or samples. In such situations, it is important to combine results from these efforts. The second part of the dissertation addresses the problem of pooling the results from different independent experiments to obtain improved results. Four statistical pooling techniques (Fisher inverse chi-square method, Logit method. Stouffer's Z transform method, and Liptak-Stouffer weighted Z-method) were investigated in this dissertation. These pooling techniques were applied to the problem of identifying cell cycle-regulated genes in two different yeast species. As a result, improved sets of cell cycle-regulated genes were identified. The last part of dissertation explores the effectiveness of wavelet data transforms for the task of clustering. Discrete wavelet transforms, with an appropriate choice of wavelet bases, were shown to be effective in producing clusters that were biologically more meaningful. ^

Quantifying the effects of temperature and concentration on variable-density flow in numerical modeling of groundwater systems: Implications for predictive uncertainty and data collection

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Groundwater systems of different densities are often mathematically modeled to understand and predict environmental behavior such as seawater intrusion or submarine groundwater discharge. Additional data collection may be justified if it will cost-effectively aid in reducing the uncertainty of a model's prediction. The collection of salinity, as well as, temperature data could aid in reducing predictive uncertainty in a variable-density model. However, before numerical models can be created, rigorous testing of the modeling code needs to be completed. This research documents the benchmark testing of a new modeling code, SEAWAT Version 4. The benchmark problems include various combinations of density-dependent flow resulting from variations in concentration and temperature. The verified code, SEAWAT, was then applied to two different hydrological analyses to explore the capacity of a variable-density model to guide data collection. ^ The first analysis tested a linear method to guide data collection by quantifying the contribution of different data types and locations toward reducing predictive uncertainty in a nonlinear variable-density flow and transport model. The relative contributions of temperature and concentration measurements, at different locations within a simulated carbonate platform, for predicting movement of the saltwater interface were assessed. Results from the method showed that concentration data had greater worth than temperature data in reducing predictive uncertainty in this case. Results also indicated that a linear method could be used to quantify data worth in a nonlinear model. ^ The second hydrological analysis utilized a model to identify the transient response of the salinity, temperature, age, and amount of submarine groundwater discharge to changes in tidal ocean stage, seasonal temperature variations, and different types of geology. The model was compared to multiple kinds of data to (1) calibrate and verify the model, and (2) explore the potential for the model to be used to guide the collection of data using techniques such as electromagnetic resistivity, thermal imagery, and seepage meters. Results indicated that the model can be used to give insight to submarine groundwater discharge and be used to guide data collection. ^

Ranked search on data graphs

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Graph-structured databases are widely prevalent, and the problem of effective search and retrieval from such graphs has been receiving much attention recently. For example, the Web can be naturally viewed as a graph. Likewise, a relational database can be viewed as a graph where tuples are modeled as vertices connected via foreign-key relationships. Keyword search querying has emerged as one of the most effective paradigms for information discovery, especially over HTML documents in the World Wide Web. One of the key advantages of keyword search querying is its simplicity—users do not have to learn a complex query language, and can issue queries without any prior knowledge about the structure of the underlying data. The purpose of this dissertation was to develop techniques for user-friendly, high quality and efficient searching of graph structured databases. Several ranked search methods on data graphs have been studied in the recent years. Given a top-k keyword search query on a graph and some ranking criteria, a keyword proximity search finds the top-k answers where each answer is a substructure of the graph containing all query keywords, which illustrates the relationship between the keyword present in the graph. We applied keyword proximity search on the web and the page graph of web documents to find top-k answers that satisfy user’s information need and increase user satisfaction. Another effective ranking mechanism applied on data graphs is the authority flow based ranking mechanism. Given a top- k keyword search query on a graph, an authority-flow based search finds the top-k answers where each answer is a node in the graph ranked according to its relevance and importance to the query. We developed techniques that improved the authority flow based search on data graphs by creating a framework to explain and reformulate them taking in to consideration user preferences and feedback. We also applied the proposed graph search techniques for Information Discovery over biological databases. Our algorithms were experimentally evaluated for performance and quality. The quality of our method was compared to current approaches by using user surveys.

Examination of the impact of personality, social, and cognitive factors on the co-occurrence of health risk behaviors among multi-problem youth: The utility of an integrative framework

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Research has identified a number of putative risk factors that places adolescents at incrementally higher risk for involvement in alcohol and other drug (AOD) use and sexual risk behaviors (SRBs). Such factors include personality characteristics such as sensation-seeking, cognitive factors such as positive expectancies and inhibition conflict as well as peer norm processes. The current study was guided by a conceptual perspective that support the notion that an integrative framework that includes multi-level factors has significant explanatory value for understanding processes associated with the co-occurrence of AOD use and sexual risk behavior outcomes. This study evaluated simultaneously the mediating role of AOD-sex related expectancies and inhibition conflict on antecedents of AOD use and SRBs including sexual sensation-seeking and peer norms for condom use.^ The sample was drawn from the Enhancing My Personal Options While Evaluating Risk (EMPOWER: Jonathan Tubman, PI), data set (N = 396; aged 12-18 years). Measures used in the study included Sexual Sensation-Seeking Scale, Inhibition Conflict for Condom Use, Risky Sex Scale. All relevant measures had well-documented psychometric properties. A global assessment of alcohol, drug use and sexual risk behaviors was used.^ Results demonstrated that AOD-sex related expectancies mediated the influence of sexual sensation-seeking on the co-occurrence of alcohol and other drug use and sexual risk behaviors. The evaluation of the integrative model also revealed that sexual sensation-seeking was positively associated with peer norms for condom use. Also, peer norms predicted inhibition conflict among this sample of multi-problem youth. ^ This dissertation research identified mechanisms of risk and protection associated with the co-occurrence of AOD use and SRBs among a multi-problem sample of adolescents receiving treatment for alcohol or drug use and related problems. This study is informative for adolescent-serving programs that address those individual and contextual characteristics that enhance treatment efficacy and effectiveness among adolescents receiving substance use and related problems services.^

Large scale data processing using MapReduce

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As massive data sets become increasingly available, people are facing the problem of how to effectively process and understand these data. Traditional sequential computing models are giving way to parallel and distributed computing models, such as MapReduce, both due to the large size of the data sets and their high dimensionality. This dissertation, as in the same direction of other researches that are based on MapReduce, tries to develop effective techniques and applications using MapReduce that can help people solve large-scale problems. Three different problems are tackled in the dissertation. The first one deals with processing terabytes of raster data in a spatial data management system. Aerial imagery files are broken into tiles to enable data parallel computation. The second and third problems deal with dimension reduction techniques that can be used to handle data sets of high dimensionality. Three variants of the nonnegative matrix factorization technique are scaled up to factorize matrices of dimensions in the order of millions in MapReduce based on different matrix multiplication implementations. Two algorithms, which compute CANDECOMP/PARAFAC and Tucker tensor decompositions respectively, are parallelized in MapReduce based on carefully partitioning the data and arranging the computation to maximize data locality and parallelism.

A theory of constraints service systems improvement method: Case of the airline turnaround problem

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation develops a process improvement method for service operations based on the Theory of Constraints (TOC), a management philosophy that has been shown to be effective in manufacturing for decreasing WIP and improving throughput. While TOC has enjoyed much attention and success in the manufacturing arena, its application to services in general has been limited. The contribution to industry and knowledge is a method for improving global performance measures based on TOC principles. The method proposed in this dissertation will be tested using discrete event simulation based on the scenario of the service factory of airline turnaround operations. To evaluate the method, a simulation model of aircraft turn operations of a U.S. based carrier was made and validated using actual data from airline operations. The model was then adjusted to reflect an application of the Theory of Constraints for determining how to deploy the scarce resource of ramp workers. The results indicate that, given slight modifications to TOC terminology and the development of a method for constraint identification, the Theory of Constraints can be applied with success to services. Bottlenecks in services must be defined as those processes for which the process rates and amount of work remaining are such that completing the process will not be possible without an increase in the process rate. The bottleneck ratio is used to determine to what degree a process is a constraint. Simulation results also suggest that redefining performance measures to reflect a global business perspective of reducing costs related to specific flights versus the operational local optimum approach of turning all aircraft quickly results in significant savings to the company. Savings to the annual operating costs of the airline were simulated to equal 30% of possible current expenses for misconnecting passengers with a modest increase in utilization of the workers through a more efficient heuristic of deploying them to the highest priority tasks. This dissertation contributes to the literature on service operations by describing a dynamic, adaptive dispatch approach to manage service factory operations similar to airline turnaround operations using the management philosophy of the Theory of Constraints.

«
1
2
...
49
50
51
52
53
54
55
56
57
»