929 resultados para Distributed lag model
Resumo:
The research reported in this article is based on the Ph.D. project of Dr. RK, which was funded by the Scottish Informatics and Computer Science Alliance (SICSA). KvD acknowledges support from the EPSRC under the RefNet grant (EP/J019615/1).
Resumo:
Fitting statistical models is computationally challenging when the sample size or the dimension of the dataset is huge. An attractive approach for down-scaling the problem size is to first partition the dataset into subsets and then fit using distributed algorithms. The dataset can be partitioned either horizontally (in the sample space) or vertically (in the feature space), and the challenge arise in defining an algorithm with low communication, theoretical guarantees and excellent practical performance in general settings. For sample space partitioning, I propose a MEdian Selection Subset AGgregation Estimator ({\em message}) algorithm for solving these issues. The algorithm applies feature selection in parallel for each subset using regularized regression or Bayesian variable selection method, calculates the `median' feature inclusion index, estimates coefficients for the selected features in parallel for each subset, and then averages these estimates. The algorithm is simple, involves very minimal communication, scales efficiently in sample size, and has theoretical guarantees. I provide extensive experiments to show excellent performance in feature selection, estimation, prediction, and computation time relative to usual competitors.
While sample space partitioning is useful in handling datasets with large sample size, feature space partitioning is more effective when the data dimension is high. Existing methods for partitioning features, however, are either vulnerable to high correlations or inefficient in reducing the model dimension. In the thesis, I propose a new embarrassingly parallel framework named {\em DECO} for distributed variable selection and parameter estimation. In {\em DECO}, variables are first partitioned and allocated to m distributed workers. The decorrelated subset data within each worker are then fitted via any algorithm designed for high-dimensional problems. We show that by incorporating the decorrelation step, DECO can achieve consistent variable selection and parameter estimation on each subset with (almost) no assumptions. In addition, the convergence rate is nearly minimax optimal for both sparse and weakly sparse models and does NOT depend on the partition number m. Extensive numerical experiments are provided to illustrate the performance of the new framework.
For datasets with both large sample sizes and high dimensionality, I propose a new "divided-and-conquer" framework {\em DEME} (DECO-message) by leveraging both the {\em DECO} and the {\em message} algorithm. The new framework first partitions the dataset in the sample space into row cubes using {\em message} and then partition the feature space of the cubes using {\em DECO}. This procedure is equivalent to partitioning the original data matrix into multiple small blocks, each with a feasible size that can be stored and fitted in a computer in parallel. The results are then synthezied via the {\em DECO} and {\em message} algorithm in a reverse order to produce the final output. The whole framework is extremely scalable.
Resumo:
Mobile Cloud Computing promises to overcome the physical limitations of mobile devices by executing demanding mobile applications on cloud infrastructure. In practice, implementing this paradigm is difficult; network disconnection often occurs, bandwidth may be limited, and a large power draw is required from the battery, resulting in a poor user experience. This thesis presents a mobile cloud middleware solution, Context Aware Mobile Cloud Services (CAMCS), which provides cloudbased services to mobile devices, in a disconnected fashion. An integrated user experience is delivered by designing for anticipated network disconnection, and low data transfer requirements. CAMCS achieves this by means of the Cloud Personal Assistant (CPA); each user of CAMCS is assigned their own CPA, which can complete user-assigned tasks, received as descriptions from the mobile device, by using existing cloud services. Service execution is personalised to the user's situation with contextual data, and task execution results are stored with the CPA until the user can connect with his/her mobile device to obtain the results. Requirements for an integrated user experience are outlined, along with the design and implementation of CAMCS. The operation of CAMCS and CPAs with cloud-based services is presented, specifically in terms of service description, discovery, and task execution. The use of contextual awareness to personalise service discovery and service consumption to the user's situation is also presented. Resource management by CAMCS is also studied, and compared with existing solutions. Additional application models that can be provided by CAMCS are also presented. Evaluation is performed with CAMCS deployed on the Amazon EC2 cloud. The resource usage of the CAMCS Client, running on Android-based mobile devices, is also evaluated. A user study with volunteers using CAMCS on their own mobile devices is also presented. Results show that CAMCS meets the requirements outlined for an integrated user experience.
Resumo:
Stroke is a prevalent disorder with immense socioeconomic impact. A variety of chronic neurological deficits result from stroke. In particular, sensorimotor deficits are a significant barrier to achieving post-stroke independence. Unfortunately, the majority of pre-clinical studies that show improved outcomes in animal stroke models have failed in clinical trials. Pre-clinical studies using non-human primate (NHP) stroke models prior to initiating human trials are a potential step to improving translation from animal studies to clinical trials. Robotic assessment tools represent a quantitative, reliable, and reproducible means to assess reaching behaviour following stroke in both humans and NHPs. We investigated the use of robotic technology to assess sensorimotor impairments in NHPs following middle cerebral artery occlusion (MCAO). Two cynomolgus macaques underwent transient MCAO for 90 minutes. Approximately 1.5 years following the procedure these NHPs and two non-stroke control monkeys were trained in a reaching task with both arms in the KINARM exoskeleton. This robot permits elbow and shoulder movements in the horizontal plane. The task required NHPs to make reaching movements from a centrally positioned start target to 1 of 8 peripheral targets uniformly distributed around the first target. We analyzed four movement parameters: reaction time, movement time (MT), initial direction error (IDE), and number of speed maxima to characterize sensorimotor deficiencies. We hypothesized reduced performance in these attributes during a neurobehavioural task with the paretic limb of NHPs following MCAO compared to controls. Reaching movements in the non-affected limbs of control and experimental NHPs showed bell-shaped velocity profiles. In contrast, the reaching movements with the affected limbs were highly variable. We found distinctive patterns in MT, IDE, and number of speed peaks between control and experimental monkeys and between limbs of NHPs with MCAO. NHPs with MCAO demonstrated more speed peaks, longer MTs, and greater IDE in their paretic limb compared to controls. These initial results qualitatively match human stroke subjects’ performance, suggesting that robotic neurobehavioural assessment in NHPs with stroke is feasible and could have translational relevance in subsequent human studies. Further studies will be necessary to replicate and expand on these preliminary findings.
Resumo:
The real-time optimization of large-scale systems is a difficult problem due to the need for complex models involving uncertain parameters and the high computational cost of solving such problems by a decentralized approach. Extremum-seeking control (ESC) is a model-free real-time optimization technique which can estimate unknown parameters and can optimize nonlinear time-varying systems using only a measurement of the cost function to be minimized. In this thesis, we develop a distributed version of extremum-seeking control which allows large-scale systems to be optimized without models and with minimal computing power. First, we develop a continuous-time distributed extremum-seeking controller. It has three main components: consensus, parameter estimation, and optimization. The consensus provides each local controller with an estimate of the cost to be minimized, allowing them to coordinate their actions. Using this cost estimate, parameters for a local input-output model are estimated, and the cost is minimized by following a gradient descent based on the estimate of the gradient. Next, a similar distributed extremum-seeking controller is developed in discrete-time. Finally, we consider an interesting application of distributed ESC: formation control of high-altitude balloons for high-speed wireless internet. These balloons must be steered into a favourable formation where they are spread out over the Earth and provide coverage to the entire planet. Distributed ESC is applied to this problem, and is shown to be effective for a system of 1200 ballons subjected to realistic wind currents. The approach does not require a wind model and uses a cost function based on a Voronoi partition of the sphere. Distributed ESC is able to steer balloons from a few initial launch sites into a formation which provides coverage to the entire Earth and can maintain a similar formation as the balloons move with the wind around the Earth.
Resumo:
Purines are nitrogen-rich compounds that are widely distributed in the marine environment and are an important component of the dissolved organic nitrogen (DON) pool. Even though purines have been shown to be degraded by bacterioplankton, the identities of marine bacteria capable of purine degradation and their underlying catabolic mechanisms are currently unknown. This study shows that Ruegeria pomeroyi, a model marine bacterium and Marine Roseobacter Clade (MRC) representative, utilizes xanthine as a source of carbon and nitrogen. The R. pomeroyi genome contains putative genes that encode xanthine dehydrogenase (XDH), which is expressed during growth with xanthine. RNAseq-based analysis of the R. pomeroyi transcriptome revealed that the transcription of an XDH-initiated catabolic pathway is up-regulated during growth with xanthine, with transcription greatest when xanthine was the only available carbon source. The RNAseq-deduced pathway indicates that glyoxylate and ammonia are the key intermediates from xanthine degradation. Utilising a laboratory model, this study has identified the potential genes and catabolic pathway active during xanthine degradation. The ability of R. pomeroyi to utilize xanthine provides novel insights into the capabilities of the MRC that may contribute to their success in marine ecosystems and the potential biogeochemical importance of the group in processing DON.
Resumo:
Purines are nitrogen-rich compounds that are widely distributed in the marine environment and are an important component of the dissolved organic nitrogen (DON) pool. Even though purines have been shown to be degraded by bacterioplankton, the identities of marine bacteria capable of purine degradation and their underlying catabolic mechanisms are currently unknown. This study shows that Ruegeria pomeroyi, a model marine bacterium and Marine Roseobacter Clade (MRC) representative, utilizes xanthine as a source of carbon and nitrogen. The R. pomeroyi genome contains putative genes that encode xanthine dehydrogenase (XDH), which is expressed during growth with xanthine. RNAseq-based analysis of the R. pomeroyi transcriptome revealed that the transcription of an XDH-initiated catabolic pathway is up-regulated during growth with xanthine, with transcription greatest when xanthine was the only available carbon source. The RNAseq-deduced pathway indicates that glyoxylate and ammonia are the key intermediates from xanthine degradation. Utilising a laboratory model, this study has identified the potential genes and catabolic pathway active during xanthine degradation. The ability of R. pomeroyi to utilize xanthine provides novel insights into the capabilities of the MRC that may contribute to their success in marine ecosystems and the potential biogeochemical importance of the group in processing DON.
Resumo:
As one of the most successfully commercialized distributed energy resources, the long-term effects of microturbines (MTs) on the distribution network has not been fully investigated due to the complex thermo-fluid-mechanical energy conversion processes. This is further complicated by the fact that the parameter and internal data of MTs are not always available to the electric utility, due to different ownerships and confidentiality concerns. To address this issue, a general modeling approach for MTs is proposed in this paper, which allows for the long-term simulation of the distribution network with multiple MTs. First, the feasibility of deriving a simplified MT model for long-term dynamic analysis of the distribution network is discussed, based on the physical understanding of dynamic processes that occurred within MTs. Then a three-stage identification method is developed in order to obtain a piecewise MT model and predict electro-mechanical system behaviors with saturation. Next, assisted with the electric power flow calculation tool, a fast simulation methodology is proposed to evaluate the long-term impact of multiple MTs on the distribution network. Finally, the model is verified by using Capstone C30 microturbine experiments, and further applied to the dynamic simulation of a modified IEEE 37-node test feeder with promising results.
Resumo:
An RVE–based stochastic numerical model is used to calculate the permeability of randomly generated porous media at different values of the fiber volume fraction for the case of transverse flow in a unidirectional ply. Analysis of the numerical results shows that the permeability is not normally distributed. With the aim of proposing a new understanding on this particular topic, permeability data are fitted using both a mixture model and a unimodal distribution. Our findings suggest that permeability can be fitted well using a mixture model based on the lognormal and power law distributions. In case of a unimodal distribution, it is found, using the maximum-likelihood estimation method (MLE), that the generalized extreme value (GEV) distribution represents the best fit. Finally, an expression of the permeability as a function of the fiber volume fraction based on the GEV distribution is discussed in light of the previous results.
Resumo:
Wireless sensor networks (WSNs) differ from conventional distributed systems in many aspects. The resource limitation of sensor nodes, the ad-hoc communication and topology of the network, coupled with an unpredictable deployment environment are difficult non-functional constraints that must be carefully taken into account when developing software systems for a WSN. Thus, more research needs to be done on designing, implementing and maintaining software for WSNs. This thesis aims to contribute to research being done in this area by presenting an approach to WSN application development that will improve the reusability, flexibility, and maintainability of the software. Firstly, we present a programming model and software architecture aimed at describing WSN applications, independently of the underlying operating system and hardware. The proposed architecture is described and realized using the Model-Driven Architecture (MDA) standard in order to achieve satisfactory levels of encapsulation and abstraction when programming sensor nodes. Besides, we study different non-functional constrains of WSN application and propose two approaches to optimize the application to satisfy these constrains. A real prototype framework was built to demonstrate the developed solutions in the thesis. The framework implemented the programming model and the multi-layered software architecture as components. A graphical interface, code generation components and supporting tools were also included to help developers design, implement, optimize, and test the WSN software. Finally, we evaluate and critically assess the proposed concepts. Two case studies are provided to support the evaluation. The first case study, a framework evaluation, is designed to assess the ease at which novice and intermediate users can develop correct and power efficient WSN applications, the portability level achieved by developing applications at a high-level of abstraction, and the estimated overhead due to usage of the framework in terms of the footprint and executable code size of the application. In the second case study, we discuss the design, implementation and optimization of a real-world application named TempSense, where a sensor network is used to monitor the temperature within an area.
Resumo:
This study examined team processes and outcomes among 12 multi-university distributed project teams from 11 universities during its early and late development stages over a 14-month project period. A longitudinal model of team interaction is presented and tested at the individual level to consider the extent to which both formal and informal network connections—measured as degree centrality—relate to changes in team members’ individual perceptions of cohesion and conflict in their teams, and their individual performance as a team member over time. The study showed a negative network centrality-cohesion relationship with significant temporal patterns, indicating that as team members perceive less degree centrality in distributed project teams, they report more team cohesion during the last four months of the project. We also found that changes in team cohesion from the first three months (i.e., early development stage) to the last four months (i.e., late development stage) of the project relate positively to changes in team member performance. Although degree centrality did not relate significantly to changes in team conflict over time, a strong inverse relationship was found between changes in team conflict and cohesion, suggesting that team conflict emphasizes a different but related aspect of how individuals view their experience with the team process. Changes in team conflict, however, did not relate to changes in team member performance. Ultimately, we showed that individuals, who are less central in the network and report higher levels of team cohesion, performed better in distributed teams over time.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
We propose three research problems to explore the relations between trust and security in the setting of distributed computation. In the first problem, we study trust-based adversary detection in distributed consensus computation. The adversaries we consider behave arbitrarily disobeying the consensus protocol. We propose a trust-based consensus algorithm with local and global trust evaluations. The algorithm can be abstracted using a two-layer structure with the top layer running a trust-based consensus algorithm and the bottom layer as a subroutine executing a global trust update scheme. We utilize a set of pre-trusted nodes, headers, to propagate local trust opinions throughout the network. This two-layer framework is flexible in that it can be easily extensible to contain more complicated decision rules, and global trust schemes. The first problem assumes that normal nodes are homogeneous, i.e. it is guaranteed that a normal node always behaves as it is programmed. In the second and third problems however, we assume that nodes are heterogeneous, i.e, given a task, the probability that a node generates a correct answer varies from node to node. The adversaries considered in these two problems are workers from the open crowd who are either investing little efforts in the tasks assigned to them or intentionally give wrong answers to questions. In the second part of the thesis, we consider a typical crowdsourcing task that aggregates input from multiple workers as a problem in information fusion. To cope with the issue of noisy and sometimes malicious input from workers, trust is used to model workers' expertise. In a multi-domain knowledge learning task, however, using scalar-valued trust to model a worker's performance is not sufficient to reflect the worker's trustworthiness in each of the domains. To address this issue, we propose a probabilistic model to jointly infer multi-dimensional trust of workers, multi-domain properties of questions, and true labels of questions. Our model is very flexible and extensible to incorporate metadata associated with questions. To show that, we further propose two extended models, one of which handles input tasks with real-valued features and the other handles tasks with text features by incorporating topic models. Our models can effectively recover trust vectors of workers, which can be very useful in task assignment adaptive to workers' trust in the future. These results can be applied for fusion of information from multiple data sources like sensors, human input, machine learning results, or a hybrid of them. In the second subproblem, we address crowdsourcing with adversaries under logical constraints. We observe that questions are often not independent in real life applications. Instead, there are logical relations between them. Similarly, workers that provide answers are not independent of each other either. Answers given by workers with similar attributes tend to be correlated. Therefore, we propose a novel unified graphical model consisting of two layers. The top layer encodes domain knowledge which allows users to express logical relations using first-order logic rules and the bottom layer encodes a traditional crowdsourcing graphical model. Our model can be seen as a generalized probabilistic soft logic framework that encodes both logical relations and probabilistic dependencies. To solve the collective inference problem efficiently, we have devised a scalable joint inference algorithm based on the alternating direction method of multipliers. The third part of the thesis considers the problem of optimal assignment under budget constraints when workers are unreliable and sometimes malicious. In a real crowdsourcing market, each answer obtained from a worker incurs cost. The cost is associated with both the level of trustworthiness of workers and the difficulty of tasks. Typically, access to expert-level (more trustworthy) workers is more expensive than to average crowd and completion of a challenging task is more costly than a click-away question. In this problem, we address the problem of optimal assignment of heterogeneous tasks to workers of varying trust levels with budget constraints. Specifically, we design a trust-aware task allocation algorithm that takes as inputs the estimated trust of workers and pre-set budget, and outputs the optimal assignment of tasks to workers. We derive the bound of total error probability that relates to budget, trustworthiness of crowds, and costs of obtaining labels from crowds naturally. Higher budget, more trustworthy crowds, and less costly jobs result in a lower theoretical bound. Our allocation scheme does not depend on the specific design of the trust evaluation component. Therefore, it can be combined with generic trust evaluation algorithms.
Resumo:
Statistical association between a single nucleotide polymorphism (SNP) genotype and a quantitative trait in genome-wide association studies is usually assessed using a linear regression model, or, in the case of non-normally distributed trait values, using the Kruskal-Wallis test. While linear regression models assume an additive mode of inheritance via equi-distant genotype scores, Kruskal-Wallis test merely tests global differences in trait values associated with the three genotype groups. Both approaches thus exhibit suboptimal power when the underlying inheritance mode is dominant or recessive. Furthermore, these tests do not perform well in the common situations when only a few trait values are available in a rare genotype category (disbalance), or when the values associated with the three genotype categories exhibit unequal variance (variance heterogeneity). We propose a maximum test based on Marcus-type multiple contrast test for relative effect sizes. This test allows model-specific testing of either dominant, additive or recessive mode of inheritance, and it is robust against variance heterogeneity. We show how to obtain mode-specific simultaneous confidence intervals for the relative effect sizes to aid in interpreting the biological relevance of the results. Further, we discuss the use of a related all-pairwise comparisons contrast test with range preserving confidence intervals as an alternative to Kruskal-Wallis heterogeneity test. We applied the proposed maximum test to the Bogalusa Heart Study dataset, and gained a remarkable increase in the power to detect association, particularly for rare genotypes. Our simulation study also demonstrated that the proposed non-parametric tests control family-wise error rate in the presence of non-normality and variance heterogeneity contrary to the standard parametric approaches. We provide a publicly available R library nparcomp that can be used to estimate simultaneous confidence intervals or compatible multiplicity-adjusted p-values associated with the proposed maximum test.