981 resultados para sequential data
Resumo:
In an investigation intended to determine training needs of night crews, Bowers et al. (1998, this issue) report two studies showing that the patterning of communication is a better discriminator of good and poor crews than is the content of communication. Bowers et al. characterize their studies as intended to generate hypotheses for training needs and draw connections with Exploratory Sequential Data Analysis (ESDA). Although applauding the intentions of Bowers ct al., we point out some concerns with their characterization and implementation of ESDA. Our principal concern is that the Bowers et al. exploration of the data does not convincingly lead them back to a better fundamental understanding of the original phenomena they are investigating.
Resumo:
This dissertation deals with aspects of sequential data assimilation (in particular ensemble Kalman filtering) and numerical weather forecasting. In the first part, the recently formulated Ensemble Kalman-Bucy (EnKBF) filter is revisited. It is shown that the previously used numerical integration scheme fails when the magnitude of the background error covariance grows beyond that of the observational error covariance in the forecast window. Therefore, we present a suitable integration scheme that handles the stiffening of the differential equations involved and doesn’t represent further computational expense. Moreover, a transform-based alternative to the EnKBF is developed: under this scheme, the operations are performed in the ensemble space instead of in the state space. Advantages of this formulation are explained. For the first time, the EnKBF is implemented in an atmospheric model. The second part of this work deals with ensemble clustering, a phenomenon that arises when performing data assimilation using of deterministic ensemble square root filters in highly nonlinear forecast models. Namely, an M-member ensemble detaches into an outlier and a cluster of M-1 members. Previous works may suggest that this issue represents a failure of EnSRFs; this work dispels that notion. It is shown that ensemble clustering can be reverted also due to nonlinear processes, in particular the alternation between nonlinear expansion and compression of the ensemble for different regions of the attractor. Some EnSRFs that use random rotations have been developed to overcome this issue; these formulations are analyzed and their advantages and disadvantages with respect to common EnSRFs are discussed. The third and last part contains the implementation of the Robert-Asselin-Williams (RAW) filter in an atmospheric model. The RAW filter is an improvement to the widely popular Robert-Asselin filter that successfully suppresses spurious computational waves while avoiding any distortion in the mean value of the function. Using statistical significance tests both at the local and field level, it is shown that the climatology of the SPEEDY model is not modified by the changed time stepping scheme; hence, no retuning of the parameterizations is required. It is found the accuracy of the medium-term forecasts is increased by using the RAW filter.
Resumo:
Recently, a lot of effort has been spent in the efficient computation of kriging predictors when observations are assimilated sequentially. In particular, kriging update formulae enabling significant computational savings were derived. Taking advantage of the previous kriging mean and variance computations helps avoiding a costly matrix inversion when adding one observation to the TeX already available ones. In addition to traditional update formulae taking into account a single new observation, Emery (2009) proposed formulae for the batch-sequential case, i.e. when TeX new observations are simultaneously assimilated. However, the kriging variance and covariance formulae given in Emery (2009) for the batch-sequential case are not correct. In this paper, we fix this issue and establish correct expressions for updated kriging variances and covariances when assimilating observations in parallel. An application in sequential conditional simulation finally shows that coupling update and residual substitution approaches may enable significant speed-ups.
Resumo:
Narrative therapy is a postmodern therapy that takes the position that people create self-narratives to make sense of their experiences. To date, narrative therapy has compiled virtually no quantitative and very little qualitative research, leaving gaps in almost all areas of process and outcome. White (2006a), one of the therapy's founders, has recently utilized Vygotsky's (1934/1987) theories of the zone of proximal development (ZPD) and concept formation to describe the process of change in narrative therapy with children. In collaboration with the child client, the narrative therapist formalizes therapeutic concepts and submits them to increasing levels of generalization to create a ZPD. This study sought to determine whether the child's development proceeds through the stages of concept formation over the course of a session, and whether therapists' utterances scaffold this movement. A sequential analysis was used due to its unique ability to measure dynamic processes in social interactions. Stages of concept formation and scaffolding were coded over time. A hierarchical log-linear analysis was performed on the sequential data to develop a model of therapist scaffolding and child concept development. This was intended to determine what patterns occur and whether the stated intent of narrative therapy matches its actual process. In accordance with narrative therapy theory, the log-linear analysis produced a final model with interactions between therapist and child utterances, and between both therapist and child utterances and time. Specifically, the child and youth participants in therapy tended to respond to therapist scaffolding at the corresponding level of concept formation. Both children and youth and therapists also tended to move away from earlier and toward later stages of White's scaffolding conversations map as the therapy session advanced. These findings provide support for White's contention that narrative therapists promote child development by scaffolding child concept formation in therapy.
Impact of hydrographic data assimilation on the modelled Atlantic meridional overturning circulation
Resumo:
Here we make an initial step toward the development of an ocean assimilation system that can constrain the modelled Atlantic Meridional Overturning Circulation (AMOC) to support climate predictions. A detailed comparison is presented of 1° and 1/4° resolution global model simulations with and without sequential data assimilation, to the observations and transport estimates from the RAPID mooring array across 26.5° N in the Atlantic. Comparisons of modelled water properties with the observations from the merged RAPID boundary arrays demonstrate the ability of in situ data assimilation to accurately constrain the east-west density gradient between these mooring arrays. However, the presence of an unconstrained "western boundary wedge" between Abaco Island and the RAPID mooring site WB2 (16 km offshore) leads to the intensification of an erroneous southwards flow in this region when in situ data are assimilated. The result is an overly intense southward upper mid-ocean transport (0–1100 m) as compared to the estimates derived from the RAPID array. Correction of upper layer zonal density gradients is found to compensate mostly for a weak subtropical gyre circulation in the free model run (i.e. with no assimilation). Despite the important changes to the density structure and transports in the upper layer imposed by the assimilation, very little change is found in the amplitude and sub-seasonal variability of the AMOC. This shows that assimilation of upper layer density information projects mainly on the gyre circulation with little effect on the AMOC at 26° N due to the absence of corrections to density gradients below 2000 m (the maximum depth of Argo). The sensitivity to initial conditions was explored through two additional experiments using a climatological initial condition. These experiments showed that the weak bias in gyre intensity in the control simulation (without data assimilation) develops over a period of about 6 months, but does so independently from the overturning, with no change to the AMOC. However, differences in the properties and volume transport of North Atlantic Deep Water (NADW) persisted throughout the 3 year simulations resulting in a difference of 3 Sv in AMOC intensity. The persistence of these dense water anomalies and their influence on the AMOC is promising for the development of decadal forecasting capabilities. The results suggest that the deeper waters must be accurately reproduced in order to constrain the AMOC.
Resumo:
Recently major processor manufacturers have announced a dramatic shift in their paradigm to increase computing power over the coming years. Instead of focusing on faster clock speeds and more powerful single core CPUs, the trend clearly goes towards multi core systems. This will also result in a paradigm shift for the development of algorithms for computationally expensive tasks, such as data mining applications. Obviously, work on parallel algorithms is not new per se but concentrated efforts in the many application domains are still missing. Multi-core systems, but also clusters of workstations and even large-scale distributed computing infrastructures provide new opportunities and pose new challenges for the design of parallel and distributed algorithms. Since data mining and machine learning systems rely on high performance computing systems, research on the corresponding algorithms must be on the forefront of parallel algorithm research in order to keep pushing data mining and machine learning applications to be more powerful and, especially for the former, interactive. To bring together researchers and practitioners working in this exciting field, a workshop on parallel data mining was organized as part of PKDD/ECML 2006 (Berlin, Germany). The six contributions selected for the program describe various aspects of data mining and machine learning approaches featuring low to high degrees of parallelism: The first contribution focuses the classic problem of distributed association rule mining and focuses on communication efficiency to improve the state of the art. After this a parallelization technique for speeding up decision tree construction by means of thread-level parallelism for shared memory systems is presented. The next paper discusses the design of a parallel approach for dis- tributed memory systems of the frequent subgraphs mining problem. This approach is based on a hierarchical communication topology to solve issues related to multi-domain computational envi- ronments. The forth paper describes the combined use and the customization of software packages to facilitate a top down parallelism in the tuning of Support Vector Machines (SVM) and the next contribution presents an interesting idea concerning parallel training of Conditional Random Fields (CRFs) and motivates their use in labeling sequential data. The last contribution finally focuses on very efficient feature selection. It describes a parallel algorithm for feature selection from random subsets. Selecting the papers included in this volume would not have been possible without the help of an international Program Committee that has provided detailed reviews for each paper. We would like to also thank Matthew Otey who helped with publicity for the workshop.
Resumo:
Radiometric data in the visible domain acquired by satellite remote sensing have proven to be powerful for monitoring the states of the ocean, both physical and biological. With the help of these data it is possible to understand certain variations in biological responses of marine phytoplankton on ecological time scales. Here, we implement a sequential data-assimilation technique to estimate from a conventional nutrient–phytoplankton–zooplankton (NPZ) model the time variations of observed and unobserved variables. In addition, we estimate the time evolution of two biological parameters, namely, the specific growth rate and specific mortality of phytoplankton. Our study demonstrates that: (i) the series of time-varying estimates of specific growth rate obtained by sequential data assimilation improves the fitting of the NPZ model to the satellite-derived time series: the model trajectories are closer to the observations than those obtained by implementing static values of the parameter; (ii) the estimates of unobserved variables, i.e., nutrient and zooplankton, obtained from an NPZ model by implementation of a pre-defined parameter evolution can be different from those obtained on applying the sequences of parameters estimated by assimilation; and (iii) the maximum estimated specific growth rate of phytoplankton in the study area is more sensitive to the sea-surface temperature than would be predicted by temperature-dependent functions reported previously. The overall results of the study are potentially useful for enhancing our understanding of the biological response of phytoplankton in a changing environment.
Resumo:
Spatial data warehouses (SDWs) allow for spatial analysis together with analytical multidimensional queries over huge volumes of data. The challenge is to retrieve data related to ad hoc spatial query windows according to spatial predicates, avoiding the high cost of joining large tables. Therefore, mechanisms to provide efficient query processing over SDWs are essential. In this paper, we propose two efficient indices for SDW: the SB-index and the HSB-index. The proposed indices share the following characteristics. They enable multidimensional queries with spatial predicate for SDW and also support predefined spatial hierarchies. Furthermore, they compute the spatial predicate and transform it into a conventional one, which can be evaluated together with other conventional predicates by accessing a star-join Bitmap index. While the SB-index has a sequential data structure, the HSB-index uses a hierarchical data structure to enable spatial objects clustering and a specialized buffer-pool to decrease the number of disk accesses. The advantages of the SB-index and the HSB-index over the DBMS resources for SDW indexing (i.e. star-join computation and materialized views) were investigated through performance tests, which issued roll-up operations extended with containment and intersection range queries. The performance results showed that improvements ranged from 68% up to 99% over both the star-join computation and the materialized view. Furthermore, the proposed indices proved to be very compact, adding only less than 1% to the storage requirements. Therefore, both the SB-index and the HSB-index are excellent choices for SDW indexing. Choosing between the SB-index and the HSB-index mainly depends on the query selectivity of spatial predicates. While low query selectivity benefits the HSB-index, the SB-index provides better performance for higher query selectivity.
Resumo:
The joint modeling of longitudinal and survival data is a new approach to many applications such as HIV, cancer vaccine trials and quality of life studies. There are recent developments of the methodologies with respect to each of the components of the joint model as well as statistical processes that link them together. Among these, second order polynomial random effect models and linear mixed effects models are the most commonly used for the longitudinal trajectory function. In this study, we first relax the parametric constraints for polynomial random effect models by using Dirichlet process priors, then three longitudinal markers rather than only one marker are considered in one joint model. Second, we use a linear mixed effect model for the longitudinal process in a joint model analyzing the three markers. In this research these methods were applied to the Primary Biliary Cirrhosis sequential data, which were collected from a clinical trial of primary biliary cirrhosis (PBC) of the liver. This trial was conducted between 1974 and 1984 at the Mayo Clinic. The effects of three longitudinal markers (1) Total Serum Bilirubin, (2) Serum Albumin and (3) Serum Glutamic-Oxaloacetic transaminase (SGOT) on patients' survival were investigated. Proportion of treatment effect will also be studied using the proposed joint modeling approaches. ^ Based on the results, we conclude that the proposed modeling approaches yield better fit to the data and give less biased parameter estimates for these trajectory functions than previous methods. Model fit is also improved after considering three longitudinal markers instead of one marker only. The results from analysis of proportion of treatment effects from these joint models indicate same conclusion as that from the final model of Fleming and Harrington (1991), which is Bilirubin and Albumin together has stronger impact in predicting patients' survival and as a surrogate endpoints for treatment. ^
Resumo:
Visceral leishmaniasis is caused by protozoan parasites of the Leishmania donovani complex. During active disease in humans, high levels of IFN-γ and TNF-α detected in blood serum, and high expression of IFN-γ mRNA in samples of the lymphoid organs suggest that the immune system is highly activated. However, studies using peripheral blood mononuclear cells have found immunosuppression specific to Leishmania antigens; this poor immune response probably results from Leishmania antigen-engaged lymphocytes being trapped in the lymphoid organs. To allow the parasites to multiply, deactivating cytokines IL-10 and TGF-β may be acting on macrophages as well as anti-Leishmania antibodies that opsonize amastigotes and induce IL-10 production in macrophages. These high activation and deactivation processes are likely to occur mainly in the spleen and liver and can be confirmed through the examination of organ samples. However, an analysis of sequential data from studies of visceral leishmaniasis in hamsters suggests that factors outside of the immune system are responsible for the early inactivation of inducible nitric oxide synthase, which occurs before the expression of deactivating cytokines. In active visceral leishmaniasis, the immune system actively participates in non-lymphoid organ lesioning. While current views only consider immunocomplex deposition, macrophages, T cells, cytokines, and immunoglobulins by diverse mechanism also play important roles in the pathogenesis.
Resumo:
Au Canada, les Commissions d'Examen des Troubles Mentaux de chaque province ont la responsabilité de déterminer les conditions de prise en charge des personnes déclarées Non Criminellement Responsables pour cause de Troubles Mentaux (NCRTM) et de rendre, sur une base annuelle une des trois décisions suivantes: a) détention dans un hôpital, b) libération conditionnelle, ou c) libération absolue. Pour favoriser la réinsertion sociale, la libération conditionnelle peut être ordonnée avec la condition de vivre dans une ressource d’hébergement dans la communauté. Parmi les personnes vivant avec une maladie mentale, l’accès aux ressources d’hébergement a été associé à une plus grande stabilité résidentielle, une réduction de nombre et de la durée de séjours d'hospitalisation ainsi qu’une réduction des contacts avec le système judiciaire. Toutefois, l’accès aux ressources d’hébergement pour les personnes trouvées NCRTM est limité, en partie lié à la stigmatisation qui entoure cette population. Il existe peu d’études qui traitent du placement en ressources d’hébergement en psychiatrie légale. Pour répondre à cette question, cette thèse comporte trois volets qui seront présentés dans le cadre de deux manuscrits: 1) évaluer le rôle du placement en ressources d’hébergement sur la réhospitalisation et la récidive chez les personnes trouvées NCRTM; 2) décrire les trajectoires de disposition et de placement en ressources d’hébergement, et 3) mieux comprendre les facteurs associés à ces trajectoires. Les données de la province du Québec du Projet National de Trajectoires d’individus trouvés NCRTM ont été utilisées. Un total de 934 personnes trouvées NCRTM entre le 1er mai 2000 et le 30 avril 2005 compose cet échantillon. Dans le premier manuscrit, l’analyse de survie démontre que les individus placés dans un logement indépendant suite à une libération conditionnelle de la Commission d’Examen sont plus susceptibles de commettre une nouvelle infraction et d’être ré-hospitalisés que les personnes en ressources d’hébergement. Dans le deuxième article, l'analyse de données séquentielle a généré quatre modèles statistiquement stables de trajectoires de disposition et de placement résidentiel pour les 36 mois suivant un verdict de NCRTM: 1) libération conditionnelle dans une ressource d’hébergement (11%), 2) libération conditionnelle dans un logement autonome (32%), 3) détention (43%), et 4) libération absolue (14%). Une régression logistique multinomiale révèle que la probabilité d'un placement en ressource supervisée comparé au maintien en détention est significativement réduite pour les personnes traitées dans un hôpital spécialisé en psychiatrie légale, ainsi que pour ceux ayant commis un délit sévère. D'autre part, la probabilité d’être soumis à des dispositions moins restrictives (soit le logement indépendant et la libération absolue) est fortement associée à des facteurs cliniques tels qu’un nombre réduit d'hospitalisations psychiatriques antérieures, un diagnostic de trouble de l'humeur et une absence de diagnostic de trouble de la personnalité. Les résultats de ce projet doctoral soulignent la valeur protectrice des ressources en hébergement pour les personnes trouvées NCRTM, en plus d’apporter des arguments solides pour une gestion de risque chez les personnes trouvées NCRTM qui incorpore des éléments contextuels de prévention du risque, tel que l’accès à des ressources d’hébergement.
Resumo:
This work demonstrates how partial evaluation can be put to practical use in the domain of high-performance numerical computation. I have developed a technique for performing partial evaluation by using placeholders to propagate intermediate results. For an important class of numerical programs, a compiler based on this technique improves performance by an order of magnitude over conventional compilation techniques. I show that by eliminating inherently sequential data-structure references, partial evaluation exposes the low-level parallelism inherent in a computation. I have implemented several parallel scheduling and analysis programs that study the tradeoffs involved in the design of an architecture that can effectively utilize this parallelism. I present these results using the 9- body gravitational attraction problem as an example.
Resumo:
A dynamic size-structured model is developed for phytoplankton and nutrients in the oceanic mixed layer and applied to extract phytoplankton biomass at discrete size fractions from remotely sensed, ocean-colour data. General relationships between cell size and biophysical processes (such as sinking, grazing, and primary production) of phytoplankton were included in the model through a bottom–up approach. Time-dependent, mixed-layer depth was used as a forcing variable, and a sequential data-assimilation scheme was implemented to derive model trajectories. From a given time-series, the method produces estimates of size-structured biomass at every observation, so estimates seasonal succession of individual phytoplankton size, derived here from remote sensing for the first time. From these estimates, normalized phytoplankton biomass size spectra over a period of 9 years were calculated for one location in the North Atlantic. Further analysis demonstrated that strong relationships exist between the seasonal trends of the estimated size spectra and the mixed-layer depth, nutrient biomass, and total chlorophyll. The results contain useful information on the time-dependent biomass flux in the pelagic ecosystem.
Resumo:
This paper presents a novel mobile sink area allocation scheme for consumer based mobile robotic devices with a proven application to robotic vacuum cleaners. In the home or office environment, rooms are physically separated by walls and an automated robotic cleaner cannot make a decision about which room to move to and perform the cleaning task. Likewise, state of the art cleaning robots do not move to other rooms without direct human interference. In a smart home monitoring system, sensor nodes may be deployed to monitor each separate room. In this work, a quad tree based data gathering scheme is proposed whereby the mobile sink physically moves through every room and logically links all separated sub-networks together. The proposed scheme sequentially collects data from the monitoring environment and transmits the information back to a base station. According to the sensor nodes information, the base station can command a cleaning robot to move to a specific location in the home environment. The quad tree based data gathering scheme minimizes the data gathering tour length and time through the efficient allocation of data gathering areas. A calculated shortest path data gathering tour can efficiently be allocated to the robotic cleaner to complete the cleaning task within a minimum time period. Simulation results show that the proposed scheme can effectively allocate and control the cleaning area to the robot vacuum cleaner without any direct interference from the consumer. The performance of the proposed scheme is then validated with a set of practical sequential data gathering tours in a typical office/home environment.