976 resultados para NCHS data brief (Series)


Relevância:

40.00% 40.00%

Publicador:

Resumo:

We live in an era of abundant data. This has necessitated the development of new and innovative statistical algorithms to get the most from experimental data. For example, faster algorithms make practical the analysis of larger genomic data sets, allowing us to extend the utility of cutting-edge statistical methods. We present a randomised algorithm that accelerates the clustering of time series data using the Bayesian Hierarchical Clustering (BHC) statistical method. BHC is a general method for clustering any discretely sampled time series data. In this paper we focus on a particular application to microarray gene expression data. We define and analyse the randomised algorithm, before presenting results on both synthetic and real biological data sets. We show that the randomised algorithm leads to substantial gains in speed with minimal loss in clustering quality. The randomised time series BHC algorithm is available as part of the R package BHC, which is available for download from Bioconductor (version 2.10 and above) via http://bioconductor.org/packages/2.10/bioc/html/BHC.html. We have also made available a set of R scripts which can be used to reproduce the analyses carried out in this paper. These are available from the following URL. https://sites.google.com/site/randomisedbhc/.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In modern process industry, it is often difficult to analyze a manufacture process due to its umerous time-series data. Analysts wish to not only interpret the evolution of data over time in a working procedure, but also examine the changes in the whole production process through time. To meet such analytic requirements, we have developed ProcessLine, an interactive visualization tool for a large amount of time-series data in process industry. The data are displayed in a fisheye timeline. ProcessLine provides good overviews for the whole production process and details for the focused working procedure. A preliminary user study using beer industry production data has shown that the tool is effective.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A new approach is proposed for clustering time-series data. The approach can be used to discover groupings of similar object motions that were observed in a video collection. A finite mixture of hidden Markov models (HMMs) is fitted to the motion data using the expectation-maximization (EM) framework. Previous approaches for HMM-based clustering employ a k-means formulation, where each sequence is assigned to only a single HMM. In contrast, the formulation presented in this paper allows each sequence to belong to more than a single HMM with some probability, and the hard decision about the sequence class membership can be deferred until a later time when such a decision is required. Experiments with simulated data demonstrate the benefit of using this EM-based approach when there is more "overlap" in the processes generating the data. Experiments with real data show the promising potential of HMM-based motion clustering in a number of applications.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

It is estimated that the quantity of digital data being transferred, processed or stored at any one time currently stands at 4.4 zettabytes (4.4 Ã 2 70 bytes) and this figure is expected to have grown by a factor of 10 to 44 zettabytes by 2020. Exploiting this data is, and will remain, a significant challenge. At present there is the capacity to store 33% of digital data in existence at any one time; by 2020 this capacity is expected to fall to 15%. These statistics suggest that, in the era of Big Data, the identification of important, exploitable data will need to be done in a timely manner. Systems for the monitoring and analysis of data, e.g. stock markets, smart grids and sensor networks, can be made up of massive numbers of individual components. These components can be geographically distributed yet may interact with one another via continuous data streams, which in turn may affect the state of the sender or receiver. This introduces a dynamic causality, which further complicates the overall system by introducing a temporal constraint that is difficult to accommodate. Practical approaches to realising the system described above have led to a multiplicity of analysis techniques, each of which concentrates on specific characteristics of the system being analysed and treats these characteristics as the dominant component affecting the results being sought. The multiplicity of analysis techniques introduces another layer of heterogeneity, that is heterogeneity of approach, partitioning the field to the extent that results from one domain are difficult to exploit in another. The question is asked can a generic solution for the monitoring and analysis of data that: accommodates temporal constraints; bridges the gap between expert knowledge and raw data; and enables data to be effectively interpreted and exploited in a transparent manner, be identified? The approach proposed in this dissertation acquires, analyses and processes data in a manner that is free of the constraints of any particular analysis technique, while at the same time facilitating these techniques where appropriate. Constraints are applied by defining a workflow based on the production, interpretation and consumption of data. This supports the application of different analysis techniques on the same raw data without the danger of incorporating hidden bias that may exist. To illustrate and to realise this approach a software platform has been created that allows for the transparent analysis of data, combining analysis techniques with a maintainable record of provenance so that independent third party analysis can be applied to verify any derived conclusions. In order to demonstrate these concepts, a complex real world example involving the near real-time capturing and analysis of neurophysiological data from a neonatal intensive care unit (NICU) was chosen. A system was engineered to gather raw data, analyse that data using different analysis techniques, uncover information, incorporate that information into the system and curate the evolution of the discovered knowledge. The application domain was chosen for three reasons: firstly because it is complex and no comprehensive solution exists; secondly, it requires tight interaction with domain experts, thus requiring the handling of subjective knowledge and inference; and thirdly, given the dearth of neurophysiologists, there is a real world need to provide a solution for this domain

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

BACKGROUND: Adenosine-induced transient flow arrest has been used to facilitate clip ligation of intracranial aneurysms. However, the starting dose that is most likely to produce an adequate duration of profound hypotension remains unclear. We reviewed our experience to determine the dose-response relationship and apparent perioperative safety profile of adenosine in intracranial aneurysm patients. METHODS: This case series describes 24 aneurysm clip ligation procedures performed under an anesthetic consisting of remifentanil, low-dose volatile anesthetic, and propofol in which adenosine was used. The report focuses on the doses administered; duration of systolic blood pressure <60 mm Hg (SBP(<60 mm Hg)); and any cardiovascular, neurologic, or pulmonary complications observed in the perioperative period. RESULTS: A median dose of 0.34 mg/kg ideal body weight (range: 0.29-0.44 mg/kg) resulted in a SBP(<60 mm Hg) for a median of 57 seconds (range: 26-105 seconds). There was a linear relationship between the log-transformed dose of adenosine and the duration of a SBP(<60 mm Hg) (R(2) = 0.38). Two patients developed transient, hemodynamically stable atrial fibrillation, 2 had postoperative troponin levels >0.03 ng/mL without any evidence of cardiac dysfunction, and 3 had postoperative neurologic changes. CONCLUSIONS: For intracranial aneurysms in which temporary occlusion is impractical or difficult, adenosine is capable of providing brief periods of profound systemic hypotension with low perioperative morbidity. On the basis of these data, a dose of 0.3 to 0.4 mg/kg ideal body weight may be the recommended starting dose to achieve approximately 45 seconds of profound systemic hypotension during a remifentanil/low-dose volatile anesthetic with propofol induced burst suppression.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

BACKGROUND: The Affordable Care Act encourages healthcare systems to integrate behavioral and medical healthcare, as well as to employ electronic health records (EHRs) for health information exchange and quality improvement. Pragmatic research paradigms that employ EHRs in research are needed to produce clinical evidence in real-world medical settings for informing learning healthcare systems. Adults with comorbid diabetes and substance use disorders (SUDs) tend to use costly inpatient treatments; however, there is a lack of empirical data on implementing behavioral healthcare to reduce health risk in adults with high-risk diabetes. Given the complexity of high-risk patients' medical problems and the cost of conducting randomized trials, a feasibility project is warranted to guide practical study designs. METHODS: We describe the study design, which explores the feasibility of implementing substance use Screening, Brief Intervention, and Referral to Treatment (SBIRT) among adults with high-risk type 2 diabetes mellitus (T2DM) within a home-based primary care setting. Our study includes the development of an integrated EHR datamart to identify eligible patients and collect diabetes healthcare data, and the use of a geographic health information system to understand the social context in patients' communities. Analysis will examine recruitment, proportion of patients receiving brief intervention and/or referrals, substance use, SUD treatment use, diabetes outcomes, and retention. DISCUSSION: By capitalizing on an existing T2DM project that uses home-based primary care, our study results will provide timely clinical information to inform the designs and implementation of future SBIRT studies among adults with multiple medical conditions.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Time-series and sequences are important patterns in data mining. Based on an ontology of time-elements, this paper presents a formal characterization of time-series and state-sequences, where a state denotes a collection of data whose validation is dependent on time. While a time-series is formalized as a vector of time-elements temporally ordered one after another, a state-sequence is denoted as a list of states correspondingly ordered by a time-series. In general, a time-series and a state-sequence can be incomplete in various ways. This leads to the distinction between complete and incomplete time-series, and between complete and incomplete state-sequences, which allows the expression of both absolute and relative temporal knowledge in data mining.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In 2000 a Review of Current Marine Observations in relation to present and future needs was undertaken by the Inter-Agency Committee for Marine Science and Technology (IACMST). The Marine Environmental Change Network (MECN) was initiated in 2002 as a direct response to the recommendations of the report. A key part of the current phase of the MECN is to ensure that information from the network is provided to policy makers and other end-users to enable them to produce more accurate assessments of ecosystem state and gain a clearer understanding of factors influencing change in marine ecosystems. The MECN holds workshops on an annual basis, bringing together partners maintaining time-series and long-term datasets as well as end-users interested in outputs from the network. It was decided that the first workshop of the MECN continuation phase should consist of an evaluation of the time series and data sets maintained by partners in the MECN with regard to their â˜fit for purposeâ for answering key science questions and informing policy development. This report is based on the outcomes of the workshop. Section one of the report contains a brief introduction to monitoring, time series and long-term datasets. The various terms are defined and the need for MECN type data to complement compliance monitoring programmes is discussed. Outlines are also given of initiatives such as the United Kingdom Marine Monitoring and Assessment Strategy (UKMMAS) and Oceans 2025. Section two contains detailed information for each of the MECN time series / long-term datasets including information on scientific outputs and current objectives. This information is mainly based on the presentations given at the workshop and therefore follows a format whereby the following headings are addressed: Origin of time series including original objectives; current objectives; policy relevance; products (advice, publications, science and society). Section three consists of comments made by the review panel concerning all the time series and the network. Needs or issues highlighted by the panel with regard to the future of long-term datasets and time-series in the UK are shown along with advice and potential solutions where offered. The recommendations are divided into 4 categories; â˜The MECN and end-user requirementsâ; â˜Procedures & protocolsâ; â˜Securing data seriesâ and â˜Future developmentsâ. Ever since marine environmental protection issues really came to the fore in the 1960s, it has been recognised that there is a requirement for a suitable evidence base on environmental change in order to support policy and management for UK waters. Section four gives a brief summary of the development of marine policy in the UK along with comments on the availability and necessity of long-term marine observations for the implementation of this policy. Policy relating to three main areas is discussed; Marine Conservation (protecting biodiversity and marine ecosystems); Marine Pollution and Fisheries. The conclusion of this section is that there has always been a specific requirement for information on long-term change in marine ecosystems around the UK in order to address concerns over pollution, fishing and general conservation. It is now imperative that this need is addressed in order for the UK to be able to fulfil its policy commitments and manage marine ecosystems in the light of climate change and other factors.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present a unique view of mackerel (Scomber scombrus) in the North Sea based on a new time series of larvae caught by the Continuous Plankton Recorder (CPR) survey from 1948-2005, covering the period both before and after the collapse of the North Sea stock. Hydrographic backtrack modelling suggested that the effect of advection is very limited between spawning and larvae capture in the CPR survey. Using a statistical technique not previously applied to CPR data, we then generated a larval index that accounts for both catchability as well as spatial and temporal autocorrelation. The resulting time series documents the significant decrease of spawning from before 1970 to recent depleted levels. Spatial distributions of the larvae, and thus the spawning area, showed a shift from early to recent decades, suggesting that the central North Sea is no longer as important as the areas further west and south. These results provide a consistent and unique perspective on the dynamics of mackerel in this region and can potentially resolve many of the unresolved questions about this stock

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Interannual and seasonal trends of zooplankton abundance and species composition were compared between the Bongo net and Continuous Plankton Recorder (CPR) time series in the Gulf of Maine. Data from 5799 Bongo and 3118 CPR samples were compared from the years 1978â2006. The two programs use different sampling methods, with the Bongo time series composed of bimonthly vertically integrated samples from locations throughout the region, while the CPR was towed monthly at 10 m depth on a transect that bisects the region. It was found that there was a significant correlation between the interannual (r = 0.67, P < 0.01) and seasonal (r = 0.95, P < 0.01) variability of total zooplankton counts. Abundance rankings of individual taxa were highly correlated and temporal trends of dominant copepods were similar between samplers. Multivariate analysis also showed that both time series equally detected major shifts in community structure through time. However, absolute abundance levels were higher in the Bongo and temporal patterns for many of the less abundant taxa groups were not similar between the two devices. The different mesh sizes of the samplers probably caused some of the discrepancies; but diel migration patterns, damage to soft bodied animals and avoidance of the small CPR aperture by some taxa likely contributed to the catch differences between the two devices. Nonetheless, Bongo data presented here confirm the previously published patterns found in the CPR data set, and both show that the abundance increase of the 1990s has been followed by average to below average levels from 2002 to 06.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Historical GIS has the potential to re-invigorate our use of statistics from historical censuses and related sources. In particular, areal interpolation can be used to create long-run time-series of spatially detailed data that will enable us to enhance significantly our understanding of geographical change over periods of a century or more. The difficulty with areal interpolation, however, is that the data that it generates are estimates which will inevitably contain some error. This paper describes a technique that allows the automated identification of possible errors at the level of the individual data values.