40 resultados para Patent data analysis
Resumo:
Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571-578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.
Resumo:
An increasing number of studies shows that the glycogen-accumulating organisms (GAOs) can survive and may indeed proliferate under the alternating anaerobic/aerobic conditions found in EBPR systems, thus forming a strong competitor of the polyphosphate-accumulating organisms (PAOs). Understanding their behaviors in a mixed PAO and GAO culture under various operational conditions is essential for developing operating strategies that disadvantage the growth of this group of unwanted organisms. A model-based data analysis method is developed in this paper for the study of the anaerobic PAO and GAO activities in a mixed PAO and GAO culture. The method primarily makes use of the hydrogen ion production rate and the carbon dioxide transfer rate resulting from the acetate uptake processes by PAOs and GAOs, measured with a recently developed titration and off-gas analysis (TOGA) sensor. The method is demonstrated using the data from a laboratory-scale sequencing batch reactor (SBR) operated under alternating anaerobic and aerobic conditions. The data analysis using the proposed method strongly indicates a coexistence of PAOs and GAOs in the system, which was independently confirmed by fluorescent in situ hybridization (FISH) measurement. The model-based analysis also allowed the identification of the respective acetate uptake rates by PAOs and GAOs, along with a number of kinetic and stoichiometric parameters involved in the PAO and GAO models. The excellent fit between the model predictions and the experimental data not involved in parameter identification shows that the parameter values found are reliable and accurate. It also demonstrates that the current anaerobic PAO and GAO models are able to accurately characterize the PAO/GAO mixed culture obtained in this study. This is of major importance as no pure culture of either PAOs or GAOs has been reported to date, and hence the current PAO and GAO models were developed for the interpretation of experimental results of mixed cultures. The proposed method is readily applicable for detailed investigations of the competition between PAOs and GAOs in enriched cultures. However, the fermentation of organic substrates carried out by ordinary heterotrophs needs to be accounted for when the method is applied to the study of PAO and GAO competition in full-scale sludges. (C) 2003 Wiley Periodicals, Inc.
Resumo:
Quantile computation has many applications including data mining and financial data analysis. It has been shown that an is an element of-approximate summary can be maintained so that, given a quantile query d (phi, is an element of), the data item at rank [phi N] may be approximately obtained within the rank error precision is an element of N over all N data items in a data stream or in a sliding window. However, scalable online processing of massive continuous quantile queries with different phi and is an element of poses a new challenge because the summary is continuously updated with new arrivals of data items. In this paper, first we aim to dramatically reduce the number of distinct query results by grouping a set of different queries into a cluster so that they can be processed virtually as a single query while the precision requirements from users can be retained. Second, we aim to minimize the total query processing costs. Efficient algorithms are developed to minimize the total number of times for reprocessing clusters and to produce the minimum number of clusters, respectively. The techniques are extended to maintain near-optimal clustering when queries are registered and removed in an arbitrary fashion against whole data streams or sliding windows. In addition to theoretical analysis, our performance study indicates that the proposed techniques are indeed scalable with respect to the number of input queries as well as the number of items and the item arrival rate in a data stream.
Resumo:
The paper investigates a Bayesian hierarchical model for the analysis of categorical longitudinal data from a large social survey of immigrants to Australia. Data for each subject are observed on three separate occasions, or waves, of the survey. One of the features of the data set is that observations for some variables are missing for at least one wave. A model for the employment status of immigrants is developed by introducing, at the first stage of a hierarchical model, a multinomial model for the response and then subsequent terms are introduced to explain wave and subject effects. To estimate the model, we use the Gibbs sampler, which allows missing data for both the response and the explanatory variables to be imputed at each iteration of the algorithm, given some appropriate prior distributions. After accounting for significant covariate effects in the model, results show that the relative probability of remaining unemployed diminished with time following arrival in Australia.
Resumo:
The importance of availability of comparable real income aggregates and their components to applied economic research is highlighted by the popularity of the Penn World Tables. Any methodology designed to achieve such a task requires the combination of data from several sources. The first is purchasing power parities (PPP) data available from the International Comparisons Project roughly every five years since the 1970s. The second is national level data on a range of variables that explain the behaviour of the ratio of PPP to market exchange rates. The final source of data is the national accounts publications of different countries which include estimates of gross domestic product and various price deflators. In this paper we present a method to construct a consistent panel of comparable real incomes by specifying the problem in state-space form. We present our completed work as well as briefly indicate our work in progress.
Resumo:
Traditionally the basal ganglia have been implicated in motor behavior, as they are involved in both the execution of automatic actions and the modification of ongoing actions in novel contexts. Corresponding to cognition, the role of the basal ganglia has not been defined as explicitly. Relative to linguistic processes, contemporary theories of subcortical participation in language have endorsed a role for the globus pallidus internus (GPi) in the control of lexical-semantic operations. However, attempts to empirically validate these postulates have been largely limited to neuropsychological investigations of verbal fluency abilities subsequent to pallidotomy. We evaluated the impact of bilateral posteroventral pallidotomy (BPVP) on language function across a range of general and high-level linguistic abilities, and validated/extended working theories of pallidal participation in language. Comprehensive linguistic profiles were compiled up to 1 month before and 3 months after BPVP in 6 subjects with Parkinson's disease (PD). Commensurate linguistic profiles were also gathered over a 3-month period for a nonsurgical control cohort of 16 subjects with PD and a group of 16 non-neurologically impaired controls (NC). Nonparametric between-groups comparisons were conducted and reliable change indices calculated, relative to baseline/3-month follow-up difference scores. Group-wise statistical comparisons between the three groups failed to reveal significant postoperative changes in language performance. Case-by-case data analysis relative to clinically consequential change indices revealed reliable alterations in performance across several language variables as a consequence of BPVP. These findings lend support to models of subcortical participation in language, which promote a role for the GPi in lexical-semantic manipulation mechanisms. Concomitant improvements and decrements in postoperative performance were interpreted within the context of additive and subtractive postlesional effects. Relative to parkinsonian cohorts, clinically reliable versus statistically significant changes on a case by case basis may provide the most accurate method of characterizing the way in which pathophysiologically divergent basal ganglia linguistic circuits respond to BPVP.
Resumo:
Observational longitudinal research is particularly useful for assessing etiology and prognosis and for providing evidence for clinical decision making. However, there are no structured reporting requirements for studies of this design to assist authors, editors, and readers. The authors developed and tested a checklist of criteria related to threats to the internal and external validity of observational longitudinal studies. The checklist criteria concerned recruitment, data collection, biases, and data analysis and descriptive issues relevant to study rationale, study population, and generalizability. Two raters independently assessed 49 randomly selected articles describing stroke research published from 1999 to 2003 in six journals: American Journal of Epidemiology, Journal of Epidemiology and Community Health, Stroke, Annals of Neurology, Archives of Physical Medicine and Rehabilitation, and American Journal of Physical Medicine and Rehabilitation. On average, 17 of the 33 checklist criteria were reported. Criteria describing the study design were better reported than those related to internal validity. No relation was found between study type (etiologic or prognostic) or word count and quality of reporting. A flow diagram for summarizing participant flow through a study was developed. Editors and authors should consider using a checklist and flow diagram when reporting on observational longitudinal research.
Resumo:
Purpose: The purpose of the study was to assess quantitative ultrasound (QUS) parameters in collegiate female gymnasts, a population whose training incorporates high-impact loading, which is particularly osteogenic, and to determine the discriminative capacity of this relatively new radiation-free technique compared with bone densitometry in a young healthy population. Methods: We studied 19 collegiate gymnasts and 23 healthy controls undergoing regular weight-bearing activity, matched for age (gymnasts 19.2 +/- 1.2, controls 19.9 +/- 1.6 yr) and body weight (gymnasts 56.7 +/- 3.7, controls 57.7 +/- 7.8 kg). QUS parameters of the calcaneus (broadband ultrasound attenuation (BUA), bone velocity (BV), and speed of sound (SOS)) were measured by a Walker Sonix UBA 575+. Bone mineral density (BMD; g.cm(-2)) of the lumbar spine, hip (Femoral neck, trochanter. Ward's triangle) and whole body was assessed by dual energy x-ray absorptiometry (DXA, Hologic QDR 1000/W). Data analysis included unpaired two-tailed Student's t-tests, analysis of variance, Pearson product-moment, and Spearman rank-order correlations. Results: Regional and whole body BMD of gymnasts was greater than controls (P < 0.001), with the difference being 7-28%. Average QUS parameters of the right and left calcaneus were also higher (P < 0.001) in the gymnasts. BUA, BV, and SOS were significantly (P < 0.001) correlated to each bone site with r = 0.54-0.79. Analysis of receiver operating characteristic (ROC) curves indicated no significant difference in sensitivity and specificity for QUS and DXA measures. Conclusions: These results indicate that QUS parameters of the calcaneus are higher in young women gymnasts compared to individuals who undergo regular weight-bearing activity and that QUS parameters are able to discriminate between these two groups in a similar manner as does regional and whole body BMD.
Resumo:
We used an event related fMRI design to study the BOLD response in Huntington’s disease (HD) patients during performance of a Simon interference task. We hypothesised that HD patients will demonstrate significantly slower RTs than controls, and that there will be significant differences in the pattern of brain activation between groups. Seventeen HD patients and 15 age and sex matched controls were scanned using 3T GE scanner (FOV = 24 cm2; TE = 40 ms; TR = 3 s; FA = 60°; slice thickness = 6 mm; in-plane resolution = 1.88x1.88 mm2). The task involved two activation conditions, namely congruent (for example, left pointing arrow appearing on the left side of the screen) and incongruent (for example, left pointing arrow appearing on the right side of the screen), and a baseline condition. Each stimulus was presented for 2500 ms followed by a blank screen for 500 ms. Subjects were instructed to press a button using the same hand as indicated by the direction of the arrow head and were given 3000 ms to respond. Data analysis was performed using SPM2 with a random effects analysis model. For each subject parameter estimates for combined task conditions (congruent and incongruent combined) were calculated. Comparisons such as these, based on block designs, have superior statistical power for detecting subtle changes in the BOLD response anywhere in the brain. The activations reported are significant at PFDR_corr
Resumo:
The humpback whales that migrate along the east coast of Australia were hunted to near-extinction in the 1950s and early 1960s. Two independent series of land-based surveys conducted over the last 25 years during the whales’ northward migration along the Australian coastline have demonstrated a rapid increase in the size of the population. In 2004 we conducted a survey of the migratory population as a continuation of these series of surveys. Two methods of data analysis were used in line with the previous surveys, both for calculation of absolute and relative abundance. We consider the best estimates for 2004 to be 7,090 ± 660 (95% CI) whales with an annual rate of increase of 10.6 ± 0.5% (95% CI) for 1987 – 2004. The rate of increase agrees with those previously obtained for this population and demonstrates the continuation of a strong post-exploitation recovery. While there are still some uncertainties concerning the absolute abundance estimate and structure of this population, the rate of annual increase should be independent of these and highly robust.