353 resultados para pseudo-random number generator


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a new method for performing Bayesian parameter inference and model choice for low count time series models with intractable likelihoods. The method involves incorporating an alive particle filter within a sequential Monte Carlo (SMC) algorithm to create a novel pseudo-marginal algorithm, which we refer to as alive SMC^2. The advantages of this approach over competing approaches is that it is naturally adaptive, it does not involve between-model proposals required in reversible jump Markov chain Monte Carlo and does not rely on potentially rough approximations. The algorithm is demonstrated on Markov process and integer autoregressive moving average models applied to real biological datasets of hospital-acquired pathogen incidence, animal health time series and the cumulative number of poison disease cases in mule deer.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The proliferation of the web presents an unsolved problem of automatically analyzing billions of pages of natural language. We introduce a scalable algorithm that clusters hundreds of millions of web pages into hundreds of thousands of clusters. It does this on a single mid-range machine using efficient algorithms and compressed document representations. It is applied to two web-scale crawls covering tens of terabytes. ClueWeb09 and ClueWeb12 contain 500 and 733 million web pages and were clustered into 500,000 to 700,000 clusters. To the best of our knowledge, such fine grained clustering has not been previously demonstrated. Previous approaches clustered a sample that limits the maximum number of discoverable clusters. The proposed EM-tree algorithm uses the entire collection in clustering and produces several orders of magnitude more clusters than the existing algorithms. Fine grained clustering is necessary for meaningful clustering in massive collections where the number of distinct topics grows linearly with collection size. These fine-grained clusters show an improved cluster quality when assessed with two novel evaluations using ad hoc search relevance judgments and spam classifications for external validation. These evaluations solve the problem of assessing the quality of clusters where categorical labeling is unavailable and unfeasible.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A measurement campaign was conducted from 3 to 19 December 2012 at an urban site of Brisbane, Australia. Size distribution of ions and particle number concentrations were measured to investigate the influence of particle formation and biomass burning on atmospheric ion and particle concentrations. Overall ion and particle number concentrations during the measurement period were found to be (-1.2 x 103 cm-3 | +1.6 x 103 cm-3) and 4.4 x 103, respectively. The results of correlation analysis between concentrations of ions and nitrogen oxides indicated that positive and negative ions originated from similar sources, and that vehicle exhaust emissions had a more significant influence on intermediate/large ions, while cluster ions rapidly attached to larger particles once emitted into the atmosphere. Diurnal variations in ion concentration suggested the enrichment of intermediate and large ions on new particle formation event days, indicating that they were involved in the particle formation processes. Elevated total ions, particularly larger ions, and particle number concentrations were found during biomass burning episodes. This could be due to the attachment of cluster ions onto accumulation mode particles or production of charged particles from biomass burning, which were in turn transported to the measurement site. The results of this work enhance scientific understanding of the sources of atmospheric ions in an urban environment, as well as their interactions with particles during particle formation processes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an uncertainty quantification study of the performance analysis of the high pressure ratio single stage radial-inflow turbine used in the Sundstrand Power Systems T-100 Multi-purpose Small Power Unit. A deterministic 3D volume-averaged Computational Fluid Dynamics (CFD) solver is coupled with a non-statistical generalized Polynomial Chaos (gPC) representation based on a pseudo-spectral projection method. One of the advantages of this approach is that it does not require any modification of the CFD code for the propagation of random disturbances in the aerodynamic and geometric fields. The stochastic results highlight the importance of the blade thickness and trailing edge tip radius on the total-to-static efficiency of the turbine compared to the angular velocity and trailing edge tip length. From a theoretical point of view, the use of the gPC representation on an arbitrary grid also allows the investigation of the sensitivity of the blade thickness profiles on the turbine efficiency. The gPC approach is also applied to coupled random parameters. The results show that the most influential coupled random variables are trailing edge tip radius coupled with the angular velocity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective This paper presents an automatic active learning-based system for the extraction of medical concepts from clinical free-text reports. Specifically, (1) the contribution of active learning in reducing the annotation effort, and (2) the robustness of incremental active learning framework across different selection criteria and datasets is determined. Materials and methods The comparative performance of an active learning framework and a fully supervised approach were investigated to study how active learning reduces the annotation effort while achieving the same effectiveness as a supervised approach. Conditional Random Fields as the supervised method, and least confidence and information density as two selection criteria for active learning framework were used. The effect of incremental learning vs. standard learning on the robustness of the models within the active learning framework with different selection criteria was also investigated. Two clinical datasets were used for evaluation: the i2b2/VA 2010 NLP challenge and the ShARe/CLEF 2013 eHealth Evaluation Lab. Results The annotation effort saved by active learning to achieve the same effectiveness as supervised learning is up to 77%, 57%, and 46% of the total number of sequences, tokens, and concepts, respectively. Compared to the Random sampling baseline, the saving is at least doubled. Discussion Incremental active learning guarantees robustness across all selection criteria and datasets. The reduction of annotation effort is always above random sampling and longest sequence baselines. Conclusion Incremental active learning is a promising approach for building effective and robust medical concept extraction models, while significantly reducing the burden of manual annotation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cerebral responses to alternating periods of a control task and a selective letter generation paradigm were investigated with functional Magnetic Resonance Imaging (fMRI). Subjects selectively generated letters from four designated sets of six letters from the English language alphabet, with the instruction that they were not to produce letters in alphabetical order either forward or backward, repeat or alternate letters. Performance during this condition was compared with that of a control condition in which subjects recited the same letters in alphabetical order. Analyses revealed significant and extensive foci of activation in a number of cerebral regions including mid-dorsolateral frontal cortex, inferior frontal gyrus, precuneus, supramarginal gyrus, and cerebellum during the selective letter generation condition. These findings are discussed with respect to recent positron emission tomography (PET) and fMRI studies of verbal working memory and encoding/retrieval in episodic memory.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As connectivity analyses become more popular, claims are often made about how the brain's anatomical networks depend on age, sex, or disease. It is unclear how results depend on tractography methods used to compute fiber networks. We applied 11 tractography methods to high angular resolution diffusion images of the brain (4-Tesla 105-gradient HARDI) from 536 healthy young adults. We parcellated 70 cortical regions, yielding 70×70 connectivity matrices, encoding fiber density. We computed popular graph theory metrics, including network efficiency, and characteristic path lengths. Both metrics were robust to the number of spherical harmonics used to model diffusion (4th-8th order). Age effects were detected only for networks computed with the probabilistic Hough transform method, which excludes smaller fibers. Sex and total brain volume affected networks measured with deterministic, tensor-based fiber tracking but not with the Hough method. Each tractography method includes different fibers, which affects inferences made about the reconstructed networks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For the first decade of its existence, the concept of citizen journalism has described an approach which was seen as a broadening of the participant base in journalistic processes, but still involved only a comparatively small subset of overall society – for the most part, citizen journalists were news enthusiasts and “political junkies” (Coleman, 2006) who, as some exasperated professional journalists put it, “wouldn’t get a job at a real newspaper” (The Australian, 2007), but nonetheless followed many of the same journalistic principles. The investment – if not of money, then at least of time and effort – involved in setting up a blog or participating in a citizen journalism Website remained substantial enough to prevent the majority of Internet users from engaging in citizen journalist activities to any significant extent; what emerged in the form of news blogs and citizen journalism sites was a new online elite which for some time challenged the hegemony of the existing journalistic elite, but gradually also merged with it. The mass adoption of next-generation social media platforms such as Facebook and Twitter, however, has led to the emergence of a new wave of quasi-journalistic user activities which now much more closely resemble the “random acts of journalism” which JD Lasica envisaged in 2003. Social media are not exclusively or even predominantly used for citizen journalism; instead, citizen journalism is now simply a by-product of user communities engaging in exchanges about the topics which interest them, or tracking emerging stories and events as they happen. Such platforms – and especially Twitter with its system of ad hoc hashtags that enable the rapid exchange of information about issues of interest – provide spaces for users to come together to “work the story” through a process of collaborative gatewatching (Bruns, 2005), content curation, and information evaluation which takes place in real time and brings together everyday users, domain experts, journalists, and potentially even the subjects of the story themselves. Compared to the spaces of news blogs and citizen journalism sites, but also of conventional online news Websites, which are controlled by their respective operators and inherently position user engagement as a secondary activity to content publication, these social media spaces are centred around user interaction, providing a third-party space in which everyday as well as institutional users, laypeople as well as experts converge without being able to control the exchange. Drawing on a number of recent examples, this article will argue that this results in a new dynamic of interaction and enables the emergence of a more broadly-based, decentralised, second wave of citizen engagement in journalistic processes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Product reviews are the foremost source of information for customers and manufacturers to help them make appropriate purchasing and production decisions. Natural language data is typically very sparse; the most common words are those that do not carry a lot of semantic content, and occurrences of any particular content-bearing word are rare, while co-occurrences of these words are rarer. Mining product aspects, along with corresponding opinions, is essential for Aspect-Based Opinion Mining (ABOM) as a result of the e-commerce revolution. Therefore, the need for automatic mining of reviews has reached a peak. In this work, we deal with ABOM as sequence labelling problem and propose a supervised extraction method to identify product aspects and corresponding opinions. We use Conditional Random Fields (CRFs) to solve the extraction problem and propose a feature function to enhance accuracy. The proposed method is evaluated using two different datasets. We also evaluate the effectiveness of feature function and the optimisation through multiple experiments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This research investigated the use of DNA fingerprinting to characterise the bacteria Streptococcus pneumoniae or pneumococcus, and hence gain insight into the development of new vaccines or antibiotics. Different bacterial DNA fingerprinting methods were studied, and a novel method was developed and validated, which characterises different cell coatings that pneumococci produce. This method was used to study the epidemiology of pneumococci in Queensland before and after the introduction of the current pneumococcal vaccine. This study demonstrated that pneumococcal disease is highly prevalent in children under four years, that the bacteria can `switch' its cell coating to evade the vaccine, and that some DNA fingerprinting methods are more discriminatory than others. This has an impact on understanding which strains are more prone to cause invasive disease. Evidence of the excellent research findings have been published in high impact internationally refereed journals.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective We examined whether exposure to a greater number of fruits, vegetables, and noncore foods (ie, nutrient poor and high in saturated fats, added sugars, or added salt) at age 14 months was related to children’s preference for and intake of these foods as well as maternal-reported food fussiness and measured child weight status at age 3.7 years. Methods This study reports secondary analyses of longitudinal data from mothers and children (n=340) participating in the NOURISH randomized controlled trial. Exposure was quantified as the number of food items (n=55) tried by a child from specified lists at age 14 months. At age 3.7 years, food preferences, intake patterns, and fussiness (also at age 14 months) were assessed using maternal-completed, established questionnaires. Child weight and length/height were measured by study staff at both age points. Multivariable linear regression models were tested to predict food preferences, intake patterns, fussy eating, and body mass index z score at age 3.7 years adjusting for a range of maternal and child covariates. Results Having tried a greater number of vegetables, fruits, and noncore foods at age 14 months predicted corresponding preferences and higher intakes at age 3.7 years but did not predict child body mass index z score. Adjusting for fussiness at age 14 months, having tried more vegetables at age 14 months was associated with lower fussiness at age 3.7 years. Conclusions These prospective analyses support the hypothesis that early taste and texture experiences influence subsequent food preferences and acceptance. These findings indicate introduction to a variety of fruits and vegetables and limited noncore food exposure from an early age are important strategies to improve later diet quality.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The inverse temperature hyperparameter of the hidden Potts model governs the strength of spatial cohesion and therefore has a substantial influence over the resulting model fit. The difficulty arises from the dependence of an intractable normalising constant on the value of the inverse temperature, thus there is no closed form solution for sampling from the distribution directly. We review three computational approaches for addressing this issue, namely pseudolikelihood, path sampling, and the approximate exchange algorithm. We compare the accuracy and scalability of these methods using a simulation study.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This research investigated differences and associations in performance in number processing and executive function for children attending primary school in a large Australian metropolitan city. In a cross-sectional study, performance of 25 children in the first full-time year of school, (Prep; mean age = 5.5 years) and 21 children in Year 3 (mean age = 8.5 years) completed three number processing tasks and three executive function tasks. Year 3 children consistently outperformed the Prep year children on measures of accuracy and reaction time, on the tasks of number comparison, calculation, shifting, and inhibition but not on number line estimation. The components of executive function (shifting, inhibition, and working memory) showed different patterns of correlation to performance on number processing tasks across the early years of school. Findings could be used to enhance teachers’ understanding about the role of the cognitive processes employed by children in numeracy learning, and so inform teachers’ classroom practices.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traffic congestion has been a growing issue in many metropolitan areas during recent years, which necessitates the identification of its key contributors and development of sustainable strategies to help decrease its adverse impacts on traffic networks. Road incidents generally and crashes specifically have been acknowledged as the cause of a large proportion of travel delays in urban areas and account for 25% to 60% of traffic congestion on motorways. Identifying the critical determinants of travel delays has been of significant importance to the incident management systems which constantly collect and store the incident duration data. This study investigates the individual and simultaneous differential effects of the relevant determinants on motorway crash duration probabilities. In particular, it applies parametric Accelerated Failure Time (AFT) hazard-based models to develop in-depth insights into how the crash-specific characteristic and the associated temporal and infrastructural determinants impact the duration. AFT models with both fixed and random parameters have been calibrated on one year of traffic crash records from two major Australian motorways in South East Queensland and the differential effects of determinants on crash survival functions have been studied on these two motorways individually. A comprehensive spectrum of commonly used parametric fixed parameter AFT models, including generalized gamma and generalized F families, have been compared to random parameter AFT structures in terms of goodness of fit to the duration data and as a result, the random parameter Weibull AFT model has been selected as the most appropriate model. Significant determinants of motorway crash duration included traffic diversion requirement, crash injury type, number and type of vehicles involved in a crash, day of week and time of day, towing support requirement and damage to the infrastructure. A major finding of this research is that the motorways under study are significantly different in terms of crash durations; such that motorway exhibits durations that are on average 19% shorter compared to the durations on motorway. The differential effects of explanatory variables on crash durations are also different on the two motorways. The detailed presented analysis confirms that, looking at the motorway network as a whole, neglecting the individual differences between roads, can lead to erroneous interpretations of duration and inefficient strategies for mitigating travel delays along a particular motorway.