992 resultados para computational statistics
Resumo:
The Guardian reportage of the United Kingdom Member of Parliament (MP) expenses scandal of 2009 used crowdsourcing and computational journalism techniques. Computational journalism can be broadly defined as the application of computer science techniques to the activities of journalism. Its foundation lies in computer assisted reporting techniques and its importance is increasing due to the: (a) increasing availability of large scale government datasets for scrutiny; (b) declining cost, increasing power and ease of use of data mining and filtering software; and Web 2.0; and (c) explosion of online public engagement and opinion.. This paper provides a case study of the Guardian MP expenses scandal reportage and reveals some key challenges and opportunities for digital journalism. It finds journalists may increasingly take an active role in understanding, interpreting, verifying and reporting clues or conclusions that arise from the interrogations of datasets (computational journalism). Secondly a distinction should be made between information reportage and computational journalism in the digital realm, just as a distinction might be made between citizen reporting and citizen journalism. Thirdly, an opportunity exists for online news providers to take a ‘curatorial’ role, selecting and making easily available the best data sources for readers to use (information reportage). These activities have always been fundamental to journalism, however the way in which they are undertaken may change. Findings from this paper may suggest opportunities and challenges for the implementation of computational journalism techniques in practice by digital Australian media providers, and further areas of research.
Resumo:
There are at least four key challenges in the online news environment that computational journalism may address. Firstly, news providers operate in a rapidly evolving environment and larger businesses are typically slower to adapt to market innovations. News consumption patterns have changed and news providers need to find new ways to capture and retain digital users. Meanwhile, declining financial performance has led to cost cuts in mass market newspapers. Finally investigative reporting is typically slow, high cost and may be tedious, and yet is valuable to the reputation of a news provider. Computational journalism involves the application of software and technologies to the activities of journalism, and it draws from the fields of computer science, social science and communications. New technologies may enhance the traditional aims of journalism, or may require “a new breed of people who are midway between technologists and journalists” (Irfan Essa in Mecklin 2009: 3). Historically referred to as ‘computer assisted reporting’, the use of software in online reportage is increasingly valuable due to three factors: larger datasets are becoming publicly available; software is becoming sophisticated and ubiquitous; and the developing Australian digital economy. This paper introduces key elements of computational journalism – it describes why it is needed; what it involves; benefits and challenges; and provides a case study and examples. Computational techniques can quickly provide a solid factual basis for original investigative journalism and may increase interaction with readers, when correctly used. It is a major opportunity to enhance the delivery of original investigative journalism, which ultimately may attract and retain readers online.
Resumo:
The health of tollbooth workers is seriously threatened by long-term exposure to polluted air from vehicle exhausts. Using traffic data collected at a toll plaza, vehicle movements were simulated by a system dynamics model with different traffic volumes and toll collection procedures. This allowed the average travel time of vehicles to be calculated. A three-dimension Computational Fluid Dynamics (CFD) model was used with a k–ε turbulence model to simulate pollutant dispersion at the toll plaza for different traffic volumes and toll collection procedures. It was shown that pollutant concentration around tollbooths increases as traffic volume increases. Whether traffic volume is low or high (1500 vehicles/h or 2500 vehicles/h), pollutant concentration decreases if electronic toll collection (ETC) is adopted. In addition, pollutant concentration around tollbooths decreases as the proportion of ETC-equipped vehicles increases. However, if the proportion of ETC-equipped vehicles is very low and the traffic volume is not heavy, then pollutant concentration increases as the number of ETC lanes increases.
Resumo:
Recently, the numerical modelling and simulation for fractional partial differential equations (FPDE), which have been found with widely applications in modern engineering and sciences, are attracting increased attentions. The current dominant numerical method for modelling of FPDE is the explicit Finite Difference Method (FDM), which is based on a pre-defined grid leading to inherited issues or shortcomings. This paper aims to develop an implicit meshless approach based on the radial basis functions (RBF) for numerical simulation of time fractional diffusion equations. The discrete system of equations is obtained by using the RBF meshless shape functions and the strong-forms. The stability and convergence of this meshless approach are then discussed and theoretically proven. Several numerical examples with different problem domains are used to validate and investigate accuracy and efficiency of the newly developed meshless formulation. The results obtained by the meshless formations are also compared with those obtained by FDM in terms of their accuracy and efficiency. It is concluded that the present meshless formulation is very effective for the modelling and simulation for FPDE.
Resumo:
Methicillin-resistant Staphylococcus Aureus (MRSA) is a pathogen that continues to be of major concern in hospitals. We develop models and computational schemes based on observed weekly incidence data to estimate MRSA transmission parameters. We extend the deterministic model of McBryde, Pettitt, and McElwain (2007, Journal of Theoretical Biology 245, 470–481) involving an underlying population of MRSA colonized patients and health-care workers that describes, among other processes, transmission between uncolonized patients and colonized health-care workers and vice versa. We develop new bivariate and trivariate Markov models to include incidence so that estimated transmission rates can be based directly on new colonizations rather than indirectly on prevalence. Imperfect sensitivity of pathogen detection is modeled using a hidden Markov process. The advantages of our approach include (i) a discrete valued assumption for the number of colonized health-care workers, (ii) two transmission parameters can be incorporated into the likelihood, (iii) the likelihood depends on the number of new cases to improve precision of inference, (iv) individual patient records are not required, and (v) the possibility of imperfect detection of colonization is incorporated. We compare our approach with that used by McBryde et al. (2007) based on an approximation that eliminates the health-care workers from the model, uses Markov chain Monte Carlo and individual patient data. We apply these models to MRSA colonization data collected in a small intensive care unit at the Princess Alexandra Hospital, Brisbane, Australia.
Resumo:
For many decades correlation and power spectrum have been primary tools for digital signal processing applications in the biomedical area. The information contained in the power spectrum is essentially that of the autocorrelation sequence; which is sufficient for complete statistical descriptions of Gaussian signals of known means. However, there are practical situations where one needs to look beyond autocorrelation of a signal to extract information regarding deviation from Gaussianity and the presence of phase relations. Higher order spectra, also known as polyspectra, are spectral representations of higher order statistics, i.e. moments and cumulants of third order and beyond. HOS (higher order statistics or higher order spectra) can detect deviations from linearity, stationarity or Gaussianity in the signal. Most of the biomedical signals are non-linear, non-stationary and non-Gaussian in nature and therefore it can be more advantageous to analyze them with HOS compared to the use of second order correlations and power spectra. In this paper we have discussed the application of HOS for different bio-signals. HOS methods of analysis are explained using a typical heart rate variability (HRV) signal and applications to other signals are reviewed.
Resumo:
Scoliosis is a spinal deformity that requires surgical correction in progressive cases. In order to optimize surgical outcomes, patient-specific finite element models are being developed by our group. In this paper, a single rod anterior correction procedure is simulated for a group of six scoliosis patients. For each patient, personalised model geometry was derived from low-dose CT scans, and clinically measured intra-operative corrective forces were applied. However, tissue material properties were not patient-specific, being derived from existing literature. Clinically, the patient group had a mean initial Cobb angle of 47.3 degrees, which was corrected to 17.5 degrees after surgery. The mean simulated post-operative Cobb angle for the group was 18.1 degrees. Although this represents good agreement between clinical and simulated corrections, the discrepancy between clinical and simulated Cobb angle for individual patients varied between -10.3 and +8.6 degrees, with only three of the six patients matching the clinical result to within accepted Cobb measurement error of +-5 degrees. The results of this study suggest that spinal tissue material properties play an important role in governing the correction obtained during surgery, and that patient-specific modelling approaches must address the question of how to prescribe patient-specific soft tissue properties for spine surgery simulation.
Resumo:
A computational fluid dynamics (CFD) analysis has been performed for a flat plate photocatalytic reactor using CFD code FLUENT. Under the simulated conditions (Reynolds number, Re around 2650), a detailed time accurate computation shows the different stages of flow evolution and the effects of finite length of the reactor in creating flow instability, which is important to improve the performance of the reactor for storm and wastewater reuse. The efficiency of a photocatalytic reactor for pollutant decontamination depends on reactor hydrodynamics and configurations. This study aims to investigate the role of different parameters on the optimization of the reactor design for its improved performance. In this regard, more modelling and experimental efforts are ongoing to better understand the interplay of the parameters that influence the performance of the flat plate photocatalytic reactor.
Resumo:
The hydrodynamic behaviour of a novel flat plate photocatalytic reactor for water treatment is investigated using CFD code FLUENT. The reactor consists of a reactive section that features negligible pressure drop and uniform illumination of the photocatalyst to ensure enhanced photocatalytic efficiency. The numerical simulations allowed the identification of several design issues in the original reactor, which include extensive boundary layer separation near the photocatalyst support and regions of flow recirculation that render a significant portion of the reactive area. The simulations reveal that this issue could be addressed by selecting the appropriate inlet positions and configurations. This modification can cause minimal pressure drop across the reactive zone and achieves significant uniformization of the tested pollutant on the photocatalyst surface. The influence of roughness elements type has also been studied with a view to identify their role on the distribution of pollutant concentration on the photocatalyst surface. The results presented here indicate that the flow and pollutant concentration field strongly depend on the geometric parameters and flow conditions.
Resumo:
Many of the classification algorithms developed in the machine learning literature, including the support vector machine and boosting, can be viewed as minimum contrast methods that minimize a convex surrogate of the 0–1 loss function. The convexity makes these algorithms computationally efficient. The use of a surrogate, however, has statistical consequences that must be balanced against the computational virtues of convexity. To study these issues, we provide a general quantitative relationship between the risk as assessed using the 0–1 loss and the risk as assessed using any nonnegative surrogate loss function. We show that this relationship gives nontrivial upper bounds on excess risk under the weakest possible condition on the loss function—that it satisfies a pointwise form of Fisher consistency for classification. The relationship is based on a simple variational transformation of the loss function that is easy to compute in many applications. We also present a refined version of this result in the case of low noise, and show that in this case, strictly convex loss functions lead to faster rates of convergence of the risk than would be implied by standard uniform convergence arguments. Finally, we present applications of our results to the estimation of convergence rates in function classes that are scaled convex hulls of a finite-dimensional base class, with a variety of commonly used loss functions.
Resumo:
This important work describes recent theoretical advances in the study of artificial neural networks. It explores probabilistic models of supervised learning problems, and addresses the key statistical and computational questions. Chapters survey research on pattern classification with binary-output networks, including a discussion of the relevance of the Vapnik Chervonenkis dimension, and of estimates of the dimension for several neural network models. In addition, Anthony and Bartlett develop a model of classification by real-output networks, and demonstrate the usefulness of classification with a "large margin." The authors explain the role of scale-sensitive versions of the Vapnik Chervonenkis dimension in large margin classification, and in real prediction. Key chapters also discuss the computational complexity of neural network learning, describing a variety of hardness results, and outlining two efficient, constructive learning algorithms. The book is self-contained and accessible to researchers and graduate students in computer science, engineering, and mathematics