996 resultados para approximate membership extraction


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We identify relation completion (RC) as one recurring problem that is central to the success of novel big data applications such as Entity Reconstruction and Data Enrichment. Given a semantic relation, RC attempts at linking entity pairs between two entity lists under the relation. To accomplish the RC goals, we propose to formulate search queries for each query entity α based on some auxiliary information, so that to detect its target entity β from the set of retrieved documents. For instance, a pattern-based method (PaRE) uses extracted patterns as the auxiliary information in formulating search queries. However, high-quality patterns may decrease the probability of finding suitable target entities. As an alternative, we propose CoRE method that uses context terms learned surrounding the expression of a relation as the auxiliary information in formulating queries. The experimental results based on several real-world web data collections demonstrate that CoRE reaches a much higher accuracy than PaRE for the purpose of RC.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This chapter examines two core dimensions of women’s gendered experiences of mining in Australia and more specifically in Western Australia (WA). First, the chapter explores what has been and continues to be women’s principal relationship to mining encapsulated in the social and cultural identity of the ‘mining wife’ and, more recently, ‘fly-in/fly-out (FIFO) wife’. Second, the chapter addresses the fraught emergence of women as mineworkers. As the research presented in this chapter makes clear, the human cost of developmentalism was and continues to be deeply gendered.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper examines the properties of various approximation methods for solving stochastic dynamic programs in structural estimation problems. The problem addressed is evaluating the expected value of the maximum of available choices. The paper shows that approximating this by the maximum of expected values frequently has poor properties. It also shows that choosing a convenient distributional assumptions for the errors and then solving exactly conditional on the distributional assumption leads to small approximation errors even if the distribution is misspecified. © 1997 Cambridge University Press.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A set system (X, F ) with X= {x 1,...,x m}) and F = {B1...,B n }, where B i ⊆ X, is called an (n, m) cover-free set system (or CF set system) if for any 1 ≤ i, j, k ≤ n and j ≠ k, |B i >2 |B j ∩ B k | +1. In this paper, we show that CF set systems can be used to construct anonymous membership broadcast schemes (or AMB schemes), allowing a center to broadcast a secret identity among a set of users in a such way that the users can verify whether or not the broadcast message contains their valid identity. Our goal is to construct (n, m) CF set systems in which for given m the value n is as large as possible. We give two constructions for CF set systems, the first one from error-correcting codes and the other from combinatorial designs. We link CF set systems to the concept of cover-free family studied by Erdös et al in early 80’s to derive bounds on parameters of CF set systems. We also discuss some possible extensions of the current work, motivated by different application.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An anonymous membership broadcast scheme is a method in which a sender broadcasts the secret identity of one out of a set of n receivers, in such a way that only the right receiver knows that he is the intended receiver, while the others can not determine any information about this identity (except that they know that they are not the intended ones). In a w-anonymous membership broadcast scheme no coalition of up to w receivers, not containing the selected receiver, is able to determine any information about the identity of the selected receiver. We present two new constructions of w-anonymous membership broadcast schemes. The first construction is based on error-correcting codes and we show that there exist schemes that allow a flexible choice of w while keeping the complexities for broadcast communication, user storage and required randomness polynomial in log n,. The second construction is based on the concept of collision-free arrays, which is introduced in this paper. The construction results in more flexible schemes, allowing trade-offs between different complexities.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The spatiotemporal dynamics of an alien species invasion across a real landscape are typically complex. While surveillance is an essential part of a management response, planning surveillance in space and time present a difficult challenge due to this complexity. We show here a method for determining the highest probability sites for occupancy across a landscape at an arbitrary point in the future, based on occupancy data from a single slice in time. We apply to the method to the invasion of Giant Hogweed, a serious weed in the Czech republic and throughout Europe.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sparse optical flow algorithms, such as the Lucas-Kanade approach, provide more robustness to noise than dense optical flow algorithms and are the preferred approach in many scenarios. Sparse optical flow algorithms estimate the displacement for a selected number of pixels in the image. These pixels can be chosen randomly. However, pixels in regions with more variance between the neighbours will produce more reliable displacement estimates. The selected pixel locations should therefore be chosen wisely. In this study, the suitability of Harris corners, Shi-Tomasi's “Good features to track", SIFT and SURF interest point extractors, Canny edges, and random pixel selection for the purpose of frame-by-frame tracking using a pyramidical Lucas-Kanade algorithm is investigated. The evaluation considers the important factors of processing time, feature count, and feature trackability in indoor and outdoor scenarios using ground vehicles and unmanned aerial vehicles, and for the purpose of visual odometry estimation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a novel scheme for improving speaker diarization by making use of repeating speakers across multiple recordings within a large corpus. We call this technique speaker re-diarization and demonstrate that it is possible to reuse the initial speaker-linked diarization outputs to boost diarization accuracy within individual recordings. We first propose and evaluate two novel re-diarization techniques. We demonstrate their complementary characteristics and fuse the two techniques to successfully conduct speaker re-diarization across the SAIVT-BNEWS corpus of Australian broadcast data. This corpus contains recurring speakers in various independent recordings that need to be linked across the dataset. We show that our speaker re-diarization approach can provide a relative improvement of 23% in diarization error rate (DER), over the original diarization results, as well as improve the estimated number of speakers and the cluster purity and coverage metrics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Commercially viable carbon–neutral biodiesel production from microalgae has potential for replacing depleting petroleum diesel. The process of biodiesel production from microalgae involves harvesting, drying and extraction of lipids which are energy- and cost-intensive processes. The development of effective large-scale lipid extraction processes which overcome the complexity of microalgae cell structure is considered one of the most vital requirements for commercial production. Thus the aim of this work was to investigate suitable extraction methods with optimised conditions to progress opportunities for sustainable microalgal biodiesel production. In this study, the green microalgal species consortium, Tarong polyculture was used to investigate lipid extraction with hexane (solvent) under high pressure and variable temperature and biomass moisture conditions using an Accelerated Solvent Extraction (ASE) method. The performance of high pressure solvent extraction was examined over a range of different process and sample conditions (dry biomass to water ratios (DBWRs): 100%, 75%, 50% and 25% and temperatures from 70 to 120 ºC, process time 5–15 min). Maximum total lipid yields were achieved at 50% and 75% sample dryness at temperatures of 90–120 ºC. We show that individual fatty acids (Palmitic acid C16:0; Stearic acid C18:0; Oleic acid C18:1; Linolenic acid C18:3) extraction optima are influenced by temperature and sample dryness, consequently affecting microalgal biodiesel quality parameters. Higher heating values and kinematic viscosity were compliant with biodiesel quality standards under all extraction conditions used. Our results indicate that biodiesel quality can be positively manipulated by selecting process extraction conditions that favour extraction of saturated and mono-unsaturated fatty acids over optimal extraction conditions for polyunsaturated fatty acids, yielding positive effects on cetane number and iodine values. Exceeding biodiesel standards for these two parameters opens blending opportunities with biodiesels that fall outside the minimal cetane and maximal iodine values.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Organic compounds in Australian coal seam gas produced water (CSG water) are poorly understood despite their environmental contamination potential. In this study, the presence of some organic substances is identified from government-held CSG water-quality data from the Bowen and Surat Basins, Queensland. These records revealed the presence of polycyclic aromatic hydrocarbons (PAHs) in 27% of samples of CSG water from the Walloon Coal Measures at concentrations <1 µg/L, and it is likely these compounds leached from in situ coals. PAHs identified from wells include naphthalene, phenanthrene, chrysene and dibenz[a,h]anthracene. In addition, the likelihood of coal-derived organic compounds leaching to groundwater is assessed by undertaking toxicity leaching experiments using coal rank and water chemistry as variables. These tests suggest higher molecular weight PAHs (including benzo[a]pyrene) leach from higher rank coals, whereas lower molecular weight PAHs leach at greater concentrations from lower rank coal. Some of the identified organic compounds have carcinogenic or health risk potential, but they are unlikely to be acutely toxic at the observed concentrations which are almost negligible (largely due to the hydrophobicity of such compounds). Hence, this study will be useful to practitioners assessing CSG water related environmental and health risk.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Quantifying the impact of biochemical compounds on collective cell spreading is an essential element of drug design, with various applications including developing treatments for chronic wounds and cancer. Scratch assays are a technically simple and inexpensive method used to study collective cell spreading; however, most previous interpretations of scratch assays are qualitative and do not provide estimates of the cell diffusivity, D, or the cell proliferation rate,l. Estimating D and l is important for investigating the efficacy of a potential treatment and provides insight into the mechanism through which the potential treatment acts. While a few methods for estimating D and l have been proposed, these previous methods lead to point estimates of D and l, and provide no insight into the uncertainty in these estimates. Here, we compare various types of information that can be extracted from images of a scratch assay, and quantify D and l using discrete computational simulations and approximate Bayesian computation. We show that it is possible to robustly recover estimates of D and l from synthetic data, as well as a new set of experimental data. For the first time, our approach also provides a method to estimate the uncertainty in our estimates of D and l. We anticipate that our approach can be generalized to deal with more realistic experimental scenarios in which we are interested in estimating D and l, as well as additional relevant parameters such as the strength of cell-to-cell adhesion or the strength of cell-to-substrate adhesion.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Approximate Bayesian Computation’ (ABC) represents a powerful methodology for the analysis of complex stochastic systems for which the likelihood of the observed data under an arbitrary set of input parameters may be entirely intractable – the latter condition rendering useless the standard machinery of tractable likelihood-based, Bayesian statistical inference [e.g. conventional Markov chain Monte Carlo (MCMC) simulation]. In this paper, we demonstrate the potential of ABC for astronomical model analysis by application to a case study in the morphological transformation of high-redshift galaxies. To this end, we develop, first, a stochastic model for the competing processes of merging and secular evolution in the early Universe, and secondly, through an ABC-based comparison against the observed demographics of massive (Mgal > 1011 M⊙) galaxies (at 1.5 < z < 3) in the Cosmic Assembly Near-IR Deep Extragalatic Legacy Survey (CANDELS)/Extended Groth Strip (EGS) data set we derive posterior probability densities for the key parameters of this model. The ‘Sequential Monte Carlo’ implementation of ABC exhibited herein, featuring both a self-generating target sequence and self-refining MCMC kernel, is amongst the most efficient of contemporary approaches to this important statistical algorithm. We highlight as well through our chosen case study the value of careful summary statistic selection, and demonstrate two modern strategies for assessment and optimization in this regard. Ultimately, our ABC analysis of the high-redshift morphological mix returns tight constraints on the evolving merger rate in the early Universe and favours major merging (with disc survival or rapid reformation) over secular evolution as the mechanism most responsible for building up the first generation of bulges in early-type discs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Analytically or computationally intractable likelihood functions can arise in complex statistical inferential problems making them inaccessible to standard Bayesian inferential methods. Approximate Bayesian computation (ABC) methods address such inferential problems by replacing direct likelihood evaluations with repeated sampling from the model. ABC methods have been predominantly applied to parameter estimation problems and less to model choice problems due to the added difficulty of handling multiple model spaces. The ABC algorithm proposed here addresses model choice problems by extending Fearnhead and Prangle (2012, Journal of the Royal Statistical Society, Series B 74, 1–28) where the posterior mean of the model parameters estimated through regression formed the summary statistics used in the discrepancy measure. An additional stepwise multinomial logistic regression is performed on the model indicator variable in the regression step and the estimated model probabilities are incorporated into the set of summary statistics for model choice purposes. A reversible jump Markov chain Monte Carlo step is also included in the algorithm to increase model diversity for thorough exploration of the model space. This algorithm was applied to a validating example to demonstrate the robustness of the algorithm across a wide range of true model probabilities. Its subsequent use in three pathogen transmission examples of varying complexity illustrates the utility of the algorithm in inferring preference of particular transmission models for the pathogens.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The strain data acquired from structural health monitoring (SHM) systems play an important role in the state monitoring and damage identification of bridges. Due to the environmental complexity of civil structures, a better understanding of the actual strain data will help filling the gap between theoretical/laboratorial results and practical application. In the study, the multi-scale features of strain response are first revealed after abundant investigations on the actual data from two typical long-span bridges. Results show that, strain types at the three typical temporal scales of 10^5, 10^2 and 10^0 sec are caused by temperature change, trains and heavy trucks, and have their respective cut-off frequency in the order of 10^-2, 10^-1 and 10^0 Hz. Multi-resolution analysis and wavelet shrinkage are applied for separating and extracting these strain types. During the above process, two methods for determining thresholds are introduced. The excellent ability of wavelet transform on simultaneously time-frequency analysis leads to an effective information extraction. After extraction, the strain data will be compressed at an attractive ratio. This research may contribute to a further understanding of actual strain data of long-span bridges; also, the proposed extracting methodology is applicable on actual SHM systems.