207 resultados para FRESH EXTRACTION SITES


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a new active learning query strategy for information extraction, called Domain Knowledge Informativeness (DKI). Active learning is often used to reduce the amount of annotation effort required to obtain training data for machine learning algorithms. A key component of an active learning approach is the query strategy, which is used to iteratively select samples for annotation. Knowledge resources have been used in information extraction as a means to derive additional features for sample representation. DKI is, however, the first query strategy that exploits such resources to inform sample selection. To evaluate the merits of DKI, in particular with respect to the reduction in annotation effort that the new query strategy allows to achieve, we conduct a comprehensive empirical comparison of active learning query strategies for information extraction within the clinical domain. The clinical domain was chosen for this work because of the availability of extensive structured knowledge resources which have often been exploited for feature generation. In addition, the clinical domain offers a compelling use case for active learning because of the necessary high costs and hurdles associated with obtaining annotations in this domain. Our experimental findings demonstrated that 1) amongst existing query strategies, the ones based on the classification model’s confidence are a better choice for clinical data as they perform equally well with a much lighter computational load, and 2) significant reductions in annotation effort are achievable by exploiting knowledge resources within active learning query strategies, with up to 14% less tokens and concepts to manually annotate than with state-of-the-art query strategies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An automated method for extracting brain volumes from three commonly acquired three-dimensional (3D) MR images (proton density, T1 weighted, and T2-weighted) of the human head is described. The procedure is divided into four levels: preprocessing, segmentation, scalp removal, and postprocessing. A user-provided reference point is the sole operator-dependent input required. The method's parameters were first optimized and then fixed and applied to 30 repeat data sets from 15 normal older adult subjects to investigate its reproducibility. Percent differences between total brain volumes (TBVs) for the subjects' repeated data sets ranged from .5% to 2.2%. We conclude that the method is both robust and reproducible and has the potential for wide application.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Currently we are facing an overburdening growth of the number of reliable information sources on the Internet. The quantity of information available to everyone via Internet is dramatically growing each year [15]. At the same time, temporal and cognitive resources of human users are not changing, therefore causing a phenomenon of information overload. World Wide Web is one of the main sources of information for decision makers (reference to my research). However our studies show that, at least in Poland, the decision makers see some important problems when turning to Internet as a source of decision information. One of the most common obstacles raised is distribution of relevant information among many sources, and therefore need to visit different Web sources in order to collect all important content and analyze it. A few research groups have recently turned to the problem of information extraction from the Web [13]. The most effort so far has been directed toward collecting data from dispersed databases accessible via web pages (related to as data extraction or information extraction from the Web) and towards understanding natural language texts by means of fact, entity, and association recognition (related to as information extraction). Data extraction efforts show some interesting results, however proper integration of web databases is still beyond us. Information extraction field has been recently very successful in retrieving information from natural language texts, however it is still lacking abilities to understand more complex information, requiring use of common sense knowledge, discourse analysis and disambiguation techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present an empirical evaluation and comparison of two content extraction methods in HTML: absolute XPath expressions and relative XPath expressions. We argue that the relative XPath expressions, although not widely used, should be used in preference to absolute XPath expressions in extracting content from human-created Web documents. Evaluation of robustness covers four thousand queries executed on several hundred webpages. We show that in referencing parts of real world dynamic HTML documents, relative XPath expressions are on average significantly more robust than absolute XPath ones.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years, both developing and industrialised societies have experienced riots and civil unrest over the corporate exploitation of fresh water. Water conflicts increase as water scarcity rises and the unsustainable use of fresh water will continue to have profound implications for sustainable development and the realisation of human rights. Rather than states adopting more costly water conservation strategies or implementing efficient water technologies, corporations are exploiting natural resources in what has been described as the “privatization of water”. By using legal doctrines, states and corporations construct fresh water sources as something that can be owned or leased. For some regions, the privatization of water has enabled corporations and corrupt states to exploit a fundamental human right. Arguing that such matters are of relevance to criminology, which should be concerned with fundamental environmental and human rights, this article adopts a green criminological perspective and draws upon Treadmill of Production theory.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Blasting is an integral part of large-scale open cut mining that often occurs in close proximity to population centers and often results in the emission of particulate material and gases potentially hazardous to health. Current air quality monitoring methods rely on limited numbers of fixed sampling locations to validate a complex fluid environment and collect sufficient data to confirm model effectiveness. This paper describes the development of a methodology to address the need of a more precise approach that is capable of characterizing blasting plumes in near-real time. The integration of the system required the modification and integration of an opto-electrical dust sensor, SHARP GP2Y10, into a small fixed-wing and multi-rotor copter, resulting in the collection of data streamed during flight. The paper also describes the calibration of the optical sensor with an industry grade dust-monitoring device, Dusttrak 8520, demonstrating a high correlation between them, with correlation coefficients (R2) greater than 0.9. The laboratory and field tests demonstrate the feasibility of coupling the sensor with the UAVs. However, further work must be done in the areas of sensor selection and calibration as well as flight planning.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Employees’ safety climate perceptions dictate their safety behavior because individuals act based on their perceptions of reality. Extensive empirical research in applied psychology has confirmed this relationship. However, rare efforts have been made to investigate the factors contributing to a favorable safety climate in construction research. As an initial effort to address the knowledge gap, this paper examines factors contributing to a psychological safety climate, an operationalization of a safety climate at the individual level, and, hence, the basic element of a safety climate at higher levels. A multiperspective framework of contributors to a psychological safety climate is estimated by a structural equation modeling technique using individual questionnaire responses from a random sample of construction project personnel. The results inform management of three routes to psychological safety climate: a client’s proactive involvement in safety management, a workforce-friendly workplace created by the project team, and transformational supervisors’ communication about safety matters with the workforce. This paper contributes to the field of construction engineering and management by highlighting a broader contextual influence in a systematic formation of psychological safety climate perceptions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose The purpose of this paper is to highlight and promote fresh thinking in services marketing research. Design/methodology/approach The topic of the special issue was deliberately chosen to encourage fresh ideas and concepts that will move the discipline forward. The accepted papers have been categorised for ease and convenience of reading by scholars and practitioners, with a short commentary on each category. Findings There is a wealth of forward-thinking by service(s) marketing researchers that bodes well for the future of the sub-discipline. Research limitations/implications The special issue does not address fresh thinking in all areas of services marketing research. Other potential areas for fresh thinking are identified. Originality/value New thinking in a scholarly field is necessary to propel the discipline forward.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Molecular phylogenetic studies of homologous sequences of nucleotides often assume that the underlying evolutionary process was globally stationary, reversible, and homogeneous (SRH), and that a model of evolution with one or more site-specific and time-reversible rate matrices (e.g., the GTR rate matrix) is enough to accurately model the evolution of data over the whole tree. However, an increasing body of data suggests that evolution under these conditions is an exception, rather than the norm. To address this issue, several non-SRH models of molecular evolution have been proposed, but they either ignore heterogeneity in the substitution process across sites (HAS) or assume it can be modeled accurately using the distribution. As an alternative to these models of evolution, we introduce a family of mixture models that approximate HAS without the assumption of an underlying predefined statistical distribution. This family of mixture models is combined with non-SRH models of evolution that account for heterogeneity in the substitution process across lineages (HAL). We also present two algorithms for searching model space and identifying an optimal model of evolution that is less likely to over- or underparameterize the data. The performance of the two new algorithms was evaluated using alignments of nucleotides with 10 000 sites simulated under complex non-SRH conditions on a 25-tipped tree. The algorithms were found to be very successful, identifying the correct HAL model with a 75% success rate (the average success rate for assigning rate matrices to the tree's 48 edges was 99.25%) and, for the correct HAL model, identifying the correct HAS model with a 98% success rate. Finally, parameter estimates obtained under the correct HAL-HAS model were found to be accurate and precise. The merits of our new algorithms were illustrated with an analysis of 42 337 second codon sites extracted from a concatenation of 106 alignments of orthologous genes encoded by the nuclear genomes of Saccharomyces cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, S. castellii, S. kluyveri, S. bayanus, and Candida albicans. Our results show that second codon sites in the ancestral genome of these species contained 49.1% invariable sites, 39.6% variable sites belonging to one rate category (V1), and 11.3% variable sites belonging to a second rate category (V2). The ancestral nucleotide content was found to differ markedly across these three sets of sites, and the evolutionary processes operating at the variable sites were found to be non-SRH and best modeled by a combination of eight edge-specific rate matrices (four for V1 and four for V2). The number of substitutions per site at the variable sites also differed markedly, with sites belonging to V1 evolving slower than those belonging to V2 along the lineages separating the seven species of Saccharomyces. Finally, sites belonging to V1 appeared to have ceased evolving along the lineages separating S. cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, and S. bayanus, implying that they might have become so selectively constrained that they could be considered invariable sites in these species.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A method for determination of tricyclazole in water using solid phase extraction and high performance liquid chromatography (HPLC) with UV detection at 230nm and a mobile phase of acetonitrile:water (20:80, v/v) was developed. A performance comparison between two types of solid phase sorbents, the C18 sorbent of Supelclean ENVI-18 cartridge and the styrene-divinyl benzene copolymer sorbent of Sep-Pak PS2-Plus cartridge was conducted. The Sep-Pak PS2-Plus cartridges were found more suitable for extracting tricyclazole from water samples than the Supelclean ENVI-18 cartridges. For this cartridge, both methanol and ethyl acetate produced good results. The method was validated with good linearity and with a limit of detection of 0.008gL-1 for a 500-fold concentration through the SPE procedure. The recoveries of the method were stable at 80% and the precision was from 1.1-6.0% within the range of fortified concentrations. The validated method was also applied to measure the concentrations of tricyclazole in real paddy water.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Past research has suggested that social engineering poses the most significant security risk. Recent studies have suggested that social networking sites (SNSs) are the most common source of social engineering attacks. The risk of social engineering attacks in SNSs is associated with the difficulty of making accurate judgments regarding source credibility in the virtual environment of SNSs. In this paper, we quantitatively investigate source credibility dimensions in terms of social engineering on Facebook, as well as the source characteristics that influence Facebook users to judge an attacker as credible, therefore making them susceptible to victimization. Moreover, in order to predict users’ susceptibility to social engineering victimization based on their demographics, we investigate the effectiveness of source characteristics on different demographic groups by measuring the consent intentions and behavior responses of users to social engineering requests using a role-play experiment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Frog protection has become increasingly essential due to the rapid decline of its biodiversity. Therefore, it is valuable to develop new methods for studying this biodiversity. In this paper, a novel feature extraction method is proposed based on perceptual wavelet packet decomposition for classifying frog calls in noisy environments. Pre-processing and syllable segmentation are first applied to the frog call. Then, a spectral peak track is extracted from each syllable if possible. Track duration, dominant frequency and oscillation rate are directly extracted from the track. With k-means clustering algorithm, the calculated dominant frequency of all frog species is clustered into k parts, which produce a frequency scale for wavelet packet decomposition. Based on the adaptive frequency scale, wavelet packet decomposition is applied to the frog calls. Using the wavelet packet decomposition coefficients, a new feature set named perceptual wavelet packet decomposition sub-band cepstral coefficients is extracted. Finally, a k-nearest neighbour (k-NN) classifier is used for the classification. The experiment results show that the proposed features can achieve an average classification accuracy of 97.45% which outperforms syllable features (86.87%) and Mel-frequency cepstral coefficients (MFCCs) feature (90.80%).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study investigates the use of unsupervised features derived from word embedding approaches and novel sequence representation approaches for improving clinical information extraction systems. Our results corroborate previous findings that indicate that the use of word embeddings significantly improve the effectiveness of concept extraction models; however, we further determine the influence that the corpora used to generate such features have. We also demonstrate the promise of sequence-based unsupervised features for further improving concept extraction.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Prostate cancer is the second most common malignancy among men worldwide. Genome-wide association studies have identified 100 risk variants for prostate cancer, which can explain approximately 33% of the familial risk of the disease. We hypothesized that a comprehensive analysis of genetic variations found within the 3' untranslated region of genes predicted to affect miRNA binding (miRSNP) can identify additional prostate cancer risk variants. We investigated the association between 2,169 miRSNPs and prostate cancer risk in a large-scale analysis of 22,301 cases and 22,320 controls of European ancestry from 23 participating studies. Twenty-two miRSNPs were associated (P<2.3×10(-5)) with risk of prostate cancer, 10 of which were within 7 genes previously not mapped by GWAS studies. Further, using miRNA mimics and reporter gene assays, we showed that miR-3162-5p has specific affinity for the KLK3 rs1058205 miRSNP T-allele, whereas miR-370 has greater affinity for the VAMP8 rs1010 miRSNP A-allele, validating their functional role. SIGNIFICANCE Findings from this large association study suggest that a focus on miRSNPs, including functional evaluation, can identify candidate risk loci below currently accepted statistical levels of genome-wide significance. Studies of miRNAs and their interactions with SNPs could provide further insights into the mechanisms of prostate cancer risk.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Previous microarray analyses identified 22 microRNAs (miRNAs) differentially expressed in paired ectopic and eutopic endometrium of women with and without endometriosis. To investigate further the role of these miRNAs in women with endometriosis, we conducted an association study aiming to explore the relationship between endometriosis risk and single-nucleotide polymorphisms (SNPs) in miRNA target sites for these differentially expressed miRNAs. A panel of 102 SNPs in the predicted miRNA binding sites were evaluated for an endometriosis association study and an ingenuity pathway analysis was performed. Fourteen rare variants were identified in this study. We found SNP rs14647 in the Wolf-Hirschhorn syndrome candidate gene1 (WHSC1) 3'UTR (untranslated region) was associated with endometriosis-related infertility presenting an odds ratio of 12.2 (95% confidence interval = 2.4-60.7, P = 9.03 x 10(-5)). SNP haplotype AGG in the solute carrier family 22, member 23 (SLC22A23) 3'UTR was associated with endometriosis-related infertility and more severe disease. With the individual genotyping data, ingenuity pathways analysis identified the tumour necrosis factor and cyclin-dependant kinase inhibitor as major factors in the molecular pathways. Significant associations between WHSC1 alleles and endometriosis-related infertility and SLC22A23 haplotypes and the disease severe stage were identified. These findings may help focus future research on subphenotypes of this disease. Replication studies in independent large sample sets to confirm and characterize the involvement of the gene variation in the pathogenesis of endometriosis are needed.