Biblioteca Digital

913 resultados para Amazon metric

The rademacher complexity of coregularized kernel classes

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the multi-view approach to semisupervised learning, we choose one predictor from each of multiple hypothesis classes, and we co-regularize our choices by penalizing disagreement among the predictors on the unlabeled data. We examine the co-regularization method used in the co-regularized least squares (CoRLS) algorithm, in which the views are reproducing kernel Hilbert spaces (RKHS's), and the disagreement penalty is the average squared difference in predictions. The final predictor is the pointwise average of the predictors from each view. We call the set of predictors that can result from this procedure the co-regularized hypothesis class. Our main result is a tight bound on the Rademacher complexity of the co-regularized hypothesis class in terms of the kernel matrices of each RKHS. We find that the co-regularization reduces the Rademacher complexity by an amount that depends on the distance between the two views, as measured by a data dependent metric. We then use standard techniques to bound the gap between training error and test error for the CoRLS algorithm. Experimentally, we find that the amount of reduction in complexity introduced by co regularization correlates with the amount of improvement that co-regularization gives in the CoRLS algorithm.

Impacts of traditional husbandry practices on exploitable levels of genetic diversity in cultured 'Tra' catfish (Pangasianodon hypophthalmus) in the Mekong Delta, Vietnam

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sutchi catfish (Pangasianodon hypophthalmus) – known more universally by the Vietnamese name ‘Tra’ is an economically important freshwater fish in the Mekong Delta in Vietnam that constitutes an important food resource. Artificial propagation technology for Tra catfish has only recently been developed along the main branches of the Mekong River where more than 60% of the local human population participate in fishing or aquaculture. Extensive support for catfish culture in general, and that of Tra (P. hypophthalmus) in particular, has been provided by the Vietnamese government to increase both the scale of production and to develop international export markets. In 2006, total Vietnamese catfish exports reached approximately 286,602 metric tons (MT) and were valued at 736.87 $M with a number of large new export destinations being developed. Total value of production from catfish culture has been predicted to increase to approximately USD 1 billion by 2020. While freshwater catfish culture in Vietnam has a promising future, concerns have been raised about long-term quality of fry and the effectiveness of current brood stock management practices, issues that have been largely neglected to date. In this study, four DNA markers (microsatellite loci: CB4, CB7, CB12 and CB13) that were developed specifically for Tra (P. hypophthalmus) in an earlier study were applied to examine the genetic quality of artificially propagated Tra fry in the Mekong Delta in Vietnam. The goals of the study were to assess: (i) how well available levels of genetic variation in Tra brood stock used for artificial propagation in the Mekong Delta of Vietnam (breeders from three private hatcheries and Research Institute of Aquaculture No2 (RIA2) founders) has been conserved; and (ii) whether or not genetic diversity had declined significantly over time in a stock improvement program for Tra catfish at RIA2. A secondary issue addressed was how genetic markers could best be used to assist industry development. DNA was extracted from fins of catfish collected from the two main branches of the Mekong River inf Vietnam, three private hatcheries and samples from the Tra improvement program at RIA2. Study outcomes: i) Genetic diversity estimates for Tra brood stock samples were similar to, and slightly higher than, wild reference samples. In addition, the relative contribution by breeders to fry in commercial private hatcheries strongly suggest that the true Ne is likely to be significantly less than the breeder numbers used; ii) in a stock improvement program for Tra catfish at RIA2, no significant differences were detected in gene frequencies among generations (FST=0.021, P=0.036>0.002 after Bonferroni correction); and only small differences were observed in alleles frequencies among sample populations. To date, genetic markers have not been applied in the Tra catfish industry, but in the current project they were used to evaluate the levels of genetic variation in the Tra catfish selective breeding program at RIA2 and to undertake genetic correlations between genetic marker and trait variation. While no associations were detected using only four loci, they analysis provided training in the practical applications of the use of molecular markers in aquaculture in general, and in Tra culture, in particular.

Overview of the INEX 2009 Link the Wiki Track

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the third year of the Link the Wiki track, the focus has been shifted to anchor-to-bep link discovery. The participants were encouraged to utilize different technologies to resolve the issue of focused link discovery. Apart from the 2009 Wikipedia collection, the Te Ara collection was introduced for the first time in INEX. For the link the wiki tasks, 5000 file-to-file topics were randomly selected and 33 anchor-to-bep topics were nominated by the participants. The Te Ara collection does not contain hyperlinks and the task was to cross link the entire collection. A GUI tool for self-verification of the linking results was distributed. This helps participants verify the location of the anchor and bep. The assessment tool and the evaluation tool were revised to improve efficiency. Submission runs were evaluated against Wikipedia ground-truth and manual result set respectively. Focus-based evaluation was undertaken using a new metric. Evaluation results are presented and link discovery approaches are described

A Regularization Approach to Metrical Task Systems

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We address the problem of constructing randomized online algorithms for the Metrical Task Systems (MTS) problem on a metric δ against an oblivious adversary. Restricting our attention to the class of “work-based” algorithms, we provide a framework for designing algorithms that uses the technique of regularization. For the case when δ is a uniform metric, we exhibit two algorithms that arise from this framework, and we prove a bound on the competitive ratio of each. We show that the second of these algorithms is ln n + O(loglogn) competitive, which is the current state-of-the art for the uniform MTS problem.

Aggregate distance based clustering using Fibonacci Series -FIBCLUS

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper proposes an innovative instance similarity based evaluation metric that reduces the search map for clustering to be performed. An aggregate global score is calculated for each instance using the novel idea of Fibonacci series. The use of Fibonacci numbers is able to separate the instances effectively and, in hence, the intra-cluster similarity is increased and the inter-cluster similarity is decreased during clustering. The proposed FIBCLUS algorithm is able to handle datasets with numerical, categorical and a mix of both types of attributes. Results obtained with FIBCLUS are compared with the results of existing algorithms such as k-means, x-means expected maximization and hierarchical algorithms that are widely used to cluster numeric, categorical and mix data types. Empirical analysis shows that FIBCLUS is able to produce better clustering solutions in terms of entropy, purity and F-score in comparison to the above described existing algorithms.

Defining a session on Web search engines

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Detecting query reformulations within a session by a Web searcher is an important area of research for designing more helpful searching systems and targeting content to particular users. Methods explored by other researchers include both qualitative (i.e., the use of human judges to manually analyze query patterns on usually small samples) and nondeterministic algorithms, typically using large amounts of training data to predict query modification during sessions. In this article, we explore three alternative methods for detection of session boundaries. All three methods are computationally straightforward and therefore easily implemented for detection of session changes. We examine 2,465,145 interactions from 534,507 users of Dogpile.com on May 6, 2005. We compare session analysis using (a) Internet Protocol address and cookie; (b) Internet Protocol address, cookie, and a temporal limit on intrasession interactions; and (c) Internet Protocol address, cookie, and query reformulation patterns. Overall, our analysis shows that defining sessions by query reformulation along with Internet Protocol address and cookie provides the best measure, resulting in an 82% increase in the count of sessions. Regardless of the method used, the mean session length was fewer than three queries, and the mean session duration was less than 30 min. Searchers most often modified their query by changing query terms (nearly 23% of all query modifications) rather than adding or deleting terms. Implications are that for measuring searching traffic, unique sessions may be a better indicator than the common metric of unique visitors. This research also sheds light on the more complex aspects of Web searching involving query modifications and may lead to advances in searching tools.

A physico-chemical characterisation of particulate emissions from a compression ignition engine : the influence of biodiesel feedstock

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study undertook a physico-chemical characterisation of particle emissions from a single compression ignition engine operated at one test mode with 3 biodiesel fuels made from 3 different feedstocks (i.e. soy, tallow and canola) at 4 different blend percentages (20%, 40%, 60% and 80%) to gain insights into their particle-related health effects. Particle physical properties were inferred by measuring particle number size distributions both with and without heating within a thermodenuder (TD) and also by measuring particulate matter (PM) emission factors with an aerodynamic diameter less than 10 μm (PM10). The chemical properties of particulates were investigated by measuring particle and vapour phase Polycyclic Aromatic Hydrocarbons (PAHs) and also Reactive Oxygen Species (ROS) concentrations. The particle number size distributions showed strong dependency on feedstock and blend percentage with some fuel types showing increased particle number emissions, whilst others showed particle number reductions. In addition, the median particle diameter decreased as the blend percentage was increased. Particle and vapour phase PAHs were generally reduced with biodiesel, with the results being relatively independent of the blend percentage. The ROS concentrations increased monotonically with biodiesel blend percentage, but did not exhibit strong feedstock variability. Furthermore, the ROS concentrations correlated quite well with the organic volume percentage of particles – a quantity which increased with increasing blend percentage. At higher blend percentages, the particle surface area was significantly reduced, but the particles were internally mixed with a greater organic volume percentage (containing ROS) which has implications for using surface area as a regulatory metric for diesel particulate matter (DPM) emissions.

Text mining and probabilistic language modeling for online review spam detecting

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the era of Web 2.0, huge volumes of consumer reviews are posted to the Internet every day. Manual approaches to detecting and analyzing fake reviews (i.e., spam) are not practical due to the problem of information overload. However, the design and development of automated methods of detecting fake reviews is a challenging research problem. The main reason is that fake reviews are specifically composed to mislead readers, so they may appear the same as legitimate reviews (i.e., ham). As a result, discriminatory features that would enable individual reviews to be classified as spam or ham may not be available. Guided by the design science research methodology, the main contribution of this study is the design and instantiation of novel computational models for detecting fake reviews. In particular, a novel text mining model is developed and integrated into a semantic language model for the detection of untruthful reviews. The models are then evaluated based on a real-world dataset collected from amazon.com. The results of our experiments confirm that the proposed models outperform other well-known baseline models in detecting fake reviews. To the best of our knowledge, the work discussed in this article represents the first successful attempt to apply text mining methods and semantic language models to the detection of fake consumer reviews. A managerial implication of our research is that firms can apply our design artifacts to monitor online consumer reviews to develop effective marketing or product design strategies based on genuine consumer feedback posted to the Internet.

Solder joint defects classification using the Log-Gabor Filter, the Discrete Wavelet Transform and the Discrete Cosine Transform

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Inspection of solder joints has been a critical process in the electronic manufacturing industry to reduce manufacturing cost, improve yield, and ensure project quality and reliability. This paper proposes the use of the Log-Gabor filter bank, Discrete Wavelet Transform and Discrete Cosine Transform for feature extraction of solder joint images on Printed Circuit Boards (PCBs). A distance based on the Mahalanobis Cosine metric is also presented for classification of five different types of solder joints. From the experimental results, this methodology achieved high accuracy and a well generalised performance. This can be an effective method to reduce cost and improve quality in the production of PCBs in the manufacturing industry.

CAT-SLAM : probabilistic localisation and mapping using a continuous appearance-based trajectory

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes a new system, dubbed Continuous Appearance-based Trajectory Simultaneous Localisation and Mapping (CAT-SLAM), which augments sequential appearance-based place recognition with local metric pose filtering to improve the frequency and reliability of appearance-based loop closure. As in other approaches to appearance-based mapping, loop closure is performed without calculating global feature geometry or performing 3D map construction. Loop-closure filtering uses a probabilistic distribution of possible loop closures along the robot’s previous trajectory, which is represented by a linked list of previously visited locations linked by odometric information. Sequential appearance-based place recognition and local metric pose filtering are evaluated simultaneously using a Rao–Blackwellised particle filter, which weights particles based on appearance matching over sequential frames and the similarity of robot motion along the trajectory. The particle filter explicitly models both the likelihood of revisiting previous locations and exploring new locations. A modified resampling scheme counters particle deprivation and allows loop-closure updates to be performed in constant time for a given environment. We compare the performance of CAT-SLAM with FAB-MAP (a state-of-the-art appearance-only SLAM algorithm) using multiple real-world datasets, demonstrating an increase in the number of correct loop closures detected by CAT-SLAM.

Towards persistent localization and mapping with a continuous appearance-based topology

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Appearance-based localization can provide loop closure detection at vast scales regardless of accumulated metric error. However, the computation time and memory requirements of current appearance-based methods scale not only with the size of the environment but also with the operation time of the platform. Additionally, repeated visits to locations will develop multiple competing representations, which will reduce recall performance over time. These properties impose severe restrictions on long-term autonomy for mobile robots, as loop closure performance will inevitably degrade with increased operation time. In this paper we present a graphical extension to CAT-SLAM, a particle filter-based algorithm for appearance-based localization and mapping, to provide constant computation and memory requirements over time and minimal degradation of recall performance during repeated visits to locations. We demonstrate loop closure detection in a large urban environment with capped computation time and memory requirements and performance exceeding previous appearance-based methods by a factor of 2. We discuss the limitations of the algorithm with respect to environment size, appearance change over time and applications in topological planning and navigation for long-term robot operation.

Capping computation time and storage requirements for appearance-based localization with CAT-SLAM

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Appearance-based localization is increasingly used for loop closure detection in metric SLAM systems. Since it relies only upon the appearance-based similarity between images from two locations, it can perform loop closure regardless of accumulated metric error. However, the computation time and memory requirements of current appearance-based methods scale linearly not only with the size of the environment but also with the operation time of the platform. These properties impose severe restrictions on longterm autonomy for mobile robots, as loop closure performance will inevitably degrade with increased operation time. We present a set of improvements to the appearance-based SLAM algorithm CAT-SLAM to constrain computation scaling and memory usage with minimal degradation in performance over time. The appearance-based comparison stage is accelerated by exploiting properties of the particle observation update, and nodes in the continuous trajectory map are removed according to minimal information loss criteria. We demonstrate constant time and space loop closure detection in a large urban environment with recall performance exceeding FAB-MAP by a factor of 3 at 100% precision, and investigate the minimum computational and memory requirements for maintaining mapping performance.

Lost in translation (and rotation) : rapid extrinsic calibration for 2D and 3D LIDARs

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes a novel method for determining the extrinsic calibration parameters between 2D and 3D LIDAR sensors with respect to a vehicle base frame. To recover the calibration parameters we attempt to optimize the quality of a 3D point cloud produced by the vehicle as it traverses an unknown, unmodified environment. The point cloud quality metric is derived from Rényi Quadratic Entropy and quantifies the compactness of the point distribution using only a single tuning parameter. We also present a fast approximate method to reduce the computational requirements of the entropy evaluation, allowing unsupervised calibration in vast environments with millions of points. The algorithm is analyzed using real world data gathered in many locations, showing robust calibration performance and substantial speed improvements from the approximations.

Learning domain-specific sentiment lexicons for predicting product sales

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Generic sentiment lexicons have been widely used for sentiment analysis these days. However, manually constructing sentiment lexicons is very time-consuming and it may not be feasible for certain application domains where annotation expertise is not available. One contribution of this paper is the development of a statistical learning based computational method for the automatic construction of domain-specific sentiment lexicons to enhance cross-domain sentiment analysis. Our initial experiments show that the proposed methodology can automatically generate domain-specific sentiment lexicons which contribute to improve the effectiveness of opinion retrieval at the document level. Another contribution of our work is that we show the feasibility of applying the sentiment metric derived based on the automatically constructed sentiment lexicons to predict product sales of certain product categories. Our research contributes to the development of more effective sentiment analysis system to extract business intelligence from numerous opinionated expressions posted to the Web

Democracy, Crime and Justice. Annals of the American Academy of Political and Social Sciences

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As a growing number of nations embark on a path to democracy, criminologists have become increasingly interested and engaged in the challenges, concerns, and questions connecting democracy with both crime and criminal justice. Rising levels of violence and street crime, white collar crime and corruption both in countries where democracy is securely in place and where it is struggling, have fuelled a deepening skepticism as to the capacity of democracy to deliver on its promise of security and justice for all citizens. What role does crime and criminal justice play in the future of democracy and for democratic political development on a global level? The editors of this special volume of The Annals realized the importance of collecting research from a broad spectrum of countries and covering a range of problems that affect citizens, politicians, and criminal justice officials. The articles here represent a solid balance between mature democracies like the U.S. and U.K. as well as emerging democracies around the globe – specifically in Latin America, Africa and Eastern Europe. They are based on large and small cross-national samples, regional comparisons, and case studies. Each contribution addresses a seminal question for the future of democratic political development across the globe. What is the role of criminal justice in the process of building democracy and instilling confidence in its institutions? Is there a role for unions in democratizing police forces? What is the impact of widespread disenfranchisement of felons on democratic citizenship and the life of democratic institutions? Under what circumstances do mature democracies adopt punitive sentencing regimes? Addressing sensitive topics such as relations between police and the Muslim communities of Western Europe in the wake of terrorist attacks, this volume also sheds light on the effects of terrorism on mature democracies under increasing pressure to provide security for their citizens. By taking a broad vantage point, this collection of research delves into complex topics such as the relationship between the process of democratization and violent crime waves; the impact of rising crime rates on newly established as well as secure democracies; how crime may endanger the transition to democracy; and how existing practices of criminal justice in mature democracies affect their core values and institutions. The collection of these insightful articles not only begins to fill a gap in criminological research but also addresses issues of critical interest to political scientists as well as other social and behavioral scientists and scholars. Taking a fresh approach to the intersection of crime, criminal justice, and democracy, this volume of The Annals is a must-read for criminologists and political scientists and provides a solid foundation for further interdisciplinary research.

«
1
2
...
52
53
54
55
56
57
58
...
60
61
»