945 resultados para Document ranking
Resumo:
A necessary step for the recognition of scanned documents is binarization, which is essentially the segmentation of the document. In order to binarize a scanned document, we can find several algorithms in the literature. What is the best binarization result for a given document image? To answer this question, a user needs to check different binarization algorithms for suitability, since different algorithms may work better for different type of documents. Manually choosing the best from a set of binarized documents is time consuming. To automate the selection of the best segmented document, either we need to use ground-truth of the document or propose an evaluation metric. If ground-truth is available, then precision and recall can be used to choose the best binarized document. What is the case, when ground-truth is not available? Can we come up with a metric which evaluates these binarized documents? Hence, we propose a metric to evaluate binarized document images using eigen value decomposition. We have evaluated this measure on DIBCO and H-DIBCO datasets. The proposed method chooses the best binarized document that is close to the ground-truth of the document.
Resumo:
Eleven GCMs (BCCR-BCCM2.0, INGV-ECHAM4, GFDL2.0, GFDL2.1, GISS, IPSL-CM4, MIROC3, MRI-CGCM2, NCAR-PCMI, UKMO-HADCM3 and UKMO-HADGEM1) were evaluated for India (covering 73 grid points of 2.5 degrees x 2.5 degrees) for the climate variable `precipitation rate' using 5 performance indicators. Performance indicators used were the correlation coefficient, normalised root mean square error, absolute normalised mean bias error, average absolute relative error and skill score. We used a nested bias correction methodology to remove the systematic biases in GCM simulations. The Entropy method was employed to obtain weights of these 5 indicators. Ranks of the 11 GCMs were obtained through a multicriterion decision-making outranking method, PROMETHEE-2 (Preference Ranking Organisation Method of Enrichment Evaluation). An equal weight scenario (assigning 0.2 weight for each indicator) was also used to rank the GCMs. An effort was also made to rank GCMs for 4 river basins (Godavari, Krishna, Mahanadi and Cauvery) in peninsular India. The upper Malaprabha catchment in Karnataka, India, was chosen to demonstrate the Entropy and PROMETHEE-2 methods. The Spearman rank correlation coefficient was employed to assess the association between the ranking patterns. Our results suggest that the ensemble of GFDL2.0, MIROC3, BCCR-BCCM2.0, UKMO-HADCM3, MPIECHAM4 and UKMO-HADGEM1 is suitable for India. The methodology proposed can be extended to rank GCMs for any selected region.
Resumo:
We consider two variants of the classical gossip algorithm. The first variant is a version of asynchronous stochastic approximation. We highlight a fundamental difficulty associated with the classical asynchronous gossip scheme, viz., that it may not converge to a desired average, and suggest an alternative scheme based on reinforcement learning that has guaranteed convergence to the desired average. We then discuss a potential application to a wireless network setting with simultaneous link activation constraints. The second variant is a gossip algorithm for distributed computation of the Perron-Frobenius eigenvector of a nonnegative matrix. While the first variant draws upon a reinforcement learning algorithm for an average cost controlled Markov decision problem, the second variant draws upon a reinforcement learning algorithm for risk-sensitive control. We then discuss potential applications of the second variant to ranking schemes, reputation networks, and principal component analysis.
Resumo:
The problem of bipartite ranking, where instances are labeled positive or negative and the goal is to learn a scoring function that minimizes the probability of mis-ranking a pair of positive and negative instances (or equivalently, that maximizes the area under the ROC curve), has been widely studied in recent years. A dominant theoretical and algorithmic framework for the problem has been to reduce bipartite ranking to pairwise classification; in particular, it is well known that the bipartite ranking regret can be formulated as a pairwise classification regret, which in turn can be upper bounded using usual regret bounds for classification problems. Recently, Kotlowski et al. (2011) showed regret bounds for bipartite ranking in terms of the regret associated with balanced versions of the standard (non-pairwise) logistic and exponential losses. In this paper, we show that such (non-pairwise) surrogate regret bounds for bipartite ranking can be obtained in terms of a broad class of proper (composite) losses that we term as strongly proper. Our proof technique is much simpler than that of Kotlowski et al. (2011), and relies on properties of proper (composite) losses as elucidated recently by Reid and Williamson (2010, 2011) and others. Our result yields explicit surrogate bounds (with no hidden balancing terms) in terms of a variety of strongly proper losses, including for example logistic, exponential, squared and squared hinge losses as special cases. An important consequence is that standard algorithms minimizing a (non-pairwise) strongly proper loss, such as logistic regression and boosting algorithms (assuming a universal function class and appropriate regularization), are in fact consistent for bipartite ranking; moreover, our results allow us to quantify the bipartite ranking regret in terms of the corresponding surrogate regret. We also obtain tighter surrogate bounds under certain low-noise conditions via a recent result of Clemencon and Robbiano (2011).
Resumo:
Eleven general circulation models/global climate models (GCMs) - BCCR-BCCM2.0, INGV-ECHAM4, GFDL2.0, GFDL2.1, GISS, IPSL-CM4, MIROC3, MRI-CGCM2, NCAR-PCMI, UKMO-HADCM3 and UKMO-HADGEM1 - are evaluated for Indian climate conditions using the performance indicator, skill score (SS). Two climate variables, temperature T (at three levels, i.e. 500, 700, 850 mb) and precipitation rate (Pr) are considered resulting in four SS-based evaluation criteria (T500, T700, T850, Pr). The multicriterion decision-making method, technique for order preference by similarity to an ideal solution, is applied to rank 11 GCMs. Efforts are made to rank GCMs for the Upper Malaprabha catchment and two river basins, namely, Krishna and Mahanadi (covered by 17 and 15 grids of size 2.5 degrees x 2.5 degrees, respectively). Similar efforts are also made for India (covered by 73 grid points of size 2.5 degrees x 2.5 degrees) for which an ensemble of GFDL2.0, INGV-ECHAM4, UKMO-HADCM3, MIROC3, BCCR-BCCM2.0 and GFDL2.1 is found to be suitable. It is concluded that the proposed methodology can be applied to similar situations with ease.
Resumo:
(287 page document)
Resumo:
The principal purpose of this document is to assist programme teams throughout the development process when they are considering the development or review of a route through the award where it will be delivered wholly, or primarily, via online distance learning. Please note that this document is current as of Sept 2015 but it is considered to be an evolving document and is updated/tweaked from time to time.