13 resultados para E-Metrics
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
Establishing metrics to assess machine translation (MT) systems automatically is now crucial owing to the widespread use of MT over the web. In this study we show that such evaluation can be done by modeling text as complex networks. Specifically, we extend our previous work by employing additional metrics of complex networks, whose results were used as input for machine learning methods and allowed MT texts of distinct qualities to be distinguished. Also shown is that the node-to-node mapping between source and target texts (English-Portuguese and Spanish-Portuguese pairs) can be improved by adding further hierarchical levels for the metrics out-degree, in-degree, hierarchical common degree, cluster coefficient, inter-ring degree, intra-ring degree and convergence ratio. The results presented here amount to a proof-of-principle that the possible capturing of a wider context with the hierarchical levels may be combined with machine learning methods to yield an approach for assessing the quality of MT systems. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
P>1. The use of indicators to identify areas of conservation importance has been challenged on several grounds, but nonetheless retains appeal as no more parsimonious approach exists. Among the many variants, two indicator strategies stand out: the use of indicator species and the use of metrics of landscape structure. While the first has been thoroughly studied, the same cannot be said about the latter. We aimed to contrast the relative efficacy of species-based and landscape-based indicators by: (i) comparing their ability to reflect changes in community integrity at regional and landscape spatial scales, (ii) assessing their sensitivity to changes in data resolution, and (iii) quantifying the degree to which indicators that are generated in one landscape or at one spatial scale can be transferred to additional landscapes or scales. 2. We used data from more than 7000 bird captures in 65 sites from six 10 000-ha landscapes with different proportions of forest cover in the Atlantic Forest of Brazil. Indicator species and landscape-based indicators were tested in terms of how effective they were in reflecting changes in community integrity, defined as deviations in bird community composition from control areas. 3. At the regional scale, indicator species provided more robust depictions of community integrity than landscape-based indicators. At the landscape scale, however, landscape-based indicators performed more effectively, more consistently and were also more transferable among landscapes. The effectiveness of high resolution landscape-based indicators was reduced by just 12% when these were used to explain patterns of community integrity in independent data sets. By contrast, the effectiveness of species-based indicators was reduced by 33%. 4. Synthesis and applications. The use of indicator species proved to be effective; however their results were variable and sensitive to changes in scale and resolution, and their application requires extensive and time-consuming field work. Landscape-based indicators were not only effective but were also much less context-dependent. The use of landscape-based indicators may allow the rapid identification of priority areas for conservation and restoration, and indicate which restoration strategies should be pursued, using remotely sensed imagery. We suggest that landscape-based indicators might often be a better, simpler, and cheaper strategy for informing decisions in conservation.
Resumo:
One of the main consequences of habitat loss and fragmentation is the increase in patch isolation and the consequent decrease in landscape connectivity. In this context, species persistence depends on their responses to this new landscape configuration, particularly on their capacity to move through the interhabitat matrix. Here, we aimed first to determine gap-crossing probabilities related to different gap widths for two forest birds (Thamnophilus caerulescens, Thamnophilidae, and Basileuterus culicivorus, Parulidae) from the Brazilian Atlantic rainforest. These values were defined with a playback technique and then used in analyses based on graph theory to determine functional connections among forest patches. Both species were capable of crossing forest gaps between patches, and these movements were related to gap width. The probability of crossing 40 m gaps was 50% for both species. This probability falls to 10% when the gaps are 60 m (for B. culicivorus) or 80 m (for T caerulescens). Actually, birds responded to stimulation about two times more distant inside forest trials (control) than in gap-crossing trials. Models that included gap-crossing capacity improved the explanatory power of species abundance variation in comparison to strictly structural models based merely on patch area and distance measurements. These results highlighted that even very simple functional connectivity measurements related to gap-crossing capacity can improve the understanding of the effect of habitat fragmentation on bird occurrence and abundance.
Resumo:
Information to guide decision making is especially urgent in human dominated landscapes in the tropics, where urban and agricultural frontiers are still expanding in an unplanned manner. Nevertheless, most studies that have investigated the influence of landscape structure on species distribution have not considered the heterogeneity of altered habitats of the matrix, which is usually high in human dominated landscapes. Using the distribution of small mammals in forest remnants and in the four main altered habitats in an Atlantic forest landscape, we investigated 1) how explanatory power of models describing species distribution in forest remnants varies between landscape structure variables that do or do not incorporate matrix quality and 2) the importance of spatial scale for analyzing the influence of landscape structure. We used standardized sampling in remnants and altered habitats to generate two indices of habitat quality, corresponding to the abundance and to the occurrence of small mammals. For each remnant, we calculated habitat quantity and connectivity in different spatial scales, considering or not the quality of surrounding habitats. The incorporation of matrix quality increased model explanatory power across all spatial scales for half the species that occurred in the matrix, but only when taking into account the distance between habitat patches (connectivity). These connectivity models were also less affected by spatial scale than habitat quantity models. The few consistent responses to the variation in spatial scales indicate that despite their small size, small mammals perceive landscape features at large spatial scales. Matrix quality index corresponding to species occurrence presented a better or similar performance compared to that of species abundance. Results indicate the importance of the matrix for the dynamics of fragmented landscapes and suggest that relatively simple indices can improve our understanding of species distribution, and could be applied in modeling, monitoring and managing complex tropical landscapes.
Resumo:
Generating quadrilateral meshes is a highly non-trivial task, as design decisions are frequently driven by specific application demands. Automatic techniques can optimize objective quality metrics, such as mesh regularity, orthogonality, alignment and adaptivity; however, they cannot make subjective design decisions. There are a few quad meshing approaches that offer some mechanisms to include the user in the mesh generation process; however, these techniques either require a large amount of user interaction or do not provide necessary or easy to use inputs. Here, we propose a template-based approach for generating quad-only meshes from triangle surfaces. Our approach offers a flexible mechanism to allow external input, through the definition of alignment features that are respected during the mesh generation process. While allowing user inputs to support subjective design decisions, our approach also takes into account objective quality metrics to produce semi-regular, quad-only meshes that align well to desired surface features. Published by Elsevier Ltd.
Resumo:
Automatic summarization of texts is now crucial for several information retrieval tasks owing to the huge amount of information available in digital media, which has increased the demand for simple, language-independent extractive summarization strategies. In this paper, we employ concepts and metrics of complex networks to select sentences for an extractive summary. The graph or network representing one piece of text consists of nodes corresponding to sentences, while edges connect sentences that share common meaningful nouns. Because various metrics could be used, we developed a set of 14 summarizers, generically referred to as CN-Summ, employing network concepts such as node degree, length of shortest paths, d-rings and k-cores. An additional summarizer was created which selects the highest ranked sentences in the 14 systems, as in a voting system. When applied to a corpus of Brazilian Portuguese texts, some CN-Summ versions performed better than summarizers that do not employ deep linguistic knowledge, with results comparable to state-of-the-art summarizers based on expensive linguistic resources. The use of complex networks to represent texts appears therefore as suitable for automatic summarization, consistent with the belief that the metrics of such networks may capture important text features. (c) 2008 Elsevier Inc. All rights reserved.
Resumo:
Complex networks have been increasingly used in text analysis, including in connection with natural language processing tools, as important text features appear to be captured by the topology and dynamics of the networks. Following previous works that apply complex networks concepts to text quality measurement, summary evaluation, and author characterization, we now focus on machine translation (MT). In this paper we assess the possible representation of texts as complex networks to evaluate cross-linguistic issues inherent in manual and machine translation. We show that different quality translations generated by NIT tools can be distinguished from their manual counterparts by means of metrics such as in-(ID) and out-degrees (OD), clustering coefficient (CC), and shortest paths (SP). For instance, we demonstrate that the average OD in networks of automatic translations consistently exceeds the values obtained for manual ones, and that the CC values of source texts are not preserved for manual translations, but are for good automatic translations. This probably reflects the text rearrangements humans perform during manual translation. We envisage that such findings could lead to better NIT tools and automatic evaluation metrics.
Resumo:
In this paper we present a novel approach for multispectral image contextual classification by combining iterative combinatorial optimization algorithms. The pixel-wise decision rule is defined using a Bayesian approach to combine two MRF models: a Gaussian Markov Random Field (GMRF) for the observations (likelihood) and a Potts model for the a priori knowledge, to regularize the solution in the presence of noisy data. Hence, the classification problem is stated according to a Maximum a Posteriori (MAP) framework. In order to approximate the MAP solution we apply several combinatorial optimization methods using multiple simultaneous initializations, making the solution less sensitive to the initial conditions and reducing both computational cost and time in comparison to Simulated Annealing, often unfeasible in many real image processing applications. Markov Random Field model parameters are estimated by Maximum Pseudo-Likelihood (MPL) approach, avoiding manual adjustments in the choice of the regularization parameters. Asymptotic evaluations assess the accuracy of the proposed parameter estimation procedure. To test and evaluate the proposed classification method, we adopt metrics for quantitative performance assessment (Cohen`s Kappa coefficient), allowing a robust and accurate statistical analysis. The obtained results clearly show that combining sub-optimal contextual algorithms significantly improves the classification performance, indicating the effectiveness of the proposed methodology. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Given a Lorentzian manifold (M, g), an event p and an observer U in M, then p and U are light conjugate if there exists a lightlike geodesic gamma : [0, 1] -> M joining p and U whose endpoints are conjugate along gamma. Using functional analytical techniques, we prove that if one fixes p and U in a differentiable manifold M, then the set of stationary Lorentzian metrics in M for which p and U are not light conjugate is generic in a strong sense. The result is obtained by reduction to a Finsler geodesic problem via a second order Fermat principle for light rays, and using a transversality argument in an infinite dimensional Banach manifold setup.
Resumo:
We consider a family of variational problems on a Hilbert manifold parameterized by an open subset of a Banach manifold, and we discuss the genericity of the nondegeneracy condition for the critical points. Using classical techniques, we prove an abstract genericity result that employs the infinite dimensional Sard-Smale theorem, along the lines of an analogous result of B. White [29]. Applications are given by proving the genericity of metrics without degenerate geodesics between fixed endpoints in general (non compact) semi-Riemannian manifolds, in orthogonally split semi-Riemannian manifolds and in globally hyperbolic Lorentzian manifolds. We discuss the genericity property also in stationary Lorentzian manifolds.
Resumo:
We prove the semi-Riemannian bumpy metric theorem using equivariant variational genericity. The theorem states that, on a given compact manifold M, the set of semi-Riemannian metrics that admit only nondegenerate closed geodesics is generic relatively to the C(k)-topology, k=2, ..., infinity, in the set of metrics of a given index on M. A higher-order genericity Riemannian result of Klingenberg and Takens is extended to semi-Riemannian geometry.
Resumo:
Let M be a possibly noncompact manifold. We prove, generically in the C(k)-topology (2 <= k <= infinity), that semi-Riemannian metrics of a given index on M do not possess any degenerate geodesics satisfying suitable boundary conditions. This extends a result of L. Biliotti, M. A. Javaloyes and P. Piccione [6] for geodesics with fixed endpoints to the case where endpoints lie on a compact submanifold P subset of M x M that satisfies an admissibility condition. Such condition holds, for example, when P is transversal to the diagonal Delta subset of M x M. Further aspects of these boundary conditions are discussed and general conditions under which metrics without degenerate geodesics are C(k)-generic are given.
Resumo:
Cytochrome P450 (CYP450) is a class of enzymes where the substrate identification is particularly important to know. It would help medicinal chemists to design drugs with lower side effects due to drug-drug interactions and to extensive genetic polymorphism. Herein, we discuss the application of the 2D and 3D-similarity searches in identifying reference Structures with higher capacity to retrieve Substrates of three important CYP enzymes (CYP2C9, CYP2D6, and CYP3A4). On the basis of the complementarities of multiple reference structures selected by different similarity search methods, we proposed the fusion of their individual Tanimoto scores into a consensus Tanimoto score (T(consensus)). Using this new score, true positive rates of 63% (CYP2C9) and 81% (CYP2D6) were achieved with false positive rates of 4% for the CYP2C9-CYP2D6 data Set. Extended similarity searches were carried out oil a validation data set, and the results showed that by using the T(consensus) score, not only the area of a ROC graph increased, but also more substrates were recovered at the beginning of a ranked list.