901 resultados para distinguishability metrics
Resumo:
Complex networks have been increasingly used in text analysis, including in connection with natural language processing tools, as important text features appear to be captured by the topology and dynamics of the networks. Following previous works that apply complex networks concepts to text quality measurement, summary evaluation, and author characterization, we now focus on machine translation (MT). In this paper we assess the possible representation of texts as complex networks to evaluate cross-linguistic issues inherent in manual and machine translation. We show that different quality translations generated by NIT tools can be distinguished from their manual counterparts by means of metrics such as in-(ID) and out-degrees (OD), clustering coefficient (CC), and shortest paths (SP). For instance, we demonstrate that the average OD in networks of automatic translations consistently exceeds the values obtained for manual ones, and that the CC values of source texts are not preserved for manual translations, but are for good automatic translations. This probably reflects the text rearrangements humans perform during manual translation. We envisage that such findings could lead to better NIT tools and automatic evaluation metrics.
Resumo:
In this paper we present a novel approach for multispectral image contextual classification by combining iterative combinatorial optimization algorithms. The pixel-wise decision rule is defined using a Bayesian approach to combine two MRF models: a Gaussian Markov Random Field (GMRF) for the observations (likelihood) and a Potts model for the a priori knowledge, to regularize the solution in the presence of noisy data. Hence, the classification problem is stated according to a Maximum a Posteriori (MAP) framework. In order to approximate the MAP solution we apply several combinatorial optimization methods using multiple simultaneous initializations, making the solution less sensitive to the initial conditions and reducing both computational cost and time in comparison to Simulated Annealing, often unfeasible in many real image processing applications. Markov Random Field model parameters are estimated by Maximum Pseudo-Likelihood (MPL) approach, avoiding manual adjustments in the choice of the regularization parameters. Asymptotic evaluations assess the accuracy of the proposed parameter estimation procedure. To test and evaluate the proposed classification method, we adopt metrics for quantitative performance assessment (Cohen`s Kappa coefficient), allowing a robust and accurate statistical analysis. The obtained results clearly show that combining sub-optimal contextual algorithms significantly improves the classification performance, indicating the effectiveness of the proposed methodology. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Given a Lorentzian manifold (M, g), an event p and an observer U in M, then p and U are light conjugate if there exists a lightlike geodesic gamma : [0, 1] -> M joining p and U whose endpoints are conjugate along gamma. Using functional analytical techniques, we prove that if one fixes p and U in a differentiable manifold M, then the set of stationary Lorentzian metrics in M for which p and U are not light conjugate is generic in a strong sense. The result is obtained by reduction to a Finsler geodesic problem via a second order Fermat principle for light rays, and using a transversality argument in an infinite dimensional Banach manifold setup.
Resumo:
We consider a family of variational problems on a Hilbert manifold parameterized by an open subset of a Banach manifold, and we discuss the genericity of the nondegeneracy condition for the critical points. Using classical techniques, we prove an abstract genericity result that employs the infinite dimensional Sard-Smale theorem, along the lines of an analogous result of B. White [29]. Applications are given by proving the genericity of metrics without degenerate geodesics between fixed endpoints in general (non compact) semi-Riemannian manifolds, in orthogonally split semi-Riemannian manifolds and in globally hyperbolic Lorentzian manifolds. We discuss the genericity property also in stationary Lorentzian manifolds.
Resumo:
We prove the semi-Riemannian bumpy metric theorem using equivariant variational genericity. The theorem states that, on a given compact manifold M, the set of semi-Riemannian metrics that admit only nondegenerate closed geodesics is generic relatively to the C(k)-topology, k=2, ..., infinity, in the set of metrics of a given index on M. A higher-order genericity Riemannian result of Klingenberg and Takens is extended to semi-Riemannian geometry.
Resumo:
Let M be a possibly noncompact manifold. We prove, generically in the C(k)-topology (2 <= k <= infinity), that semi-Riemannian metrics of a given index on M do not possess any degenerate geodesics satisfying suitable boundary conditions. This extends a result of L. Biliotti, M. A. Javaloyes and P. Piccione [6] for geodesics with fixed endpoints to the case where endpoints lie on a compact submanifold P subset of M x M that satisfies an admissibility condition. Such condition holds, for example, when P is transversal to the diagonal Delta subset of M x M. Further aspects of these boundary conditions are discussed and general conditions under which metrics without degenerate geodesics are C(k)-generic are given.
Resumo:
Cytochrome P450 (CYP450) is a class of enzymes where the substrate identification is particularly important to know. It would help medicinal chemists to design drugs with lower side effects due to drug-drug interactions and to extensive genetic polymorphism. Herein, we discuss the application of the 2D and 3D-similarity searches in identifying reference Structures with higher capacity to retrieve Substrates of three important CYP enzymes (CYP2C9, CYP2D6, and CYP3A4). On the basis of the complementarities of multiple reference structures selected by different similarity search methods, we proposed the fusion of their individual Tanimoto scores into a consensus Tanimoto score (T(consensus)). Using this new score, true positive rates of 63% (CYP2C9) and 81% (CYP2D6) were achieved with false positive rates of 4% for the CYP2C9-CYP2D6 data Set. Extended similarity searches were carried out oil a validation data set, and the results showed that by using the T(consensus) score, not only the area of a ROC graph increased, but also more substrates were recovered at the beginning of a ranked list.
Resumo:
A challenge for the clinical management of Parkinson's disease (PD) is the large within- and between-patient variability in symptom profiles as well as the emergence of motor complications which represent a significant source of disability in patients. This thesis deals with the development and evaluation of methods and systems for supporting the management of PD by using repeated measures, consisting of subjective assessments of symptoms and objective assessments of motor function through fine motor tests (spirography and tapping), collected by means of a telemetry touch screen device. One aim of the thesis was to develop methods for objective quantification and analysis of the severity of motor impairments being represented in spiral drawings and tapping results. This was accomplished by first quantifying the digitized movement data with time series analysis and then using them in data-driven modelling for automating the process of assessment of symptom severity. The objective measures were then analysed with respect to subjective assessments of motor conditions. Another aim was to develop a method for providing comparable information content as clinical rating scales by combining subjective and objective measures into composite scores, using time series analysis and data-driven methods. The scores represent six symptom dimensions and an overall test score for reflecting the global health condition of the patient. In addition, the thesis presents the development of a web-based system for providing a visual representation of symptoms over time allowing clinicians to remotely monitor the symptom profiles of their patients. The quality of the methods was assessed by reporting different metrics of validity, reliability and sensitivity to treatment interventions and natural PD progression over time. Results from two studies demonstrated that the methods developed for the fine motor tests had good metrics indicating that they are appropriate to quantitatively and objectively assess the severity of motor impairments of PD patients. The fine motor tests captured different symptoms; spiral drawing impairment and tapping accuracy related to dyskinesias (involuntary movements) whereas tapping speed related to bradykinesia (slowness of movements). A longitudinal data analysis indicated that the six symptom dimensions and the overall test score contained important elements of information of the clinical scales and can be used to measure effects of PD treatment interventions and disease progression. A usability evaluation of the web-based system showed that the information presented in the system was comparable to qualitative clinical observations and the system was recognized as a tool that will assist in the management of patients.
Resumo:
Viljan att hålla en hög kvalitet på den kod som skrivs vid utveckling av system och applikationerär inte något nytt i utvecklingsvärlden. Flera större företag använder sig av olika mått för attmäta kvaliteten på koden i sina system med målet att hålla en hög driftsäkerhet.Trafikverket är en statlig myndighet som ansvarar för driften av bland annat de system somhåller igång Sveriges järnvägsnät. Eftersom systemen fyller en viktig del i att säkra driften ochse till att tågpositioner, planering av avgångar och hantering av driftstörningar fungerar dygnetrunt för hela landet anser de att det är viktigt att sträva efter att hålla en hög kvalitet påsystemen.Syftet med det här examensarbetet var att ta reda på vilka mått som kan vara möjliga attanvända under systemutvecklingsprocessen för att mäta kvaliteten på kod och hur måtten kananvändas för att öka kvaliteten på IT-lösningar. Detta för att redan på ett tidigt stadie kunnamäta kvaliteten på den kod som skrivs i både befintliga och nyutvecklade system.Studien är en fallstudie som utfördes på Trafikverket, de olika måtten som undersöktes varcode coverage, nivån på maintainability index och antalet inrapporterade incidenter för varjesystem. Mätningar utfördes på sju av Trafikverkets system som i analysen jämfördes motantalet rapporterade incidenter. Intervjuer utfördes för att ge en bild över hur arbetssättet vidutveckling kan påverka kvaliteten. Genom litteraturstudier kom det fram ett mått som inte kundeanvändas praktiskt i det här fallet men är högst intressant, detta är cyclomatic complexity somfinns som en del av maintainability index men som även separat påverkar möjligheten att skrivaenhetstest.Resultaten av studien visar att måtten är användbara för ändamålet men bör inte användassom enskilda mått för att mäta kvalitet eftersom de fyller olika funktioner. Det är viktigt attarbetssättet runt utveckling genomförs enligt en tydlig struktur och att utvecklarna både harkunskap om hur man arbetar med enhetstest och följer kodprinciper för strukturen. Tydligakopplingar mellan nivån på code coverage och inflödet av incidenter kunde ses i de undersöktasystemen där hög code coverage ger ett lägre inflöde av incidenter. Ingen korrelation mellanmaintainability index och incidenter kunde hittas.
Resumo:
The core concepts of CA In the theoretical framework of CA, well-being is constituted by a person’s unique way of functioning and capabilities. This means that a person's well-being is personal and involves freedom of choice which in turn means they have a number of options. Although many people may have the same resources, it is of importance to study how these resources are converted into how they function. Thus, wellbeing is about the person's freedom to achieve in general and the capabilities to function in particular (Sen, 1995). Strength of the capability approach The capability approach is a useful tool for matching objective evaluations with subjective metrics. Furthermore, although one’s individual abilities are in focus, contextual factors, and subjective perceptions and experiences, are taken into consideration. Critiques against the CA The capability approach has been criticized for being too individual-centered and not taking sufficient account to social structures in society. It is difficult to know what a person would choose to do if other options were available. Therefore, to operationalize abilities involves uncertainties.
Resumo:
Det mobila operativsystemet Android är idag ett ganska dominerande operativsystem på den mobila marknaden dels på grund av sin öppenhet men också på grund av att tillgängligheten är stor i och med både billiga och dyra telefoner finns att tillgå. Men idag har Android inget fördefinierat designmönster vilket leder till att varje utvecklare får bestämma själv vad som ska användas, vilket ibland kan leda till onödigt komplex kod i applikationerna som sen blir svårtestad och svårhanterlig. Detta arbete ämnar jämföra två designmönster, Passive Model View Controller (PMVC) och Model View View-Model (MVVM), för att se vilket designmönster som blir minst komplext med hjälp av att räkna fram mätvärden med hjälp av Cyclomatic Complexity Number (CCN). Studien är gjord utifrån arbetssättet Design & Creation och ämnar bidra med: kunskap om vilket mönster man bör välja, samt om CCN kan peka ut vilka delar i en applikation som kommer att ta mer eller mindre lång tid att testa. Under studiens gång tog vi även fram skillnader på om man anväder sig av den så kallade Single Responsibilyt Principle (SRP) eller inte. Detta för att se om separerade vyer gör någon skillnad i applikationernas komplexitet. I slutändan så visar studien på att komplexiteten i små applikationer är väldigt likvärdig, men att man även på små applikationer kan se skillnad på hur komplex koden är men också att kodkomplexitet på metodnivå kan ge riktlinjer för testfall.
Resumo:
In a global economy, manufacturers mainly compete with cost efficiency of production, as the price of raw materials are similar worldwide. Heavy industry has two big issues to deal with. On the one hand there is lots of data which needs to be analyzed in an effective manner, and on the other hand making big improvements via investments in cooperate structure or new machinery is neither economically nor physically viable. Machine learning offers a promising way for manufacturers to address both these problems as they are in an excellent position to employ learning techniques with their massive resource of historical production data. However, choosing modelling a strategy in this setting is far from trivial and this is the objective of this article. The article investigates characteristics of the most popular classifiers used in industry today. Support Vector Machines, Multilayer Perceptron, Decision Trees, Random Forests, and the meta-algorithms Bagging and Boosting are mainly investigated in this work. Lessons from real-world implementations of these learners are also provided together with future directions when different learners are expected to perform well. The importance of feature selection and relevant selection methods in an industrial setting are further investigated. Performance metrics have also been discussed for the sake of completion.
Resumo:
The authors take a broad view that ultimately Grid- or Web-services must be located via personalised, semantic-rich discovery processes. They argue that such processes must rely on the storage of arbitrary metadata about services that originates from both service providers and service users. Examples of such metadata are reliability metrics, quality of service data, or semantic service description markup. This paper presents UDDI-MT, an extension to the standard UDDI service directory approach that supports the storage of such metadata via a tunnelling technique that ties the metadata store to the original UDDI directory. They also discuss the use of a rich, graph-based RDF query language for syntactic queries on this data. Finally, they analyse the performance of each of these contributions in our implementation.
Resumo:
We take a broad view that ultimately Grid- or Web-services must be located via personalised, semantic-rich discovery processes. We argue that such processes must rely on the storage of arbitrary metadata about services that originates from both service providers and service users. Examples of such metadata are reliability metrics, quality of service data, or semantic service description markup. This paper presents UDDI-MT, an extension to the standard UDDI service directory approach that supports the storage of such metadata via a tunnelling technique that ties the metadata store to the original UDDI directory. We also discuss the use of a rich, graph-based RDF query language for syntactic queries on this data. Finally, we analyse the performance of each of these contributions in our implementation.