935 resultados para complex data


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Remote sensing techniques involving hyperspectral imagery have applications in a number of sciences that study some aspects of the surface of the planet. The analysis of hyperspectral images is complex because of the large amount of information involved and the noise within that data. Investigating images with regard to identify minerals, rocks, vegetation and other materials is an application of hyperspectral remote sensing in the earth sciences. This thesis evaluates the performance of two classification and clustering techniques on hyperspectral images for mineral identification. Support Vector Machines (SVM) and Self-Organizing Maps (SOM) are applied as classification and clustering techniques, respectively. Principal Component Analysis (PCA) is used to prepare the data to be analyzed. The purpose of using PCA is to reduce the amount of data that needs to be processed by identifying the most important components within the data. A well-studied dataset from Cuprite, Nevada and a dataset of more complex data from Baffin Island were used to assess the performance of these techniques. The main goal of this research study is to evaluate the advantage of training a classifier based on a small amount of data compared to an unsupervised method. Determining the effect of feature extraction on the accuracy of the clustering and classification method is another goal of this research. This thesis concludes that using PCA increases the learning accuracy, and especially so in classification. SVM classifies Cuprite data with a high precision and the SOM challenges SVM on datasets with high level of noise (like Baffin Island).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Le développement du logiciel actuel doit faire face de plus en plus à la complexité de programmes gigantesques, élaborés et maintenus par de grandes équipes réparties dans divers lieux. Dans ses tâches régulières, chaque intervenant peut avoir à répondre à des questions variées en tirant des informations de sources diverses. Pour améliorer le rendement global du développement, nous proposons d'intégrer dans un IDE populaire (Eclipse) notre nouvel outil de visualisation (VERSO) qui calcule, organise, affiche et permet de naviguer dans les informations de façon cohérente, efficace et intuitive, afin de bénéficier du système visuel humain dans l'exploration de données variées. Nous proposons une structuration des informations selon trois axes : (1) le contexte (qualité, contrôle de version, bogues, etc.) détermine le type des informations ; (2) le niveau de granularité (ligne de code, méthode, classe, paquetage) dérive les informations au niveau de détails adéquat ; et (3) l'évolution extrait les informations de la version du logiciel désirée. Chaque vue du logiciel correspond à une coordonnée discrète selon ces trois axes, et nous portons une attention toute particulière à la cohérence en naviguant entre des vues adjacentes seulement, et ce, afin de diminuer la charge cognitive de recherches pour répondre aux questions des utilisateurs. Deux expériences valident l'intérêt de notre approche intégrée dans des tâches représentatives. Elles permettent de croire qu'un accès à diverses informations présentées de façon graphique et cohérente devrait grandement aider le développement du logiciel contemporain.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Formal Concept Analysis allows to derive conceptual hierarchies from data tables. Formal Concept Analysis is applied in various domains, e.g., data analysis, information retrieval, and knowledge discovery in databases. In order to deal with increasing sizes of the data tables (and to allow more complex data structures than just binary attributes), conceputal scales habe been developed. They are considered as metadata which structure the data conceptually. But in large applications, the number of conceptual scales increases as well. Techniques are needed which support the navigation of the user also on this meta-level of conceptual scales. In this paper, we attack this problem by extending the set of scales by hierarchically ordered higher level scales and by introducing a visualization technique called nested scaling. We extend the two-level architecture of Formal Concept Analysis (the data table plus one level of conceptual scales) to many-level architecture with a cascading system of conceptual scales. The approach also allows to use representation techniques of Formal Concept Analysis for the visualization of thesauri and ontologies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Este trabajo está dividido en dos aspectos puntales de la migración de colombianos hacia la ciudad de Nueva York. El primero, está centrado en una reconstrucción del proceso migratorio, con la finalidad de discutir y repensar la reducción que se hace del fenómeno a lineamientos netamente económicos. Para ello, ésta atiende características puntuales tales como las condiciones de inmigración, las redes migratorias y la percepción general del fenómeno por parte de los involucrados; para proponer que aspectos como la “aventura” o “el vamos a ver qué pasa” determinan en gran medida el desplazamiento de colombianos hacia el exterior. Mientras que el segundo, está enfocado en analizar el proceso subsiguiente de integración que vivencian los inmigrantes una vez entran en contacto con los universos culturales de la sociedad huésped. Para reconstruir este aspecto, en él se siguen los planteamientos del ciclo de relaciones raciales y de interacción y violencia simbólica, que permiten realizar una discusión pertinente del fenómeno en el contexto emigratorio colombiano. Este documento está basado en tres aspectos metodológicos complementarios: La reconstrucción y análisis de relatos de vida, una corta inmersión etnográfica y observación participante. Merced de ello, en él se replantean algunos factores analizados levemente en investigaciones previas sobre el caso y, además, se abren las puertas para tratamientos más profundos sobre esta población en el futuro. Palabras claves: Migración, redes migratorias, integración, reapropiación identitaria y mezcla cultural.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In real world applications sequential algorithms of data mining and data exploration are often unsuitable for datasets with enormous size, high-dimensionality and complex data structure. Grid computing promises unprecedented opportunities for unlimited computing and storage resources. In this context there is the necessity to develop high performance distributed data mining algorithms. However, the computational complexity of the problem and the large amount of data to be explored often make the design of large scale applications particularly challenging. In this paper we present the first distributed formulation of a frequent subgraph mining algorithm for discriminative fragments of molecular compounds. Two distributed approaches have been developed and compared on the well known National Cancer Institute’s HIV-screening dataset. We present experimental results on a small-scale computing environment.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

It is generally acknowledged that population-level assessments provide,I better measure of response to toxicants than assessments of individual-level effects. population-level assessments generally require the use of models to integrate potentially complex data about the effects of toxicants on life-history traits, and to provide a relevant measure of ecological impact. Building on excellent earlier reviews we here briefly outline the modelling options in population-level risk assessment. Modelling is used to calculate population endpoints from available data, which is often about Individual life histories, the ways that individuals interact with each other, the environment and other species, and the ways individuals are affected by pesticides. As population endpoints, we recommend the use of population abundance, population growth rate, and the chance of population persistence. We recommend two types of model: simple life-history models distinguishing two life-history stages, juveniles and adults; and spatially-explicit individual-based landscape models. Life-history models are very quick to set up and run, and they provide a great deal or insight. At the other extreme, individual-based landscape models provide the greatest verisimilitude, albeit at the cost of greatly increased complexity. We conclude with a discussion of the cations of the severe problems of parameterising models.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Modern database applications are increasingly employing database management systems (DBMS) to store multimedia and other complex data. To adequately support the queries required to retrieve these kinds of data, the DBMS need to answer similarity queries. However, the standard structured query language (SQL) does not provide effective support for such queries. This paper proposes an extension to SQL that seamlessly integrates syntactical constructions to express similarity predicates to the existing SQL syntax and describes the implementation of a similarity retrieval engine that allows posing similarity queries using the language extension in a relational DBM. The engine allows the evaluation of every aspect of the proposed extension, including the data definition language and data manipulation language statements, and employs metric access methods to accelerate the queries. Copyright (c) 2008 John Wiley & Sons, Ltd.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Currently, one of the biggest challenges for the field of data mining is to perform cluster analysis on complex data. Several techniques have been proposed but, in general, they can only achieve good results within specific areas providing no consensus of what would be the best way to group this kind of data. In general, these techniques fail due to non-realistic assumptions about the true probability distribution of the data. Based on this, this thesis proposes a new measure based on Cross Information Potential that uses representative points of the dataset and statistics extracted directly from data to measure the interaction between groups. The proposed approach allows us to use all advantages of this information-theoretic descriptor and solves the limitations imposed on it by its own nature. From this, two cost functions and three algorithms have been proposed to perform cluster analysis. As the use of Information Theory captures the relationship between different patterns, regardless of assumptions about the nature of this relationship, the proposed approach was able to achieve a better performance than the main algorithms in literature. These results apply to the context of synthetic data designed to test the algorithms in specific situations and to real data extracted from problems of different fields

Relevância:

60.00% 60.00%

Publicador:

Resumo:

It s notorious the advance of computer networks in recent decades, whether in relation to transmission rates, the number of interconnected devices or the existing applications. In parallel, it s also visible this progress in various sectors of the automation, such as: industrial, commercial and residential. In one of its branches, we find the hospital networks, which can make the use of a range of services, ranging from the simple registration of patients to a surgery by a robot under the supervision of a physician. In the context of both worlds, appear the applications in Telemedicine and Telehealth, which work with the transfer in real time of high resolution images, sound, video and patient data. Then comes a problem, since the computer networks, originally developed for the transfer of less complex data, is now being used by a service that involves high transfer rates and needs requirements for quality of service (QoS) offered by the network . Thus, this work aims to do the analysis and comparison of performance of a network when subjected to this type of application, for two different situations: the first without the use of QoS policies, and the second with the application of such policies, using as scenario for testing, the Metropolitan Health Network of the Federal University of Rio Grande do Norte (UFRN)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Oil prospecting is one of most complex and important features of oil industry Direct prospecting methods like drilling well logs are very expensive, in consequence indirect methods are preferred. Among the indirect prospecting techniques the seismic imaging is a relevant method. Seismic method is based on artificial seismic waves that are generated, go through the geologic medium suffering diffraction and reflexion and return to the surface where they are recorded and analyzed to construct seismograms. However, the seismogram contains not only actual geologic information, but also noise, and one of the main components of the noise is the ground roll. Noise attenuation is essential for a good geologic interpretation of the seismogram. It is common to study seismograms by using time-frequency transformations that map the seismic signal into a frequency space where it is easier to remove or attenuate noise. After that, data is reconstructed in the original space in such a way that geologic structures are shown in more detail. In addition, the curvelet transform is a new and effective spectral transformation that have been used in the analysis of complex data. In this work, we employ the curvelet transform to represent geologic data using basis functions that are directional in space. This particular basis can represent more effectively two dimensional objects with contours and lines. The curvelet analysis maps real space into frequencies scales and angular sectors in such way that we can distinguish in detail the sub-spaces where is the noise and remove the coefficients corresponding to the undesired data. In this work we develop and apply the denoising analysis to remove the ground roll of seismograms. We apply this technique to a artificial seismogram and to a real one. In both cases we obtain a good noise attenuation

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The use of increasingly complex software applications is demanding greater investment in the development of such systems to ensure applications with better quality. Therefore, new techniques are being used in Software Engineering, thus making the development process more effective. Among these new approaches, we highlight Formal Methods, which use formal languages that are strongly based on mathematics and have a well-defined semantics and syntax. One of these languages is Circus, which can be used to model concurrent systems. It was developed from the union of concepts from two other specification languages: Z, which specifies systems with complex data, and CSP, which is normally used to model concurrent systems. Circus has an associated refinement calculus, which can be used to develop software in a precise and stepwise fashion. Each step is justified by the application of a refinement law (possibly with the discharge of proof obligations). Sometimes, the same laws can be applied in the same manner in different developments or even in different parts of a single development. A strategy to optimize this calculus is to formalise these application as a refinement tactic, which can then be used as a single transformation rule. CRefine was developed to support the Circus refinement calculus. However, before the work presented here, it did not provide support for refinement tactics. The aim of this work is to provide tool support for refinement tactics. For that, we develop a new module in CRefine, which automates the process of defining and applying refinement tactics that are formalised in the tactic language ArcAngelC. Finally, we validate the extension by applying the new module in a case study, which used the refinement tactics in a refinement strategy for verification of SPARK Ada implementations of control systems. In this work, we apply our module in the first two phases of this strategy

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Non-conventional database management systems are used to achieve a better performance when dealing with complex data. One fundamental concept of these systems is object identity (OID), because each object in the database has a unique identifier that is used to access and reference it in relationships to other objects. Two approaches can be used for the implementation of OIDs: physical or logical OIDs. In order to manage complex data, was proposed the Multimedia Data Manager Kernel (NuGeM) that uses a logical technique, named Indirect Mapping. This paper proposes an improvement to the technique used by NuGeM, whose original contribution is management of OIDs with a fewer number of disc accesses and less processing, thus reducing management time from the pages and eliminating the problem with exhaustion of OIDs. Also, the technique presented here can be applied to others OODBMSs. © 2011 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Non-conventional database management systems are used to achieve a better performance when dealing with complex data. One fundamental concept of these systems is object identity (OID). Two techniques can be used for the implementation of OIDs: physical or logical. A logical implementation of OIDs, based on an Indirection Table, is used by NuGeM, a multimedia data manager kernel which is described in this paper. NuGeM Indirection Table allows the relocation of all pages in a database. The proposed strategy modifies the workings of this table so that it is possible to reduce considerably the number of I/O operations during the request and release of pages containing objects and their OIDs. Tests show a reduction of 84% in reading operations and a 67% reduction in writing operations when pages are requested. Although no changes were observed in writing operations during the release of pages, a 100% of reduction in reading operations was obtained. © 2012 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)