996 resultados para Deep web


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Matching query interfaces is a crucial step in data integration across multiple Web databases. The problem is closely related to schema matching that typically exploits different features of schemas. Relying on a particular feature of schemas is not suffcient. We propose an evidential approach to combining multiple matchers using Dempster-Shafer theory of evidence. First, our approach views the match results of an individual matcher as a source of evidence that provides a level of confidence on the validity of each candidate attribute correspondence. Second, it combines multiple sources of evidence to get a combined mass function that represents the overall level of confidence, taking into account the match results of different matchers. Our combination mechanism does not require use of weighing parameters, hence no setting and tuning of them is needed. Third, it selects the top k attribute correspondences of each source attribute from the target schema based on the combined mass function. Finally it uses some heuristics to resolve any conflicts between the attribute correspondences of different source attributes. Our experimental results show that our approach is highly accurate and effective.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Web sites that rely on databases for their content are now ubiquitous. Query result pages are dynamically generated from these databases in response to user-submitted queries. Automatically extracting structured data from query result pages is a challenging problem, as the structure of the data is not explicitly represented. While humans have shown good intuition in visually understanding data records on a query result page as displayed by a web browser, no existing approach to data record extraction has made full use of this intuition. We propose a novel approach, in which we make use of the common sources of evidence that humans use to understand data records on a displayed query result page. These include structural regularity, and visual and content similarity between data records displayed on a query result page. Based on these observations we propose new techniques that can identify each data record individually, while ignoring noise items, such as navigation bars and adverts. We have implemented these techniques in a software prototype, rExtractor, and tested it using two datasets. Our experimental results show that our approach achieves significantly higher accuracy than previous approaches. Furthermore, it establishes the case for use of vision-based algorithms in the context of data extraction from web sites.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Un argomento di attualità è la privacy e la sicurezza in rete. La tesi, attraverso lo studio di diversi documenti e la sperimentazione di applicazioni per garantire l'anonimato, analizza la situazione attuale. La nostra privacy è compromessa e risulta importante una sensibilizzazione globale.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

En el Siglo XXI, donde las Nuevas Tecnologías de la Información y Comunicación están a la orden del día, se suceden manifestaciones que tradicionalmente abarcaban otros entornos menos virtuales y engloban grupos etarios desconocidos en los que la diferenciación de género es manifiesta. Con la facilidad de acceso y conexión a Internet, muchos disponemos de herramientas suficientes como para escribir en Redes Sociales determinadas emociones en su grado extremo así como ideaciones suicidas. Sin embargo, hay ubicaciones más profundas y desconocidas por algunos usuarios, como la Deep Web (y su navegador Tor), que permiten un completo anonimato del usuario. Por tanto, surge necesidad de la creación de un corpus de mensajes de índole suicida y relacionados con las emociones profundas con el fin de analizar el léxico mediante el lenguaje computacional y una previa categorización de los resultados con el fin de fomentar la creación de programas que detecten estas manifestaciones y ejerzan una labor preventiva.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Trabalho baseado no relatório para a disciplina “Sociologia das Novas Tecnologias de Informação” no âmbito do Mestrado Integrado de Engenharia e Gestão Industrial, da Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa em 2015-16. O trabalho foi orientado pelo Prof. António Brandão Moniz do Departamento de Ciências Sociais Aplicadas (DCSA) na mesma Faculdade.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Web is a steadily evolving resource comprising much more than mere HTML pages. With its ever-growing data sources in a variety of formats, it provides great potential for knowledge discovery. In this article, we shed light on some interesting phenomena of the Web: the deep Web, which surfaces database records as Web pages; the Semantic Web, which de�nes meaningful data exchange formats; XML, which has established itself as a lingua franca for Web data exchange; and domain-speci�c markup languages, which are designed based on XML syntax with the goal of preserving semantics in targeted domains. We detail these four developments in Web technology, and explain how they can be used for data mining. Our goal is to show that all these areas can be as useful for knowledge discovery as the HTML-based part of the Web.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

CAP.1: introduzione sulla creazione di Internet e la sua diffusione; CAP.2: panoramica sui dati reperibili online e sugli strumenti attraverso cui è possibile estrarne informazioni; CAP.3: concetti di privacy ed anonimato applicati ad Internet, alcune normative, sintesi su cookie e spyware; CAP.4: deep web, cos'è e come raggiungerlo; CAP.5: TOR project, elenco delle componenti, spiegazione del protocollo per la creazione di connessioni anonime, particolarità ed aspetti problematici; CAP.6: conclusioni; carrellata dei progetti correlati a TOR, statistiche sull'uso dell'Internet anonimo, considerazioni sugli effetti dell'anonimato sul sociale e sull'inviolabilità di questo sistema.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La expansión de las tecnologías de la información y las comunicaciones (TIC) ha traído muchas ventajas, pero también algunos peligros. Son frecuentes hoy en día las noticias sobre delitos relacionados con las TIC. Se usa a menudo el término cibercrimen y el de ciberterrorismo pero, ¿realmente son una amenaza para la sociedad?. Este trabajo realiza un análisis del cibercrimen y el ciberterrorismo. Para ello se hace un estudio en profundidad desde distintos puntos de vista. En primer lugar se analizan varios aspectos básicos de la materia: el contexto en el que se desarrollan estas actividades, el ciberespacio y sus características, las ventajas que tiene el cibercrimen respecto a la delincuencia tradicional, características y ejemplos de ciberterrorismo y la importancia de la protección de las infraestructuras críticas. Luego se realiza un estudio del mundo del cibercrimen, en el cual se muestran los distintos tipos de cibercriminales, los actos delictivos, herramientas y técnicas más habituales usadas por el cibercrimen, la web profunda y la criptomoneda; se indican asimismo varios de los grupos criminales más conocidos y algunas de sus acciones, y se realiza un estudio de las consecuencias económicas del cibercrimen. Finalmente se hace un repaso a los medios legales que distintos países y organizaciones han establecido para combatir estos hechos delictivos. Para ello se analizan estrategias de seguridad de distinto tipo aprobadas en multitud de países de todo el mundo y los grupos operativos de respuesta (tanto los de tipo policial como los CSIRT/CERT), además de la legislación publicada para poder perseguir el cibercrimen y el ciberterrorismo, con especial atención a la legislación española. De esta manera, tras la lectura de este Proyecto se puede tener una visión global completa del mundo de la ciberdelincuencia y el ciberterrorismo. ABSTRACT. The expansion of Information and Communications Technology (ITC) has brought many benefits, but also some dangers. It is very usual nowadays to see news about ITC-related crimes. Terms like cyber crime and cyber terrorism are usually used but, are they really a big threat for our society?. This work analyzes cyber crime and cyber terrorism. To achieve it, a deep research under different points of view is made. First, basic aspects of the topic are analyzed: the context where these activities are carried out, cyber space and its features, benefits for cyber criminals with respect to traditional crime, characteristics and relevant examples of cyber terrorism, and importance of critical infrastructures protection. Then, a study about the world of cyber crime is made, analyzing the typology of different kinds of cyber criminals, the most common criminal acts, tools and techniques used by cyber crime, and the deep web and cryptocurrency. Some of the most known criminal groups and their activities are also explored, and the economic consequences of cyber crime are assessed. Finally, there is a review of the legal means used by countries and organizations to fight against these unlawful acts; this includes the analysis of several types of security strategies approved by countries all around the world, operational response groups (including law enforcement and CSIRT/CERT) and legislation to fight cyber crime and cyber terrorism, with special emphasis on Spanish legal rules. This way, a global, complete view of the world around cyber crime and cyber terrorism can be obtained after reading this work.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

El creciente número de sitios y recursos en Internet ha dificultado el proceso de búsqueda y recuperación de información útil para la formación e investigación. Aunque la Red cuenta con importantísimas fuentes, es desconocida por los motores de búsqueda tradicionales, a esta parte se le llama Red Profunda. Para acceder a este pequeño universo de Internet es necesario conocer los mecanismos, estrategias y herramientas que faciliten y garanticen el logro de nuestros objetivos.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recent years have seen an astronomical rise in SQL Injection Attacks (SQLIAs) used to compromise the confidentiality, authentication and integrity of organisations’ databases. Intruders becoming smarter in obfuscating web requests to evade detection combined with increasing volumes of web traffic from the Internet of Things (IoT), cloud-hosted and on-premise business applications have made it evident that the existing approaches of mostly static signature lack the ability to cope with novel signatures. A SQLIA detection and prevention solution can be achieved through exploring an alternative bio-inspired supervised learning approach that uses input of labelled dataset of numerical attributes in classifying true positives and negatives. We present in this paper a Numerical Encoding to Tame SQLIA (NETSQLIA) that implements a proof of concept for scalable numerical encoding of features to a dataset attributes with labelled class obtained from deep web traffic analysis. In the numerical attributes encoding: the model leverages proxy in the interception and decryption of web traffic. The intercepted web requests are then assembled for front-end SQL parsing and pattern matching by applying traditional Non-Deterministic Finite Automaton (NFA). This paper is intended for a technique of numerical attributes extraction of any size primed as an input dataset to an Artificial Neural Network (ANN) and statistical Machine Learning (ML) algorithms implemented using Two-Class Averaged Perceptron (TCAP) and Two-Class Logistic Regression (TCLR) respectively. This methodology then forms the subject of the empirical evaluation of the suitability of this model in the accurate classification of both legitimate web requests and SQLIA payloads.