933 resultados para unclean internet data


Relevância:

30.00% 30.00%

Publicador:

Resumo:

[EN] This paper is an outcome of the ERASMUS IP program called TOPCART, there are more information about this project that can be accessed from the following item:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Whoever in future will need information about a location or an area, either literature, measurement data, photos or administrative information, might only click on that spot on a screen map in the internet. A search programme started thereby will offer all available information in databanks. A step forward to such a solution, to the retrieval of location related literature and measurement data from different kinds of databanks, is presented by the project “Baltic Sea Web” (http://www.baltic.vtt.fi/ demonstrator/index.html). The basic idea was to make the available information about a certain location accessible via a link of their geographical coordinates, longitude and latitude, to a map in a web browser

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The dissertation is concerned with the mathematical study of various network problems. First, three real-world networks are considered: (i) the human brain network (ii) communication networks, (iii) electric power networks. Although these networks perform very different tasks, they share similar mathematical foundations. The high-level goal is to analyze and/or synthesis each of these systems from a “control and optimization” point of view. After studying these three real-world networks, two abstract network problems are also explored, which are motivated by power systems. The first one is “flow optimization over a flow network” and the second one is “nonlinear optimization over a generalized weighted graph”. The results derived in this dissertation are summarized below.

Brain Networks: Neuroimaging data reveals the coordinated activity of spatially distinct brain regions, which may be represented mathematically as a network of nodes (brain regions) and links (interdependencies). To obtain the brain connectivity network, the graphs associated with the correlation matrix and the inverse covariance matrix—describing marginal and conditional dependencies between brain regions—have been proposed in the literature. A question arises as to whether any of these graphs provides useful information about the brain connectivity. Due to the electrical properties of the brain, this problem will be investigated in the context of electrical circuits. First, we consider an electric circuit model and show that the inverse covariance matrix of the node voltages reveals the topology of the circuit. Second, we study the problem of finding the topology of the circuit based on only measurement. In this case, by assuming that the circuit is hidden inside a black box and only the nodal signals are available for measurement, the aim is to find the topology of the circuit when a limited number of samples are available. For this purpose, we deploy the graphical lasso technique to estimate a sparse inverse covariance matrix. It is shown that the graphical lasso may find most of the circuit topology if the exact covariance matrix is well-conditioned. However, it may fail to work well when this matrix is ill-conditioned. To deal with ill-conditioned matrices, we propose a small modification to the graphical lasso algorithm and demonstrate its performance. Finally, the technique developed in this work will be applied to the resting-state fMRI data of a number of healthy subjects.

Communication Networks: Congestion control techniques aim to adjust the transmission rates of competing users in the Internet in such a way that the network resources are shared efficiently. Despite the progress in the analysis and synthesis of the Internet congestion control, almost all existing fluid models of congestion control assume that every link in the path of a flow observes the original source rate. To address this issue, a more accurate model is derived in this work for the behavior of the network under an arbitrary congestion controller, which takes into account of the effect of buffering (queueing) on data flows. Using this model, it is proved that the well-known Internet congestion control algorithms may no longer be stable for the common pricing schemes, unless a sufficient condition is satisfied. It is also shown that these algorithms are guaranteed to be stable if a new pricing mechanism is used.

Electrical Power Networks: Optimal power flow (OPF) has been one of the most studied problems for power systems since its introduction by Carpentier in 1962. This problem is concerned with finding an optimal operating point of a power network minimizing the total power generation cost subject to network and physical constraints. It is well known that OPF is computationally hard to solve due to the nonlinear interrelation among the optimization variables. The objective is to identify a large class of networks over which every OPF problem can be solved in polynomial time. To this end, a convex relaxation is proposed, which solves the OPF problem exactly for every radial network and every meshed network with a sufficient number of phase shifters, provided power over-delivery is allowed. The concept of “power over-delivery” is equivalent to relaxing the power balance equations to inequality constraints.

Flow Networks: In this part of the dissertation, the minimum-cost flow problem over an arbitrary flow network is considered. In this problem, each node is associated with some possibly unknown injection, each line has two unknown flows at its ends related to each other via a nonlinear function, and all injections and flows need to satisfy certain box constraints. This problem, named generalized network flow (GNF), is highly non-convex due to its nonlinear equality constraints. Under the assumption of monotonicity and convexity of the flow and cost functions, a convex relaxation is proposed, which always finds the optimal injections. A primary application of this work is in the OPF problem. The results of this work on GNF prove that the relaxation on power balance equations (i.e., load over-delivery) is not needed in practice under a very mild angle assumption.

Generalized Weighted Graphs: Motivated by power optimizations, this part aims to find a global optimization technique for a nonlinear optimization defined over a generalized weighted graph. Every edge of this type of graph is associated with a weight set corresponding to the known parameters of the optimization (e.g., the coefficients). The motivation behind this problem is to investigate how the (hidden) structure of a given real/complex valued optimization makes the problem easy to solve, and indeed the generalized weighted graph is introduced to capture the structure of an optimization. Various sufficient conditions are derived, which relate the polynomial-time solvability of different classes of optimization problems to weak properties of the generalized weighted graph such as its topology and the sign definiteness of its weight sets. As an application, it is proved that a broad class of real and complex optimizations over power networks are polynomial-time solvable due to the passivity of transmission lines and transformers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Smartphones and other powerful sensor-equipped consumer devices make it possible to sense the physical world at an unprecedented scale. Nearly 2 million Android and iOS devices are activated every day, each carrying numerous sensors and a high-speed internet connection. Whereas traditional sensor networks have typically deployed a fixed number of devices to sense a particular phenomena, community networks can grow as additional participants choose to install apps and join the network. In principle, this allows networks of thousands or millions of sensors to be created quickly and at low cost. However, making reliable inferences about the world using so many community sensors involves several challenges, including scalability, data quality, mobility, and user privacy.

This thesis focuses on how learning at both the sensor- and network-level can provide scalable techniques for data collection and event detection. First, this thesis considers the abstract problem of distributed algorithms for data collection, and proposes a distributed, online approach to selecting which set of sensors should be queried. In addition to providing theoretical guarantees for submodular objective functions, the approach is also compatible with local rules or heuristics for detecting and transmitting potentially valuable observations. Next, the thesis presents a decentralized algorithm for spatial event detection, and describes its use detecting strong earthquakes within the Caltech Community Seismic Network. Despite the fact that strong earthquakes are rare and complex events, and that community sensors can be very noisy, our decentralized anomaly detection approach obtains theoretical guarantees for event detection performance while simultaneously limiting the rate of false alarms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

[ES]Este proyecto tiene como objetivo el diseño e implementación de una herramienta para la integración de los datos de calidad de servicio (QoS) en Internet publicados por el regulador español. Se trata de una herramienta que pretende, por una parte, unificar los diferentes formatos en que se publican los datos de QoS y, por otra, facilitar la conservación de los datos favoreciendo la obtención de históricos, datos estadísticos e informes. En la página del regulador sólo se puede acceder a los datos de los 5 últimos trimestres y los datos anteriormente publicados no permanecen accesibles si no que son sustituidos por los más recientes por lo que, desde el punto de vista del usuario final, estos datos se pierden. La herramienta propuesta en este trabajo soluciona este problema además de unificar formatos y facilitar el acceso a los datos de interés. Para el diseño del sistema se han usado las últimas tecnologías en desarrollo de aplicaciones web con lo que la potencia y posibilidad de futuras ampliaciones son elevadas.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

185 p.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

[ES]En este trabajo fin de grado se presenta un estudio de diferentes metodologías para la estimación de la velocidad de acceso a Internet. En el estudio no sólo se analizan las metodologías de las herramientas más extendidas sino que también se tienen en cuenta los factores de influencia principales examinándose su afección global en los resultados obtenidos. Los resultados de este estudio permitirán a los distintos agentes implicados contar con información de interés para el desarrollo de sus propias herramientas. Además, las conclusiones del estudio podrían conducir, en un futuro próximo, a la estandarización de una metodología unificada, por parte de organismos internacionales del sector, que permita comparativas de datos así como la verificación de los acuerdos de nivel de servicio, de interés para usuarios, operadores y reguladores.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A realização da Internet das Coisas (Internet of Things, IoT) requer a integração e interação de dispositivos e serviços com protocolos de comunicação heterogêneos. Os dados gerados pelos dispositivos precisam ser analisados e interpretados em concordância com um modelo de dados em comum, o que pode ser solucionado com o uso de tecnologias de modelagem semântica, processamento, raciocínio e persistência de dados. A computação ciente de contexto possui soluções para estes desafios com mecanismos que associam os dados de contexto com dados coletados pelos dispositivos. Entretanto, a IoT precisa ir além da computação ciente de contexto, sendo simultaneamente necessário soluções para aspectos de segurança, privacidade e escalabilidade. Para integração destas tecnologias é necessário o suporte de uma infraestrutura, que pode ser implementada como um middleware. No entanto, uma solução centralizada de integração de dispositivos heterogêneos pode afetar escalabilidade. Assim esta integração é delegada para agentes de software, que são responsáveis por integrar os dispositivos e serviços, encapsulando as especificidades das suas interfaces e protocolos de comunicação. Neste trabalho são explorados os aspectos de segurança, persistência e nomeação para agentes de recursos. Para este fim foi desenvolvido o ContQuest, um framework, que facilita a integração de novos recursos e o desenvolvimento de aplicações cientes de contexto para a IoT, através de uma arquitetura de serviços e um modelo de dados. O ContQuest inclui soluções consistentes para os aspectos de persistência, segurança e controle de acesso tanto para os serviços de middleware, como para os Agentes de Recursos, que encapsulam dispositivos e serviços, e aplicações-clientes. O ContQuest utiliza OWL para a modelagem dos recursos e inclui um mecanismo de geração de identificadores únicos universais nas ontologias. Um protótipo do ContQuest foi desenvolvido e validado com a integração de três Agentes de Recurso para dispositivos reais: um dispositivo Arduino, um leitor de RFID e uma rede de sensores. Foi também realizado um experimento para avaliação de desempenho dos componentes do sistema, em que se observou o impacto do mecanismo de segurança proposto no desempenho do protótipo. Os resultados da validação e do desempenho são satisfatórios

Relevância:

30.00% 30.00%

Publicador:

Resumo:

本文提出了一种具有实时性、可靠性保障的INTERNET网络机器人控制系统的设计方法。基于该方法设计的网络实时控制系统能够满足机器人实时、高效、灵活的技术特点。该方法的核心为基于UDP传输协议的网络数据补偿算法,通过对网络传输过程中丢失的数据进行实时在线补偿预测,降低了网络数据的丢失对系统的影响。实验结果证明该方法的有效性、合理性。

Relevância:

30.00% 30.00%

Publicador:

Resumo:

2000

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As distributed information services like the World Wide Web become increasingly popular on the Internet, problems of scale are clearly evident. A promising technique that addresses many of these problems is service (or document) replication. However, when a service is replicated, clients then need the additional ability to find a "good" provider of that service. In this paper we report on techniques for finding good service providers without a priori knowledge of server location or network topology. We consider the use of two principal metrics for measuring distance in the Internet: hops, and round-trip latency. We show that these two metrics yield very different results in practice. Surprisingly, we show data indicating that the number of hops between two hosts in the Internet is not strongly correlated to round-trip latency. Thus, the distance in hops between two hosts is not necessarily a good predictor of the expected latency of a document transfer. Instead of using known or measured distances in hops, we show that the extra cost at runtime incurred by dynamic latency measurement is well justified based on the resulting improved performance. In addition we show that selection based on dynamic latency measurement performs much better in practice that any static selection scheme. Finally, the difference between the distribution of hops and latencies is fundamental enough to suggest differences in algorithms for server replication. We show that conclusions drawn about service replication based on the distribution of hops need to be revised when the distribution of latencies is considered instead.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a novel protocol which uses the Internet Domain Name System (DNS) to partition Web clients into disjoint sets, each of which is associated with a single DNS server. We define an L-DNS cluster to be a grouping of Web Clients that use the same Local DNS server to resolve Internet host names. We identify such clusters in real-time using data obtained from a Web Server in conjunction with that server's Authoritative DNS―both instrumented with an implementation of our clustering algorithm. Using these clusters, we perform measurements from four distinct Internet locations. Our results show that L-DNS clustering enables a better estimation of proximity of a Web Client to a Web Server than previously proposed techniques. Thus, in a Content Distribution Network, a DNS-based scheme that redirects a request from a web client to one of many servers based on the client's name server coordinates (e.g., hops/latency/loss-rates between the client and servers) would perform better with our algorithm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One relatively unexplored question about the Internet's physical structure concerns the geographical location of its components: routers, links and autonomous systems (ASes). We study this question using two large inventories of Internet routers and links, collected by different methods and about two years apart. We first map each router to its geographical location using two different state-of-the-art tools. We then study the relationship between router location and population density; between geographic distance and link density; and between the size and geographic extent of ASes. Our findings are consistent across the two datasets and both mapping methods. First, as expected, router density per person varies widely over different economic regions; however, in economically homogeneous regions, router density shows a strong superlinear relationship to population density. Second, the probability that two routers are directly connected is strongly dependent on distance; our data is consistent with a model in which a majority (up to 75-95%) of link formation is based on geographical distance (as in the Waxman topology generation method). Finally, we find that ASes show high variability in geographic size, which is correlated with other measures of AS size (degree and number of interfaces). Among small to medium ASes, ASes show wide variability in their geographic dispersal; however, all ASes exceeding a certain threshold in size are maximally dispersed geographically. These findings have many implications for the next generation of topology generators, which we envisage as producing router-level graphs annotated with attributes such as link latencies, AS identifiers and geographical locations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The measurement of users’ attitudes towards and confidence with using the Internet is an important yet poorly researched topic. Previous research has encountered issues that serve to obfuscate rather than clarify. Such issues include a lack of distinction between the terms ‘attitude’ and ‘self-efficacy’, the absence of a theoretical framework to measure each concept, and failure to follow well-established techniques for the measurement of both attitude and self-efficacy. Thus, the primary aim of this research was to develop two statistically reliable scales which independently measure attitudes towards the Internet and Internet self-efficacy. This research addressed the outlined issues by applying appropriate theoretical frameworks to each of the constructs under investigation. First, the well-known three component (affect, behaviour, cognition) model of attitudes was applied to previous Internet attitude statements. The scale was distributed to four large samples of participants. Exploratory factor analyses revealed four underlying factors in the scale: Internet Affect, Internet Exhilaration, Social Benefit of the Internet and Internet Detriment. The final scale contains 21 items, demonstrates excellent reliability and achieved excellent model fit in the confirmatory factor analysis. Second, Bandura’s (1997) model of self-efficacy was followed to develop a reliable measure of Internet self-efficacy. Data collected as part of this research suggests that there are ten main activities which individuals can carry out on the Internet. Preliminary analyses suggested that self-efficacy is confounded with previous experience; thus, individuals were invited to indicate how frequently they performed the listed Internet tasks in addition to rating their feelings of self-efficacy for each task. The scale was distributed to a sample of 841 participants. Results from the analyses suggest that the more frequently an individual performs an activity on the Internet, the higher their self-efficacy score for that activity. This suggests that frequency of use ought to be taken into account in individual’s self-efficacy scores to obtain a ‘true’ self-efficacy score for the individual. Thus, a formula was devised to incorporate participants’ previous experience of Internet tasks in their Internet self-efficacy scores. This formula was then used to obtain an overall Internet self-efficacy score for participants. Following the development of both scales, gender and age differences were explored in Internet attitudes and Internet self-efficacy scores. The analyses indicated that there were no gender differences between groups for Internet attitude or Internet self-efficacy scores. However, age group differences were identified for both attitudes and self-efficacy. Individuals aged 25-34 years achieved the highest scores on both the Internet attitude and Internet self-efficacy measures. Internet attitude and self-efficacy scores tended to decrease with age with older participants achieving lower scores on both measures than younger participants. It was also found that the more exposure individuals had to the Internet, the higher their Internet attitude and Internet self-efficacy scores. Examination of the relationship between attitude and self-efficacy found a significantly positive relationship between the two measures suggesting that the two constructs are related. Implication of such findings and directions for future research are outlined in detail in the Discussion section of this thesis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Sharing of epidemiological and clinical data sets among researchers is poor at best, in detriment of science and community at large. The purpose of this paper is therefore to (1) describe a novel Web application designed to share information on study data sets focusing on epidemiological clinical research in a collaborative environment and (2) create a policy model placing this collaborative environment into the current scientific social context. METHODOLOGY: The Database of Databases application was developed based on feedback from epidemiologists and clinical researchers requiring a Web-based platform that would allow for sharing of information about epidemiological and clinical study data sets in a collaborative environment. This platform should ensure that researchers can modify the information. A Model-based predictions of number of publications and funding resulting from combinations of different policy implementation strategies (for metadata and data sharing) were generated using System Dynamics modeling. PRINCIPAL FINDINGS: The application allows researchers to easily upload information about clinical study data sets, which is searchable and modifiable by other users in a wiki environment. All modifications are filtered by the database principal investigator in order to maintain quality control. The application has been extensively tested and currently contains 130 clinical study data sets from the United States, Australia, China and Singapore. Model results indicated that any policy implementation would be better than the current strategy, that metadata sharing is better than data-sharing, and that combined policies achieve the best results in terms of publications. CONCLUSIONS: Based on our empirical observations and resulting model, the social network environment surrounding the application can assist epidemiologists and clinical researchers contribute and search for metadata in a collaborative environment, thus potentially facilitating collaboration efforts among research communities distributed around the globe.