940 resultados para World wide web (www)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Decision support systems (DSS) have evolved rapidly during the last decade from stand alone or limited networked solutions to online participatory solutions. One of the major enablers of this change is the fastest growing areas of geographical information system (GIS) technology development that relates to the use of the Internet as a means to access, display, and analyze geospatial data remotely. World-wide many federal, state, and particularly local governments are designing to facilitate data sharing using interactive Internet map servers. This new generation DSS or planning support systems (PSS), interactive Internet map server, is the solution for delivering dynamic maps and GIS data and services via the world-wide Web, and providing public participatory GIS (PPGIS) opportunities to a wider community (Carver, 2001; Jankowski & Nyerges, 2001). It provides a highly scalable framework for GIS Web publishing, Web-based public participatory GIS (WPPGIS), which meets the needs of corporate intranets and demands of worldwide Internet access (Craig, 2002). The establishment of WPPGIS provides spatial data access through a support centre or a GIS portal to facilitate efficient access to and sharing of related geospatial data (Yigitcanlar, Baum, & Stimson, 2003). As more and more public and private entities adopt WPPGIS technology, the importance and complexity of facilitating geospatial data sharing is growing rapidly (Carver, 2003). Therefore, this article focuses on the online public participation dimension of the GIS technology. The article provides an overview of recent literature on GIS and WPPGIS, and includes a discussion on the potential use of these technologies in providing a democratic platform for the public in decision-making.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Interactive documents for use with the World Wide Web have been developed for viewing multi-dimensional radiographic and visual images of human anatomy, derived from the Visible Human Project. Emphasis has been placed on user-controlled features and selections. The purpose was to develop an interface which was independent of host operating system and browser software which would allow viewing of information by multiple users. The interfaces were implemented using HyperText Markup Language (HTML) forms, C programming language and Perl scripting language. Images were pre-processed using ANALYZE and stored on a Web server in CompuServe GIF format. Viewing options were included in the document design, such as interactive thresholding and two-dimensional slice direction. The interface is an example of what may be achieved using the World Wide Web. Key applications envisaged for such software include education, research and accessing of information through internal databases and simultaneous sharing of images by remote computers by health personnel for diagnostic purposes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The proliferation of the web presents an unsolved problem of automatically analyzing billions of pages of natural language. We introduce a scalable algorithm that clusters hundreds of millions of web pages into hundreds of thousands of clusters. It does this on a single mid-range machine using efficient algorithms and compressed document representations. It is applied to two web-scale crawls covering tens of terabytes. ClueWeb09 and ClueWeb12 contain 500 and 733 million web pages and were clustered into 500,000 to 700,000 clusters. To the best of our knowledge, such fine grained clustering has not been previously demonstrated. Previous approaches clustered a sample that limits the maximum number of discoverable clusters. The proposed EM-tree algorithm uses the entire collection in clustering and produces several orders of magnitude more clusters than the existing algorithms. Fine grained clustering is necessary for meaningful clustering in massive collections where the number of distinct topics grows linearly with collection size. These fine-grained clusters show an improved cluster quality when assessed with two novel evaluations using ad hoc search relevance judgments and spam classifications for external validation. These evaluations solve the problem of assessing the quality of clusters where categorical labeling is unavailable and unfeasible.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tutkimuksen tarkoituksena on selvittää, millaisia tiedonhakustrategioita tiedonhakijatkäyttävät etsiessään tietoa Internetistä. Käyttäjät luokitellaan kolmeen ryhmään tiedonhakustrategiansa mukaan. Haku-suuntautuneet käyttäjät käyttävät enimmäkseen hakukoneita, niin koko Internetin kattavia kuin sivustojen sisäisiäkin. Linkkisuuntautuneet taas joko tietävät tai arvaavat kohdesivuston osoitteen tai käyttävät laajoja hierarkkisia hakemistoja tiedon löytämiseen. He navigoivat mielummin sivustollakin linkkejä käyttäen eivätkä yleensä käytä hakutoimintoa. Eriytyneet käyttäjät eivät säännönmukaisesti suosi kumpaakaan tapaa, vaan valitsevat strategian tehtävän mukaan. Tietoa kerättiin kahdella tavalla: WWW-sivulla olleella kyselylomakkeella ja tiedonhakutestillä, jossa käyttäjille annettiin suoritettavaksi erilaisia tiedonhakutehtäviä. Tiedonhakutehtävät lajiteltiin kolmeen ryhmään sen mukaan, mitä strategiaa ne suosivat: hakustrategiaa suosivat, linkkistrategiaa suosivat ja neutraalit tehtävät. Tutkimusongelmana oli selvittää, kuinka tehtävän tyyppi ja ATK- ja Internet-kokemus vaikuttavat tiedonhakustrategian valintaan. Kävi ilmi, ettei käyttäjien suuntautuneisuus tiettyyn strategiaan vaikuta tiedonhakustrategian valintaan, vaan ainoastaan tehtävän tyyppi oli merkitsevä tekijä. Aikaisemman tutkimustiedon valossa kokeenet suosivat haku-suuntautunutta strategiaa. Tässä tutkimuksessa havaittiin, että kokemus lisäsi molempien strategioiden käyttöä yhtäläisesti, mutta tämä ilmiö oli havaittavissa ainoastaan kysely-lomakkeen pohjalta, ei testeissä. Molempien tiedonhakustrategioiden käyttö lisääntyy kokemuksen myötä, mutta suhteelliset osuudet pysyvät samoina. Syyksi sille, että kokeneet eivät suosineet hakustrategiaa, esitetään sitä, että tehtävät olivat liian helppoja, jolloin kokemus ei pääse auttamaan. Oleellisia eroja suoritusajoissa tai hakustrategian vaihdon tiheydessä ei havaittu suhteessa kokemukseen, ainoastaan suhteessa tehtävän tyyppiin.Tämäkin selitettiin toisentyyppisten tehtävien helppoudella. Tutkimuksessa pohditaan lisäksi asiantuntijuuden syntyä tiedonhakukontekstissa sekä esitetään metatietohypoteesi, jonka mukaan tiedonhakustrategian valintaan vaikuttaa tärkeänä tekijänä käyttäjän metatieto hakupalveluista. Metatietoon kuuluu tieto siitä, mitä hakukoneita on saatavilla, mitä tietoa verkosta kannattaa hakea, millä yrityksillä ja yhteisöillä on sisältörikkaat sivut jne, ja minkä tyyppistä tietoa yleensä on saatavilla. Kaikenkaikkiaan strategian valintaan esitetään taustalle kolmen tason tiedon vaikutusta: 1) oma asiantuntemus haettavasta alasta, 2) metatieto Internetin tiedonhakupalveluista sekä 3) tekninen tieto siitä, kuinka hakukoneet toimivat. Avainsanat: tiedonhaku, tiedonhakustrategia, hakukone, WWW, metatieto, kognitiivinen psykologia

Relevância:

100.00% 100.00%

Publicador:

Resumo:

万维网(World Wide Web)是由大量的网页组成的,网页之间由超链接(HyperLink)相互连接。在传统上,人们对网络信息的分析和获取是依靠对网页内容的分析和处理来进行的。例如,传统的网络搜索引擎对网页上文本信息进行分析、索引,并将处理后的信息存储在数据库中,然后根据用户查询输入进行分析,获得查询结果。

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Serious concerns have been raised about the ecological effects of industrialized fishing1, 2, 3, spurring a United Nations resolution on restoring fisheries and marine ecosystems to healthy levels4. However, a prerequisite for restoration is a general understanding of the composition and abundance of unexploited fish communities, relative to contemporary ones. We constructed trajectories of community biomass and composition of large predatory fishes in four continental shelf and nine oceanic systems, using all available data from the beginning of exploitation. Industrialized fisheries typically reduced community biomass by 80% within 15 years of exploitation. Compensatory increases in fast-growing species were observed, but often reversed within a decade. Using a meta-analytic approach, we estimate that large predatory fish biomass today is only about 10% of pre-industrial levels. We conclude that declines of large predators in coastal regions5 have extended throughout the global ocean, with potentially serious consequences for ecosystems5, 6, 7. Our analysis suggests that management based on recent data alone may be misleading, and provides minimum estimates for unexploited communities, which could serve as the ‘missing baseline’8 needed for future restoration efforts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Web数据挖掘是将数据挖掘技术和理论应用于对WWW资源进行挖掘的一个新兴的研究领域,本文介绍了Web数据挖掘的基本概念,分类,并给出 Web数据挖掘的基本原理,基本方法,最后指出 Web数据挖掘的用途,展望了其美好的发展前景。

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The exploding demand for services like the World Wide Web reflects the potential that is presented by globally distributed information systems. The number of WWW servers world-wide has doubled every 3 to 5 months since 1993, outstripping even the growth of the Internet. At each of these self-managed sites, the Common Gateway Interface (CGI) and Hypertext Transfer Protocol (HTTP) already constitute a rudimentary basis for contributing local resources to remote collaborations. However, the Web has serious deficiencies that make it unsuited for use as a true medium for metacomputing --- the process of bringing hardware, software, and expertise from many geographically dispersed sources to bear on large scale problems. These deficiencies are, paradoxically, the direct result of the very simple design principles that enabled its exponential growth. There are many symptoms of the problems exhibited by the Web: disk and network resources are consumed extravagantly; information search and discovery are difficult; protocols are aimed at data movement rather than task migration, and ignore the potential for distributing computation. However, all of these can be seen as aspects of a single problem: as a distributed system for metacomputing, the Web offers unpredictable performance and unreliable results. The goal of our project is to use the Web as a medium (within either the global Internet or an enterprise intranet) for metacomputing in a reliable way with performance guarantees. We attack this problem one four levels: (1) Resource Management Services: Globally distributed computing allows novel approaches to the old problems of performance guarantees and reliability. Our first set of ideas involve setting up a family of real-time resource management models organized by the Web Computing Framework with a standard Resource Management Interface (RMI), a Resource Registry, a Task Registry, and resource management protocols to allow resource needs and availability information be collected and disseminated so that a family of algorithms with varying computational precision and accuracy of representations can be chosen to meet realtime and reliability constraints. (2) Middleware Services: Complementary to techniques for allocating and scheduling available resources to serve application needs under realtime and reliability constraints, the second set of ideas aim at reduce communication latency, traffic congestion, server work load, etc. We develop customizable middleware services to exploit application characteristics in traffic analysis to drive new server/browser design strategies (e.g., exploit self-similarity of Web traffic), derive document access patterns via multiserver cooperation, and use them in speculative prefetching, document caching, and aggressive replication to reduce server load and bandwidth requirements. (3) Communication Infrastructure: Finally, to achieve any guarantee of quality of service or performance, one must get at the network layer that can provide the basic guarantees of bandwidth, latency, and reliability. Therefore, the third area is a set of new techniques in network service and protocol designs. (4) Object-Oriented Web Computing Framework A useful resource management system must deal with job priority, fault-tolerance, quality of service, complex resources such as ATM channels, probabilistic models, etc., and models must be tailored to represent the best tradeoff for a particular setting. This requires a family of models, organized within an object-oriented framework, because no one-size-fits-all approach is appropriate. This presents a software engineering challenge requiring integration of solutions at all levels: algorithms, models, protocols, and profiling and monitoring tools. The framework captures the abstract class interfaces of the collection of cooperating components, but allows the concretization of each component to be driven by the requirements of a specific approach and environment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As the World Wide Web (Web) is increasingly adopted as the infrastructure for large-scale distributed information systems, issues of performance modeling become ever more critical. In particular, locality of reference is an important property in the performance modeling of distributed information systems. In the case of the Web, understanding the nature of reference locality will help improve the design of middleware, such as caching, prefetching, and document dissemination systems. For example, good measurements of reference locality would allow us to generate synthetic reference streams with accurate performance characteristics, would allow us to compare empirically measured streams to explain differences, and would allow us to predict expected performance for system design and capacity planning. In this paper we propose models for both temporal and spatial locality of reference in streams of requests arriving at Web servers. We show that simple models based only on document popularity (likelihood of reference) are insufficient for capturing either temporal or spatial locality. Instead, we rely on an equivalent, but numerical, representation of a reference stream: a stack distance trace. We show that temporal locality can be characterized by the marginal distribution of the stack distance trace, and we propose models for typical distributions and compare their cache performance to our traces. We also show that spatial locality in a reference stream can be characterized using the notion of self-similarity. Self-similarity describes long-range correlations in the dataset, which is a property that previous researchers have found hard to incorporate into synthetic reference strings. We show that stack distance strings appear to be strongly self-similar, and we provide measurements of the degree of self-similarity in our traces. Finally, we discuss methods for generating synthetic Web traces that exhibit the properties of temporal and spatial locality that we measured in our data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ImageRover is a search by image content navigation tool for the world wide web. The staggering size of the WWW dictates certain strategies and algorithms for image collection, digestion, indexing, and user interface. This paper describes two key components of the ImageRover strategy: image digestion and relevance feedback. Image digestion occurs during image collection; robots digest the images they find, computing image decompositions and indices, and storing this extracted information in vector form for searches based on image content. Relevance feedback occurs during index search; users can iteratively guide the search through the selection of relevant examples. ImageRover employs a novel relevance feedback algorithm to determine the weighted combination of image similarity metrics appropriate for a particular query. ImageRover is available and running on the web site.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the most vexing questions facing researchers interested in the World Wide Web is why users often experience long delays in document retrieval. The Internet's size, complexity, and continued growth make this a difficult question to answer. We describe the Wide Area Web Measurement project (WAWM) which uses an infrastructure distributed across the Internet to study Web performance. The infrastructure enables simultaneous measurements of Web client performance, network performance and Web server performance. The infrastructure uses a Web traffic generator to create representative workloads on servers, and both active and passive tools to measure performance characteristics. Initial results based on a prototype installation of the infrastructure are presented in this paper.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Web sites that rely on databases for their content are now ubiquitous. Query result pages are dynamically generated from these databases in response to user-submitted queries. Automatically extracting structured data from query result pages is a challenging problem, as the structure of the data is not explicitly represented. While humans have shown good intuition in visually understanding data records on a query result page as displayed by a web browser, no existing approach to data record extraction has made full use of this intuition. We propose a novel approach, in which we make use of the common sources of evidence that humans use to understand data records on a displayed query result page. These include structural regularity, and visual and content similarity between data records displayed on a query result page. Based on these observations we propose new techniques that can identify each data record individually, while ignoring noise items, such as navigation bars and adverts. We have implemented these techniques in a software prototype, rExtractor, and tested it using two datasets. Our experimental results show that our approach achieves significantly higher accuracy than previous approaches. Furthermore, it establishes the case for use of vision-based algorithms in the context of data extraction from web sites.