875 resultados para document clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tianjin University of Technology

Relevância:

20.00% 20.00%

Publicador:

Resumo:

ACM SIGIR; ACM SIGWEB

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Deuterated polyethylene tracer molecules with small amount of branches (12 C2H5- branches per 1000 backbone carbon atoms) were blended with a hydrogenated polyethylene matrix to form a homogenous mixture. The conformational evolution of the deuterated chains in a stretched semi-cry stall me film was observed via online small angle neutron scattering measurements during annealing at high temperatures close to the melting point. Because the sample was annealed at a temperature closely below its melting point, the crystalline lamellae were only partially molten and the system could not fully relax. The global chain dimensions were preserved during annealing. Recrystallization of released polymeric chain segments allows for local phase separation thus driving the deuterated chain segments into the confining interlamellar amorphous layers giving rise to an interesting intra-molecular clustering effect of the long deuterated chain. This clustering is deduced from characteristic small angle neutron scattering patterns. The confined phase separation has its origin in primarily the small amount of the branches on the deuterated polymers which impede the crystallization of the deuterated chain segments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Along with the development of marine industries, especially marine petroleum exploitation, more and more pipelines are buried in the marine sediment. It is necessary and useful to know the corrosion environment and corrosiveness of marine sediment. In this paper, field corrosion environmental factors were investigated in Liaodong Bay marine sediment containing sulfate-reducing bacteria (SRB) and corrosion rate of steel in the partly sediment specimens were determined by the transplanting burying method. Based on the data, the fuzzy clustering analysis (FCA) was applied to evaluate and predict the corrosiveness of marine sediment. On that basis, the influence factors of corrosion damage were discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This survey was undertaken by the film crew accompanying Cary Grant when making the film "Charade" in 1963.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article contributes to the debate on what form of preparation and support can enhance the intercultural student experience during the Year Abroad. It presents a credit-bearing and multi-modal module at a UK university designed to both prepare students prior to departure through a series of workshops and activities on an e-portfolio and help them engage in meta-reflection on intercultural issues during their stay. The presentation of the curricular components of the course and instances extracted from student blogs are contextualised within theoretical considerations on intercultural education and a holistic approach to student development. The longitudinal evolution of the module is presented in the context of an iterative approach leading to a cycle of revisions and amendments. With its pragmatic stance this article aims to address one of the concerns recently expressed about intercultural education, namely that although intercultural theories are suitably incorporated in the latest thinking on communicative competence, there is a lack of evidence-based practice.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Struyf, J., Dzeroski, S. Blockeel, H. and Clare, A. (2005) Hierarchical Multi-classification with Predictive Clustering Trees in Functional Genomics. In proceedings of the EPIA 2005 CMB Workshop

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a type system, StaXML, which employs the stacked type syntax to represent essential aspects of the potential roles of XML fragments to the structure of complete XML documents. The simplest application of this system is to enforce well-formedness upon the construction of XML documents without requiring the use of templates or balanced "gap plugging" operators; this allows it to be applied to programs written according to common imperative web scripting idioms, particularly the echoing of unbalanced XML fragments to an output buffer. The system can be extended to verify particular XML applications such as XHTML and identifying individual XML tags constructed from their lexical components. We also present StaXML for PHP, a prototype precompiler for the PHP4 scripting language which infers StaXML types for expressions without assistance from the programmer.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A system is described that tracks moving objects in a video dataset so as to extract a representation of the objects' 3D trajectories. The system then finds hierarchical clusters of similar trajectories in the video dataset. Objects' motion trajectories are extracted via an EKF formulation that provides each object's 3D trajectory up to a constant factor. To increase accuracy when occlusions occur, multiple tracking hypotheses are followed. For trajectory-based clustering and retrieval, a modified version of edit distance, called longest common subsequence (LCSS) is employed. Similarities are computed between projections of trajectories on coordinate axes. Trajectories are grouped based, using an agglomerative clustering algorithm. To check the validity of the approach, experiments using real data were performed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the increasing demand for document transfer services such as the World Wide Web comes a need for better resource management to reduce the latency of documents in these systems. To address this need, we analyze the potential for document caching at the application level in document transfer services. We have collected traces of actual executions of Mosaic, reflecting over half a million user requests for WWW documents. Using those traces, we study the tradeoffs between caching at three levels in the system, and the potential for use of application-level information in the caching system. Our traces show that while a high hit rate in terms of URLs is achievable, a much lower hit rate is possible in terms of bytes, because most profitably-cached documents are small. We consider the performance of caching when applied at the level of individual user sessions, at the level of individual hosts, and at the level of a collection of hosts on a single LAN. We show that the performance gain achievable by caching at the session level (which is straightforward to implement) is nearly all of that achievable at the LAN level (where caching is more difficult to implement). However, when resource requirements are considered, LAN level caching becomes much more desirable, since it can achieve a given level of caching performance using a much smaller amount of cache space. Finally, we consider the use of organizational boundary information as an example of the potential for use of application-level information in caching. Our results suggest that distinguishing between documents produced locally and those produced remotely can provide useful leverage in designing caching policies, because of differences in the potential for sharing these two document types among multiple users.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We analyzed the logs of our departmental HTTP server http://cs-www.bu.edu as well as the logs of the more popular Rolling Stones HTTP server http://www.stones.com. These servers have very different purposes; the former caters primarily to local clients, whereas the latter caters exclusively to remote clients all over the world. In both cases, our analysis showed that remote HTTP accesses were confined to a very small subset of documents. Using a validated analytical model of server popularity and file access profiles, we show that by disseminating the most popular documents on servers (proxies) closer to the clients, network traffic could be reduced considerably, while server loads are balanced. We argue that this process could be generalized so as to provide for an automated demand-based duplication of documents. We believe that such server-based information dissemination protocols will be more effective at reducing both network bandwidth and document retrieval times than client-based caching protocols [2].

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a novel protocol which uses the Internet Domain Name System (DNS) to partition Web clients into disjoint sets, each of which is associated with a single DNS server. We define an L-DNS cluster to be a grouping of Web Clients that use the same Local DNS server to resolve Internet host names. We identify such clusters in real-time using data obtained from a Web Server in conjunction with that server's Authoritative DNS―both instrumented with an implementation of our clustering algorithm. Using these clusters, we perform measurements from four distinct Internet locations. Our results show that L-DNS clustering enables a better estimation of proximity of a Web Client to a Web Server than previously proposed techniques. Thus, in a Content Distribution Network, a DNS-based scheme that redirects a request from a web client to one of many servers based on the client's name server coordinates (e.g., hops/latency/loss-rates between the client and servers) would perform better with our algorithm.