17 resultados para unclean internet data
em CentAUR: Central Archive University of Reading - UK
Resumo:
Active Networks can be seen as an evolution of the classical model of packet-switched networks. The traditional and ”passive” network model is based on a static definition of the network node behaviour. Active Networks propose an “active” model where the intermediate nodes (switches and routers) can load and execute user code contained in the data units (packets). Active Networks are a programmable network model, where bandwidth and computation are both considered shared network resources. This approach opens up new interesting research fields. This paper gives a short introduction of Active Networks, discusses the advantages they introduce and presents the research advances in this field.
Resumo:
This article reviews current technological developments, particularly Peer-to-Peer technologies and Distributed Data Systems, and their value to community memory projects, particularly those concerned with the preservation of the cultural, literary and administrative data of cultures which have suffered genocide or are at risk of genocide. It draws attention to the comparatively good representation online of genocide denial groups and changes in the technological strategies of holocaust denial and other far-right groups. It draws on the author's work in providing IT support for a UK-based Non-Governmental Organization providing support for survivors of genocide in Rwanda.
Resumo:
Purpose - The purpose of this paper is to identify the most popular techniques used to rank a web page highly in Google. Design/methodology/approach - The paper presents the results of a study into 50 highly optimized web pages that were created as part of a Search Engine Optimization competition. The study focuses on the most popular techniques that were used to rank highest in this competition, and includes an analysis on the use of PageRank, number of pages, number of in-links, domain age and the use of third party sites such as directories and social bookmarking sites. A separate study was made into 50 non-optimized web pages for comparison. Findings - The paper provides insight into the techniques that successful Search Engine Optimizers use to ensure a page ranks highly in Google. Recognizes the importance of PageRank and links as well as directories and social bookmarking sites. Research limitations/implications - Only the top 50 web sites for a specific query were analyzed. Analysing more web sites and comparing with similar studies in different competition would provide more concrete results. Practical implications - The paper offers a revealing insight into the techniques used by industry experts to rank highly in Google, and the success or other-wise of those techniques. Originality/value - This paper fulfils an identified need for web sites and e-commerce sites keen to attract a wider web audience.
Resumo:
The general packet radio service (GPRS) has been developed to allow packet data to be transported efficiently over an existing circuit-switched radio network, such as GSM. The main application of GPRS are in transporting Internet protocol (IP) datagrams from web servers (for telemetry or for mobile Internet browsers). Four GPRS baseband coding schemes are defined to offer a trade-off in requested data rates versus propagation channel conditions. However, data rates in the order of > 100 kbits/s are only achievable if the simplest coding scheme is used (CS-4) which offers little error detection and correction (EDC) (requiring excellent SNR) and the receiver hardware is capable of full duplex which is not currently available in the consumer market. A simple EDC scheme to improve the GPRS block error rate (BLER) performance is presented, particularly for CS-4, however gains in other coding schemes are seen. For every GPRS radio block that is corrected by the EDC scheme, the block does not need to be retransmitted releasing bandwidth in the channel and improving the user's application data rate. As GPRS requires intensive processing in the baseband, a viable field programmable gate array (FPGA) solution is presented in this paper.
Resumo:
The General Packet Radio Service (GPRS) was developed to allow packet data to be transported efficiently over an existing circuit switched radio network. The main applications for GPRS are in transporting IP datagram’s from the user’s mobile Internet browser to and from the Internet, or in telemetry equipment. A simple Error Detection and Correction (EDC) scheme to improve the GPRS Block Error Rate (BLER) performance is presented, particularly for coding scheme 4 (CS-4), however gains in other coding schemes are seen. For every GPRS radio block that is corrected by the EDC scheme, the block does not need to be retransmitted releasing bandwidth in the channel, improving throughput and the user’s application data rate. As GPRS requires intensive processing in the baseband, a viable hardware solution for a GPRS BLER co-processor is discussed that has been currently implemented in a Field Programmable Gate Array (FPGA) and presented in this paper.
Resumo:
In the decade since OceanObs `99, great advances have been made in the field of ocean data dissemination. The use of Internet technologies has transformed the landscape: users can now find, evaluate and access data rapidly and securely using only a web browser. This paper describes the current state of the art in dissemination methods for ocean data, focussing particularly on ocean observations from in situ and remote sensing platforms. We discuss current efforts being made to improve the consistency of delivered data and to increase the potential for automated integration of diverse datasets. An important recent development is the adoption of open standards from the Geographic Information Systems community; we discuss the current impact of these new technologies and their future potential. We conclude that new approaches will indeed be necessary to exchange data more effectively and forge links between communities, but these approaches must be evaluated critically through practical tests, and existing ocean data exchange technologies must be used to their best advantage. Investment in key technology components, cross-community pilot projects and the enhancement of end-user software tools will be required in order to assess and demonstrate the value of any new technology.
Resumo:
Purpose - The role of affective states in consumer behaviour is well established. However, no study to date has empirically examined online affective states as a basis for constructing typologies of internet users and for assessing the invariance of clusters across national cultures. Design/methodology/approach - Four focus groups with internet users were carried out to adapt a set of affective states identified from the literature to the online environment. An online survey was then designed to collect data from internet users in four Western and four East Asian countries. Findings - Based on a cluster analysis, six cross-national market segments are identified and labelled "Positive Online Affectivists", "Offline Affectivists", "On/Off-line Negative Affectivists", "Online Affectivists", "Indistinguishable Affectivists", and "Negative Offline Affectivists". The resulting clusters discriminate on the basis of national culture, gender, working status and perceptions towards online brands. Practical implications - Marketers may use this typology to segment internet users in order to predict their perceptions towards online brands. Also, a standardised approach to e-marketing is not recommended on the basis of affective state-based segmentation. Originality/value - This is the first study proposing affective state-based typologies of internet users using comparable samples from four Western and four East Asian countries.
Resumo:
This study examines the evolution of prices in markets with Internet price-comparison search engines. The empirical study analyzes laboratory data of prices available to informed consumers, for two industry sizes and two conditions on the sample (complete and incomplete). Distributions are typically bimodal. One of the two modes of distribution, corresponding to monopoly pricing, tends to attract such pricing strategies increasingly over time. The second one, corresponding to interior pricing, follows a decreasing trend. Monopoly pricing can serve as a means of insurance against more competitive (but riskier) behavior. In fact, experimental subjects who initially earn low profits due to interior pricing are more likely to switch to monopoly pricing than subjects who experience good returns from the start.
Resumo:
Purpose Personalised intervention may have greater potential for reducing the global burden of non-communicable diseases and for promoting better health and wellbeing across the life-span than the conventional “one size fits all” approach. However, the characteristics of individuals interested in personalised nutrition (PN) are unclear. Therefore, the aim of this study was to describe the characteristics of European adults interested in taking part in an internet-based PN study. Methods Individuals from seven European countries (UK, Ireland, Germany, the Netherlands, Spain, Greece and Poland) were invited to participate in the study via the Food4Me website (http://www.food4me.org). Two screening questionnaires were used to collect data on socio-demographic, anthropometric and health characteristics as well as dietary intakes. Results A total of 5662 individuals expressed an interest in the study (mean age 40 ± 12.7; range 15-87 years). Of these 64.6% were female and 96.9% were Caucasian. Overall, 12.9% were smokers and 46.8% reported the presence of a clinically diagnosed disease. Furthermore, 46.9% were overweight or obese and 34.9% were sedentary during leisure time. Assessment of dietary intakes showed that 54.3% of individuals reported consuming at least 5 portions of fruit and vegetables per day, 45.9% consumed more than 3 servings of wholegrains and 37.2% limited their salt intake to less than 5.75g per day. Conclusions Our data indicate that individuals volunteering to participate in an internet-based PN study are broadly representative of the European adult population, most of whom had adequate nutrient intakes but who could benefit from improved dietary choices and greater physical activity. Future use of internet-based PN approaches is thus relevant to a wide target audience.
Resumo:
In e-health intervention studies, there are concerns about the reliability of internet-based, self-reported (SR) data and about the potential for identity fraud. This study introduced and tested a novel procedure for assessing the validity of internet-based, SR identity and validated anthropometric and demographic data via measurements performed face-to-face in a validation study (VS). Participants (n = 140) from seven European countries, participating in the Food4Me intervention study which aimed to test the efficacy of personalised nutrition approaches delivered via the internet, were invited to take part in the VS. Participants visited a research centre in each country within 2 weeks of providing SR data via the internet. Participants received detailed instructions on how to perform each measurement. Individual’s identity was checked visually and by repeated collection and analysis of buccal cell DNA for 33 genetic variants. Validation of identity using genomic information showed perfect concordance between SR and VS. Similar results were found for demographic data (age and sex verification). We observed strong intra-class correlation coefficients between SR and VS for anthropometric data (height 0.990, weight 0.994 and BMI 0.983). However, internet-based SR weight was under-reported (Δ −0.70 kg [−3.6 to 2.1], p < 0.0001) and, therefore, BMI was lower for SR data (Δ −0.29 kg m−2 [−1.5 to 1.0], p < 0.0001). BMI classification was correct in 93 % of cases. We demonstrate the utility of genotype information for detection of possible identity fraud in e-health studies and confirm the reliability of internet-based, SR anthropometric and demographic data collected in the Food4Me study.
Resumo:
Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors.
Resumo:
An important application of Big Data Analytics is the real-time analysis of streaming data. Streaming data imposes unique challenges to data mining algorithms, such as concept drifts, the need to analyse the data on the fly due to unbounded data streams and scalable algorithms due to potentially high throughput of data. Real-time classification algorithms that are adaptive to concept drifts and fast exist, however, most approaches are not naturally parallel and are thus limited in their scalability. This paper presents work on the Micro-Cluster Nearest Neighbour (MC-NN) classifier. MC-NN is based on an adaptive statistical data summary based on Micro-Clusters. MC-NN is very fast and adaptive to concept drift whilst maintaining the parallel properties of the base KNN classifier. Also MC-NN is competitive compared with existing data stream classifiers in terms of accuracy and speed.