50 resultados para Intelligent computing techniques
Resumo:
Generally classifiers tend to overfit if there is noise in the training data or there are missing values. Ensemble learning methods are often used to improve a classifier's classification accuracy. Most ensemble learning approaches aim to improve the classification accuracy of decision trees. However, alternative classifiers to decision trees exist. The recently developed Random Prism ensemble learner for classification aims to improve an alternative classification rule induction approach, the Prism family of algorithms, which addresses some of the limitations of decision trees. However, Random Prism suffers like any ensemble learner from a high computational overhead due to replication of the data and the induction of multiple base classifiers. Hence even modest sized datasets may impose a computational challenge to ensemble learners such as Random Prism. Parallelism is often used to scale up algorithms to deal with large datasets. This paper investigates parallelisation for Random Prism, implements a prototype and evaluates it empirically using a Hadoop computing cluster.
Resumo:
Advances in hardware and software technology enable us to collect, store and distribute large quantities of data on a very large scale. Automatically discovering and extracting hidden knowledge in the form of patterns from these large data volumes is known as data mining. Data mining technology is not only a part of business intelligence, but is also used in many other application areas such as research, marketing and financial analytics. For example medical scientists can use patterns extracted from historic patient data in order to determine if a new patient is likely to respond positively to a particular treatment or not; marketing analysts can use extracted patterns from customer data for future advertisement campaigns; finance experts have an interest in patterns that forecast the development of certain stock market shares for investment recommendations. However, extracting knowledge in the form of patterns from massive data volumes imposes a number of computational challenges in terms of processing time, memory, bandwidth and power consumption. These challenges have led to the development of parallel and distributed data analysis approaches and the utilisation of Grid and Cloud computing. This chapter gives an overview of parallel and distributed computing approaches and how they can be used to scale up data mining to large datasets.
Resumo:
Older adult computer users often lose track of the mouse cursor and so resort to methods such as shaking the mouse or searching the entire screen to find the cursor again. Hence, this paper describes how a standard optical mouse was modified to include a touch sensor, activated by releasing and touching the mouse, which automatically centers the mouse cursor to the screen, potentially making it easier to find a ‘lost’ cursor. Six older adult computer users and six younger computer users were asked to compare the touch sensitive mouse with cursor centering with two alternative techniques for locating the mouse cursor: manually shaking the mouse and using the Windows sonar facility. The time taken to click on a target after a distractor task was recorded, and results show that centering the mouse was the fastest to use, with a 35% improvement over shaking the mouse. Five out of six older participants ranked the touch sensitive mouse with cursor centering as the easiest to use.
Resumo:
n this study, the authors discuss the effective usage of technology to solve the problem of deciding on journey start times for recurrent traffic conditions. The developed algorithm guides the vehicles to travel on more reliable routes that are not easily prone to congestion or travel delays, ensures that the start time is as late as possible to avoid the traveller waiting too long at their destination and attempts to minimise the travel time. Experiments show that in order to be more certain of reaching their destination on time, a traveller has to leave early and correspondingly arrive early, resulting in a large waiting time. The application developed here asks the user to set this certainty factor as per the task in hand, and computes the best start time and route.
Resumo:
SOA (Service Oriented Architecture), workflow, the Semantic Web, and Grid computing are key enabling information technologies in the development of increasingly sophisticated e-Science infrastructures and application platforms. While the emergence of Cloud computing as a new computing paradigm has provided new directions and opportunities for e-Science infrastructure development, it also presents some challenges. Scientific research is increasingly finding that it is difficult to handle “big data” using traditional data processing techniques. Such challenges demonstrate the need for a comprehensive analysis on using the above mentioned informatics techniques to develop appropriate e-Science infrastructure and platforms in the context of Cloud computing. This survey paper describes recent research advances in applying informatics techniques to facilitate scientific research particularly from the Cloud computing perspective. Our particular contributions include identifying associated research challenges and opportunities, presenting lessons learned, and describing our future vision for applying Cloud computing to e-Science. We believe our research findings can help indicate the future trend of e-Science, and can inform funding and research directions in how to more appropriately employ computing technologies in scientific research. We point out the open research issues hoping to spark new development and innovation in the e-Science field.