318 resultados para Frequent Sequential Patterns


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract With the phenomenal growth of electronic data and information, there are many demands for the development of efficient and effective systems (tools) to perform the issue of data mining tasks on multidimensional databases. Association rules describe associations between items in the same transactions (intra) or in different transactions (inter). Association mining attempts to find interesting or useful association rules in databases: this is the crucial issue for the application of data mining in the real world. Association mining can be used in many application areas, such as the discovery of associations between customers’ locations and shopping behaviours in market basket analysis. Association mining includes two phases. The first phase, called pattern mining, is the discovery of frequent patterns. The second phase, called rule generation, is the discovery of interesting and useful association rules in the discovered patterns. The first phase, however, often takes a long time to find all frequent patterns; these also include much noise. The second phase is also a time consuming activity that can generate many redundant rules. To improve the quality of association mining in databases, this thesis provides an alternative technique, granule-based association mining, for knowledge discovery in databases, where a granule refers to a predicate that describes common features of a group of transactions. The new technique first transfers transaction databases into basic decision tables, then uses multi-tier structures to integrate pattern mining and rule generation in one phase for both intra and inter transaction association rule mining. To evaluate the proposed new technique, this research defines the concept of meaningless rules by considering the co-relations between data-dimensions for intratransaction-association rule mining. It also uses precision to evaluate the effectiveness of intertransaction association rules. The experimental results show that the proposed technique is promising.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The increasing diversity of the Internet has created a vast number of multilingual resources on the Web. A huge number of these documents are written in various languages other than English. Consequently, the demand for searching in non-English languages is growing exponentially. It is desirable that a search engine can search for information over collections of documents in other languages. This research investigates the techniques for developing high-quality Chinese information retrieval systems. A distinctive feature of Chinese text is that a Chinese document is a sequence of Chinese characters with no space or boundary between Chinese words. This feature makes Chinese information retrieval more difficult since a retrieved document which contains the query term as a sequence of Chinese characters may not be really relevant to the query since the query term (as a sequence Chinese characters) may not be a valid Chinese word in that documents. On the other hand, a document that is actually relevant may not be retrieved because it does not contain the query sequence but contains other relevant words. In this research, we propose two approaches to deal with the problems. In the first approach, we propose a hybrid Chinese information retrieval model by incorporating word-based techniques with the traditional character-based techniques. The aim of this approach is to investigate the influence of Chinese segmentation on the performance of Chinese information retrieval. Two ranking methods are proposed to rank retrieved documents based on the relevancy to the query calculated by combining character-based ranking and word-based ranking. Our experimental results show that Chinese segmentation can improve the performance of Chinese information retrieval, but the improvement is not significant if it incorporates only Chinese segmentation with the traditional character-based approach. In the second approach, we propose a novel query expansion method which applies text mining techniques in order to find the most relevant words to extend the query. Unlike most existing query expansion methods, which generally select the highly frequent indexing terms from the retrieved documents to expand the query. In our approach, we utilize text mining techniques to find patterns from the retrieved documents that highly correlate with the query term and then use the relevant words in the patterns to expand the original query. This research project develops and implements a Chinese information retrieval system for evaluating the proposed approaches. There are two stages in the experiments. The first stage is to investigate if high accuracy segmentation can make an improvement to Chinese information retrieval. In the second stage, a text mining based query expansion approach is implemented and a further experiment has been done to compare its performance with the standard Rocchio approach with the proposed text mining based query expansion method. The NTCIR5 Chinese collections are used in the experiments. The experiment results show that by incorporating the text mining based query expansion with the hybrid model, significant improvement has been achieved in both precision and recall assessments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In paper has been to investigate the morphological patterns and kinetics of PDMS spreading on silicon wafer using combination of techniques like ellipsometry, atomic force microscope (AFM), scanning electron microscope (SEM) and optical microscopy. A macroscopic silicone oil drops as well as PDMS water based emulsions were studied after deposition on a flat surface of silicon wafer in air, water and vacuum. our own measurements using an imaging ellipsometer, which also clearly shows the presence of a precursor film. The diffusion constant of this film, measured with a 60 000 cS PDMS sample spreading on a hydrophilic silicon wafer, is Df = 1.4  10-11 m2/s. Regardless of their size, density and method of deposition, droplets on both types of wafer (hydrophilic and hydrophobic) flatten out over a period of many hours, up to 3 days. During this process neighbouring droplets may coalesce, but there is strong evidence that some of the PDMS from the droplets migrates into a thin, continuous film that covers the surface in between droplets. The thin film appears to be ubiquitous if there has been any deposition of PDMS. However, this statement needs further verification. One question is whether the film forms immediately after forced drying, or whether in some or all cases it only forms by spreading from isolated droplets as they slowly flatten out.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Continuous learning and development has become increasingly important in the information age. However, employees with limited formal education in lower status occupations may be disadvantaged in their opportunities for development, as their jobs tend to require more limited knowledge and skills. In mature age, such workers may be subject to cumulative disadvantage with respect to work related learning and development, as well as negative stereotyping. This thesis concerns work related learning and development from a lifespan development psychology perspective. Development across the lifespan is grounded in biocultural co-constructivism. That is, the reciprocal influences of the individual and environment produce change in the individual. Existing theories and models of adaptive development attempt to explain how developmental resources are allocated across the lifespan. These included the Meta- theory of Selective Optimisation with Compensation, Dual Process Model of Self Regulation, and Developmental Regulation via Optimisation and Primary and Secondary Control. These models were integrated to create the Model of Adaptive Development for Work Related Learning. The Learning and Development Survey (LDS) was constructed to measure the hypothesised processes of adaptive development for work related learning, which were individual goal selection, individual goal engagement, individual goal disengagement, organisational opportunities (selection and engagement), and organisational constraints. Data collection was undertaken in two phases: the pilot study and the main study. The objective of the pilot study was to test the LDS on a target population of 112 employees from a local government organisation. Exploratory factor analysis reduced the pilot version of the survey to 38 items encompassing eight constructs which covered the processes of the model of adaptive development for work related learning. In the main study, the Revised Learning and Development Survey (R-LDS) was administered to another group of 137 employees from the local government organisation, as well as 110 employees from a private healthcare organisation. The purpose of the main study was to validate the R-LDS on two different groups to provide evidence of stability, and compare survey scores according to age and occupational status to determine construct validity. Findings from the main study indicated that only four constructs of the R-LDS were stable, which were organisational opportunities – selection, individual goal engagement, organisational constraints – disengagement and organisational opportunities – engagement. In addition, MANOVA studies revealed that the demographic variables affected organisational opportunities and constraints in the workplace, although individual goal engagement was not influenced by age. The findings from the pilot and main study partially supported the model of adaptive development for work related learning. Given that only four factors displayed adequate reliability in terms of internal consistency and stability, the findings suggest that individual goal selection and individual goal disengagement are less relevant to work related learning and development. Some recent research which emerged during the course of the current study has suggested that individual goal selection and individual goal disengagement are more relevant when goal achievement is impeded by biological constraints such as ageing. However, correlations between the retained factors support the model of adaptive development for work related learning, and represent the role of biocultural co-constructivism in development. Individual goal engagement was positively correlated with both opportunity factors (selection and engagement), while organisational constraints – disengagement was negatively correlated with organisational opportunities – selection. Demographic findings indicated that higher occupational status was associated with more opportunities for development. Age was associated with fewer opportunities or greater constraints for development, especially for lower status workers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Road curves are an important feature of road infrastructure and many serious crashes occur on road curves. In Queensland, the number of fatalities is twice as many on curves as that on straight roads. Therefore, there is a need to reduce drivers’ exposure to crash risk on road curves. Road crashes in Australia and in the Organisation for Economic Co-operation and Development(OECD) have plateaued in the last five years (2004 to 2008) and the road safety community is desperately seeking innovative interventions to reduce the number of crashes. However, designing an innovative and effective intervention may prove to be difficult as it relies on providing theoretical foundation, coherence, understanding, and structure to both the design and validation of the efficiency of the new intervention. Researchers from multiple disciplines have developed various models to determine the contributing factors for crashes on road curves with a view towards reducing the crash rate. However, most of the existing methods are based on statistical analysis of contributing factors described in government crash reports. In order to further explore the contributing factors related to crashes on road curves, this thesis designs a novel method to analyse and validate these contributing factors. The use of crash claim reports from an insurance company is proposed for analysis using data mining techniques. To the best of our knowledge, this is the first attempt to use data mining techniques to analyse crashes on road curves. Text mining technique is employed as the reports consist of thousands of textual descriptions and hence, text mining is able to identify the contributing factors. Besides identifying the contributing factors, limited studies to date have investigated the relationships between these factors, especially for crashes on road curves. Thus, this study proposed the use of the rough set analysis technique to determine these relationships. The results from this analysis are used to assess the effect of these contributing factors on crash severity. The findings obtained through the use of data mining techniques presented in this thesis, have been found to be consistent with existing identified contributing factors. Furthermore, this thesis has identified new contributing factors towards crashes and the relationships between them. A significant pattern related with crash severity is the time of the day where severe road crashes occur more frequently in the evening or night time. Tree collision is another common pattern where crashes that occur in the morning and involves hitting a tree are likely to have a higher crash severity. Another factor that influences crash severity is the age of the driver. Most age groups face a high crash severity except for drivers between 60 and 100 years old, who have the lowest crash severity. The significant relationship identified between contributing factors consists of the time of the crash, the manufactured year of the vehicle, the age of the driver and hitting a tree. Having identified new contributing factors and relationships, a validation process is carried out using a traffic simulator in order to determine their accuracy. The validation process indicates that the results are accurate. This demonstrates that data mining techniques are a powerful tool in road safety research, and can be usefully applied within the Intelligent Transport System (ITS) domain. The research presented in this thesis provides an insight into the complexity of crashes on road curves. The findings of this research have important implications for both practitioners and academics. For road safety practitioners, the results from this research illustrate practical benefits for the design of interventions for road curves that will potentially help in decreasing related injuries and fatalities. For academics, this research opens up a new research methodology to assess crash severity, related to road crashes on curves.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates the patterns and determinants of life satisfaction in Germany following reunification. We implement a new fixed-effect estimator for ordinal life satisfaction in the German Socio-Economic Panel and find negative effects on life satisfaction from being recently fired, losing a spouse through either death or separation, and time spent in hospital, while we find strong positive effects from income and marriage. Using a new causal decomposition technique, we find that East Germans experienced a continued improvement in life satisfaction to which increased household incomes contributed around 12 percent. Most of the improvement is explained by better average circumstances, such as greater political freedom. For West Germans, we find little change in average life satisfaction over this period.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Information fusion in biometrics has received considerable attention. The architecture proposed here is based on the sequential integration of multi-instance and multi-sample fusion schemes. This method is analytically shown to improve the performance and allow a controlled trade-off between false alarms and false rejects when the classifier decisions are statistically independent. Equations developed for detection error rates are experimentally evaluated by considering the proposed architecture for text dependent speaker verification using HMM based digit dependent speaker models. The tuning of parameters, n classifiers and m attempts/samples, is investigated and the resultant detection error trade-off performance is evaluated on individual digits. Results show that performance improvement can be achieved even for weaker classifiers (FRR-19.6%, FAR-16.7%). The architectures investigated apply to speaker verification from spoken digit strings such as credit card numbers in telephone or VOIP or internet based applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a new approach to improving the effectiveness of autonomous systems that deal with dynamic environments. The basis of the approach is to find repeating patterns of behavior in the dynamic elements of the system, and then to use predictions of the repeating elements to better plan goal directed behavior. It is a layered approach involving classifying, modeling, predicting and exploiting. Classifying involves using observations to place the moving elements into previously defined classes. Modeling involves recording features of the behavior on a coarse grained grid. Exploitation is achieved by integrating predictions from the model into the behavior selection module to improve the utility of the robot's actions. This is in contrast to typical approaches that use the model to select between different strategies or plays. Three methods of adaptation to the dynamic features of the environment are explored. The effectiveness of each method is determined using statistical tests over a number of repeated experiments. The work is presented in the context of predicting opponent behavior in the highly dynamic and multi-agent robot soccer domain (RoboCup).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective To describe quality of life (QOL) over a 12-month period among women with breast cancer, consider the association between QOL and overall survival (OS), and explore characteristics associated with QOL declines. Methods A population-based sample of Australian women (n=287) with invasive, unilateral breast cancer (Stage I+), was observed prospectively for a median of 6.6 years. QOL was assessed at six, 12 and 18 months post-diagnosis, using the Functional Assessment of Cancer Therapy, Breast (FACT-B+4) questionnaire. Raw scores for the FACT-B+4 and subscales were computed and individuals were categorized according to whether QOL declined, remained stable or improved between six and 18 months. Kaplan-Meier and Cox proportional hazards survival methods were used to estimate OS and its associations with QOL. Logistic regression models identified factors associated with QOL decline. Results Within FACT-B+4 sub-scales, between 10% and 23% of women showed declines in QOL. Following adjustment for established prognostic factors, emotional wellbeing and FACT-B+4 scores at six months post-diagnosis were associated with OS (p<0.05). Declines in physical (p<0.01) or functional (p=0.02) well-being between six and 18 months post-diagnosis were also associated significantly with OS. Receiving multiple forms of adjuvant treatment, a perception of not handling stress well and reporting one or more other major life events at six months post-diagnosis were factors associated with declines in QOL in multivariable analyses. Conclusions Interventions targeted at preventing QOL declines may ultimately improve quantity as well as quality of life following breast cancer.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Today’s evolving networks are experiencing a large number of different attacks ranging from system break-ins, infection from automatic attack tools such as worms, viruses, trojan horses and denial of service (DoS). One important aspect of such attacks is that they are often indiscriminate and target Internet addresses without regard to whether they are bona fide allocated or not. Due to the absence of any advertised host services the traffic observed on unused IP addresses is by definition unsolicited and likely to be either opportunistic or malicious. The analysis of large repositories of such traffic can be used to extract useful information about both ongoing and new attack patterns and unearth unusual attack behaviors. However, such an analysis is difficult due to the size and nature of the collected traffic on unused address spaces. In this dissertation, we present a network traffic analysis technique which uses traffic collected from unused address spaces and relies on the statistical properties of the collected traffic, in order to accurately and quickly detect new and ongoing network anomalies. Detection of network anomalies is based on the concept that an anomalous activity usually transforms the network parameters in such a way that their statistical properties no longer remain constant, resulting in abrupt changes. In this dissertation, we use sequential analysis techniques to identify changes in the behavior of network traffic targeting unused address spaces to unveil both ongoing and new attack patterns. Specifically, we have developed a dynamic sliding window based non-parametric cumulative sum change detection techniques for identification of changes in network traffic. Furthermore we have introduced dynamic thresholds to detect changes in network traffic behavior and also detect when a particular change has ended. Experimental results are presented that demonstrate the operational effectiveness and efficiency of the proposed approach, using both synthetically generated datasets and real network traces collected from a dedicated block of unused IP addresses.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Taking an 'action genre' approach (Lemke, 199**) this paper analyses representational strategies of three genres of photography: press photography, photojournalism and documentary photography. While there has been much written on editorial photography, there is no organised body of scholarship that distinguishes between these three very different modes of of editorial photography.