818 resultados para MULTI-RELATIONAL DATA MINING
Resumo:
The main challenges of multimedia data retrieval lie in the effective mapping between low-level features and high-level concepts, and in the individual users' subjective perceptions of multimedia content. ^ The objectives of this dissertation are to develop an integrated multimedia indexing and retrieval framework with the aim to bridge the gap between semantic concepts and low-level features. To achieve this goal, a set of core techniques have been developed, including image segmentation, content-based image retrieval, object tracking, video indexing, and video event detection. These core techniques are integrated in a systematic way to enable the semantic search for images/videos, and can be tailored to solve the problems in other multimedia related domains. In image retrieval, two new methods of bridging the semantic gap are proposed: (1) for general content-based image retrieval, a stochastic mechanism is utilized to enable the long-term learning of high-level concepts from a set of training data, such as user access frequencies and access patterns of images. (2) In addition to whole-image retrieval, a novel multiple instance learning framework is proposed for object-based image retrieval, by which a user is allowed to more effectively search for images that contain multiple objects of interest. An enhanced image segmentation algorithm is developed to extract the object information from images. This segmentation algorithm is further used in video indexing and retrieval, by which a robust video shot/scene segmentation method is developed based on low-level visual feature comparison, object tracking, and audio analysis. Based on shot boundaries, a novel data mining framework is further proposed to detect events in soccer videos, while fully utilizing the multi-modality features and object information obtained through video shot/scene detection. ^ Another contribution of this dissertation is the potential of the above techniques to be tailored and applied to other multimedia applications. This is demonstrated by their utilization in traffic video surveillance applications. The enhanced image segmentation algorithm, coupled with an adaptive background learning algorithm, improves the performance of vehicle identification. A sophisticated object tracking algorithm is proposed to track individual vehicles, while the spatial and temporal relationships of vehicle objects are modeled by an abstract semantic model. ^
Resumo:
Convergence among treatment, prevention, and developmental intervention approaches has led to the recognition of the need for evaluation models and research designs that employ a full range of evaluation information to provide an empirical basis for enhancing the efficiency, efficacy, and effectiveness of prevention and positive development interventions. This study reports an investigation of a positive youth development program using an Outcome Mediation Cascade (OMC) evaluation model, an integrated model for evaluating the empirical intersection between intervention and developmental processes. The Changing Lives Program (CLP) is a community supported positive youth development intervention implemented in a practice setting as a selective/indicated program for multi-ethnic, multi-problem at risk youth in urban alternative high schools. This study used a Relational Data Analysis integration of quantitative and qualitative data analysis strategies, including the use of both fixed and free response measures and a structural equation modeling approach, to construct and evaluate the hypothesized OMC model. Findings indicated that the hypothesized model fit the data (χ2 (7) = 6.991, p = .43; RMSEA = .00; CFI = 1.00; WRMR = .459). Findings also provided preliminary evidence consistent with the hypothesis that in addition to having effects on targeted positive outcomes, PYD interventions are likely to have progressive cascading effects on untargeted problem outcomes that operate through effects on positive outcomes. Furthermore, the general pattern of findings suggested the need to use methods capable of capturing both quantitative and qualitative change in order to increase the likelihood of identifying more complete theory informed empirically supported models of developmental intervention change processes.
Resumo:
Recent intervention efforts in promoting positive identity in troubled adolescents have begun to draw on the potential for an integration of the self-construction and self-discovery perspectives in conceptualizing identity processes, as well as the integration of quantitative and qualitative data analytic strategies. This study reports an investigation of the Changing Lives Program (CLP), using an Outcome Mediation (OM) evaluation model, an integrated model for evaluating targets of intervention, while theoretically including a Self-Transformative Model of Identity Development (STM), a proposed integration of self-discovery and self-construction identity processes. This study also used a Relational Data Analysis (RDA) integration of quantitative and qualitative analysis strategies and a structural equation modeling approach (SEM), to construct and evaluate the hypothesized OM/STM model. The CLP is a community supported positive youth development intervention, targeting multi-problem youth in alternative high schools in the Miami Dade County Public Schools (M-DCPS). The 259 participants for this study were drawn from the CLP’s archival data file. The model evaluated in this study utilized three indices of core identity processes (1) personal expressiveness, (2) identity conflict resolution, and (3) informational identity style that were conceptualized as mediators of the effects of participation in the CLP on change in two qualitative outcome indices of participants’ sense of self and identity. Findings indicated the model fit the data (χ2 (10) = 3.638, p = .96; RMSEA = .00; CFI = 1.00; WRMR = .299). The pattern of findings supported the utilization of the STM in conceptualizing identity processes and provided support for the OM design. The findings also suggested the need for methods capable of detecting and rendering unique sample specific free response data to increase the likelihood of identifying emergent core developmental research concepts and constructs in studies of intervention/developmental change over time in ways not possible using fixed response methods alone.
Resumo:
With increasing competition and more demanding members, clubs need a tool to help them belter attract and retain members and predict their behavior. Data mining is such a tool. This article presents an overview of how data warehousing, data marting, and data mining can provide the foundation on which clubs can build strategies to outsmart competitors, build Ioyalty identify new members, and lower costs.
Resumo:
Many systems and applications are continuously producing events. These events are used to record the status of the system and trace the behaviors of the systems. By examining these events, system administrators can check the potential problems of these systems. If the temporal dynamics of the systems are further investigated, the underlying patterns can be discovered. The uncovered knowledge can be leveraged to predict the future system behaviors or to mitigate the potential risks of the systems. Moreover, the system administrators can utilize the temporal patterns to set up event management rules to make the system more intelligent. With the popularity of data mining techniques in recent years, these events grad- ually become more and more useful. Despite the recent advances of the data mining techniques, the application to system event mining is still in a rudimentary stage. Most of works are still focusing on episodes mining or frequent pattern discovering. These methods are unable to provide a brief yet comprehensible summary to reveal the valuable information from the high level perspective. Moreover, these methods provide little actionable knowledge to help the system administrators to better man- age the systems. To better make use of the recorded events, more practical techniques are required. From the perspective of data mining, three correlated directions are considered to be helpful for system management: (1) Provide concise yet comprehensive summaries about the running status of the systems; (2) Make the systems more intelligence and autonomous; (3) Effectively detect the abnormal behaviors of the systems. Due to the richness of the event logs, all these directions can be solved in the data-driven manner. And in this way, the robustness of the systems can be enhanced and the goal of autonomous management can be approached. This dissertation mainly focuses on the foregoing directions that leverage tem- poral mining techniques to facilitate system management. More specifically, three concrete topics will be discussed, including event, resource demand prediction, and streaming anomaly detection. Besides the theoretic contributions, the experimental evaluation will also be presented to demonstrate the effectiveness and efficacy of the corresponding solutions.
Resumo:
Educational Data Mining is an application domain in artificial intelligence area that has been extensively explored nowadays. Technological advances and in particular, the increasing use of virtual learning environments have allowed the generation of considerable amounts of data to be investigated. Among the activities to be treated in this context exists the prediction of school performance of the students, which can be accomplished through the use of machine learning techniques. Such techniques may be used for student’s classification in predefined labels. One of the strategies to apply these techniques consists in their combination to design multi-classifier systems, which efficiency can be proven by results achieved in other studies conducted in several areas, such as medicine, commerce and biometrics. The data used in the experiments were obtained from the interactions between students in one of the most used virtual learning environments called Moodle. In this context, this paper presents the results of several experiments that include the use of specific multi-classifier systems systems, called ensembles, aiming to reach better results in school performance prediction that is, searching for highest accuracy percentage in the student’s classification. Therefore, this paper presents a significant exploration of educational data and it shows analyzes of relevant results about these experiments.
Resumo:
La tesi presenta uno studio della libreria grafica per web D3, sviluppata in javascript, e ne presenta una catalogazione dei grafici implementati e reperibili sul web. Lo scopo è quello di valutare la libreria e studiarne i pregi e difetti per capire se sia opportuno utilizzarla nell'ambito di un progetto Europeo. Per fare questo vengono studiati i metodi di classificazione dei grafici presenti in letteratura e viene esposto e descritto lo stato dell'arte del data visualization. Viene poi descritto il metodo di classificazione proposto dal team di progettazione e catalogata la galleria di grafici presente sul sito della libreria D3. Infine viene presentato e studiato in maniera formale un algoritmo per selezionare un grafico in base alle esigenze dell'utente.
Resumo:
Peer reviewed
Resumo:
Peer reviewed
Resumo:
Peer reviewed
Resumo:
Heating, ventilation, air conditioning (HVAC) systems are significant consumers of energy, however building management systems do not typically operate them in accordance with occupant movements. Due to the delayed response of HVAC systems, prediction of occupant locations is necessary to maximize energy efficiency. We present an approach to occupant location prediction based on association rule mining, allowing prediction based on historical occupant locations. Association rule mining is a machine learning technique designed to find any correlations which exist in a given dataset. Occupant location datasets have a number of properties which differentiate them from the market basket datasets that association rule mining was originally designed for. This thesis adapts the approach to suit such datasets, focusing the rule mining process on patterns which are useful for location prediction. This approach, named OccApriori, allows for the prediction of occupants’ next locations as well as their locations further in the future, and can take into account any available data, for example the day of the week, the recent movements of the occupant, and timetable data. By integrating an existing extension of association rule mining into the approach, it is able to make predictions based on general classes of locations as well as specific locations.
Resumo:
During the SINOPS project, an optimal state of the art simulation of the marine silicon cycle is attempted employing a biogeochemical ocean general circulation model (BOGCM) through three particular time steps relevant for global (paleo-) climate. In order to tune the model optimally, results of the simulations are compared to a comprehensive data set of 'real' observations. SINOPS' scientific data management ensures that data structure becomes homogeneous throughout the project. Practical work routine comprises systematic progress from data acquisition, through preparation, processing, quality check and archiving, up to the presentation of data to the scientific community. Meta-information and analytical data are mapped by an n-dimensional catalogue in order to itemize the analytical value and to serve as an unambiguous identifier. In practice, data management is carried out by means of the online-accessible information system PANGAEA, which offers a tool set comprising a data warehouse, Graphical Information System (GIS), 2-D plot, cross-section plot, etc. and whose multidimensional data model promotes scientific data mining. Besides scientific and technical aspects, this alliance between scientific project team and data management crew serves to integrate the participants and allows them to gain mutual respect and appreciation.
Resumo:
This paper presents a numerical study of a linear compressor cascade to investigate the effective end wall profiling rules for highly-loaded axial compressors. The first step in the research applies a correlation analysis for the different flow field parameters by a data mining over 600 profiling samples to quantify how variations of loss, secondary flow and passage vortex interact with each other under the influence of a profiled end wall. The result identifies the dominant role of corner separation for control of total pressure loss, providing a principle that only in the flow field with serious corner separation does the does the profiled end wall change total pressure loss, secondary flow and passage vortex in the same direction. Then in the second step, a multi-objective optimization of a profiled end wall is performed to reduce loss at design point and near stall point. The development of effective end wall profiling rules is based on the manner of secondary flow control rather than the geometry features of the end wall. Using the optimum end wall cases from the Pareto front, a quantitative tool for analyzing secondary flow control is employed. The driving force induced by a profiled end wall on different regions of end wall flow are subjected to a detailed analysis and identified for their positive/negative influences in relieving corner separation, from which the effective profiling rules are further confirmed. It is found that the profiling rules on a cascade show distinct differences at design point and near stall point, thus loss control of different operating points is generally independent.
Resumo:
In this talk, I will describe various computational modelling and data mining solutions that form the basis of how the office of Deputy Head of Department (Resources) works to serve you. These include lessons I learn about, and from, optimisation issues in resource allocation, uncertainty analysis on league tables, modelling the process of winning external grants, and lessons we learn from student satisfaction surveys, some of which I have attempted to inject into our planning processes.
Resumo:
Discovery Driven Analysis (DDA) is a common feature of OLAP technology to analyze structured data. In essence, DDA helps analysts to discover anomalous data by highlighting 'unexpected' values in the OLAP cube. By giving indications to the analyst on what dimensions to explore, DDA speeds up the process of discovering anomalies and their causes. However, Discovery Driven Analysis (and OLAP in general) is only applicable on structured data, such as records in databases. We propose a system to extend DDA technology to semi-structured text documents, that is, text documents with a few structured data. Our system pipeline consists of two stages: first, the text part of each document is structured around user specified dimensions, using semi-PLSA algorithm; then, we adapt DDA to these fully structured documents, thus enabling DDA on text documents. We present some applications of this system in OLAP analysis and show how scalability issues are solved. Results show that our system can handle reasonable datasets of documents, in real time, without any need for pre-computation.