15 resultados para Information Retrieval, Weblogs, Decision Support
em Digital Commons at Florida International University
Resumo:
The rapid growth of the Internet and the advancements of the Web technologies have made it possible for users to have access to large amounts of on-line music data, including music acoustic signals, lyrics, style/mood labels, and user-assigned tags. The progress has made music listening more fun, but has raised an issue of how to organize this data, and more generally, how computer programs can assist users in their music experience. An important subject in computer-aided music listening is music retrieval, i.e., the issue of efficiently helping users in locating the music they are looking for. Traditionally, songs were organized in a hierarchical structure such as genre->artist->album->track, to facilitate the users’ navigation. However, the intentions of the users are often hard to be captured in such a simply organized structure. The users may want to listen to music of a particular mood, style or topic; and/or any songs similar to some given music samples. This motivated us to work on user-centric music retrieval system to improve users’ satisfaction with the system. The traditional music information retrieval research was mainly concerned with classification, clustering, identification, and similarity search of acoustic data of music by way of feature extraction algorithms and machine learning techniques. More recently the music information retrieval research has focused on utilizing other types of data, such as lyrics, user-access patterns, and user-defined tags, and on targeting non-genre categories for classification, such as mood labels and styles. This dissertation focused on investigating and developing effective data mining techniques for (1) organizing and annotating music data with styles, moods and user-assigned tags; (2) performing effective analysis of music data with features from diverse information sources; and (3) recommending music songs to the users utilizing both content features and user access patterns.
Resumo:
Infrastructure management agencies are facing multiple challenges, including aging infrastructure, reduction in capacity of existing infrastructure, and availability of limited funds. Therefore, decision makers are required to think innovatively and develop inventive ways of using available funds. Maintenance investment decisions are generally made based on physical condition only. It is important to understand that spending money on public infrastructure is synonymous with spending money on people themselves. This also requires consideration of decision parameters, in addition to physical condition, such as strategic importance, socioeconomic contribution and infrastructure utilization. Consideration of multiple decision parameters for infrastructure maintenance investments can be beneficial in case of limited funding. Given this motivation, this dissertation presents a prototype decision support framework to evaluate trade-off, among competing infrastructures, that are candidates for infrastructure maintenance, repair and rehabilitation investments. Decision parameters' performances measured through various factors are combined to determine the integrated state of an infrastructure using Multi-Attribute Utility Theory (MAUT). The integrated state, cost and benefit estimates of probable maintenance actions are utilized alongside expert opinion to develop transition probability and reward matrices for each probable maintenance action for a particular candidate infrastructure. These matrices are then used as an input to the Markov Decision Process (MDP) for the finite-stage dynamic programming model to perform project (candidate)-level analysis to determine optimized maintenance strategies based on reward maximization. The outcomes of project (candidate)-level analysis are then utilized to perform network-level analysis taking the portfolio management approach to determine a suitable portfolio under budgetary constraints. The major decision support outcomes of the prototype framework include performance trend curves, decision logic maps, and a network-level maintenance investment plan for the upcoming years. The framework has been implemented with a set of bridges considered as a network with the assistance of the Pima County DOT, AZ. It is expected that the concept of this prototype framework can help infrastructure management agencies better manage their available funds for maintenance.
Resumo:
Increased pressure to control costs and increased competition has prompted health care managers to look for tools to effectively operate their institutions. This research sought a framework for the development of a Simulation-Based Decision Support System (SB-DSS) to evaluate operating policies. A prototype of this SB-DSS was developed. It incorporates a simulation model that uses real or simulated data. ER decisions have been categorized and, for each one, an implementation plan has been devised. Several issues of integrating heterogeneous tools have been addressed. The prototype revealed that simulation can truly be used in this environment in a timely fashion because the simulation model has been complemented with a series of decision-making routines. These routines use a hierarchical approach to organize the various scenarios under which the model may run and to partially reconfigure the ARENA model at run time. Hence, the SB-DSS tailors its responses to each node in the hierarchy.
Resumo:
Construction organizations typically deal with large volumes of project data containing valuable information. It is found that these organizations do not use these data effectively for planning and decision-making. There are two reasons. First, the information systems in construction organizations are designed to support day-to-day construction operations. The data stored in these systems are often non-validated, non-integrated and are available in a format that makes it difficult for decision makers to use in order to make timely decisions. Second, the organizational structure and the IT infrastructure are often not compatible with the information systems thereby resulting in higher operational costs and lower productivity. These two issues have been investigated in this research with the objective of developing systems that are structured for effective decision-making. ^ A framework was developed to guide storage and retrieval of validated and integrated data for timely decision-making and to enable construction organizations to redesign their organizational structure and IT infrastructure matched with information system capabilities. The research was focused on construction owner organizations that were continuously involved in multiple construction projects. Action research and Data warehousing techniques were used to develop the framework. ^ One hundred and sixty-three construction owner organizations were surveyed in order to assess their data needs, data management practices and extent of use of information systems in planning and decision-making. For in-depth analysis, Miami-Dade Transit (MDT) was selected which is in-charge of all transportation-related construction projects in the Miami-Dade county. A functional model and a prototype system were developed to test the framework. The results revealed significant improvements in data management and decision-support operations that were examined through various qualitative (ease in data access, data quality, response time, productivity improvement, etc.) and quantitative (time savings and operational cost savings) measures. The research results were first validated by MDT and then by a representative group of twenty construction owner organizations involved in various types of construction projects. ^
Resumo:
Construction organizations typically deal with large volumes of project data containing valuable information. It is found that these organizations do not use these data effectively for planning and decision-making. There are two reasons. First, the information systems in construction organizations are designed to support day-to-day construction operations. The data stored in these systems are often non-validated, nonintegrated and are available in a format that makes it difficult for decision makers to use in order to make timely decisions. Second, the organizational structure and the IT infrastructure are often not compatible with the information systems thereby resulting in higher operational costs and lower productivity. These two issues have been investigated in this research with the objective of developing systems that are structured for effective decision-making. A framework was developed to guide storage and retrieval of validated and integrated data for timely decision-making and to enable construction organizations to redesign their organizational structure and IT infrastructure matched with information system capabilities. The research was focused on construction owner organizations that were continuously involved in multiple construction projects. Action research and Data warehousing techniques were used to develop the framework. One hundred and sixty-three construction owner organizations were surveyed in order to assess their data needs, data management practices and extent of use of information systems in planning and decision-making. For in-depth analysis, Miami-Dade Transit (MDT) was selected which is in-charge of all transportation-related construction projects in the Miami-Dade county. A functional model and a prototype system were developed to test the framework. The results revealed significant improvements in data management and decision-support operations that were examined through various qualitative (ease in data access, data quality, response time, productivity improvement, etc.) and quantitative (time savings and operational cost savings) measures. The research results were first validated by MDT and then by a representative group of twenty construction owner organizations involved in various types of construction projects.
Resumo:
This thesis develops and validates the framework of a specialized maintenance decision support system for a discrete part manufacturing facility. Its construction utilizes a modular approach based on the fundamental philosophy of Reliability Centered Maintenance (RCM). The proposed architecture uniquely integrates System Decomposition, System Evaluation, Failure Analysis, Logic Tree Analysis, and Maintenance Planning modules. It presents an ideal solution to the unique maintenance inadequacies of modern discrete part manufacturing systems. Well established techniques are incorporated as building blocks of the system's modules. These include Failure Mode Effect and Criticality Analysis (FMECA), Logic Tree Analysis (LTA), Theory of Constraints (TOC), and an Expert System (ES). A Maintenance Information System (MIS) performs the system's support functions. Validation was performed by field testing of the system at a Miami based manufacturing facility. Such a maintenance support system potentially reduces downtime losses and contributes to higher product quality output. Ultimately improved profitability is the final outcome. ^
Resumo:
Since multimedia data, such as images and videos, are way more expressive and informative than ordinary text-based data, people find it more attractive to communicate and express with them. Additionally, with the rising popularity of social networking tools such as Facebook and Twitter, multimedia information retrieval can no longer be considered a solitary task. Rather, people constantly collaborate with one another while searching and retrieving information. But the very cause of the popularity of multimedia data, the huge and different types of information a single data object can carry, makes their management a challenging task. Multimedia data is commonly represented as multidimensional feature vectors and carry high-level semantic information. These two characteristics make them very different from traditional alpha-numeric data. Thus, to try to manage them with frameworks and rationales designed for primitive alpha-numeric data, will be inefficient. An index structure is the backbone of any database management system. It has been seen that index structures present in existing relational database management frameworks cannot handle multimedia data effectively. Thus, in this dissertation, a generalized multidimensional index structure is proposed which accommodates the atypical multidimensional representation and the semantic information carried by different multimedia data seamlessly from within one single framework. Additionally, the dissertation investigates the evolving relationships among multimedia data in a collaborative environment and how such information can help to customize the design of the proposed index structure, when it is used to manage multimedia data in a shared environment. Extensive experiments were conducted to present the usability and better performance of the proposed framework over current state-of-art approaches.
Resumo:
Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. ^ Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a twofold “custom wrapper” approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. ^ Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. ^ This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases. ^
Resumo:
The increasing emphasis on mass customization, shortened product lifecycles, synchronized supply chains, when coupled with advances in information system, is driving most firms towards make-to-order (MTO) operations. Increasing global competition, lower profit margins, and higher customer expectations force the MTO firms to plan its capacity by managing the effective demand. The goal of this research was to maximize the operational profits of a make-to-order operation by selectively accepting incoming customer orders and simultaneously allocating capacity for them at the sales stage. ^ For integrating the two decisions, a Mixed-Integer Linear Program (MILP) was formulated which can aid an operations manager in an MTO environment to select a set of potential customer orders such that all the selected orders are fulfilled by their deadline. The proposed model combines order acceptance/rejection decision with detailed scheduling. Experiments with the formulation indicate that for larger problem sizes, the computational time required to determine an optimal solution is prohibitive. This formulation inherits a block diagonal structure, and can be decomposed into one or more sub-problems (i.e. one sub-problem for each customer order) and a master problem by applying Dantzig-Wolfe’s decomposition principles. To efficiently solve the original MILP, an exact Branch-and-Price algorithm was successfully developed. Various approximation algorithms were developed to further improve the runtime. Experiments conducted unequivocally show the efficiency of these algorithms compared to a commercial optimization solver.^ The existing literature addresses the static order acceptance problem for a single machine environment having regular capacity with an objective to maximize profits and a penalty for tardiness. This dissertation has solved the order acceptance and capacity planning problem for a job shop environment with multiple resources. Both regular and overtime resources is considered. ^ The Branch-and-Price algorithms developed in this dissertation are faster and can be incorporated in a decision support system which can be used on a daily basis to help make intelligent decisions in a MTO operation.^
Resumo:
Background As the use of electronic health records (EHRs) becomes more widespread, so does the need to search and provide effective information discovery within them. Querying by keyword has emerged as one of the most effective paradigms for searching. Most work in this area is based on traditional Information Retrieval (IR) techniques, where each document is compared individually against the query. We compare the effectiveness of two fundamentally different techniques for keyword search of EHRs. Methods We built two ranking systems. The traditional BM25 system exploits the EHRs' content without regard to association among entities within. The Clinical ObjectRank (CO) system exploits the entities' associations in EHRs using an authority-flow algorithm to discover the most relevant entities. BM25 and CO were deployed on an EHR dataset of the cardiovascular division of Miami Children's Hospital. Using sequences of keywords as queries, sensitivity and specificity were measured by two physicians for a set of 11 queries related to congenital cardiac disease. Results Our pilot evaluation showed that CO outperforms BM25 in terms of sensitivity (65% vs. 38%) by 71% on average, while maintaining the specificity (64% vs. 61%). The evaluation was done by two physicians. Conclusions Authority-flow techniques can greatly improve the detection of relevant information in EHRs and hence deserve further study.
Resumo:
Construction projects are complex endeavors that require the involvement of different professional disciplines in order to meet various project objectives that are often conflicting. The level of complexity and the multi-objective nature of construction projects lend themselves to collaborative design and construction such as integrated project delivery (IPD), in which relevant disciplines work together during project conception, design and construction. Traditionally, the main objectives of construction projects have been to build in the least amount of time with the lowest cost possible, thus the inherent and well-established relationship between cost and time has been the focus of many studies. The importance of being able to effectively model relationships among multiple objectives in building construction has been emphasized in a wide range of research. In general, the trade-off relationship between time and cost is well understood and there is ample research on the subject. However, despite sustainable building designs, relationships between time and environmental impact, as well as cost and environmental impact, have not been fully investigated. The objectives of this research were mainly to analyze and identify relationships of time, cost, and environmental impact, in terms of CO2 emissions, at different levels of a building: material level, component level, and building level, at the pre-use phase, including manufacturing and construction, and the relationships of life cycle cost and life cycle CO2 emissions at the usage phase. Additionally, this research aimed to develop a robust simulation-based multi-objective decision-support tool, called SimulEICon, which took construction data uncertainty into account, and was capable of incorporating life cycle assessment information to the decision-making process. The findings of this research supported the trade-off relationship between time and cost at different building levels. Moreover, the time and CO2 emissions relationship presented trade-off behavior at the pre-use phase. The results of the relationship between cost and CO2 emissions were interestingly proportional at the pre-use phase. The same pattern continually presented after the construction to the usage phase. Understanding the relationships between those objectives is a key in successfully planning and designing environmentally sustainable construction projects.
Resumo:
The increasing amount of available semistructured data demands efficient mechanisms to store, process, and search an enormous corpus of data to encourage its global adoption. Current techniques to store semistructured documents either map them to relational databases, or use a combination of flat files and indexes. These two approaches result in a mismatch between the tree-structure of semistructured data and the access characteristics of the underlying storage devices. Furthermore, the inefficiency of XML parsing methods has slowed down the large-scale adoption of XML into actual system implementations. The recent development of lazy parsing techniques is a major step towards improving this situation, but lazy parsers still have significant drawbacks that undermine the massive adoption of XML. Once the processing (storage and parsing) issues for semistructured data have been addressed, another key challenge to leverage semistructured data is to perform effective information discovery on such data. Previous works have addressed this problem in a generic (i.e. domain independent) way, but this process can be improved if knowledge about the specific domain is taken into consideration. This dissertation had two general goals: The first goal was to devise novel techniques to efficiently store and process semistructured documents. This goal had two specific aims: We proposed a method for storing semistructured documents that maps the physical characteristics of the documents to the geometrical layout of hard drives. We developed a Double-Lazy Parser for semistructured documents which introduces lazy behavior in both the pre-parsing and progressive parsing phases of the standard Document Object Model's parsing mechanism. The second goal was to construct a user-friendly and efficient engine for performing Information Discovery over domain-specific semistructured documents. This goal also had two aims: We presented a framework that exploits the domain-specific knowledge to improve the quality of the information discovery process by incorporating domain ontologies. We also proposed meaningful evaluation metrics to compare the results of search systems over semistructured documents.
Resumo:
The increasing emphasis on mass customization, shortened product lifecycles, synchronized supply chains, when coupled with advances in information system, is driving most firms towards make-to-order (MTO) operations. Increasing global competition, lower profit margins, and higher customer expectations force the MTO firms to plan its capacity by managing the effective demand. The goal of this research was to maximize the operational profits of a make-to-order operation by selectively accepting incoming customer orders and simultaneously allocating capacity for them at the sales stage. For integrating the two decisions, a Mixed-Integer Linear Program (MILP) was formulated which can aid an operations manager in an MTO environment to select a set of potential customer orders such that all the selected orders are fulfilled by their deadline. The proposed model combines order acceptance/rejection decision with detailed scheduling. Experiments with the formulation indicate that for larger problem sizes, the computational time required to determine an optimal solution is prohibitive. This formulation inherits a block diagonal structure, and can be decomposed into one or more sub-problems (i.e. one sub-problem for each customer order) and a master problem by applying Dantzig-Wolfe’s decomposition principles. To efficiently solve the original MILP, an exact Branch-and-Price algorithm was successfully developed. Various approximation algorithms were developed to further improve the runtime. Experiments conducted unequivocally show the efficiency of these algorithms compared to a commercial optimization solver. The existing literature addresses the static order acceptance problem for a single machine environment having regular capacity with an objective to maximize profits and a penalty for tardiness. This dissertation has solved the order acceptance and capacity planning problem for a job shop environment with multiple resources. Both regular and overtime resources is considered. The Branch-and-Price algorithms developed in this dissertation are faster and can be incorporated in a decision support system which can be used on a daily basis to help make intelligent decisions in a MTO operation.
Resumo:
Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a two-fold "custom wrapper" approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases.
Resumo:
Our jury system is predicated upon the expectation that jurors engage in systematic processing when considering evidence and making decisions. They are instructed to interpret facts and apply the appropriate law in a fair, dispassionate manner, free of all bias, including that of emotion. However, emotions containing an element of certainty (e.g., anger and happiness, which require little cognitive effort in determining their source) can often lead people to engage in superficial, heuristic-based processing. Compare this to uncertain emotions (e.g., hope and fear, which require people to seek out explanations for their emotional arousal), which instead has the potential to lead them to engage in deeper, more systematic processing. The purpose of the current research is in part to confirm past research (Tiedens & Linton, 2001; Semmler & Brewer, 2002) that uncertain emotions (like fear) can influence decision-making towards a more systematic style of processing, whereas more certain emotional states (like anger) will lead to a more heuristic style of processing. Studies One, Two, and Three build upon this prior research with the goal of improving methodological rigor through the use of film clips to reliably induce emotions, with awareness of testimonial details serving as measures of processing style. The ultimate objective of the current research was to explore this effect in Study Four by inducing either fear, anger, or neutral emotion in mock jurors, half of whom then followed along with a trial transcript featuring eight testimonial inconsistencies, while the other participants followed along with an error-free version of the same transcript. Overall rates of detection for these inconsistencies was expected to be higher for the uncertain/fearful participants due to their more effortful processing compared to certain/angry participants. These expectations were not fulfilled, with significant main effects only for the transcript version (with or without inconsistencies) on overall inconsistency detection rates. There are a number of plausible explanations for these results, so further investigation is needed.