989 resultados para 290701 Mining Engineering
Resumo:
Recent advances in machine learning methods enable increasingly the automatic construction of various types of computer assisted methods that have been difficult or laborious to program by human experts. The tasks for which this kind of tools are needed arise in many areas, here especially in the fields of bioinformatics and natural language processing. The machine learning methods may not work satisfactorily if they are not appropriately tailored to the task in question. However, their learning performance can often be improved by taking advantage of deeper insight of the application domain or the learning problem at hand. This thesis considers developing kernel-based learning algorithms incorporating this kind of prior knowledge of the task in question in an advantageous way. Moreover, computationally efficient algorithms for training the learning machines for specific tasks are presented. In the context of kernel-based learning methods, the incorporation of prior knowledge is often done by designing appropriate kernel functions. Another well-known way is to develop cost functions that fit to the task under consideration. For disambiguation tasks in natural language, we develop kernel functions that take account of the positional information and the mutual similarities of words. It is shown that the use of this information significantly improves the disambiguation performance of the learning machine. Further, we design a new cost function that is better suitable for the task of information retrieval and for more general ranking problems than the cost functions designed for regression and classification. We also consider other applications of the kernel-based learning algorithms such as text categorization, and pattern recognition in differential display. We develop computationally efficient algorithms for training the considered learning machines with the proposed kernel functions. We also design a fast cross-validation algorithm for regularized least-squares type of learning algorithm. Further, an efficient version of the regularized least-squares algorithm that can be used together with the new cost function for preference learning and ranking tasks is proposed. In summary, we demonstrate that the incorporation of prior knowledge is possible and beneficial, and novel advanced kernels and cost functions can be used in algorithms efficiently.
Resumo:
Customer relationship management has been one essential part of marketing for over 20 years. Today’s business environment is fast changing, international and highly competitive, and that is why the most important factor for long-term profitability is one-to-one customer relationships. However, managing relationships and serving customers that are profitable has been always challenging. In this thesis the objective was to define the main obstacles that the case company must overcome to succeed in CRM. Possible solutions have also been defined. The main elements of the implementation i.e. people, processes and technologies, can clearly be found behind these matters and solutions. This thesis also presents theoretical information about CRM and it is meant to act as a guide book inside the organisation to spread information about CRM for those who are not so familiar with the topic.
Resumo:
This book is one out of 8 IAEG XII Congress volumes, and deals with Landslide processes, including: field data and monitoring techniques, prediction and forecasting of landslide occurrence, regional landslide inventories and dating studies, modeling of slope instabilities and secondary hazards (e.g. impulse waves and landslide-induced tsunamis, landslide dam failures and breaching), hazard and risk assessment, earthquake and rainfall induced landslides, instabilities of volcanic edifices, remedial works and mitigation measures, development of innovative stabilization techniques and applicability to specific engineering geological conditions, use of geophysical techniques for landslide characterization and investigation of triggering mechanisms. Focuses is given to innovative techniques, well documented case studies in different environments, critical components of engineering geological and geotechnical investigations, hydrological and hydrogeological investigations, remote sensing and geophysical techniques, modeling of triggering, collapse, runout and landslide reactivation, geotechnical design and construction procedures in landslide zones, interaction of landslides with structures and infrastructures and possibility of domino effects. The Engineering Geology for Society and Territory volumes of the IAEG XII Congress held in Torino from September 15-19, 2014, analyze the dynamic role of engineering geology in our changing world and build on the four main themes of the congress: environment, processes, issues, and approaches.
Resumo:
Biomedical research is currently facing a new type of challenge: an excess of information, both in terms of raw data from experiments and in the number of scientific publications describing their results. Mirroring the focus on data mining techniques to address the issues of structured data, there has recently been great interest in the development and application of text mining techniques to make more effective use of the knowledge contained in biomedical scientific publications, accessible only in the form of natural human language. This thesis describes research done in the broader scope of projects aiming to develop methods, tools and techniques for text mining tasks in general and for the biomedical domain in particular. The work described here involves more specifically the goal of extracting information from statements concerning relations of biomedical entities, such as protein-protein interactions. The approach taken is one using full parsing—syntactic analysis of the entire structure of sentences—and machine learning, aiming to develop reliable methods that can further be generalized to apply also to other domains. The five papers at the core of this thesis describe research on a number of distinct but related topics in text mining. In the first of these studies, we assessed the applicability of two popular general English parsers to biomedical text mining and, finding their performance limited, identified several specific challenges to accurate parsing of domain text. In a follow-up study focusing on parsing issues related to specialized domain terminology, we evaluated three lexical adaptation methods. We found that the accurate resolution of unknown words can considerably improve parsing performance and introduced a domain-adapted parser that reduced the error rate of theoriginal by 10% while also roughly halving parsing time. To establish the relative merits of parsers that differ in the applied formalisms and the representation given to their syntactic analyses, we have also developed evaluation methodology, considering different approaches to establishing comparable dependency-based evaluation results. We introduced a methodology for creating highly accurate conversions between different parse representations, demonstrating the feasibility of unification of idiverse syntactic schemes under a shared, application-oriented representation. In addition to allowing formalism-neutral evaluation, we argue that such unification can also increase the value of parsers for domain text mining. As a further step in this direction, we analysed the characteristics of publicly available biomedical corpora annotated for protein-protein interactions and created tools for converting them into a shared form, thus contributing also to the unification of text mining resources. The introduced unified corpora allowed us to perform a task-oriented comparative evaluation of biomedical text mining corpora. This evaluation established clear limits on the comparability of results for text mining methods evaluated on different resources, prompting further efforts toward standardization. To support this and other research, we have also designed and annotated BioInfer, the first domain corpus of its size combining annotation of syntax and biomedical entities with a detailed annotation of their relationships. The corpus represents a major design and development effort of the research group, with manual annotation that identifies over 6000 entities, 2500 relationships and 28,000 syntactic dependencies in 1100 sentences. In addition to combining these key annotations for a single set of sentences, BioInfer was also the first domain resource to introduce a representation of entity relations that is supported by ontologies and able to capture complex, structured relationships. Part I of this thesis presents a summary of this research in the broader context of a text mining system, and Part II contains reprints of the five included publications.
Resumo:
A major challenge of cardiac tissue engineering is directing cells to establish the physiological structure and function of the myocardium being replaced. In native heart, pacing cells generate electrical stimuli that spread throughout the heartcausing cell membrane depolarization and activation of contractile apparatus. We ought to examine whether electricalstimulation of adipose tissue-derived progenitor cells (ATDPCs) exerts phenotypic and genetic changes that enhance theircardiomyogenic potential.
Resumo:
The European Educational Institutions have the challenge and the commitment to enhance multilingual competence and teaching curricular subjects in a foreign language is seen as one of the most promising alternatives. In that context, professors teaching different engineering subjects at the School of Engineering of the UPC at Manresa (EPSEM) have been involved in projects aiming at analyzing the current linguistic situation and developing some on-line open access materials using CLIL as a strategy. They formed the u-Linguatech Research Group on Multilingual Communication in Science and Technology in order to provide such resources in an effective and efficient way. In this paper, we focus on students’ perception of the improvement of their multilingual competence throughout their Engineering degree, by means of subjects taught in English by non-native speakers. Data about the English level of current students are taken into account. We also describe the use of the above resources to improve the quality of subjects learning related to Chemical Engineering curricula.
Resumo:
Objective To construct a Portuguese language index of information on the practice of diagnostic radiology in order to improve the standardization of the medical language and terminology. Materials and Methods A total of 61,461 definitive reports were collected from the database of the Radiology Information System at Hospital das Clínicas – Faculdade de Medicina de Ribeirão Preto (RIS/HCFMRP) as follows: 30,000 chest x-ray reports; 27,000 mammography reports; and 4,461 thyroid ultrasonography reports. The text mining technique was applied for the selection of terms, and the ANSI/NISO Z39.19-2005 standard was utilized to construct the index based on a thesaurus structure. The system was created in *html. Results The text mining resulted in a set of 358,236 (n = 100%) words. Out of this total, 76,347 (n = 21%) terms were selected to form the index. Such terms refer to anatomical pathology description, imaging techniques, equipment, type of study and some other composite terms. The index system was developed with 78,538 *html web pages. Conclusion The utilization of text mining on a radiological reports database has allowed the construction of a lexical system in Portuguese language consistent with the clinical practice in Radiology.
Resumo:
The vast majority of users don’t seek results beyond the second page offered by the search engine, so if a site fails to be among the top 20 (second page), it says that this page does not have good SEO and, therefore, is not visible to the user. The overall objective of this project is to conduct a study to discover the factors that determine (or not) the positioning of websites in a search engine.
Resumo:
This work presents an analysis of the assessment tools used by professors at the Universitat Politécnica de Catalunya to assess the generic competencies introduced in the Bachelor’s Degrees in Engineering. In order to conduct this study, a survey was designed and administered anonymously to a sample of the professors most receptive to educational innovation at their own university. All total, 80 professors responded to this survey, of whom 26% turned out to be members of the university’s own evaluation innovation group (https://www.upc.edu/rima/grups/grapa), GRAPA. This percentage represents 47% of the total GRAPA membership, meaning that nearly half of the professors most concerned about evaluation at the university chose to participate. The analysis of the variables carried out using the statistical program SPSS v19 shows that for practically 49% of those surveyed, rubrics are the tools most commonly used to assess generic competencies integrated in more specific ones. Of those surveyed, 60% use them either frequently or always. The most frequently evaluated generic competencies were teamwork (28%), problem solving (26%), effective oral and written communication (24%) and autonomous learning (13%), all of which constitute commonly recognized competencies in the engineering profession. A two-dimensional crosstabs analysis with SPSS v19 shows a significant correlation (Asymp. Sig. 0.001) between the type of tool used and the competencies assessed. However, no significant correlation was found between the type of assessment tool used and the type of subject, type of evaluation (formative or summative), frequency of feedback given to the students or the degree of student satisfaction, and thus none of these variables can be considered to have an influence on the kind of assessment tool used. In addition, the results also indicate that there are no significant differences between the instructors belonging to GRAPA and the rest of those surveyed
Resumo:
Teachers of the course Introduction to Mathematics for Engineers at the UOC, an online distance-learning university, have designed,developed and tested an online studymaterial. It includes basic pre-university mathematics, indications for correct follow-up of this content and recommendations for finding appropriate support and complementarymaterials. Many different resources are used,depending on the characteristics of thecontents: Flash sequences, interactive applets, WIRIS calculators and PDF files.During the last semester, the new study material has been tested with 119 students. The academic results and student satisfaction have allowed us to outline and prioritise future lines of action.
Resumo:
Teachers of the course Introduction to Mathematics for Engineers at the UOC, an online distance-learning university, have designed and produced online study material which includes basic pre-university mathematics, instructions for correct follow-up of this content and recommendations for finding appropiate support and complementary materials.
Resumo:
In this thesis we study the field of opinion mining by giving a comprehensive review of the available research that has been done in this topic. Also using this available knowledge we present a case study of a multilevel opinion mining system for a student organization's sales management system. We describe the field of opinion mining by discussing its historical roots, its motivations and applications as well as the different scientific approaches that have been used to solve this challenging problem of mining opinions. To deal with this huge subfield of natural language processing, we first give an abstraction of the problem of opinion mining and describe the theoretical frameworks that are available for dealing with appraisal language. Then we discuss the relation between opinion mining and computational linguistics which is a crucial pre-processing step for the accuracy of the subsequent steps of opinion mining. The second part of our thesis deals with the semantics of opinions where we describe the different ways used to collect lists of opinion words as well as the methods and techniques available for extracting knowledge from opinions present in unstructured textual data. In the part about collecting lists of opinion words we describe manual, semi manual and automatic ways to do so and give a review of the available lists that are used as gold standards in opinion mining research. For the methods and techniques of opinion mining we divide the task into three levels that are the document, sentence and feature level. The techniques that are presented in the document and sentence level are divided into supervised and unsupervised approaches that are used to determine the subjectivity and polarity of texts and sentences at these levels of analysis. At the feature level we give a description of the techniques available for finding the opinion targets, the polarity of the opinions about these opinion targets and the opinion holders. Also at the feature level we discuss the various ways to summarize and visualize the results of this level of analysis. In the third part of our thesis we present a case study of a sales management system that uses free form text and that can benefit from an opinion mining system. Using the knowledge gathered in the review of this field we provide a theoretical multi level opinion mining system (MLOM) that can perform most of the tasks needed from an opinion mining system. Based on the previous research we give some hints that many of the laborious market research tasks that are done by the sales force, which uses this sales management system, can improve their insight about their partners and by that increase the quality of their sales services and their overall results.