30 resultados para rule mining, closed sequential patterns

em Universidade do Minho


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Earthworks tasks aim at levelling the ground surface at a target construction area and precede any kind of structural construction (e.g., road and railway construction). It is comprised of sequential tasks, such as excavation, transportation, spreading and compaction, and it is strongly based on heavy mechanical equipment and repetitive processes. Under this context, it is essential to optimize the usage of all available resources under two key criteria: the costs and duration of earthwork projects. In this paper, we present an integrated system that uses two artificial intelligence based techniques: data mining and evolutionary multi-objective optimization. The former is used to build data-driven models capable of providing realistic estimates of resource productivity, while the latter is used to optimize resource allocation considering the two main earthwork objectives (duration and cost). Experiments held using real-world data, from a construction site, have shown that the proposed system is competitive when compared with current manual earthwork design.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Rockburst is characterized by a violent explosion of a block causing a sudden rupture in the rock and is quite common in deep tunnels. It is critical to understand the phenomenon of rockburst, focusing on the patterns of occurrence so these events can be avoided and/or managed saving costs and possibly lives. The failure mechanism of rockburst needs to be better understood. Laboratory experiments are undergoing at the Laboratory for Geomechanics and Deep Underground Engineering (SKLGDUE) of Beijing and the system is described. A large number of rockburst tests were performed and their information collected, stored in a database and analyzed. Data Mining (DM) techniques were applied to the database in order to develop predictive models for the rockburst maximum stress (σRB) and rockburst risk index (IRB) that need the results of such tests to be determined. With the developed models it is possible to predict these parameters with high accuracy levels using data from the rock mass and specific project.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Hospitals are nowadays collecting vast amounts of data related with patient records. All this data hold valuable knowledge that can be used to improve hospital decision making. Data mining techniques aim precisely at the extraction of useful knowledge from raw data. This work describes an implementation of a medical data mining project approach based on the CRISP-DM methodology. Recent real-world data, from 2000 to 2013, were collected from a Portuguese hospital and related with inpatient hospitalization. The goal was to predict generic hospital Length Of Stay based on indicators that are commonly available at the hospitalization process (e.g., gender, age, episode type, medical specialty). At the data preparation stage, the data were cleaned and variables were selected and transformed, leading to 14 inputs. Next, at the modeling stage, a regression approach was adopted, where six learning methods were compared: Average Prediction, Multiple Regression, Decision Tree, Artificial Neural Network ensemble, Support Vector Machine and Random Forest. The best learning model was obtained by the Random Forest method, which presents a high quality coefficient of determination value (0.81). This model was then opened by using a sensitivity analysis procedure that revealed three influential input attributes: the hospital episode type, the physical service where the patient is hospitalized and the associated medical specialty. Such extracted knowledge confirmed that the obtained predictive model is credible and with potential value for supporting decisions of hospital managers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper aims to describe the Sequential Excavation Method, used for excava-tion in underground works, as well as the related risks and preventive measures. This method has characteristics that differentiate it from other tunnelling techniques: it uses a larger number of workers and equipment; it has a high concurrency of tasks with various workers and equip-ment quite exposed to hazards; and it uses many potentially aggressive chemicals. Firstly, it is given a broad overview of this issue. Afterwards, it will be presented the results of a survey to a sample of experienced technicians, aimed at gauging the relevance of a set of guidelines relat-ing to the design and work phases, applicable to the domestic market and prepared following technical visits to works abroad.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

telligence applications for the banking industry. Searches were performed in relevant journals resulting in 219 articles published between 2002 and 2013. To analyze such a large number of manuscripts, text mining techniques were used in pursuit for relevant terms on both business intelligence and banking domains. Moreover, the latent Dirichlet allocation modeling was used in or- der to group articles in several relevant topics. The analysis was conducted using a dictionary of terms belonging to both banking and business intelli- gence domains. Such procedure allowed for the identification of relationships between terms and topics grouping articles, enabling to emerge hypotheses regarding research directions. To confirm such hypotheses, relevant articles were collected and scrutinized, allowing to validate the text mining proce- dure. The results show that credit in banking is clearly the main application trend, particularly predicting risk and thus supporting credit approval or de- nial. There is also a relevant interest in bankruptcy and fraud prediction. Customer retention seems to be associated, although weakly, with targeting, justifying bank offers to reduce churn. In addition, a large number of ar- ticles focused more on business intelligence techniques and its applications, using the banking industry just for evaluation, thus, not clearly acclaiming for benefits in the banking business. By identifying these current research topics, this study also highlights opportunities for future research.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a study on human mobility at small spatial scales. Differently from large scale mobility, recently studied through dollar-bill tracking and mobile phone data sets within one big country or continent, we report Brownian features of human mobility at smaller scales. In particular, the scaling exponents found at the smallest scales is typically close to one-half, differently from the larger values for the exponent characterizing mobility at larger scales. We carefully analyze 12-month data of the Eduroam database within the Portuguese university of Minho. A full procedure is introduced with the aim of properly characterizing the human mobility within the network of access points composing the wireless system of the university. In particular, measures of flux are introduced for estimating a distance between access points. This distance is typically non-Euclidean, since the spatial constraints at such small scales distort the continuum space on which human mobility occurs. Since two different ex- ponents are found depending on the scale human motion takes place, we raise the question at which scale the transition from Brownian to non-Brownian motion takes place. In this context, we discuss how the numerical approach can be extended to larger scales, using the full Eduroam in Europe and in Asia, for uncovering the transi- tion between both dynamical regimes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

During the last few years many research efforts have been done to improve the design of ETL (Extract-Transform-Load) systems. ETL systems are considered very time-consuming, error-prone and complex involving several participants from different knowledge domains. ETL processes are one of the most important components of a data warehousing system that are strongly influenced by the complexity of business requirements, their changing and evolution. These aspects influence not only the structure of a data warehouse but also the structures of the data sources involved with. To minimize the negative impact of such variables, we propose the use of ETL patterns to build specific ETL packages. In this paper, we formalize this approach using BPMN (Business Process Modelling Language) for modelling more conceptual ETL workflows, mapping them to real execution primitives through the use of a domain-specific language that allows for the generation of specific instances that can be executed in an ETL commercial tool.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Worldwide, around 9% of the children are born with less than 37 weeks of labour, causing risk to the premature child, whom it is not prepared to develop a number of basic functions that begin soon after the birth. In order to ensure that those risk pregnancies are being properly monitored by the obstetricians in time to avoid those problems, Data Mining (DM) models were induced in this study to predict preterm births in a real environment using data from 3376 patients (women) admitted in the maternal and perinatal care unit of Centro Hospitalar of Oporto. A sensitive metric to predict preterm deliveries was developed, assisting physicians in the decision-making process regarding the patients’ observation. It was possible to obtain promising results, achieving sensitivity and specificity values of 96% and 98%, respectively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Lecture Notes in Computer Science, 9273

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In Maternity Care, a quick decision has to be made about the most suitable delivery type for the current patient. Guidelines are followed by physicians to support that decision; however, those practice recommendations are limited and underused. In the last years, caesarean delivery has been pursued in over 28% of pregnancies, and other operative techniques regarding specific problems have also been excessively employed. This study identifies obstetric and pregnancy factors that can be used to predict the most appropriate delivery technique, through the induction of data mining models using real data gathered in the perinatal and maternal care unit of Centro Hospitalar of Oporto (CHP). Predicting the type of birth envisions high-quality services, increased safety and effectiveness of specific practices to help guide maternity care decisions and facilitate optimal outcomes in mother and child. In this work was possible to acquire good results, achieving sensitivity and specificity values of 90.11% and 80.05%, respectively, providing the CHP with a model capable of correctly identify caesarean sections and vaginal deliveries.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tese de Doutoramento em Estudos da Criança (área de especialização em Comunicação Visual e Expressão Plástica)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tese de Doutoramento Ramo Engenharia Industrial e de Sistemas

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Transcriptional Regulatory Networks (TRNs) are powerful tool for representing several interactions that occur within a cell. Recent studies have provided information to help researchers in the tasks of building and understanding these networks. One of the major sources of information to build TRNs is biomedical literature. However, due to the rapidly increasing number of scientific papers, it is quite difficult to analyse the large amount of papers that have been published about this subject. This fact has heightened the importance of Biomedical Text Mining approaches in this task. Also, owing to the lack of adequate standards, as the number of databases increases, several inconsistencies concerning gene and protein names and identifiers are common. In this work, we developed an integrated approach for the reconstruction of TRNs that retrieve the relevant information from important biological databases and insert it into a unique repository, named KREN. Also, we applied text mining techniques over this integrated repository to build TRNs. However, was necessary to create a dictionary of names and synonyms associated with these entities and also develop an approach that retrieves all the abstracts from the related scientific papers stored on PubMed, in order to create a corpora of data about genes. Furthermore, these tasks were integrated into @Note, a software system that allows to use some methods from the Biomedical Text Mining field, including an algorithms for Named Entity Recognition (NER), extraction of all relevant terms from publication abstracts, extraction relationships between biological entities (genes, proteins and transcription factors). And finally, extended this tool to allow the reconstruction Transcriptional Regulatory Networks through using scientific literature.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Wild boar (Sus scrofa) and red deer (Cervus elaphus) are the main maintenance hosts for bovine tuberculosis (bTB) in continental Europe. Understanding Mycobacterium tuberculosis complex (MTC) excretion routes is crucial to define strategies to control bTB in free-ranging populations, nevertheless available information is scarce. Aiming at filling this gap, four different MTC excretion routes (oronasal, bronchial-alveolar, fecal and urinary) were investigated by molecular methods in naturally infected hunter-harvested wild boar and red deer. In addition MTC concentrations were estimated by the Most Probable Number method. MTC DNA was amplified in all types of excretion routes. MTC DNA was amplified in at least one excretion route from 83.0% (CI95 70.8-90.8) of wild ungulates with bTB-like lesions. Oronasal or bronchial-alveolar shedding were detected with higher frequency than fecal shedding (p < 0.001). The majority of shedders yielded MTC concentrations <10(3) CFU/g or mL. However, from those ungulates from which oronasal, bronchial-alveolar and fecal samples were available, 28.2% of wild boar (CI95 16.6-43.8) and 35.7% of red deer (CI95 16.3-61.2) yielded MTC concentrations >10(3) CFU/g or mL (referred here as super-shedders). Red deer have a significantly higher risk of being super-shedders compared to wild boar (OR = 11.8, CI95 2.3-60.2). The existence of super-shedders among the naturally infected population of wild boar and red deer is thus reported here for the first time and MTC DNA concentrations greater than the minimum infective doses were estimated in excretion samples from both species.