860 resultados para conceptual data modelling
Resumo:
A complete workflow specification requires careful integration of many different process characteristics. Decisions must be made as to the definitions of individual activities, their scope, the order of execution that maintains the overall business process logic, the rules governing the discipline of work list scheduling to performers, identification of time constraints and more. The goal of this paper is to address an important issue in workflows modelling and specification, which is data flow, its modelling, specification and validation. Researchers have neglected this dimension of process analysis for some time, mainly focussing on structural considerations with limited verification checks. In this paper, we identify and justify the importance of data modelling in overall workflows specification and verification. We illustrate and define several potential data flow problems that, if not detected prior to workflow deployment may prevent the process from correct execution, execute process on inconsistent data or even lead to process suspension. A discussion on essential requirements of the workflow data model in order to support data validation is also given..
Resumo:
The design and implementation of data bases involve, firstly, the formulation of a conceptual data model by systematic analysis of the structure and information requirements of the organisation for which the system is being designed; secondly, the logical mapping of this conceptual model onto the data structure of the target data base management system (DBMS); and thirdly, the physical mapping of this structured model into storage structures of the target DBMS. The accuracy of both the logical and physical mapping determine the performance of the resulting systems. This thesis describes research which develops software tools to facilitate the implementation of data bases. A conceptual model describing the information structure of a hospital is derived using the Entity-Relationship (E-R) approach and this model forms the basis for mapping onto the logical model. Rules are derived for automatically mapping the conceptual model onto relational and CODASYL types of data structures. Further algorithms are developed for partly automating the implementation of these models onto INGRES, MIMER and VAX-11 DBMS.
Resumo:
The algorithmic approach to data modelling has developed rapidly these last years, in particular methods based on data mining and machine learning have been used in a growing number of applications. These methods follow a data-driven methodology, aiming at providing the best possible generalization and predictive abilities instead of concentrating on the properties of the data model. One of the most successful groups of such methods is known as Support Vector algorithms. Following the fruitful developments in applying Support Vector algorithms to spatial data, this paper introduces a new extension of the traditional support vector regression (SVR) algorithm. This extension allows for the simultaneous modelling of environmental data at several spatial scales. The joint influence of environmental processes presenting different patterns at different scales is here learned automatically from data, providing the optimum mixture of short and large-scale models. The method is adaptive to the spatial scale of the data. With this advantage, it can provide efficient means to model local anomalies that may typically arise in situations at an early phase of an environmental emergency. However, the proposed approach still requires some prior knowledge on the possible existence of such short-scale patterns. This is a possible limitation of the method for its implementation in early warning systems. The purpose of this paper is to present the multi-scale SVR model and to illustrate its use with an application to the mapping of Cs137 activity given the measurements taken in the region of Briansk following the Chernobyl accident.
Resumo:
A unified approach is proposed for sparse kernel data modelling that includes regression and classification as well as probability density function estimation. The orthogonal-least-squares forward selection method based on the leave-one-out test criteria is presented within this unified data-modelling framework to construct sparse kernel models that generalise well. Examples from regression, classification and density estimation applications are used to illustrate the effectiveness of this generic sparse kernel data modelling approach.
Resumo:
A basic principle in data modelling is to incorporate available a priori information regarding the underlying data generating mechanism into the modelling process. We adopt this principle and consider grey-box radial basis function (RBF) modelling capable of incorporating prior knowledge. Specifically, we show how to explicitly incorporate the two types of prior knowledge: the underlying data generating mechanism exhibits known symmetric property and the underlying process obeys a set of given boundary value constraints. The class of orthogonal least squares regression algorithms can readily be applied to construct parsimonious grey-box RBF models with enhanced generalisation capability.
Resumo:
In this paper,the Prony's method is applied to the time-domain waveform data modelling in the presence of noise.The following three problems encountered in this work are studied:(1)determination of the order of waveform;(2)de-termination of numbers of multiple roots;(3)determination of the residues.The methods of solving these problems are given and simulated on the computer.Finally,an output pulse of model PG-10N signal generator and the distorted waveform obtained by transmitting the pulse above mentioned through a piece of coaxial cable are modelled,and satisfactory results are obtained.So the effectiveness of Prony's method in waveform data modelling in the presence of noise is confirmed.
Resumo:
A fundamental principle in data modelling is to incorporate available a priori information regarding the underlying data generating mechanism into the modelling process. We adopt this principle and consider grey-box radial basis function (RBF) modelling capable of incorporating prior knowledge. Specifically, we show how to explicitly incorporate the two types of prior knowledge: (i) the underlying data generating mechanism exhibits known symmetric property, and (ii) the underlying process obeys a set of given boundary value constraints. The class of efficient orthogonal least squares regression algorithms can readily be applied without any modification to construct parsimonious grey-box RBF models with enhanced generalisation capability.
Resumo:
Climate model projections show that climate change will further increase the risk of flooding in many regions of the world. There is a need for climate adaptation, but building new infrastructure or additional retention basins has its limits, especially in densely populated areas where open spaces are limited. Another solution is the more efficient use of the existing infrastructure. This research investigates a method for real-time flood control by means of existing gated weirs and retention basins. The method was tested for the specific study area of the Demer basin in Belgium but is generally applicable. Today, retention basins along the Demer River are controlled by means of adjustable gated weirs based on fixed logic rules. However, because of the high complexity of the system, only suboptimal results are achieved by these rules. By making use of precipitation forecasts and combined hydrological-hydraulic river models, the state of the river network can be predicted. To fasten the calculation speed, a conceptual river model was used. The conceptual model was combined with a Model Predictive Control (MPC) algorithm and a Genetic Algorithm (GA). The MPC algorithm predicts the state of the river network depending on the positions of the adjustable weirs in the basin. The GA generates these positions in a semi-random way. Cost functions, based on water levels, were introduced to evaluate the efficiency of each generation, based on flood damage minimization. In the final phase of this research the influence of the most important MPC and GA parameters was investigated by means of a sensitivity study. The results show that the MPC-GA algorithm manages to reduce the total flood volume during the historical event of September 1998 by 46% in comparison with the current regulation. Based on the MPC-GA results, some recommendations could be formulated to improve the logic rules.
Resumo:
The speculation that climate change may impact on sustainable fish production suggests a need to understand how these effects influence fish catch on a broad scale. With a gross annual value of A$ 2.2 billion, the fishing industry is a significant primary industry in Australia. Many commercially important fish species use estuarine habitats such as mangroves, tidal flats and seagrass beds as nurseries or breeding grounds and have lifecycles correlated to rainfall and temperature patterns. Correlation of catches of mullet (e.g. Mugil cephalus) and barramundi (Lates calcarifer) with rainfall suggests that fisheries may be sensitive to effects of climate change. This work reviews key commercial fish and crustacean species and their link to estuaries and climate parameters. A conceptual model demonstrates ecological and biophysical links of estuarine habitats that influences capture fisheries production. The difficulty involved in explaining the effect of climate change on fisheries arising from the lack of ecological knowledge may be overcome by relating climate parameters with long-term fish catch data. Catch per unit effort (CPUE), rainfall, the Southern Oscillation Index (SOI) and catch time series for specific combinations of climate seasons and regions have been explored and surplus production models applied to Queensland's commercial fish catch data with the program CLIMPROD. Results indicate that up to 30% of Queensland's total fish catch and up to 80% of the barramundi catch variation for specific regions can be explained by rainfall often with a lagged response to rainfall events. Our approach allows an evaluation of the economic consequences of climate parameters on estuarine fisheries. thus highlighting the need to develop forecast models and manage estuaries for future climate chan e impact by adjusting the quota for climate change sensitive species. Different modelling approaches are discussed with respect to their forecast ability. (c) 2006 Elsevier Ltd. All rights reserved.
Resumo:
Simulation modelling has been used for many years in the manufacturing sector but has now become a mainstream tool in business situations. This is partly because of the popularity of business process re-engineering (BPR) and other process based improvement methods that use simulation to help analyse changes in process design. This textbook includes case studies in both manufacturing and service situations to demonstrate the usefulness of the approach. A further reason for the increasing popularity of the technique is the development of business orientated and user-friendly Windows-based software. This text provides a guide to the use of ARENA, SIMUL8 and WITNESS simulation software systems that are widely used in industry and available to students. Overall this text provides a practical guide to building and implementing the results from a simulation model. All the steps in a typical simulation study are covered including data collection, input data modelling and experimentation.
Resumo:
Models of plant architecture allow us to explore how genotype environment interactions effect the development of plant phenotypes. Such models generate masses of data organised in complex hierarchies. This paper presents a generic system for creating and automatically populating a relational database from data generated by the widely used L-system approach to modelling plant morphogenesis. Techniques from compiler technology are applied to generate attributes (new fields) in the database, to simplify query development for the recursively-structured branching relationship. Use of biological terminology in an interactive query builder contributes towards making the system biologist-friendly. (C) 2002 Elsevier Science Ireland Ltd. All rights reserved.
Resumo:
Coastal areas are highly exposed to natural hazards associated with the sea. In all cases where there is historical evidence for devastating tsunamis, as is the case of the southern coasts of the Iberian Peninsula, there is a need for quantitative hazard tsunami assessment to support spatial planning. Also, local authorities must be able to act towards the population protection in a preemptive way, to inform 'what to do' and 'where to go' and in an alarm, to make people aware of the incoming danger. With this in mind, we investigated the inundation extent, run-up and water depths, of a 1755-like event on the region of Huelva, located on the Spanish southwestern coast, one of the regions that was affected in the past by several high energy events, as proved by historical documents and sedimentological data. Modelling was made with a slightly modified version of the COMCOT (Cornell Multi-grid Coupled Tsunami Model) code. Sensitivity tests were performed for a single source in order to understand the relevance and influence of the source parameters in the inundation extent and the fundamental impact parameters. We show that a 1755-like event will have a dramatic impact in a large area close to Huelva inundating an area between 82 and 92 km(2) and reaching maximum run-up around 5 m. In this sense our results show that small variations on the characteristics of the tsunami source are not too significant for the impact assessment. We show that the maximum flow depth and the maximum run-up increase with the average slip on the source, while the strike of the fault is not a critical factor as Huelva is significantly far away from the potential sources identified up to now. We also show that the maximum flow depth within the inundated area is very dependent on the tidal level, while maximum run-up is less affected, as a consequence of the complex morphology of the area.
Resumo:
An individual experiences double coverage when he bene ts from more than one health insurance plan at the same time. This paper examines the impact of such supplementary insurance on the demand for health care services. Its novelty is that within the context of count data modelling and without imposing restrictive parametric assumptions, the analysis is carried out for di¤erent points of the conditional distribution, not only for its mean location. Results indicate that moral hazard is present across the whole outcome distribution for both public and private second layers of health insurance coverage but with greater magnitude in the latter group. By looking at di¤erent points we unveil that stronger double coverage e¤ects are smaller for high levels of usage. We use data for Portugal, taking advantage of particular features of the public and private protection schemes on top of the statutory National Health Service. By exploring the last Portuguese Health Survey, we were able to evaluate their impacts on the consumption of doctor visi
Resumo:
RESUMO: A estrutura demográfica portuguesa é marcada por baixas taxas de natalidade e mortalidade, onde a população idosa representa uma fatia cada vez mais representativa, fruto de uma maior longevidade. A incidência do cancro, na sua generalidade, é maior precisamente nessa classe etária. A par de outras doenças igualmente lesivas (e.g. cardiovasculares, degenerativas) cuja incidência aumenta com a idade, o cancro merece relevo. Estudos epidemiológicos apresentam o cancro como líder mundial na mortalidade. Em países desenvolvidos, o seu peso representa 25% do número total de óbitos, percentagem essa que mais que duplica noutros países. A obesidade, a baixa ingestão de frutas e vegetais, o sedentarismo, o consumo de tabaco e a ingestão de álcool, configuram-se como cinco dos fatores de risco presentes em 30% das mortes diagnosticadas por cancro. A nível mundial e, em particular no Sul de Portugal, os cancros do estômago, recto e cólon apresentam elevadas taxas de incidência e de mortalidade. Do ponto de vista estritamente económico, o cancro é a doença que mais recursos consome enquanto que do ponto de vista físico e psicológico é uma doença que não limita o seu raio de ação ao doente. O cancro é, portanto, uma doença sempre atual e cada vez mais presente, pois reflete os hábitos e o ambiente de uma sociedade, não obstante as características intrínsecas a cada indivíduo. A adoção de metodologia estatística aplicada à modelação de dados oncológicos é, sobretudo, valiosa e pertinente quando a informação é oriunda de Registos de Cancro de Base Populacional (RCBP). A pertinência é justificada pelo fato destes registos permitirem aferir numa população específica, o risco desta sofrer e/ou vir a sofrer de uma dada neoplasia. O peso que as neoplasias do estômago, cólon e recto assumem foi um dos elementos que motivou o presente estudo que tem por objetivo analisar tendências, projeções, sobrevivências relativas e a distribuição espacial destas neoplasias. Foram considerados neste estudo todos os casos diagnosticados no período 1998-2006, pelo RCBP da região sul de Portugal (ROR-Sul). O estudo descritivo inicial das taxas de incidência e da tendência em cada uma das referidas neoplasias teve como base uma única variável temporal - o ano de diagnóstico - também designada por período. Todavia, uma metodologia que contemple apenas uma única variável temporal é limitativa. No cancro, para além do período, a idade à data do diagnóstico e a coorte de nascimento, são variáveis temporais que poderão prestar um contributo adicional na caracterização das taxas de incidência. A relevância assumida por estas variáveis temporais justificou a sua inclusão numaclasse de modelos designada por modelos Idade-Período-Coorte (Age-Period-Cohort models - APC), utilizada na modelação das taxas de incidência para as neoplasias em estudo. Os referidos modelos permitem ultrapassar o problema de relações não lineares e/ou de mudanças súbitas na tendência linear das taxas. Nos modelos APC foram consideradas a abordagem clássica e a abordagem com recurso a funções suavizadoras. A modelação das taxas foi estratificada por sexo. Foram ainda estudados os respectivos submodelos (apenas com uma ou duas variáveis temporais). Conhecido o comportamento das taxas de incidência, uma questão subsequente prende-se com a sua projeção em períodos futuros. Porém, o efeito de mudanças estruturais na população, ao qual Portugal não é alheio, altera substancialmente o número esperado de casos futuros com cancro. Estimativas da incidência de cancro a nível mundial obtidas a partir de projeções demográficas apontam para um aumento de 25% dos casos de cancro nas próximas duas décadas. Embora a projeção da incidência esteja associada a alguma incerteza, as projeções auxiliam no planeamento de políticas de saúde para a afetação de recursos e permitem a avaliação de cenários e de intervenções que tenham como objetivo a redução do impacto do cancro. O desconhecimento de projeções da taxa de incidência destas neoplasias na área abrangida pelo ROR-Sul, levou à utilização de modelos de projeção que diferem entre si quanto à sua estrutura, linearidade (ou não) dos seus coeficientes e comportamento das taxas na série histórica de dados (e.g. crescente, decrescente ou estável). Os referidos modelos pautaram-se por duas abordagens: (i)modelos lineares no que concerne ao tempo e (ii) extrapolação de efeitos temporais identificados pelos modelos APC para períodos futuros. Foi feita a projeção das taxas de incidência para os anos de 2007 a 2010 tendo em conta o género, idade e neoplasia. É ainda apresentada uma estimativa do impacto económico destas neoplasias no período de projeção. Uma questão pertinente e habitual no contexto clínico e a que o presente estudo pretende dar resposta, reside em saber qual a contribuição da neoplasia em si para a sobrevivência do doente. Nesse sentido, a mortalidade por causa específica é habitualmente utilizada para estimar a mortalidade atribuível apenas ao cancro em estudo. Porém, existem muitas situações em que a causa de morte é desconhecida e, mesmo que esta informação esteja disponível através dos certificados de óbito, não é fácil distinguir os casos em que a principal causa de morte é devida ao cancro. A sobrevivência relativa surge como uma medida objetiva que não necessita do conhecimento da causa específica da morte para o seu cálculo e dar-nos-á uma estimativa da probabilidade de sobrevivência caso o cancro em análise, num cenário hipotético, seja a única causa de morte. Desconhecida a principal causa de morte nos casos diagnosticados com cancro no registo ROR-Sul, foi determinada a sobrevivência relativa para cada uma das neoplasias em estudo, para um período de follow-up de 5 anos, tendo em conta o sexo, a idade e cada uma das regiões que constituem o registo. Foi adotada uma análise por período e as abordagens convencional e por modelos. No epílogo deste estudo, é analisada a influência da variabilidade espaço-temporal nas taxas de incidência. O longo período de latência das doenças oncológicas, a dificuldade em identificar mudanças súbitas no comportamento das taxas, populações com dimensão e riscos reduzidos, são alguns dos elementos que dificultam a análise da variação temporal das taxas. Nalguns casos, estas variações podem ser reflexo de flutuações aleatórias. O efeito da componente temporal aferida pelos modelos APC dá-nos um retrato incompleto da incidência do cancro. A etiologia desta doença, quando conhecida, está associada com alguma frequência a fatores de risco tais como condições socioeconómicas, hábitos alimentares e estilo de vida, atividade profissional, localização geográfica e componente genética. O “contributo”, dos fatores de risco é, por vezes, determinante e não deve ser ignorado. Surge, assim, a necessidade em complementar o estudo temporal das taxas com uma abordagem de cariz espacial. Assim, procurar-se-á aferir se as variações nas taxas de incidência observadas entre os concelhos inseridos na área do registo ROR-Sul poderiam ser explicadas quer pela variabilidade temporal e geográfica quer por fatores socioeconómicos ou, ainda, pelos desiguais estilos de vida. Foram utilizados os Modelos Bayesianos Hierárquicos Espaço-Temporais com o objetivo de identificar tendências espaço-temporais nas taxas de incidência bem como quantificar alguns fatores de risco ajustados à influência simultânea da região e do tempo. Os resultados obtidos pela implementação de todas estas metodologias considera-se ser uma mais valia para o conhecimento destas neoplasias em Portugal.------------ABSTRACT: mortality rates, with the elderly being an increasingly representative sector of the population, mainly due to greater longevity. The incidence of cancer, in general, is greater precisely in that age group. Alongside with other equally damaging diseases (e.g. cardiovascular,degenerative), whose incidence rates increases with age, cancer is of special note. In epidemiological studies, cancer is the global leader in mortality. In developed countries its weight represents 25% of the total number of deaths, with this percentage being doubled in other countries. Obesity, a reduce consumption of fruit and vegetables, physical inactivity, smoking and alcohol consumption, are the five risk factors present in 30% of deaths due to cancer. Globally, and in particular in the South of Portugal, the stomach, rectum and colon cancer have high incidence and mortality rates. From a strictly economic perspective, cancer is the disease that consumes more resources, while from a physical and psychological point of view, it is a disease that is not limited to the patient. Cancer is therefore na up to date disease and one of increased importance, since it reflects the habits and the environment of a society, regardless the intrinsic characteristics of each individual. The adoption of statistical methodology applied to cancer data modelling is especially valuable and relevant when the information comes from population-based cancer registries (PBCR). In such cases, these registries allow for the assessment of the risk and the suffering associated to a given neoplasm in a specific population. The weight that stomach, colon and rectum cancers assume in Portugal was one of the motivations of the present study, that focus on analyzing trends, projections, relative survival and spatial distribution of these neoplasms. The data considered in this study, are all cases diagnosed between 1998 and 2006, by the PBCR of Portugal, ROR-Sul.Only year of diagnosis, also called period, was the only time variable considered in the initial descriptive analysis of the incidence rates and trends for each of the three neoplasms considered. However, a methodology that only considers one single time variable will probably fall short on the conclusions that could be drawn from the data under study. In cancer, apart from the variable period, the age at diagnosis and the birth cohort are also temporal variables and may provide an additional contribution to the characterization of the incidence. The relevance assumed by these temporal variables justified its inclusion in a class of models called Age-Period-Cohort models (APC). This class of models was used for the analysis of the incidence rates of the three cancers under study. APC models allow to model nonlinearity and/or sudden changes in linear relationships of rate trends. Two approaches of APC models were considered: the classical and the one using smoothing functions. The models were stratified by gender and, when justified, further studies explored other sub-models where only one or two temporal variables were considered. After the analysis of the incidence rates, a subsequent goal is related to their projections in future periods. Although the effect of structural changes in the population, of which Portugal is not oblivious, may substantially change the expected number of future cancer cases, the results of these projections could help planning health policies with the proper allocation of resources, allowing for the evaluation of scenarios and interventions that aim to reduce the impact of cancer in a population. Worth noting that cancer incidence worldwide obtained from demographic projections point out to an increase of 25% of cancer cases in the next two decades. The lack of projections of incidence rates of the three cancers under study in the area covered by ROR-Sul, led us to use a variety of forecasting models that differ in the nature and structure. For example, linearity or nonlinearity in their coefficients and the trend of the incidence rates in historical data series (e.g. increasing, decreasing or stable).The models followed two approaches: (i) linear models regarding time and (ii) extrapolation of temporal effects identified by the APC models for future periods. The study provide incidence rates projections and the numbers of newly diagnosed cases for the year, 2007 to 2010, taking into account gender, age and the type of cancer. In addition, an estimate of the economic impact of these neoplasms is presented for the projection period considered. This research also try to address a relevant and common clinical question in these type of studies, regarding the contribution of the type of cancer to the patient survival. In such studies, the primary cause of death is commonly used to estimate the mortality specifically due to the cancer. However, there are many situations in which the cause of death is unknown, or, even if this information is available through the death certificates, it is not easy to distinguish the cases where the primary cause of death is the cancer. With this in mind, the relative survival is an alternative measure that does not need the knowledge of the specific cause of death to be calculated. This estimate will represent the survival probability in the hypothetical scenario of a certain cancer be the only cause of death. For the patients with unknown cause of death that were diagnosed with cancer in the ROR-Sul, the relative survival was calculated for each of the cancers under study, for a follow-up period of 5 years, considering gender, age and each one of the regions that are part the registry. A period analysis was undertaken, considering both the conventional and the model approaches. In final part of this study, we analyzed the influence of space-time variability in the incidence rates. The long latency period of oncologic diseases, the difficulty in identifying subtle changes in the rates behavior, populations of reduced size and low risk are some of the elements that can be a challenge in the analysis of temporal variations in rates, that, in some cases, can reflect simple random fluctuations. The effect of the temporal component measured by the APC models gives an incomplete picture of the cancer incidence. The etiology of this disease, when known, is frequently associated to risk factors such as socioeconomic conditions, eating habits and lifestyle, occupation, geographic location and genetic component. The "contribution"of such risk factors is sometimes decisive in the evolution of the disease and should not be ignored. Therefore, there was the need to consider an additional approach in this study, one of spatial nature, addressing the fact that changes in incidence rates observed in the ROR-Sul area, could be explained either by temporal and geographical variability or by unequal socio-economic or lifestyle factors. Thus, Bayesian hierarchical space-time models were used with the purpose of identifying space-time trends in incidence rates together with the the analysis of the effect of the risk factors considered in the study. The results obtained and the implementation of all these methodologies are considered to be an added value to the knowledge of these neoplasms in Portugal.
Resumo:
Dissertação de mestrado em Sistemas de Informação