982 resultados para Data handling


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tese de doutoramento, Informática (Ciências da Computação), Universidade de Lisboa, Faculdade de Ciências, 2014

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Blood pressure follows a circadian rhythm with a physiologic 10% to 20% decrease during the night. There is now increasing evidence that a blunted decrease or an increase in nighttime blood pressure is associated with a greater prevalence of target organ damage and a faster disease progression in patients with chronic kidney diseases. Several factors contribute to the changes in nighttime blood pressure including changes in hormonal profiles such as variations in the activity of the renin-angiotensin and the sympathetic nervous systems. Recently, it was hypothesized that the absence of a blood pressure decrease during the nighttime (nondipping) is in fact a pressure-natriuresis mechanism enabling subjects with an impaired capacity to excrete sodium to remain in sodium balance. In this article, we review the clinical and epidemiologic data that tend to support this hypothesis. Moreover, we show that most, if not all, clinical conditions associated with an impaired dipping profile are diseases associated either with a low glomerular filtration rate and/or an impaired ability to excrete sodium. These observations would suggest that renal function, and most importantly the ability to eliminate sodium during the day, is indeed a key determinant of the circadian rhythm of blood pressure.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Knowledge discovery in databases is the non-trivial process of identifying valid, novel potentially useful and ultimately understandable patterns from data. The term Data mining refers to the process which does the exploratory analysis on the data and builds some model on the data. To infer patterns from data, data mining involves different approaches like association rule mining, classification techniques or clustering techniques. Among the many data mining techniques, clustering plays a major role, since it helps to group the related data for assessing properties and drawing conclusions. Most of the clustering algorithms act on a dataset with uniform format, since the similarity or dissimilarity between the data points is a significant factor in finding out the clusters. If a dataset consists of mixed attributes, i.e. a combination of numerical and categorical variables, a preferred approach is to convert different formats into a uniform format. The research study explores the various techniques to convert the mixed data sets to a numerical equivalent, so as to make it equipped for applying the statistical and similar algorithms. The results of clustering mixed category data after conversion to numeric data type have been demonstrated using a crime data set. The thesis also proposes an extension to the well known algorithm for handling mixed data types, to deal with data sets having only categorical data. The proposed conversion has been validated on a data set corresponding to breast cancer. Moreover, another issue with the clustering process is the visualization of output. Different geometric techniques like scatter plot, or projection plots are available, but none of the techniques display the result projecting the whole database but rather demonstrate attribute-pair wise analysis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The common GIS-based approach to regional analyses of soil organic carbon (SOC) stocks and changes is to define geographic layers for which unique sets of driving variables are derived, which include land use, climate, and soils. These GIS layers, with their associated attribute data, can then be fed into a range of empirical and dynamic models. Common methodologies for collating and formatting regional data sets on land use, climate, and soils were adopted for the project Assessment of Soil Organic Carbon Stocks and Changes at National Scale (GEFSOC). This permitted the development of a uniform protocol for handling the various input for the dynamic GEFSOC Modelling System. Consistent soil data sets for Amazon-Brazil, the Indo-Gangetic Plains (IGP) of India, Jordan and Kenya, the case study areas considered in the GEFSOC project, were prepared using methodologies developed for the World Soils and Terrain Database (SOTER). The approach involved three main stages: (1) compiling new soil geographic and attribute data in SOTER format; (2) using expert estimates and common sense to fill selected gaps in the measured or primary data; (3) using a scheme of taxonomy-based pedotransfer rules and expert-rules to derive soil parameter estimates for similar soil units with missing soil analytical data. The most appropriate approach varied from country to country, depending largely on the overall accessibility and quality of the primary soil data available in the case study areas. The secondary SOTER data sets discussed here are appropriate for a wide range of environmental applications at national scale. These include agro-ecological zoning, land evaluation, modelling of soil C stocks and changes, and studies of soil vulnerability to pollution. Estimates of national-scale stocks of SOC, calculated using SOTER methods, are presented as a first example of database application. Independent estimates of SOC stocks are needed to evaluate the outcome of the GEFSOC Modelling System for current conditions of land use and climate. (C) 2007 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have compared the biokinetics of deuterated natural (RPR) and synthetic (all rac) alpha-tocopherol in male apoE4-carrying smokers and nonsmokers. In a randomized, crossover study subjects underwent two 4-week treatments (400 mg/day) with undeuterated RRR- and all rac-alpha-tocopheryl acetate around a 12-week washout. Before and after each supplementation period subjects underwent a biokinetic protocol (48 h) with 150 mg deuterated RRR- or all rac-alpha-tocopheryl acetate. During the biokinetic protocols, the elimination of endogenous plasma alpha-tocopherol was significantly faster in smokers (P < 0.05). However, smokers had a lower uptake of deuterated RRR than nonsmokers, but there was no difference in uptake of deuterated all rac. The supplementation regimes significantly raised plasma alpha-tocopherol (P < 0.001) with no differences in response between smokers and nonsmokers or between alpha-tocopherol forms. Smokers had significantly lower excretion of alpha-carboxyethyl-hydroxychroman than nonsmokers following supplementation (P < 0.05). Nonsmokers excreted more alpha-carboxyethyl-hydroxychroman following RRR than all rac; however, smokers did not differ in excretion between forms. At baseline, smokers had significantly lower ascorbate (P < 0.01) and higher F(2-)isoprostarres (P < 0.05). F-2-isoprostanes in smokers remained unchanged during the study, but increased in nonsmokers following alpha-tocopherol supplementation. These data suggest that apoE4-carrying smokers and nonsmokers differ in their handling of natural and synthetic alpha-tocopherol. (c) 2006 Elsevier Inc. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes a prototype grid infrastructure, called the eMinerals minigrid, for molecular simulation scientists. which is based on an integration of shared compute and data resources. We describe the key components, namely the use of Condor pools, Linux/Unix clusters with PBS and IBM's LoadLeveller job handling tools, the use of Globus for security handling, the use of Condor-G tools for wrapping globus job submit commands, Condor's DAGman tool for handling workflow, the Storage Resource Broker for handling data, and the CCLRC dataportal and associated tools for both archiving data with metadata and making data available to other workers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Novel imaging techniques are playing an increasingly important role in drug development, providing insight into the mechanism of action of new chemical entities. The data sets obtained by these methods can be large with complex inter-relationships, but the most appropriate statistical analysis for handling this data is often uncertain - precisely because of the exploratory nature of the way the data are collected. We present an example from a clinical trial using magnetic resonance imaging to assess changes in atherosclerotic plaques following treatment with a tool compound with established clinical benefit. We compared two specific approaches to handle the correlations due to physical location and repeated measurements: two-level and four-level multilevel models. The two methods identified similar structural variables, but higher level multilevel models had the advantage of explaining a greater proportion of variation, and the modeling assumptions appeared to be better satisfied.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In Britain, substantial cuts in police budgets alongside controversial handling of incidents such as politically sensitive enquiries, public disorder and relations with the media have recently triggered much debate about public knowledge and trust in the police. To date, however, little academic research has investigated how knowledge of police performance impacts citizens’ trust. We address this long-standing lacuna by exploring citizens’ trust before and after exposure to real performance data in the context of a British police force. The results reveal that being informed of performance data affects citizens’ trust significantly. Furthermore, direction and degree of change in trust are related to variations across the different elements of the reported performance criteria. Interestingly, the volatility of citizens’ trust is related to initial performance perceptions (such that citizens with low initial perceptions of police performance react more significantly to evidence of both good and bad performance than citizens with high initial perceptions), and citizens’ intentions to support the police do not always correlate with their cognitive and affective trust towards the police. In discussing our findings, we explore the implications of how being transparent with performance data can both hinder and be helpful in developing citizens’ trust towards a public organisation such as the police. From our study, we pose a number of ethical challenges that practitioners face when deciding what data to highlight, to whom, and for what purpose.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Astronomy has evolved almost exclusively by the use of spectroscopic and imaging techniques, operated separately. With the development of modern technologies, it is possible to obtain data cubes in which one combines both techniques simultaneously, producing images with spectral resolution. To extract information from them can be quite complex, and hence the development of new methods of data analysis is desirable. We present a method of analysis of data cube (data from single field observations, containing two spatial and one spectral dimension) that uses Principal Component Analysis (PCA) to express the data in the form of reduced dimensionality, facilitating efficient information extraction from very large data sets. PCA transforms the system of correlated coordinates into a system of uncorrelated coordinates ordered by principal components of decreasing variance. The new coordinates are referred to as eigenvectors, and the projections of the data on to these coordinates produce images we will call tomograms. The association of the tomograms (images) to eigenvectors (spectra) is important for the interpretation of both. The eigenvectors are mutually orthogonal, and this information is fundamental for their handling and interpretation. When the data cube shows objects that present uncorrelated physical phenomena, the eigenvector`s orthogonality may be instrumental in separating and identifying them. By handling eigenvectors and tomograms, one can enhance features, extract noise, compress data, extract spectra, etc. We applied the method, for illustration purpose only, to the central region of the low ionization nuclear emission region (LINER) galaxy NGC 4736, and demonstrate that it has a type 1 active nucleus, not known before. Furthermore, we show that it is displaced from the centre of its stellar bulge.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we introduce a parametric model for handling lifetime data where an early lifetime can be related to the infant-mortality failure or to the wear processes but we do not know which risk is responsible for the failure. The maximum likelihood approach and the sampling-based approach are used to get the inferences of interest. Some special cases of the proposed model are studied via Monte Carlo methods for size and power of hypothesis tests. To illustrate the proposed methodology, we introduce an example consisting of a real data set.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Model trees are a particular case of decision trees employed to solve regression problems. They have the advantage of presenting an interpretable output, helping the end-user to get more confidence in the prediction and providing the basis for the end-user to have new insight about the data, confirming or rejecting hypotheses previously formed. Moreover, model trees present an acceptable level of predictive performance in comparison to most techniques used for solving regression problems. Since generating the optimal model tree is an NP-Complete problem, traditional model tree induction algorithms make use of a greedy top-down divide-and-conquer strategy, which may not converge to the global optimal solution. In this paper, we propose a novel algorithm based on the use of the evolutionary algorithms paradigm as an alternate heuristic to generate model trees in order to improve the convergence to globally near-optimal solutions. We call our new approach evolutionary model tree induction (E-Motion). We test its predictive performance using public UCI data sets, and we compare the results to traditional greedy regression/model trees induction algorithms, as well as to other evolutionary approaches. Results show that our method presents a good trade-off between predictive performance and model comprehensibility, which may be crucial in many machine learning applications. (C) 2010 Elsevier Inc. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Data mining refers to extracting or "mining" knowledge from large amounts of data. It is an increasingly popular field that uses statistical, visualization, machine learning, and other data manipulation and knowledge extraction techniques aimed at gaining an insight into the relationships and patterns hidden in the data. Availability of digital data within picture archiving and communication systems raises a possibility of health care and research enhancement associated with manipulation, processing and handling of data by computers.That is the basis for computer-assisted radiology development. Further development of computer-assisted radiology is associated with the use of new intelligent capabilities such as multimedia support and data mining in order to discover the relevant knowledge for diagnosis. It is very useful if results of data mining can be communicated to humans in an understandable way. In this paper, we present our work on data mining in medical image archiving systems. We investigate the use of a very efficient data mining technique, a decision tree, in order to learn the knowledge for computer-assisted image analysis. We apply our method to the classification of x-ray images for lung cancer diagnosis. The proposed technique is based on an inductive decision tree learning algorithm that has low complexity with high transparency and accuracy. The results show that the proposed algorithm is robust, accurate, fast, and it produces a comprehensible structure, summarizing the knowledge it induces.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Wireless sensor networks (WSNs) are proposed as powerful means for fine grained monitoring in different classes of applications at very low cost and for extended periods of time. Among various solutions, supporting WSNs with intelligent mobile platforms for handling the data management, proved its benefits towards extending the network lifetime and enhancing its performance. The mobility model applied highly affects the data latency in the network as well as the sensors’ energy consumption levels. Intelligent-based models taking into consideration the network runtime conditions are adopted to overcome such problems. In this chapter, existing proposals that use intelligent mobility for managing the data in WSNs are surveyed. Different classifications are presented through the chapter to give a complete view on the solutions lying in this domain. Furthermore, these models are compared considering various metrics and design goals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this study, we develop some deterministic metamodels to quickly and precisely predict the future of a technically complex system. The underlying system is essentially a stochastic, discrete event simulation model of a big baggage handling system. The highly detailed simulation model of this is used for conducting some experiments and logging data which are then used for training artificial neural network metamodels. Demonstrated results show that the developed metamodels are well able to predict different performance measures related to the travel time of bags within this system. In contrast to the simulation models which are computationally expensive and expertise extensive to be developed, run, and maintained, the artificial neural network metamodels could serve as real time decision aiding tools which are considerably fast, precise, simple to use, and reliable.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a triple-random ensemble learning method for handling multi-label classification problems. The proposed method integrates and develops the concepts of random subspace, bagging and random k-label sets ensemble learning methods to form an approach to classify multi-label data. It applies the random subspace method to feature space, label space as well as instance space. The devised subsets selection procedure is executed iteratively. Each multi-label classifier is trained using the randomly selected subsets. At the end of the iteration, optimal parameters are selected and the ensemble MLC classifiers are constructed. The proposed method is implemented and its performance compared against that of popular multi-label classification methods. The experimental results reveal that the proposed method outperforms the examined counterparts in most occasions when tested on six small to larger multi-label datasets from different domains. This demonstrates that the developed method possesses general applicability for various multi-label classification problems.