952 resultados para Data recovery (Computer science)
Resumo:
Content-based image retrieval is still a challenging issue due to the inherent complexity of images and choice of the most discriminant descriptors. Recent developments in the field have introduced multidimensional projections to burst accuracy in the retrieval process, but many issues such as introduction of pattern recognition tasks and deeper user intervention to assist the process of choosing the most discriminant features still remain unaddressed. In this paper, we present a novel framework to CBIR that combines pattern recognition tasks, class-specific metrics, and multidimensional projection to devise an effective and interactive image retrieval system. User interaction plays an essential role in the computation of the final multidimensional projection from which image retrieval will be attained. Results have shown that the proposed approach outperforms existing methods, turning out to be a very attractive alternative for managing image data sets.
Resumo:
This article describes a real-world production planning and scheduling problem occurring at an integrated pulp and paper mill (P&P) which manufactures paper for cardboard out of produced pulp. During the cooking of wood chips in the digester, two by-products are produced: the pulp itself (virgin fibers) and the waste stream known as black liquor. The former is then mixed with recycled fibers and processed in a paper machine. Here, due to significant sequence-dependent setups in paper type changeovers, sizing and sequencing of lots have to be made simultaneously in order to efficiently use capacity. The latter is converted into electrical energy using a set of evaporators, recovery boilers and counter-pressure turbines. The planning challenge is then to synchronize the material flow as it moves through the pulp and paper mills, and energy plant, maximizing customer demand (as backlogging is allowed), and minimizing operation costs. Due to the intensive capital feature of P&P, the output of the digester must be maximized. As the production bottleneck is not fixed, to tackle this problem we propose a new model that integrates the critical production units associated to the pulp and paper mills, and energy plant for the first time. Simple stochastic mixed integer programming based local search heuristics are developed to obtain good feasible solutions for the problem. The benefits of integrating the three stages are discussed. The proposed approaches are tested on real-world data. Our work may help P&P companies to increase their competitiveness and reactiveness in dealing with demand pattern oscillations. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
The attributes describing a data set may often be arranged in meaningful subsets, each of which corresponds to a different aspect of the data. An unsupervised algorithm (SCAD) that simultaneously performs fuzzy clustering and aspects weighting was proposed in the literature. However, SCAD may fail and halt given certain conditions. To fix this problem, its steps are modified and then reordered to reduce the number of parameters required to be set by the user. In this paper we prove that each step of the resulting algorithm, named ASCAD, globally minimizes its cost-function with respect to the argument being optimized. The asymptotic analysis of ASCAD leads to a time complexity which is the same as that of fuzzy c-means. A hard version of the algorithm and a novel validity criterion that considers aspect weights in order to estimate the number of clusters are also described. The proposed method is assessed over several artificial and real data sets.
Resumo:
For the first time, we introduce a generalized form of the exponentiated generalized gamma distribution [Cordeiro et al. The exponentiated generalized gamma distribution with application to lifetime data, J. Statist. Comput. Simul. 81 (2011), pp. 827-842.] that is the baseline for the log-exponentiated generalized gamma regression model. The new distribution can accommodate increasing, decreasing, bathtub- and unimodal-shaped hazard functions. A second advantage is that it includes classical distributions reported in the lifetime literature as special cases. We obtain explicit expressions for the moments of the baseline distribution of the new regression model. The proposed model can be applied to censored data since it includes as sub-models several widely known regression models. It therefore can be used more effectively in the analysis of survival data. We obtain maximum likelihood estimates for the model parameters by considering censored data. We show that our extended regression model is very useful by means of two applications to real data.
Resumo:
The beta-Birnbaum-Saunders (Cordeiro and Lemonte, 2011) and Birnbaum-Saunders (Birnbaum and Saunders, 1969a) distributions have been used quite effectively to model failure times for materials subject to fatigue and lifetime data. We define the log-beta-Birnbaum-Saunders distribution by the logarithm of the beta-Birnbaum-Saunders distribution. Explicit expressions for its generating function and moments are derived. We propose a new log-beta-Birnbaum-Saunders regression model that can be applied to censored data and be used more effectively in survival analysis. We obtain the maximum likelihood estimates of the model parameters for censored data and investigate influence diagnostics. The new location-scale regression model is modified for the possibility that long-term survivors may be presented in the data. Its usefulness is illustrated by means of two real data sets. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
The Primary Care Information System (SIAB) concentrates basic healthcare information from all different regions of Brazil. The information is collected by primary care teams on a paper-based procedure that degrades the quality of information provided to the healthcare authorities and slows down the process of decision making. To overcome these problems we propose a new data gathering application that uses a mobile device connected to a 3G network and a GPS to be used by the primary care teams for collecting the families' data. A prototype was developed in which a digital version of one SIAB form is made available at the mobile device. The prototype was tested in a basic healthcare unit located in a suburb of Sao Paulo. The results obtained so far have shown that the proposed process is a better alternative for data collecting at primary care, both in terms of data quality and lower deployment time to health care authorities.
Resumo:
Traditional supervised data classification considers only physical features (e. g., distance or similarity) of the input data. Here, this type of learning is called low level classification. On the other hand, the human (animal) brain performs both low and high orders of learning and it has facility in identifying patterns according to the semantic meaning of the input data. Data classification that considers not only physical attributes but also the pattern formation is, here, referred to as high level classification. In this paper, we propose a hybrid classification technique that combines both types of learning. The low level term can be implemented by any classification technique, while the high level term is realized by the extraction of features of the underlying network constructed from the input data. Thus, the former classifies the test instances by their physical features or class topologies, while the latter measures the compliance of the test instances to the pattern formation of the data. Our study shows that the proposed technique not only can realize classification according to the pattern formation, but also is able to improve the performance of traditional classification techniques. Furthermore, as the class configuration's complexity increases, such as the mixture among different classes, a larger portion of the high level term is required to get correct classification. This feature confirms that the high level classification has a special importance in complex situations of classification. Finally, we show how the proposed technique can be employed in a real-world application, where it is capable of identifying variations and distortions of handwritten digit images. As a result, it supplies an improvement in the overall pattern recognition rate.
Resumo:
Spatial data warehouses (SDWs) allow for spatial analysis together with analytical multidimensional queries over huge volumes of data. The challenge is to retrieve data related to ad hoc spatial query windows according to spatial predicates, avoiding the high cost of joining large tables. Therefore, mechanisms to provide efficient query processing over SDWs are essential. In this paper, we propose two efficient indices for SDW: the SB-index and the HSB-index. The proposed indices share the following characteristics. They enable multidimensional queries with spatial predicate for SDW and also support predefined spatial hierarchies. Furthermore, they compute the spatial predicate and transform it into a conventional one, which can be evaluated together with other conventional predicates by accessing a star-join Bitmap index. While the SB-index has a sequential data structure, the HSB-index uses a hierarchical data structure to enable spatial objects clustering and a specialized buffer-pool to decrease the number of disk accesses. The advantages of the SB-index and the HSB-index over the DBMS resources for SDW indexing (i.e. star-join computation and materialized views) were investigated through performance tests, which issued roll-up operations extended with containment and intersection range queries. The performance results showed that improvements ranged from 68% up to 99% over both the star-join computation and the materialized view. Furthermore, the proposed indices proved to be very compact, adding only less than 1% to the storage requirements. Therefore, both the SB-index and the HSB-index are excellent choices for SDW indexing. Choosing between the SB-index and the HSB-index mainly depends on the query selectivity of spatial predicates. While low query selectivity benefits the HSB-index, the SB-index provides better performance for higher query selectivity.
Resumo:
Dimensionality reduction is employed for visual data analysis as a way to obtaining reduced spaces for high dimensional data or to mapping data directly into 2D or 3D spaces. Although techniques have evolved to improve data segregation on reduced or visual spaces, they have limited capabilities for adjusting the results according to user's knowledge. In this paper, we propose a novel approach to handling both dimensionality reduction and visualization of high dimensional data, taking into account user's input. It employs Partial Least Squares (PLS), a statistical tool to perform retrieval of latent spaces focusing on the discriminability of the data. The method employs a training set for building a highly precise model that can then be applied to a much larger data set very effectively. The reduced data set can be exhibited using various existing visualization techniques. The training data is important to code user's knowledge into the loop. However, this work also devises a strategy for calculating PLS reduced spaces when no training data is available. The approach produces increasingly precise visual mappings as the user feeds back his or her knowledge and is capable of working with small and unbalanced training sets.
Resumo:
Current scientific applications have been producing large amounts of data. The processing, handling and analysis of such data require large-scale computing infrastructures such as clusters and grids. In this area, studies aim at improving the performance of data-intensive applications by optimizing data accesses. In order to achieve this goal, distributed storage systems have been considering techniques of data replication, migration, distribution, and access parallelism. However, the main drawback of those studies is that they do not take into account application behavior to perform data access optimization. This limitation motivated this paper which applies strategies to support the online prediction of application behavior in order to optimize data access operations on distributed systems, without requiring any information on past executions. In order to accomplish such a goal, this approach organizes application behaviors as time series and, then, analyzes and classifies those series according to their properties. By knowing properties, the approach selects modeling techniques to represent series and perform predictions, which are, later on, used to optimize data access operations. This new approach was implemented and evaluated using the OptorSim simulator, sponsored by the LHC-CERN project and widely employed by the scientific community. Experiments confirm this new approach reduces application execution time in about 50 percent, specially when handling large amounts of data.
Resumo:
Statistical methods have been widely employed to assess the capabilities of credit scoring classification models in order to reduce the risk of wrong decisions when granting credit facilities to clients. The predictive quality of a classification model can be evaluated based on measures such as sensitivity, specificity, predictive values, accuracy, correlation coefficients and information theoretical measures, such as relative entropy and mutual information. In this paper we analyze the performance of a naive logistic regression model (Hosmer & Lemeshow, 1989) and a logistic regression with state-dependent sample selection model (Cramer, 2004) applied to simulated data. Also, as a case study, the methodology is illustrated on a data set extracted from a Brazilian bank portfolio. Our simulation results so far revealed that there is no statistically significant difference in terms of predictive capacity between the naive logistic regression models and the logistic regression with state-dependent sample selection models. However, there is strong difference between the distributions of the estimated default probabilities from these two statistical modeling techniques, with the naive logistic regression models always underestimating such probabilities, particularly in the presence of balanced samples. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
Abstract Background Several mathematical and statistical methods have been proposed in the last few years to analyze microarray data. Most of those methods involve complicated formulas, and software implementations that require advanced computer programming skills. Researchers from other areas may experience difficulties when they attempting to use those methods in their research. Here we present an user-friendly toolbox which allows large-scale gene expression analysis to be carried out by biomedical researchers with limited programming skills. Results Here, we introduce an user-friendly toolbox called GEDI (Gene Expression Data Interpreter), an extensible, open-source, and freely-available tool that we believe will be useful to a wide range of laboratories, and to researchers with no background in Mathematics and Computer Science, allowing them to analyze their own data by applying both classical and advanced approaches developed and recently published by Fujita et al. Conclusion GEDI is an integrated user-friendly viewer that combines the state of the art SVR, DVAR and SVAR algorithms, previously developed by us. It facilitates the application of SVR, DVAR and SVAR, further than the mathematical formulas present in the corresponding publications, and allows one to better understand the results by means of available visualizations. Both running the statistical methods and visualizing the results are carried out within the graphical user interface, rendering these algorithms accessible to the broad community of researchers in Molecular Biology.
Resumo:
Abstract Background Smallpox is a lethal disease that was endemic in many parts of the world until eradicated by massive immunization. Due to its lethality, there are serious concerns about its use as a bioweapon. Here we analyze publicly available microarray data to further understand survival of smallpox infected macaques, using systems biology approaches. Our goal is to improve the knowledge about the progression of this disease. Results We used KEGG pathways annotations to define groups of genes (or modules), and subsequently compared them to macaque survival times. This technique provided additional insights about the host response to this disease, such as increased expression of the cytokines and ECM receptors in the individuals with higher survival times. These results could indicate that these gene groups could influence an effective response from the host to smallpox. Conclusion Macaques with higher survival times clearly express some specific pathways previously unidentified using regular gene-by-gene approaches. Our work also shows how third party analysis of public datasets can be important to support new hypotheses to relevant biological problems.
Resumo:
The treatment of the Cerebral Palsy (CP) is considered as the “core problem” for the whole field of the pediatric rehabilitation. The reason why this pathology has such a primary role, can be ascribed to two main aspects. First of all CP is the form of disability most frequent in childhood (one new case per 500 birth alive, (1)), secondarily the functional recovery of the “spastic” child is, historically, the clinical field in which the majority of the therapeutic methods and techniques (physiotherapy, orthotic, pharmacologic, orthopedic-surgical, neurosurgical) were first applied and tested. The currently accepted definition of CP – Group of disorders of the development of movement and posture causing activity limitation (2) – is the result of a recent update by the World Health Organization to the language of the International Classification of Functioning Disability and Health, from the original proposal of Ingram – A persistent but not unchangeable disorder of posture and movement – dated 1955 (3). This definition considers CP as a permanent ailment, i.e. a “fixed” condition, that however can be modified both functionally and structurally by means of child spontaneous evolution and treatments carried out during childhood. The lesion that causes the palsy, happens in a structurally immature brain in the pre-, peri- or post-birth period (but only during the firsts months of life). The most frequent causes of CP are: prematurity, insufficient cerebral perfusion, arterial haemorrhage, venous infarction, hypoxia caused by various origin (for example from the ingestion of amniotic liquid), malnutrition, infection and maternal or fetal poisoning. In addition to these causes, traumas and malformations have to be included. The lesion, whether focused or spread over the nervous system, impairs the whole functioning of the Central Nervous System (CNS). As a consequence, they affect the construction of the adaptive functions (4), first of all posture control, locomotion and manipulation. The palsy itself does not vary over time, however it assumes an unavoidable “evolutionary” feature when during growth the child is requested to meet new and different needs through the construction of new and different functions. It is essential to consider that clinically CP is not only a direct expression of structural impairment, that is of etiology, pathogenesis and lesion timing, but it is mainly the manifestation of the path followed by the CNS to “re”-construct the adaptive functions “despite” the presence of the damage. “Palsy” is “the form of the function that is implemented by an individual whose CNS has been damaged in order to satisfy the demands coming from the environment” (4). Therefore it is only possible to establish general relations between lesion site, nature and size, and palsy and recovery processes. It is quite common to observe that children with very similar neuroimaging can have very different clinical manifestations of CP and, on the other hand, children with very similar motor behaviors can have completely different lesion histories. A very clear example of this is represented by hemiplegic forms, which show bilateral hemispheric lesions in a high percentage of cases. The first section of this thesis is aimed at guiding the interpretation of CP. First of all the issue of the detection of the palsy is treated from historical viewpoint. Consequently, an extended analysis of the current definition of CP, as internationally accepted, is provided. The definition is then outlined in terms of a space dimension and then of a time dimension, hence it is highlighted where this definition is unacceptably lacking. The last part of the first section further stresses the importance of shifting from the traditional concept of CP as a palsy of development (defect analysis) towards the notion of development of palsy, i.e., as the product of the relationship that the individual however tries to dynamically build with the surrounding environment (resource semeiotics) starting and growing from a different availability of resources, needs, dreams, rights and duties (4). In the scientific and clinic community no common classification system of CP has so far been universally accepted. Besides, no standard operative method or technique have been acknowledged to effectively assess the different disabilities and impairments exhibited by children with CP. CP is still “an artificial concept, comprising several causes and clinical syndromes that have been grouped together for a convenience of management” (5). The lack of standard and common protocols able to effectively diagnose the palsy, and as a consequence to establish specific treatments and prognosis, is mainly because of the difficulty to elevate this field to a level based on scientific evidence. A solution aimed at overcoming the current incomplete treatment of CP children is represented by the clinical systematic adoption of objective tools able to measure motor defects and movement impairments. A widespread application of reliable instruments and techniques able to objectively evaluate both the form of the palsy (diagnosis) and the efficacy of the treatments provided (prognosis), constitutes a valuable method able to validate care protocols, establish the efficacy of classification systems and assess the validity of definitions. Since the ‘80s, instruments specifically oriented to the analysis of the human movement have been advantageously designed and applied in the context of CP with the aim of measuring motor deficits and, especially, gait deviations. The gait analysis (GA) technique has been increasingly used over the years to assess, analyze, classify, and support the process of clinical decisions making, allowing for a complete investigation of gait with an increased temporal and spatial resolution. GA has provided a basis for improving the outcome of surgical and nonsurgical treatments and for introducing a new modus operandi in the identification of defects and functional adaptations to the musculoskeletal disorders. Historically, the first laboratories set up for gait analysis developed their own protocol (set of procedures for data collection and for data reduction) independently, according to performances of the technologies available at that time. In particular, the stereophotogrammetric systems mainly based on optoelectronic technology, soon became a gold-standard for motion analysis. They have been successfully applied especially for scientific purposes. Nowadays the optoelectronic systems have significantly improved their performances in term of spatial and temporal resolution, however many laboratories continue to use the protocols designed on the technology available in the ‘70s and now out-of-date. Furthermore, these protocols are not coherent both for the biomechanical models and for the adopted collection procedures. In spite of these differences, GA data are shared, exchanged and interpreted irrespectively to the adopted protocol without a full awareness to what extent these protocols are compatible and comparable with each other. Following the extraordinary advances in computer science and electronics, new systems for GA no longer based on optoelectronic technology, are now becoming available. They are the Inertial and Magnetic Measurement Systems (IMMSs), based on miniature MEMS (Microelectromechanical systems) inertial sensor technology. These systems are cost effective, wearable and fully portable motion analysis systems, these features gives IMMSs the potential to be used both outside specialized laboratories and to consecutive collect series of tens of gait cycles. The recognition and selection of the most representative gait cycle is then easier and more reliable especially in CP children, considering their relevant gait cycle variability. The second section of this thesis is focused on GA. In particular, it is firstly aimed at examining the differences among five most representative GA protocols in order to assess the state of the art with respect to the inter-protocol variability. The design of a new protocol is then proposed and presented with the aim of achieving gait analysis on CP children by means of IMMS. The protocol, named ‘Outwalk’, contains original and innovative solutions oriented at obtaining joint kinematic with calibration procedures extremely comfortable for the patients. The results of a first in-vivo validation of Outwalk on healthy subjects are then provided. In particular, this study was carried out by comparing Outwalk used in combination with an IMMS with respect to a reference protocol and an optoelectronic system. In order to set a more accurate and precise comparison of the systems and the protocols, ad hoc methods were designed and an original formulation of the statistical parameter coefficient of multiple correlation was developed and effectively applied. On the basis of the experimental design proposed for the validation on healthy subjects, a first assessment of Outwalk, together with an IMMS, was also carried out on CP children. The third section of this thesis is dedicated to the treatment of walking in CP children. Commonly prescribed treatments in addressing gait abnormalities in CP children include physical therapy, surgery (orthopedic and rhizotomy), and orthoses. The orthotic approach is conservative, being reversible, and widespread in many therapeutic regimes. Orthoses are used to improve the gait of children with CP, by preventing deformities, controlling joint position, and offering an effective lever for the ankle joint. Orthoses are prescribed for the additional aims of increasing walking speed, improving stability, preventing stumbling, and decreasing muscular fatigue. The ankle-foot orthosis (AFO), with a rigid ankle, are primarily designed to prevent equinus and other foot deformities with a positive effect also on more proximal joints. However, AFOs prevent the natural excursion of the tibio-tarsic joint during the second rocker, hence hampering the natural leaning progression of the whole body under the effect of the inertia (6). A new modular (submalleolar) astragalus-calcanear orthosis, named OMAC, has recently been proposed with the intention of substituting the prescription of AFOs in those CP children exhibiting a flat and valgus-pronated foot. The aim of this section is thus to present the mechanical and technical features of the OMAC by means of an accurate description of the device. In particular, the integral document of the deposited Italian patent, is provided. A preliminary validation of OMAC with respect to AFO is also reported as resulted from an experimental campaign on diplegic CP children, during a three month period, aimed at quantitatively assessing the benefit provided by the two orthoses on walking and at qualitatively evaluating the changes in the quality of life and motor abilities. As already stated, CP is universally considered as a persistent but not unchangeable disorder of posture and movement. Conversely to this definition, some clinicians (4) have recently pointed out that movement disorders may be primarily caused by the presence of perceptive disorders, where perception is not merely the acquisition of sensory information, but an active process aimed at guiding the execution of movements through the integration of sensory information properly representing the state of one’s body and of the environment. Children with perceptive impairments show an overall fear of moving and the onset of strongly unnatural walking schemes directly caused by the presence of perceptive system disorders. The fourth section of the thesis thus deals with accurately defining the perceptive impairment exhibited by diplegic CP children. A detailed description of the clinical signs revealing the presence of the perceptive impairment, and a classification scheme of the clinical aspects of perceptual disorders is provided. In the end, a functional reaching test is proposed as an instrumental test able to disclosure the perceptive impairment. References 1. Prevalence and characteristics of children with cerebral palsy in Europe. Dev Med Child Neurol. 2002 Set;44(9):633-640. 2. Bax M, Goldstein M, Rosenbaum P, Leviton A, Paneth N, Dan B, et al. Proposed definition and classification of cerebral palsy, April 2005. Dev Med Child Neurol. 2005 Ago;47(8):571-576. 3. Ingram TT. A study of cerebral palsy in the childhood population of Edinburgh. Arch. Dis. Child. 1955 Apr;30(150):85-98. 4. Ferrari A, Cioni G. The spastic forms of cerebral palsy : a guide to the assessment of adaptive functions. Milan: Springer; 2009. 5. Olney SJ, Wright MJ. Cerebral Palsy. Campbell S et al. Physical Therapy for Children. 2nd Ed. Philadelphia: Saunders. 2000;:533-570. 6. Desloovere K, Molenaers G, Van Gestel L, Huenaerts C, Van Campenhout A, Callewaert B, et al. How can push-off be preserved during use of an ankle foot orthosis in children with hemiplegia? A prospective controlled study. Gait Posture. 2006 Ott;24(2):142-151.
Resumo:
This thesis offers a practical and theoretical evaluations about gossip-epidemic algorithms, comparing those most common in the literature with new proposed algorithms and analyzing their behavior. Tests have been executed using one hundred graphs that has been randomly generated by Large Unstructured NEtwork Simulator (LUNES), a simulation software provided by Parallel and Distributed Simulation Research Group (PADS), of the Department of Computer Science, Università di Bologna and simulated using Advanced RTI System (ARTÌS), based on the High Level Architecture standard. Literatures algorithms have been analyzed and taken as base for new algorithms.