903 resultados para Data-driven Methods


Relevância:

40.00% 40.00%

Publicador:

Resumo:

This article deals with classification problems involving unequal probabilities in each class and discusses metrics to systems that use multilayer perceptrons neural networks (MLP) for the task of classifying new patterns. In addition we propose three new pruning methods that were compared to other seven existing methods in the literature for MLP networks. All pruning algorithms presented in this paper have been modified by the authors to do pruning of neurons, in order to produce fully connected MLP networks but being small in its intermediary layer. Experiments were carried out involving the E. coli unbalanced classification problem and ten pruning methods. The proposed methods had obtained good results, actually, better results than another pruning methods previously defined at the MLP neural network area. (C) 2014 Elsevier Ltd. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this work, different methods to estimate the value of thin film residual stresses using instrumented indentation data were analyzed. This study considered procedures proposed in the literature, as well as a modification on one of these methods and a new approach based on the effect of residual stress on the value of hardness calculated via the Oliver and Pharr method. The analysis of these methods was centered on an axisymmetric two-dimensional finite element model, which was developed to simulate instrumented indentation testing of thin ceramic films deposited onto hard steel substrates. Simulations were conducted varying the level of film residual stress, film strain hardening exponent, film yield strength, and film Poisson's ratio. Different ratios of maximum penetration depth h(max) over film thickness t were also considered, including h/t = 0.04, for which the contribution of the substrate in the mechanical response of the system is not significant. Residual stresses were then calculated following the procedures mentioned above and compared with the values used as input in the numerical simulations. In general, results indicate the difference that each method provides with respect to the input values depends on the conditions studied. The method by Suresh and Giannakopoulos consistently overestimated the values when stresses were compressive. The method provided by Wang et al. has shown less dependence on h/t than the others.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background: The CUPID (Cultural and Psychosocial Influences on Disability) study was established to explore the hypothesis that common musculoskeletal disorders (MSDs) and associated disability are importantly influenced by culturally determined health beliefs and expectations. This paper describes the methods of data collection and various characteristics of the study sample. Methods/Principal Findings: A standardised questionnaire covering musculoskeletal symptoms, disability and potential risk factors, was used to collect information from 47 samples of nurses, office workers, and other (mostly manual) workers in 18 countries from six continents. In addition, local investigators provided data on economic aspects of employment for each occupational group. Participation exceeded 80% in 33 of the 47 occupational groups, and after pre-specified exclusions, analysis was based on 12,426 subjects (92 to 1018 per occupational group). As expected, there was high usage of computer keyboards by office workers, while nurses had the highest prevalence of heavy manual lifting in all but one country. There was substantial heterogeneity between occupational groups in economic and psychosocial aspects of work; three-to fivefold variation in awareness of someone outside work with musculoskeletal pain; and more than ten-fold variation in the prevalence of adverse health beliefs about back and arm pain, and in awareness of terms such as "repetitive strain injury" (RSI). Conclusions/Significance: The large differences in psychosocial risk factors (including knowledge and beliefs about MSDs) between occupational groups should allow the study hypothesis to be addressed effectively.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper we discuss the detection of glucose and triglycerides using information visualization methods to process impedance spectroscopy data. The sensing units contained either lipase or glucose oxidase immobilized in layer-by-layer (LbL) films deposited onto interdigitated electrodes. The optimization consisted in identifying which part of the electrical response and combination of sensing units yielded the best distinguishing ability. It is shown that complete separation can be obtained for a range of concentrations of glucose and triglyceride when the interactive document map (IDMAP) technique is used to project the data into a two-dimensional plot. Most importantly, the optimization procedure can be extended to other types of biosensors, thus increasing the versatility of analysis provided by tailored molecular architectures exploited with various detection principles. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Abstract Background With the development of DNA hybridization microarray technologies, nowadays it is possible to simultaneously assess the expression levels of thousands to tens of thousands of genes. Quantitative comparison of microarrays uncovers distinct patterns of gene expression, which define different cellular phenotypes or cellular responses to drugs. Due to technical biases, normalization of the intensity levels is a pre-requisite to performing further statistical analyses. Therefore, choosing a suitable approach for normalization can be critical, deserving judicious consideration. Results Here, we considered three commonly used normalization approaches, namely: Loess, Splines and Wavelets, and two non-parametric regression methods, which have yet to be used for normalization, namely, the Kernel smoothing and Support Vector Regression. The results obtained were compared using artificial microarray data and benchmark studies. The results indicate that the Support Vector Regression is the most robust to outliers and that Kernel is the worst normalization technique, while no practical differences were observed between Loess, Splines and Wavelets. Conclusion In face of our results, the Support Vector Regression is favored for microarray normalization due to its superiority when compared to the other methods for its robustness in estimating the normalization curve.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

With the increasing production of information from e-government initiatives, there is also the need to transform a large volume of unstructured data into useful information for society. All this information should be easily accessible and made available in a meaningful and effective way in order to achieve semantic interoperability in electronic government services, which is a challenge to be pursued by governments round the world. Our aim is to discuss the context of e-Government Big Data and to present a framework to promote semantic interoperability through automatic generation of ontologies from unstructured information found in the Internet. We propose the use of fuzzy mechanisms to deal with natural language terms and present some related works found in this area. The results achieved in this study are based on the architectural definition and major components and requirements in order to compose the proposed framework. With this, it is possible to take advantage of the large volume of information generated from e-Government initiatives and use it to benefit society.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

ÈN]A trans-oceanic section at 24.5°N in the North Atlantic has been sampled at a decadal frequency. This work demonstrates that the wind-driven component of the Meridional Overturning Circulation (MOC) may be monitored using autonomous profiling floats deployed in the eastern North Atlantic Subtropical Gyre. More than 500 CTD vertical profiles from the surface to 2000 m depth, spanning one year (from April 2002 to March 2003), are used to compute the geostrophic transport stream function at 24.5°N. The baroclinic transport obtained from the autonomous profiling floats is not statistically different than that from three hydrographic cruises carried out in 1957, 1981 and 1992. A good agreement is found between the geostrophic transport stream function and the transport derived from the wind field through the Sverdrup relation.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Ontology design and population -core aspects of semantic technologies- re- cently have become fields of great interest due to the increasing need of domain-specific knowledge bases that can boost the use of Semantic Web. For building such knowledge resources, the state of the art tools for ontology design require a lot of human work. Producing meaningful schemas and populating them with domain-specific data is in fact a very difficult and time-consuming task. Even more if the task consists in modelling knowledge at a web scale. The primary aim of this work is to investigate a novel and flexible method- ology for automatically learning ontology from textual data, lightening the human workload required for conceptualizing domain-specific knowledge and populating an extracted schema with real data, speeding up the whole ontology production process. Here computational linguistics plays a fundamental role, from automati- cally identifying facts from natural language and extracting frame of relations among recognized entities, to producing linked data with which extending existing knowledge bases or creating new ones. In the state of the art, automatic ontology learning systems are mainly based on plain-pipelined linguistics classifiers performing tasks such as Named Entity recognition, Entity resolution, Taxonomy and Relation extraction [11]. These approaches present some weaknesses, specially in capturing struc- tures through which the meaning of complex concepts is expressed [24]. Humans, in fact, tend to organize knowledge in well-defined patterns, which include participant entities and meaningful relations linking entities with each other. In literature, these structures have been called Semantic Frames by Fill- 6 Introduction more [20], or more recently as Knowledge Patterns [23]. Some NLP studies has recently shown the possibility of performing more accurate deep parsing with the ability of logically understanding the structure of discourse [7]. In this work, some of these technologies have been investigated and em- ployed to produce accurate ontology schemas. The long-term goal is to collect large amounts of semantically structured information from the web of crowds, through an automated process, in order to identify and investigate the cognitive patterns used by human to organize their knowledge.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In recent years, the use of Reverse Engineering systems has got a considerable interest for a wide number of applications. Therefore, many research activities are focused on accuracy and precision of the acquired data and post processing phase improvements. In this context, this PhD Thesis deals with the definition of two novel methods for data post processing and data fusion between physical and geometrical information. In particular a technique has been defined for error definition in 3D points’ coordinates acquired by an optical triangulation laser scanner, with the aim to identify adequate correction arrays to apply under different acquisition parameters and operative conditions. Systematic error in data acquired is thus compensated, in order to increase accuracy value. Moreover, the definition of a 3D thermogram is examined. Object geometrical information and its thermal properties, coming from a thermographic inspection, are combined in order to have a temperature value for each recognizable point. Data acquired by an optical triangulation laser scanner are also used to normalize temperature values and make thermal data independent from thermal-camera point of view.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The purpose of my PhD thesis has been to face the issue of retrieving a three dimensional attenuation model in volcanic areas. To this purpose, I first elaborated a robust strategy for the analysis of seismic data. This was done by performing several synthetic tests to assess the applicability of spectral ratio method to our purposes. The results of the tests allowed us to conclude that: 1) spectral ratio method gives reliable differential attenuation (dt*) measurements in smooth velocity models; 2) short signal time window has to be chosen to perform spectral analysis; 3) the frequency range over which to compute spectral ratios greatly affects dt* measurements. Furthermore, a refined approach for the application of spectral ratio method has been developed and tested. Through this procedure, the effects caused by heterogeneities of propagation medium on the seismic signals may be removed. The tested data analysis technique was applied to the real active seismic SERAPIS database. It provided a dataset of dt* measurements which was used to obtain a three dimensional attenuation model of the shallowest part of Campi Flegrei caldera. Then, a linearized, iterative, damped attenuation tomography technique has been tested and applied to the selected dataset. The tomography, with a resolution of 0.5 km in the horizontal directions and 0.25 km in the vertical direction, allowed to image important features in the off-shore part of Campi Flegrei caldera. High QP bodies are immersed in a high attenuation body (Qp=30). The latter is well correlated with low Vp and high Vp/Vs values and it is interpreted as a saturated marine and volcanic sediments layer. High Qp anomalies, instead, are interpreted as the effects either of cooled lava bodies or of a CO2 reservoir. A pseudo-circular high Qp anomaly was detected and interpreted as the buried rim of NYT caldera.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In the past two decades the work of a growing portion of researchers in robotics focused on a particular group of machines, belonging to the family of parallel manipulators: the cable robots. Although these robots share several theoretical elements with the better known parallel robots, they still present completely (or partly) unsolved issues. In particular, the study of their kinematic, already a difficult subject for conventional parallel manipulators, is further complicated by the non-linear nature of cables, which can exert only efforts of pure traction. The work presented in this thesis therefore focuses on the study of the kinematics of these robots and on the development of numerical techniques able to address some of the problems related to it. Most of the work is focused on the development of an interval-analysis based procedure for the solution of the direct geometric problem of a generic cable manipulator. This technique, as well as allowing for a rapid solution of the problem, also guarantees the results obtained against rounding and elimination errors and can take into account any uncertainties in the model of the problem. The developed code has been tested with the help of a small manipulator whose realization is described in this dissertation together with the auxiliary work done during its design and simulation phases.

Relevância:

40.00% 40.00%

Publicador:

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The accuracy of medicine use information was compared for a telephone interview and mail questionnaire, using an in-home medicine check as the standard of assessment The validity of medicine use information varied by data source, level of specificity of data, and respondent characteristics. The mail questionnaire was the more valid source of overall medicine use information. Implications for both service providers and researchers are provided.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This article gives an overview over the methods used in the low--level analysis of gene expression data generated using DNA microarrays. This type of experiment allows to determine relative levels of nucleic acid abundance in a set of tissues or cell populations for thousands of transcripts or loci simultaneously. Careful statistical design and analysis are essential to improve the efficiency and reliability of microarray experiments throughout the data acquisition and analysis process. This includes the design of probes, the experimental design, the image analysis of microarray scanned images, the normalization of fluorescence intensities, the assessment of the quality of microarray data and incorporation of quality information in subsequent analyses, the combination of information across arrays and across sets of experiments, the discovery and recognition of patterns in expression at the single gene and multiple gene levels, and the assessment of significance of these findings, considering the fact that there is a lot of noise and thus random features in the data. For all of these components, access to a flexible and efficient statistical computing environment is an essential aspect.