965 resultados para Data Repository


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present study aimed at comparing social representations structures concerning data collection procedures: through internet forms, diffused in the WWW, and through conventional paper and pencil questionnaire methods. overall 893 individuals participated in the research, 58% of whom were female. A total of 217 questionnaires about the social representation on football (soccer) and 218 about the representation on aging were answered by Brazilian university students in classrooms. Electronic versions of the same instrument were diffused through an internet forum linked to the same university. There were 238 answers for the football questionnaire and 230 for the aging one. The instrument asked participants to indicate five words or expressions related to one of the social objects. Sample characteristics and structural analyses were carried out separately for the two data collection procedures. data indicated that internet-based research allows for higher sample diversity, but it is essential to guarantee the adoption of measures that can select only desired participants. Results also pointed out the need to take into account the nature of the social object to be investigated through internet research on representations, seeking to avoid self-selection effects, which can bias results, as it seems to have happened with the football social object.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

El presente documento se desarrollará con base en los lineamientos definidos en el PMBOK, que servirán como base para la implementación de la estrategia corporativa a través del Proyecto MultiAccess, cuya capítulo inicial es la relación de los procesos de Iniciación.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

El desarrollo que está presentando el tema, hace que la información al respecto resulte algo limitada, no es mucha la literatura que hasta el momento se haya producido, especialmente en países donde la vida del Habeas Data es más corta. Por ello es que nuestra investigación resulta una herramienta

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This document examines the time-series properties of the wage differentials that arise between the public and private sector in Colombia during the sample period 1984 to 2005. We Find conflicting results in unit-root and stationary tests when looking at wage differentials at an aggregate level (such as for men, women or both). However, when we analyse wage differentials at higher levels of disaggregation, treat them jointly as a panel of data, and allow for the presence of potential cross section dependence, there is more supportive evidence for the view that wage differentials are stationary. This implies that although wage differentials do exist, they have not been consistently increasing (or decreasing) over time.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper applies stationarity tests to examine evidence of market integration for a relatively large sample of food products in Colombia. We Önd little support for market integration when using the univariate KPSS tests for stationarity. However, within a panel context and after allowing for cross sectional dependence, the Hadri tests provide much more evidence supporting the view that food markets are integrated or, in other words, that the law of one price holds for most products.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La medición de la desigualdad de oportunidades con las bases de PISA implican varias limitaciones: (i) la muestra sólo representa una fracción limitada de las cohortes de jóvenes de 15 años en los países en desarrollo y (ii) estas fracciones no son uniformes entre países ni entre periodos. Lo anterior genera dudas sobre la confiabilidad de estas mediciones cuando se usan para comparaciones internacionales: mayor equidad puede ser resultado de una muestra más restringida y más homogénea. A diferencia de enfoques previos basados en reconstrucción de las muestras, el enfoque del documento consiste en proveer un índice bidimensional que incluye logro y acceso como dimensiones del índice. Se utilizan varios métodos de agregación y se observan cambios considerables en los rankings de (in) equidad de oportunidades cuando solo se observa el logro y cuando se observan ambas dimensiones en las pruebas de PISA 2006/2009. Finalmente se propone una generalización del enfoque permitiendo otras dimensiones adicionales y otros pesos utilizados en la agregación.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There has been a clear lack of common data exchange semantics for inter-organisational workflow management systems where the research has mainly focused on technical issues rather than language constructs. This paper presents the neutral data exchanges semantics required for the workflow integration within the AXAEDIS framework and presents the mechanism for object discovery from the object repository where little or no knowledge about the object is available. The paper also presents workflow independent integration architecture with the AXAEDIS Framework.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the increasing awareness of protein folding disorders, the explosion of genomic information, and the need for efficient ways to predict protein structure, protein folding and unfolding has become a central issue in molecular sciences research. Molecular dynamics computer simulations are increasingly employed to understand the folding and unfolding of proteins. Running protein unfolding simulations is computationally expensive and finding ways to enhance performance is a grid issue on its own. However, more and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data. This paper describes efforts to provide a grid-enabled data warehouse for protein unfolding data. We outline the challenge and present first results in the design and implementation of the data warehouse.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The P-found protein folding and unfolding simulation repository is designed to allow scientists to perform data mining and other analyses across large, distributed simulation data sets. There are two storage components in P-found: a primary repository of simulation data that is used to populate the second component, and a data warehouse that contains important molecular properties. These properties may be used for data mining studies. Here we demonstrate how grid technologies can support multiple, distributed P-found installations. In particular, we look at two aspects: firstly, how grid data management technologies can be used to access the distributed data warehouses; and secondly, how the grid can be used to transfer analysis programs to the primary repositories — this is an important and challenging aspect of P-found, due to the large data volumes involved and the desire of scientists to maintain control of their own data. The grid technologies we are developing with the P-found system will allow new large data sets of protein folding simulations to be accessed and analysed in novel ways, with significant potential for enabling scientific discovery.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of instance selection is to identify which instances (examples, patterns) in a large dataset should be selected as representatives of the entire dataset, without significant loss of information. When a machine learning method is applied to the reduced dataset, the accuracy of the model should not be significantly worse than if the same method were applied to the entire dataset. The reducibility of any dataset, and hence the success of instance selection methods, surely depends on the characteristics of the dataset, as well as the machine learning method. This paper adopts a meta-learning approach, via an empirical study of 112 classification datasets from the UCI Repository [1], to explore the relationship between data characteristics, machine learning methods, and the success of instance selection method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Healthcare plays an important role in promoting the general health and well-being of people around the world. The difficulty in healthcare data classification arises from the uncertainty and the high-dimensional nature of the medical data collected. This paper proposes an integration of fuzzy standard additive model (SAM) with genetic algorithm (GA), called GSAM, to deal with uncertainty and computational challenges. GSAM learning process comprises three continual steps: rule initialization by unsupervised learning using the adaptive vector quantization clustering, evolutionary rule optimization by GA and parameter tuning by the gradient descent supervised learning. Wavelet transformation is employed to extract discriminative features for high-dimensional datasets. GSAM becomes highly capable when deployed with small number of wavelet features as its computational burden is remarkably reduced. The proposed method is evaluated using two frequently-used medical datasets: the Wisconsin breast cancer and Cleveland heart disease from the UCI Repository for machine learning. Experiments are organized with a five-fold cross validation and performance of classification techniques are measured by a number of important metrics: accuracy, F-measure, mutual information and area under the receiver operating characteristic curve. Results demonstrate the superiority of the GSAM compared to other machine learning methods including probabilistic neural network, support vector machine, fuzzy ARTMAP, and adaptive neuro-fuzzy inference system. The proposed approach is thus helpful as a decision support system for medical practitioners in the healthcare practice.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, a hybrid model consisting of the fuzzy ARTMAP (FAM) neural network and the classification and regression tree (CART) is formulated. FAM is useful for tackling the stability–plasticity dilemma pertaining to data-based learning systems, while CART is useful for depicting its learned knowledge explicitly in a tree structure. By combining the benefits of both models, FAM–CART is capable of learning data samples stably and, at the same time, explaining its predictions with a set of decision rules. In other words, FAM–CART possesses two important properties of an intelligent system, i.e., learning in a stable manner (by overcoming the stability–plasticity dilemma) and extracting useful explanatory rules (by overcoming the opaqueness issue). To evaluate the usefulness of FAM–CART, six benchmark medical data sets from the UCI repository of machine learning and a real-world medical data classification problem are used for evaluation. For performance comparison, a number of performance metrics which include accuracy, specificity, sensitivity, and the area under the receiver operation characteristic curve are computed. The results are quantified with statistical indicators and compared with those reported in the literature. The outcomes positively indicate that FAM–CART is effective for undertaking data classification tasks. In addition to producing good results, it provides justifications of the predictions in the form of a decision tree so that domain users can easily understand the predictions, therefore making it a useful decision support tool.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper introduces an automated medical data classification method using wavelet transformation (WT) and interval type-2 fuzzy logic system (IT2FLS). Wavelet coefficients, which serve as inputs to the IT2FLS, are a compact form of original data but they exhibits highly discriminative features. The integration between WT and IT2FLS aims to cope with both high-dimensional data challenge and uncertainty. IT2FLS utilizes a hybrid learning process comprising unsupervised structure learning by the fuzzy c-means (FCM) clustering and supervised parameter tuning by genetic algorithm. This learning process is computationally expensive, especially when employed with high-dimensional data. The application of WT therefore reduces computational burden and enhances performance of IT2FLS. Experiments are implemented with two frequently used medical datasets from the UCI Repository for machine learning: the Wisconsin breast cancer and Cleveland heart disease. A number of important metrics are computed to measure the performance of the classification. They consist of accuracy, sensitivity, specificity and area under the receiver operating characteristic curve. Results demonstrate a significant dominance of the wavelet-IT2FLS approach compared to other machine learning methods including probabilistic neural network, support vector machine, fuzzy ARTMAP, and adaptive neuro-fuzzy inference system. The proposed approach is thus useful as a decision support system for clinicians and practitioners in the medical practice. copy; 2015 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Informação - FFC

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Informação - FFC