905 resultados para Trhee-way data
Resumo:
As a sequel to a paper that dealt with the analysis of two-way quantitative data in large germplasm collections, this paper presents analytical methods appropriate for two-way data matrices consisting of mixed data types, namely, ordered multicategory and quantitative data types. While various pattern analysis techniques have been identified as suitable for analysis of the mixed data types which occur in germplasm collections, the clustering and ordination methods used often can not deal explicitly with the computational consequences of large data sets (i.e. greater than 5000 accessions) with incomplete information. However, it is shown that the ordination technique of principal component analysis and the mixture maximum likelihood method of clustering can be employed to achieve such analyses. Germplasm evaluation data for 11436 accessions of groundnut (Arachis hypogaea L.) from the International Research Institute of the Semi-Arid Tropics, Andhra Pradesh, India were examined. Data for nine quantitative descriptors measured in the post-rainy season and five ordered multicategory descriptors were used. Pattern analysis results generally indicated that the accessions could be distinguished into four regions along the continuum of growth habit (or plant erectness). Interpretation of accession membership in these regions was found to be consistent with taxonomic information, such as subspecies. Each growth habit region contained accessions from three of the most common groundnut botanical varieties. This implies that within each of the habit types there is the full range of expression for the other descriptors used in the analysis. Using these types of insights, the patterns of variability in germplasm collections can provide scientists with valuable information for their plant improvement programs.
Resumo:
El establecimiento de relaciones entre taxones es un paso esencial en el proceso de catalogación y evaluación del material conservado en un Banco de Germoplasma. Existen distintos métodos de evaluación en función del tipo de caracteres estudiados. Cuando el registro de caracteres se repite en el tiempo y en distintos ambientes, se debe separar la variabilidad intrínsecamente genética entre los taxones de aquella que se debe al ambiente, y más aún, de la posible variabilidad debida a la interacción genotipo*ambiente para el posterior establecimiento de relaciones puramente filogenéticas. En el presente trabajo se estudia comparativamente la factibilidad de aplicación de dos estrategias de análisis estadístico para dar solución a este problema. La primera corresponde al análisis tradicional donde se realiza un Análisis de Componentes Principales sobre los caracteres promedios a lo largo de los diferentes ambientes; y la segunda son métodos más complejos en los cuales cada dato es originado por tres modos: individuos, variables y condiciones ambientales, tales como el Análisis Factorial Múltiple y el Análisis de Procrustes Generalizado. Si bien las configuraciones resultantes fueron todas equivalentes, los métodos de tres vías permiten la interpretación de la interacción genotipo*ambiente.
Resumo:
We consider the wireless two-way relay channel, in which two-way data transfer takes place between the end nodes with the help of a relay. For the Denoise-And-Forward (DNF) protocol, it was shown by Koike-Akino et al. that adaptively changing the network coding map used at the relay greatly reduces the impact of Multiple Access Interference at the relay. The harmful effect of the deep channel fade conditions can be effectively mitigated by proper choice of these network coding maps at the relay. Alternatively, in this paper we propose a Distributed Space Time Coding (DSTC) scheme, which effectively removes most of the deep fade channel conditions at the transmitting nodes itself without any CSIT and without any need to adaptively change the network coding map used at the relay. It is shown that the deep fades occur when the channel fade coefficient vector falls in a finite number of vector subspaces of, which are referred to as the singular fade subspaces. DSTC design criterion referred to as the singularity minimization criterion under which the number of such vector subspaces are minimized is obtained. Also, a criterion to maximize the coding gain of the DSTC is obtained. Explicit low decoding complexity DSTC designs which satisfy the singularity minimization criterion and maximize the coding gain for QAM and PSK signal sets are provided. Simulation results show that at high Signal to Noise Ratio, the DSTC scheme provides large gains when compared to the conventional Exclusive OR network code and performs better than the adaptive network coding scheme.
Resumo:
Conceptualisations of disability that emphasise the contextual and cultural nature of disability and the embodiment of these within a national system of data collection present a number of challenges especially where this process is devolved to schools. The requirement for measures based on contextual and subjective experiences gives rise to particular difficulties in achieving parity in the way data is analysed and reported. This paper presents an account of the testing of a tool intended for use by schools as they collect data from parents to identify children who meet the criteria of disability established in Disability Discrimination Acts (DDAs). Data were validated through interviews with parents and teachers and observations of children and highlighted the pivotal role of the criterion of impact. The findings are set in the context of schools meeting their legal duties to identify disabled children and their support needs in a way that captures the complexity of disabled children’s school lives and provides useful and useable data.
Resumo:
© 2014 Cises This work is distributed with License Creative Commons Attribution-Non commercial-No derivatives 4.0 International (CC BY-BC-ND 4.0)
Resumo:
This work identifies the limitations of n-way data analysis techniques in multidimensional stream data, such as Internet chat room communications data, and establishes a link between data collection and performance of these techniques. Its contributions are twofold. First, it extends data analysis to multiple dimensions by constructing n-way data arrays known as high order tensors. Chat room tensors are generated by a simulator which collects and models actual communication data. The accuracy of the model is determined by the Kolmogorov-Smirnov goodness-of-fit test which compares the simulation data with the observed (real) data. Second, a detailed computational comparison is performed to test several data analysis techniques including svd [1], and multi-way techniques including Tucker1, Tucker3 [2], and Parafac [3].
Resumo:
This work investigates the accuracy and efficiency tradeoffs between centralized and collective (distributed) algorithms for (i) sampling, and (ii) n-way data analysis techniques in multidimensional stream data, such as Internet chatroom communications. Its contributions are threefold. First, we use the Kolmogorov-Smirnov goodness-of-fit test to show that statistical differences between real data obtained by collective sampling in time dimension from multiple servers and that of obtained from a single server are insignificant. Second, we show using the real data that collective data analysis of 3-way data arrays (users x keywords x time) known as high order tensors is more efficient than centralized algorithms with respect to both space and computational cost. Furthermore, we show that this gain is obtained without loss of accuracy. Third, we examine the sensitivity of collective constructions and analysis of high order data tensors to the choice of server selection and sampling window size. We construct 4-way tensors (users x keywords x time x servers) and analyze them to show the impact of server and window size selections on the results.
Resumo:
Fast content addressable data access mechanisms have compelling applications in today's systems. Many of these exploit the powerful wildcard matching capabilities provided by ternary content addressable memories. For example, TCAM based implementations of important algorithms in data mining been developed in recent years; these achieve an an order of magnitude speedup over prevalent techniques. However, large hardware TCAMs are still prohibitively expensive in terms of power consumption and cost per bit. This has been a barrier to extending their exploitation beyond niche and special purpose systems. We propose an approach to overcome this barrier by extending the traditional virtual memory hierarchy to scale up the user visible capacity of TCAMs while mitigating the power consumption overhead. By exploiting the notion of content locality (as opposed to spatial locality), we devise a novel combination of software and hardware techniques to provide an abstraction of a large virtual ternary content addressable space. In the long run, such abstractions enable applications to disassociate considerations of spatial locality and contiguity from the way data is referenced. If successful, ideas for making content addressability a first class abstraction in computing systems can open up a radical shift in the way applications are optimized for memory locality, just as storage class memories are soon expected to shift away from the way in which applications are typically optimized for disk access locality.
Resumo:
Dissertação apresentada à Universidade Fernando Pessoa como partes dos requisitos para a obtenção do grau de Mestre em Engenharia Informática, ramo de Sistemas de Informação e Multimédia
Resumo:
Certaines études récentes confirment que les tests de personnalité sont largement utilisés à des fins de sélection dans les organisations nord-américaines et que leur fréquence d’utilisation continue de croître (Boudrias, Pettersen, Longpré, & Plunier, 2008; Rothstein & Goffin, 2006). Or, les résultats des recherches portant sur le lien prévisionnel entre la personnalité et le rendement global au travail sont peu convaincants (Morgeson et al., 2007b; Murphy & Dzieweczynski, 2005). La présente thèse vise à vérifier si une amélioration des liens prédictifs entre la personnalité et le rendement au travail pourrait être obtenue en modifiant la façon d’opérationnaliser les variables prévisionnelles issues des inventaires de personnalité et en précisant les critères à prédire de manière à les rendre plus spécifiques et mieux arrimés. Pour ce faire, la capacité prévisionnelle d’une approche centrée sur le critère, c’est-à-dire l’utilisation de composites de traits de personnalité, est comparée à l’approche traditionnelle centrée sur le prédicteur, dans ce cas-ci, les cinq grands facteurs de personnalité (Big Five). D’autre part, le rendement au travail est opérationnalisé sous l’angle des compétences en emploi, ce qui permet d’en différencier les dimensions et d’augmenter la spécificité des critères. Des hypothèses précisant les facteurs de personnalité qui devraient permettre de prédire chacune des compétences évaluées sont testées. De plus, des hypothèses précisant les traits de personnalité servant à créer les variables composites sont aussi testées. Finalement, une hypothèse portant sur la comparaison de la puissance prévisionnelle des deux approches est mise à l’épreuve. L’échantillon de la recherche est composé de 225 employés occupant divers emplois au sein d’une grande organisation québécoise. Ils ont complété un inventaire de personnalité au travail dans le cadre des processus de sélection de l’organisation. Leur supérieur immédiat a effectué une évaluation de leurs compétences et de leur rendement au moins six (6) mois après leur embauche. Les résultats démontrent que la maîtrise des compétences est mieux prédite par une approche centrée sur le prédicteur (c’est-à-dire les Big Five) que par une approche centrée sur le critère (c’est-à-dire les variables composites). En effet, seules trois hypothèses portant sur le lien entre certains facteurs de personnalité et les compétences se sont avérées partiellement soutenues. Les résultats d’analyses statistiques supplémentaires, réalisées a posteriori afin de mieux comprendre les résultats, laissent supposer la présence de variables modératrices, dont, notamment, les caractéristiques situationnelles. En somme, il nous semble plus probable d’arriver, dans le futur, à trouver une méthode structurée de création des variables composites qui permettrait d’obtenir des liens prévisionnels plus puissants que de découvrir des variables composites qui seraient elles-mêmes généralisables à tous les emplois et à toutes les organisations. Par ailleurs, nous encourageons les praticiens à porter attention à la façon d’utiliser les données de personnalité. Pour le moment, il semble que les facteurs de personnalité permettent de prédire, en partie, le rendement futur en emploi. Or, les preuves empiriques concernant l’efficacité d’autres approches demeurent relativement rares et, surtout, insuffisantes pour guider fidèlement les praticiens à travers les choix nécessaires à leur utilisation.
Resumo:
A set of four eddy-permitting global ocean reanalyses produced in the framework of the MyOcean project have been compared over the altimetry period 1993–2011. The main differences among the reanalyses used here come from the data assimilation scheme implemented to control the ocean state by inserting reprocessed observations of sea surface temperature (SST), in situ temperature and salinity profiles, sea level anomaly and sea-ice concentration. A first objective of this work includes assessing the interannual variability and trends for a series of parameters, usually considered in the community as essential ocean variables: SST, sea surface salinity, temperature and salinity averaged over meaningful layers of the water column, sea level, transports across pre-defined sections, and sea ice parameters. The eddy-permitting nature of the global reanalyses allows also to estimate eddy kinetic energy. The results show that in general there is a good consistency between the different reanalyses. An intercomparison against experiments without data assimilation was done during the MyOcean project and we conclude that data assimilation is crucial for correctly simulating some quantities such as regional trends of sea level as well as the eddy kinetic energy. A second objective is to show that the ensemble mean of reanalyses can be evaluated as one single system regarding its reliability in reproducing the climate signals, where both variability and uncertainties are assessed through the ensemble spread and signal-to-noise ratio. The main advantage of having access to several reanalyses differing in the way data assimilation is performed is that it becomes possible to assess part of the total uncertainty. Given the fact that we use very similar ocean models and atmospheric forcing, we can conclude that the spread of the ensemble of reanalyses is mainly representative of our ability to gauge uncertainty in the assimilation methods. This uncertainty changes a lot from one ocean parameter to another, especially in global indices. However, despite several caveats in the design of the multi-system ensemble, the main conclusion from this study is that an eddy-permitting multi-system ensemble approach has become mature and our results provide a first step towards a systematic comparison of eddy-permitting global ocean reanalyses aimed at providing robust conclusions on the recent evolution of the oceanic state.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Pós-graduação em Desenvolvimento Humano e Tecnologias - IBRC
Resumo:
Pós-graduação em Engenharia Elétrica - FEIS
Resumo:
Independencia Bay can be considered as one of the most productive invertebratc fishing grounds worldwide. One of the most important exploited species is the scallop (Argopecten purpuratus) with strong catching fluctuations related to El Nino and La Nina events and to inadequate Management strategies. During strong warming periods annual landings reach up to 50000 t in an area of about 150 km**2 and during cold years they remain around 500 to 1000 t. This study analyses the changes in scallop landings at Independencia Bay observed during the last two decades and discusses the main factors affecting the scallop proliferations during the EI Nino events. In this way data on landings, sea surface temperature and those related to growth, reproduction, predation, mean density and oxygen concentration from published and unpublished Papers are used. The relationship between annual catches and average water temperature over the preceding reproductive period of the scallop over the past 20 year's period, showed that scallop production is affected positively only with strong EI Nino such as those of 1983 and 1998. Our review showed that the scallop stock proliferation can be traced to the combined effect of (1) an increase in reproductive output through an acceleration of gonad maturation and a higher spawning frequency; (2) a shortening of the larval period and an increase in larval survival; (3) an increase in the individual growth performance; (4) an increase in the juvenile and adult survival through reduction of predator biomass; (5) an increase in carrying capacity of the scallop banks due to elevated oxygen levels.