849 resultados para subspace mapping
Resumo:
We discuss the feasibility of wireless terahertz communications links deployed in a metropolitan area and model the large-scale fading of such channels. The model takes into account reception through direct line of sight, ground and wall reflection, as well as diffraction around a corner. The movement of the receiver is modeled by an autonomous dynamic linear system in state space, whereas the geometric relations involved in the attenuation and multipath propagation of the electric field are described by a static nonlinear mapping. A subspace algorithm in conjunction with polynomial regression is used to identify a single-output Wiener model from time-domain measurements of the field intensity when the receiver motion is simulated using a constant angular speed and an exponentially decaying radius. The identification procedure is validated by using the model to perform q-step ahead predictions. The sensitivity of the algorithm to small-scale fading, detector noise, and atmospheric changes are discussed. The performance of the algorithm is tested in the diffraction zone assuming a range of emitter frequencies (2, 38, 60, 100, 140, and 400 GHz). Extensions of the simulation results to situations where a more complicated trajectory describes the motion of the receiver are also implemented, providing information on the performance of the algorithm under a worst case scenario. Finally, a sensitivity analysis to model parameters for the identified Wiener system is proposed.
Resumo:
This paper introduces a new neurofuzzy model construction and parameter estimation algorithm from observed finite data sets, based on a Takagi and Sugeno (T-S) inference mechanism and a new extended Gram-Schmidt orthogonal decomposition algorithm, for the modeling of a priori unknown dynamical systems in the form of a set of fuzzy rules. The first contribution of the paper is the introduction of a one to one mapping between a fuzzy rule-base and a model matrix feature subspace using the T-S inference mechanism. This link enables the numerical properties associated with a rule-based matrix subspace, the relationships amongst these matrix subspaces, and the correlation between the output vector and a rule-base matrix subspace, to be investigated and extracted as rule-based knowledge to enhance model transparency. The matrix subspace spanned by a fuzzy rule is initially derived as the input regression matrix multiplied by a weighting matrix that consists of the corresponding fuzzy membership functions over the training data set. Model transparency is explored by the derivation of an equivalence between an A-optimality experimental design criterion of the weighting matrix and the average model output sensitivity to the fuzzy rule, so that rule-bases can be effectively measured by their identifiability via the A-optimality experimental design criterion. The A-optimality experimental design criterion of the weighting matrices of fuzzy rules is used to construct an initial model rule-base. An extended Gram-Schmidt algorithm is then developed to estimate the parameter vector for each rule. This new algorithm decomposes the model rule-bases via an orthogonal subspace decomposition approach, so as to enhance model transparency with the capability of interpreting the derived rule-base energy level. This new approach is computationally simpler than the conventional Gram-Schmidt algorithm for resolving high dimensional regression problems, whereby it is computationally desirable to decompose complex models into a few submodels rather than a single model with large number of input variables and the associated curse of dimensionality problem. Numerical examples are included to demonstrate the effectiveness of the proposed new algorithm.
Resumo:
A new robust neurofuzzy model construction algorithm has been introduced for the modeling of a priori unknown dynamical systems from observed finite data sets in the form of a set of fuzzy rules. Based on a Takagi-Sugeno (T-S) inference mechanism a one to one mapping between a fuzzy rule base and a model matrix feature subspace is established. This link enables rule based knowledge to be extracted from matrix subspace to enhance model transparency. In order to achieve maximized model robustness and sparsity, a new robust extended Gram-Schmidt (G-S) method has been introduced via two effective and complementary approaches of regularization and D-optimality experimental design. Model rule bases are decomposed into orthogonal subspaces, so as to enhance model transparency with the capability of interpreting the derived rule base energy level. A locally regularized orthogonal least squares algorithm, combined with a D-optimality used for subspace based rule selection, has been extended for fuzzy rule regularization and subspace based information extraction. By using a weighting for the D-optimality cost function, the entire model construction procedure becomes automatic. Numerical examples are included to demonstrate the effectiveness of the proposed new algorithm.
Resumo:
Machine learning techniques are used for extracting valuable knowledge from data. Nowa¬days, these techniques are becoming even more important due to the evolution in data ac¬quisition and storage, which is leading to data with different characteristics that must be exploited. Therefore, advances in data collection must be accompanied with advances in machine learning techniques to solve new challenges that might arise, on both academic and real applications. There are several machine learning techniques depending on both data characteristics and purpose. Unsupervised classification or clustering is one of the most known techniques when data lack of supervision (unlabeled data) and the aim is to discover data groups (clusters) according to their similarity. On the other hand, supervised classification needs data with supervision (labeled data) and its aim is to make predictions about labels of new data. The presence of data labels is a very important characteristic that guides not only the learning task but also other related tasks such as validation. When only some of the available data are labeled whereas the others remain unlabeled (partially labeled data), neither clustering nor supervised classification can be used. This scenario, which is becoming common nowadays because of labeling process ignorance or cost, is tackled with semi-supervised learning techniques. This thesis focuses on the branch of semi-supervised learning closest to clustering, i.e., to discover clusters using available labels as support to guide and improve the clustering process. Another important data characteristic, different from the presence of data labels, is the relevance or not of data features. Data are characterized by features, but it is possible that not all of them are relevant, or equally relevant, for the learning process. A recent clustering tendency, related to data relevance and called subspace clustering, claims that different clusters might be described by different feature subsets. This differs from traditional solutions to data relevance problem, where a single feature subset (usually the complete set of original features) is found and used to perform the clustering process. The proximity of this work to clustering leads to the first goal of this thesis. As commented above, clustering validation is a difficult task due to the absence of data labels. Although there are many indices that can be used to assess the quality of clustering solutions, these validations depend on clustering algorithms and data characteristics. Hence, in the first goal three known clustering algorithms are used to cluster data with outliers and noise, to critically study how some of the most known validation indices behave. The main goal of this work is however to combine semi-supervised clustering with subspace clustering to obtain clustering solutions that can be correctly validated by using either known indices or expert opinions. Two different algorithms are proposed from different points of view to discover clusters characterized by different subspaces. For the first algorithm, available data labels are used for searching for subspaces firstly, before searching for clusters. This algorithm assigns each instance to only one cluster (hard clustering) and is based on mapping known labels to subspaces using supervised classification techniques. Subspaces are then used to find clusters using traditional clustering techniques. The second algorithm uses available data labels to search for subspaces and clusters at the same time in an iterative process. This algorithm assigns each instance to each cluster based on a membership probability (soft clustering) and is based on integrating known labels and the search for subspaces into a model-based clustering approach. The different proposals are tested using different real and synthetic databases, and comparisons to other methods are also included when appropriate. Finally, as an example of real and current application, different machine learning tech¬niques, including one of the proposals of this work (the most sophisticated one) are applied to a task of one of the most challenging biological problems nowadays, the human brain model¬ing. Specifically, expert neuroscientists do not agree with a neuron classification for the brain cortex, which makes impossible not only any modeling attempt but also the day-to-day work without a common way to name neurons. Therefore, machine learning techniques may help to get an accepted solution to this problem, which can be an important milestone for future research in neuroscience. Resumen Las técnicas de aprendizaje automático se usan para extraer información valiosa de datos. Hoy en día, la importancia de estas técnicas está siendo incluso mayor, debido a que la evolución en la adquisición y almacenamiento de datos está llevando a datos con diferentes características que deben ser explotadas. Por lo tanto, los avances en la recolección de datos deben ir ligados a avances en las técnicas de aprendizaje automático para resolver nuevos retos que pueden aparecer, tanto en aplicaciones académicas como reales. Existen varias técnicas de aprendizaje automático dependiendo de las características de los datos y del propósito. La clasificación no supervisada o clustering es una de las técnicas más conocidas cuando los datos carecen de supervisión (datos sin etiqueta), siendo el objetivo descubrir nuevos grupos (agrupaciones) dependiendo de la similitud de los datos. Por otra parte, la clasificación supervisada necesita datos con supervisión (datos etiquetados) y su objetivo es realizar predicciones sobre las etiquetas de nuevos datos. La presencia de las etiquetas es una característica muy importante que guía no solo el aprendizaje sino también otras tareas relacionadas como la validación. Cuando solo algunos de los datos disponibles están etiquetados, mientras que el resto permanece sin etiqueta (datos parcialmente etiquetados), ni el clustering ni la clasificación supervisada se pueden utilizar. Este escenario, que está llegando a ser común hoy en día debido a la ignorancia o el coste del proceso de etiquetado, es abordado utilizando técnicas de aprendizaje semi-supervisadas. Esta tesis trata la rama del aprendizaje semi-supervisado más cercana al clustering, es decir, descubrir agrupaciones utilizando las etiquetas disponibles como apoyo para guiar y mejorar el proceso de clustering. Otra característica importante de los datos, distinta de la presencia de etiquetas, es la relevancia o no de los atributos de los datos. Los datos se caracterizan por atributos, pero es posible que no todos ellos sean relevantes, o igualmente relevantes, para el proceso de aprendizaje. Una tendencia reciente en clustering, relacionada con la relevancia de los datos y llamada clustering en subespacios, afirma que agrupaciones diferentes pueden estar descritas por subconjuntos de atributos diferentes. Esto difiere de las soluciones tradicionales para el problema de la relevancia de los datos, en las que se busca un único subconjunto de atributos (normalmente el conjunto original de atributos) y se utiliza para realizar el proceso de clustering. La cercanía de este trabajo con el clustering lleva al primer objetivo de la tesis. Como se ha comentado previamente, la validación en clustering es una tarea difícil debido a la ausencia de etiquetas. Aunque existen muchos índices que pueden usarse para evaluar la calidad de las soluciones de clustering, estas validaciones dependen de los algoritmos de clustering utilizados y de las características de los datos. Por lo tanto, en el primer objetivo tres conocidos algoritmos se usan para agrupar datos con valores atípicos y ruido para estudiar de forma crítica cómo se comportan algunos de los índices de validación más conocidos. El objetivo principal de este trabajo sin embargo es combinar clustering semi-supervisado con clustering en subespacios para obtener soluciones de clustering que puedan ser validadas de forma correcta utilizando índices conocidos u opiniones expertas. Se proponen dos algoritmos desde dos puntos de vista diferentes para descubrir agrupaciones caracterizadas por diferentes subespacios. Para el primer algoritmo, las etiquetas disponibles se usan para bus¬car en primer lugar los subespacios antes de buscar las agrupaciones. Este algoritmo asigna cada instancia a un único cluster (hard clustering) y se basa en mapear las etiquetas cono-cidas a subespacios utilizando técnicas de clasificación supervisada. El segundo algoritmo utiliza las etiquetas disponibles para buscar de forma simultánea los subespacios y las agru¬paciones en un proceso iterativo. Este algoritmo asigna cada instancia a cada cluster con una probabilidad de pertenencia (soft clustering) y se basa en integrar las etiquetas conocidas y la búsqueda en subespacios dentro de clustering basado en modelos. Las propuestas son probadas utilizando diferentes bases de datos reales y sintéticas, incluyendo comparaciones con otros métodos cuando resulten apropiadas. Finalmente, a modo de ejemplo de una aplicación real y actual, se aplican diferentes técnicas de aprendizaje automático, incluyendo una de las propuestas de este trabajo (la más sofisticada) a una tarea de uno de los problemas biológicos más desafiantes hoy en día, el modelado del cerebro humano. Específicamente, expertos neurocientíficos no se ponen de acuerdo en una clasificación de neuronas para la corteza cerebral, lo que imposibilita no sólo cualquier intento de modelado sino también el trabajo del día a día al no tener una forma estándar de llamar a las neuronas. Por lo tanto, las técnicas de aprendizaje automático pueden ayudar a conseguir una solución aceptada para este problema, lo cual puede ser un importante hito para investigaciones futuras en neurociencia.
Resumo:
The generative topographic mapping (GTM) model was introduced by Bishop et al. (1998, Neural Comput. 10(1), 215-234) as a probabilistic re- formulation of the self-organizing map (SOM). It offers a number of advantages compared with the standard SOM, and has already been used in a variety of applications. In this paper we report on several extensions of the GTM, including an incremental version of the EM algorithm for estimating the model parameters, the use of local subspace models, extensions to mixed discrete and continuous data, semi-linear models which permit the use of high-dimensional manifolds whilst avoiding computational intractability, Bayesian inference applied to hyper-parameters, and an alternative framework for the GTM based on Gaussian processes. All of these developments directly exploit the probabilistic structure of the GTM, thereby allowing the underlying modelling assumptions to be made explicit. They also highlight the advantages of adopting a consistent probabilistic framework for the formulation of pattern recognition algorithms.
Resumo:
Dulce de leche samples available in the Brazilian market were submitted to sensory profiling by quantitative descriptive analysis and acceptance test, as well sensory evaluation using the just-about-right scale and purchase intent. External preference mapping and the ideal sensory characteristics of dulce de leche were determined. The results were also evaluated by principal component analysis, hierarchical cluster analysis, partial least squares regression, artificial neural networks, and logistic regression. Overall, significant product acceptance was related to intermediate scores of the sensory attributes in the descriptive test, and this trend was observed even after consumer segmentation. The results obtained by sensometric techniques showed that optimizing an ideal dulce de leche from the sensory standpoint is a multidimensional process, with necessary adjustments on the appearance, aroma, taste, and texture attributes of the product for better consumer acceptance and purchase. The optimum dulce de leche was characterized by high scores for the attributes sweet taste, caramel taste, brightness, color, and caramel aroma in accordance with the preference mapping findings. In industrial terms, this means changing the parameters used in the thermal treatment and quantitative changes in the ingredients used in formulations.
Resumo:
The evolution and population dynamics of avian coronaviruses (AvCoVs) remain underexplored. In the present study, in-depth phylogenetic and Bayesian phylogeographic studies were conducted to investigate the evolutionary dynamics of AvCoVs detected in wild and synanthropic birds. A total of 500 samples, including tracheal and cloacal swabs collected from 312 wild birds belonging to 42 species, were analysed using molecular assays. A total of 65 samples (13%) from 22 bird species were positive for AvCoV. Molecular evolution analyses revealed that the sequences from samples collected in Brazil did not cluster with any of the AvCoV S1 gene sequences deposited in the GenBank database. Bayesian framework analysis estimated an AvCoV strain from Sweden (1999) as the most recent common ancestor of the AvCoVs detected in this study. Furthermore, the analysis inferred an increase in the AvCoV dynamic demographic population in different wild and synanthropic bird species, suggesting that birds may be potential new hosts responsible for spreading this virus.
Resumo:
Mapping of elements in biological tissue by laser induced mass spectrometry is a fast growing analytical methodology in life sciences. This method provides a multitude of useful information of metal, nonmetal, metalloid and isotopic distribution at major, minor and trace concentration ranges, usually with a lateral resolution of 12-160 µm. Selected applications in medical research require an improved lateral resolution of laser induced mass spectrometric technique at the low micrometre scale and below. The present work demonstrates the applicability of a recently developed analytical methodology - laser microdissection associated to inductively coupled plasma mass spectrometry (LMD ICP-MS) - to obtain elemental images of different solid biological samples at high lateral resolution. LMD ICP-MS images of mouse brain tissue samples stained with uranium and native are shown, and a direct comparison of LMD and laser ablation (LA) ICP-MS imaging methodologies, in terms of elemental quantification, is performed.
Resumo:
QTL mapping provides usefull information for breeding programs since it allows the estimation of genomic locations and genetic effects of chromossomal regions related to the expression of quantitative traits. The objective of this study was to map QTL related to several agronomic important traits associated with grain yield: ear weight (EW), prolificacy (PROL), ear number (NE), ear length (EL) and diameter (ED), number of rows on the ear (NRE) and number of kernels per row on the ear (NKPR). Four hundred F-2:3 tropical maize progenies were evaluated in five environments in Piracicaba, Sao Paulo, Brazil. The genetic map was previously estimated and had 117 microssatelite loci with average distance of 14 cM. Data was analysed using Composite Interval Mapping for each trait. Thirty six QTL were mapped and related to the expression of EW (2), PROL (3), NE (2), EL (5), ED (5), NRE (10), NKPR (5). Few QTL were mapped since there was high GxE interaction. Traits EW, PROL and EN showed high genetic correlation with grain yield and several QTL mapped to similar genomic regions, which could cause the observed correlation. However, further analysis using apropriate statistical models are required to separate linked versus pleiotropic QTL. Five QTL (named Ew1, Ne1, Ed3, Nre3 and Nre10) had high genetic effects, explaining from 10.8% (Nre3) to 16.9% (Nre10) of the phenotypic variance, and could be considered in further studies.
Resumo:
The identification of alternatively spliced transcripts has contributed to a better comprehension of developmental mechanisms, tissue-specific physiological processes and human diseases. Polymerase chain reaction amplification of alternatively spliced variants commonly leads to the formation of heteroduplexes as a result of base pairing involving exons common between the two variants. S1 nuclease cleaves single-stranded loops of heteroduplexes and also nicks the opposite DNA strand. In order to establish a strategy for mapping alternative splice-prone sites in the whole transcriptome, we developed a method combining the formation of heteroduplexes between 2 distinct splicing variants and S1 nuclease digestion. For 20 consensuses identified here using this methodology, 5 revealed a conserved splice site after inspection of the cDNA alignment against the human genome (exact splice sites). For 8 other consensuses, conserved splice sites were mapped at 2 to 30 bp from the border, called proximal splice sites; for the other 7 consensuses, conserved splice sites were mapped at 40 to 800 bp, called distal splice sites. These latter cases showed a nonspecific activity of S1 nuclease in digesting double-strand DNA. From the 20 consensuses identified here, 5 were selected for reverse transcription-polymerase chain reaction validation, confirming the splice sites. These data showed the potential of the strategy in mapping splice sites. However, the lack of specificity of the S1 nuclease enzyme is a significant obstacle that impedes the use of this strategy in large-scale studies.
Resumo:
A combined analytical and numerical study is performed of the mapping between strongly interacting fermions and weakly interacting spins, in the framework of the Hubbard, t-J, and Heisenberg models. While for spatially homogeneous models in the thermodynamic limit the mapping is thoroughly understood, we here focus on aspects that become relevant in spatially inhomogeneous situations, such as the effect of boundaries, impurities, superlattices, and interfaces. We consider parameter regimes that are relevant for traditional applications of these models, such as electrons in cuprates and manganites, and for more recent applications to atoms in optical lattices. The rate of the mapping as a function of the interaction strength is determined from the Bethe-Ansatz for infinite systems and from numerical diagonalization for finite systems. We show analytically that if translational symmetry is broken through the presence of impurities, the mapping persists and is, in a certain sense, as local as possible, provided the spin-spin interaction between two sites of the Heisenberg model is calculated from the harmonic mean of the onsite Coulomb interaction on adjacent sites of the Hubbard model. Numerical calculations corroborate these findings also in interfaces and superlattices, where analytical calculations are more complicated.
Resumo:
This paper reports the use of a non-destructive, continuous magnetic Barkhausen noise (CMBN) technique to investigate the size and thickness of volumetric defects, in a 1070 steel. The magnetic behavior of the used probe was analyzed by numerical simulation, using the finite element method (FEM). Results indicated that the presence of a ferrite coil core in the probe favors MBN emissions. The samples were scanned with different speeds and probe configurations to determine the effect of the flaw on the CMBN signal amplitude. A moving smooth window, based on a second-order statistical moment, was used for analyzing the time signal. The results show the technique`s good repeatability, and high capacity for detection of this type of defect. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
The competition among the companies depends on the velocity and efficience they can create and commercialize knowledge in a timely and cost-efficient manner. In this context, collaboration emerges as a reaction to the environmental changes. Although strategic alliances and networks have been exploited in the strategic literature for decades, the complexity and continuous usage of these cooperation structures, in a world of growing competition, justify the continuous interest in both themes. This article presents a scanning of the contemporary academic production in strategic alliances and networks, covering the period from January 1997 to august 2007, based on the top five journals accordingly to the journal of Citation Report 2006 in the business and management categories simultaneously. The results point to a retraction in publications about strategic alliances and a significant growth in the area of strategic. networks. The joint view of strategic alliances and networks, cited by some authors a the evolutionary path of study, still did not appear salient. The most cited topics found in the alliance literature are the governance structure, cooperation, knowledge transfer, culture, control, trust, alliance formation,,previous experience, resources, competition and partner selection. The theme network focuses mainly on structure, knowledge transfer and social network, while the joint vision is highly concentrated in: the subjects of alliance formation and the governance choice.
Resumo:
Phaeosphaeria leaf spot (PLS) is an important disease in tropical and subtropical maize (Zea mays, L.) growing areas, but there is limited information on its inheritance. Thus, this research was conducted to study the inheritance of the PLS disease in tropical maize by using QTL mapping and to assess the feasibility of using marker-assisted selection aimed to develop genotypes resistance to this disease. Highly susceptible L14-04B and highly resistant L08-05F inbred lines were crossed to develop an F(2) population. Two-hundred and fifty six F(2) plants were genotyped with 143 microsatellite markers and their F(2:3) progenies were evaluated at seven environments. Ten plants per plot were evaluated 30 days after silk emergence following a rating scale, and the plot means were used for analyses. The heritability coefficient on a progeny mean basis was high (91.37%), and six QTL were mapped, with one QTL on chromosomes 1, 3, 4, and 6, and two QTL on chromosome 8. The gene action of the QTL ranged from additive to partial dominance, and the average level of dominance was partial dominance; also a dominance x dominance epistatic effect was detected between the QTL mapped on chromosome 8. The phenotypic variance explained by each QTL ranged from 2.91 to 11.86%, and the joint QTL effects explained 41.62% of the phenotypic variance. The alleles conditioning resistance to PLS disease of all mapped QTL were in the resistant parental inbred L08-05F. Thus, these alleles could be transferred to other elite maize inbreds by marker-assisted backcross selection to develop hybrids resistant to PLS disease.
Resumo:
Despite its importance to agriculture, the genetic basis of heterosis is still not well understood. The main competing hypotheses include dominance, overdominance, and epistasis. NC design III is an experimental design that. has been used for estimating the average degree of dominance of quantitative trait 106 (QTL) and also for studying heterosis. In this study, we first develop a multiple-interval mapping (MIM) model for design III that provides a platform to estimate the number, genomic positions, augmented additive and dominance effects, and epistatic interactions of QTL. The model can be used for parents with any generation of selling. We apply the method to two data sets, one for maize and one for rice. Our results show that heterosis in maize is mainly due to dominant gene action, although overdominance of individual QTL could not completely be ruled out due to the mapping resolution and limitations of NC design III. For rice, the estimated QTL dominant effects could not explain the observed heterosis. There is evidence that additive X additive epistatic effects of QTL could be the main cause for the heterosis in rice. The difference in the genetic basis of heterosis seems to be related to open or self pollination of the two species. The MIM model for NC design III is implemented in Windows QTL Cartographer, a freely distributed software.