903 resultados para Probabilistic latent semantic analysis (PLSA)


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we use concepts from graph theory and cellular biology represented as ontologies, to carry out semantic mining tasks on signaling pathway networks. Specifically, the paper describes the semantic enrichment of signaling pathway networks. A cell signaling network describes the basic cellular activities and their interactions. The main contribution of this paper is in the signaling pathway research area, it proposes a new technique to analyze and understand how changes in these networks may affect the transmission and flow of information, which produce diseases such as cancer and diabetes. Our approach is based on three concepts from graph theory (modularity, clustering and centrality) frequently used on social networks analysis. Our approach consists into two phases: the first uses the graph theory concepts to determine the cellular groups in the network, which we will call them communities; the second uses ontologies for the semantic enrichment of the cellular communities. The measures used from the graph theory allow us to determine the set of cells that are close (for example, in a disease), and the main cells in each community. We analyze our approach in two cases: TGF-β and the Alzheimer Disease.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Considerable interest in renewable energy has increased in recent years due to the concerns raised over the environmental impact of conventional energy sources and their price volatility. In particular, wind power has enjoyed a dramatic global growth in installed capacity over the past few decades. Nowadays, the advancement of wind turbine industry represents a challenge for several engineering areas, including materials science, computer science, aerodynamics, analytical design and analysis methods, testing and monitoring, and power electronics. In particular, the technological improvement of wind turbines is currently tied to the use of advanced design methodologies, allowing the designers to develop new and more efficient design concepts. Integrating mathematical optimization techniques into the multidisciplinary design of wind turbines constitutes a promising way to enhance the profitability of these devices. In the literature, wind turbine design optimization is typically performed deterministically. Deterministic optimizations do not consider any degree of randomness affecting the inputs of the system under consideration, and result, therefore, in an unique set of outputs. However, given the stochastic nature of the wind and the uncertainties associated, for instance, with wind turbine operating conditions or geometric tolerances, deterministically optimized designs may be inefficient. Therefore, one of the ways to further improve the design of modern wind turbines is to take into account the aforementioned sources of uncertainty in the optimization process, achieving robust configurations with minimal performance sensitivity to factors causing variability. The research work presented in this thesis deals with the development of a novel integrated multidisciplinary design framework for the robust aeroservoelastic design optimization of multi-megawatt horizontal axis wind turbine (HAWT) rotors, accounting for the stochastic variability related to the input variables. The design system is based on a multidisciplinary analysis module integrating several simulations tools needed to characterize the aeroservoelastic behavior of wind turbines, and determine their economical performance by means of the levelized cost of energy (LCOE). The reported design framework is portable and modular in that any of its analysis modules can be replaced with counterparts of user-selected fidelity. The presented technology is applied to the design of a 5-MW HAWT rotor to be used at sites of wind power density class from 3 to 7, where the mean wind speed at 50 m above the ground ranges from 6.4 to 11.9 m/s. Assuming the mean wind speed to vary stochastically in such range, the rotor design is optimized by minimizing the mean and standard deviation of the LCOE. Airfoil shapes, spanwise distributions of blade chord and twist, internal structural layup and rotor speed are optimized concurrently, subject to an extensive set of structural and aeroelastic constraints. The effectiveness of the multidisciplinary and robust design framework is demonstrated by showing that the probabilistically designed turbine achieves more favorable probabilistic performance than those of the initial baseline turbine and a turbine designed deterministically.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Markov Chain analysis was recently proposed to assess the time scales and preferential pathways into biological or physical networks by computing residence time, first passage time, rates of transfer between nodes and number of passages in a node. We propose to adapt an algorithm already published for simple systems to physical systems described with a high resolution hydrodynamic model. The method is applied to bays and estuaries on the Eastern Coast of Canada for their interest in shellfish aquaculture. Current velocities have been computed by using a 2 dimensional grid of elements and circulation patterns were summarized by averaging Eulerian flows between adjacent elements. Flows and volumes allow computing probabilities of transition between elements and to assess the average time needed by virtual particles to move from one element to another, the rate of transfer between two elements, and the average residence time of each system. We also combined transfer rates and times to assess the main pathways of virtual particles released in farmed areas and the potential influence of farmed areas on other areas. We suggest that Markov chain is complementary to other sets of ecological indicators proposed to analyse the interactions between farmed areas - e.g. depletion index, carrying capacity assessment. Markov Chain has several advantages with respect to the estimation of connectivity between pair of sites. It makes possible to estimate transfer rates and times at once in a very quick and efficient way, without the need to perform long term simulations of particle or tracer concentration.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation investigates the connection between spectral analysis and frame theory. When considering the spectral properties of a frame, we present a few novel results relating to the spectral decomposition. We first show that scalable frames have the property that the inner product of the scaling coefficients and the eigenvectors must equal the inverse eigenvalues. From this, we prove a similar result when an approximate scaling is obtained. We then focus on the optimization problems inherent to the scalable frames by first showing that there is an equivalence between scaling a frame and optimization problems with a non-restrictive objective function. Various objective functions are considered, and an analysis of the solution type is presented. For linear objectives, we can encourage sparse scalings, and with barrier objective functions, we force dense solutions. We further consider frames in high dimensions, and derive various solution techniques. From here, we restrict ourselves to various frame classes, to add more specificity to the results. Using frames generated from distributions allows for the placement of probabilistic bounds on scalability. For discrete distributions (Bernoulli and Rademacher), we bound the probability of encountering an ONB, and for continuous symmetric distributions (Uniform and Gaussian), we show that symmetry is retained in the transformed domain. We also prove several hyperplane-separation results. With the theory developed, we discuss graph applications of the scalability framework. We make a connection with graph conditioning, and show the in-feasibility of the problem in the general case. After a modification, we show that any complete graph can be conditioned. We then present a modification of standard PCA (robust PCA) developed by Cand\`es, and give some background into Electron Energy-Loss Spectroscopy (EELS). We design a novel scheme for the processing of EELS through robust PCA and least-squares regression, and test this scheme on biological samples. Finally, we take the idea of robust PCA and apply the technique of kernel PCA to perform robust manifold learning. We derive the problem and present an algorithm for its solution. There is also discussion of the differences with RPCA that make theoretical guarantees difficult.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Verbal fluency is the ability to produce a satisfying sequence of spoken words during a given time interval. The core of verbal fluency lies in the capacity to manage the executive aspects of language. The standard scores of the semantic verbal fluency test are broadly used in the neuropsychological assessment of the elderly, and different analytical methods are likely to extract even more information from the data generated in this test. Graph theory, a mathematical approach to analyze relations between items, represents a promising tool to understand a variety of neuropsychological states. This study reports a graph analysis of data generated by the semantic verbal fluency test by cognitively healthy elderly (NC), patients with Mild Cognitive Impairment – subtypes amnestic(aMCI) and amnestic multiple domain (a+mdMCI) - and patients with Alzheimer’s disease (AD). Sequences of words were represented as a speech graph in which every word corresponded to a node and temporal links between words were represented by directed edges. To characterize the structure of the data we calculated 13 speech graph attributes (SGAs). The individuals were compared when divided in three (NC – MCI – AD) and four (NC – aMCI – a+mdMCI – AD) groups. When the three groups were compared, significant differences were found in the standard measure of correct words produced, and three SGA: diameter, average shortest path, and network density. SGA sorted the elderly groups with good specificity and sensitivity. When the four groups were compared, the groups differed significantly in network density, except between the two MCI subtypes and NC and aMCI. The diameter of the network and the average shortest path were significantly different between the NC and AD, and between aMCI and AD. SGA sorted the elderly in their groups with good specificity and sensitivity, performing better than the standard score of the task. These findings provide support for a new methodological frame to assess the strength of semantic memory through the verbal fluency task, with potential to amplify the predictive power of this test. Graph analysis is likely to become clinically relevant in neurology and psychiatry, and may be particularly useful for the differential diagnosis of the elderly.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this thesis, we present a quantitative approach using probabilistic verification techniques for the analysis of reliability, availability, maintainability, and safety (RAMS) properties of satellite systems. The subject of our research is satellites used in mission critical industrial applications. A strong case for using probabilistic model checking to support RAMS analysis of satellite systems is made by our verification results. This study is intended to build a foundation to help reliability engineers with a basic background in model checking to apply probabilistic model checking to small satellite systems. We make two major contributions. One of these is the approach of RAMS analysis to satellite systems. In the past, RAMS analysis has been extensively applied to the field of electrical and electronics engineering. It allows system designers and reliability engineers to predict the likelihood of failures from the indication of historical or current operational data. There is a high potential for the application of RAMS analysis in the field of space science and engineering. However, there is a lack of standardisation and suitable procedures for the correct study of RAMS characteristics for satellite systems. This thesis considers the promising application of RAMS analysis to the case of satellite design, use, and maintenance, focusing on its system segments. Data collection and verification procedures are discussed, and a number of considerations are also presented on how to predict the probability of failure. Our second contribution is leveraging the power of probabilistic model checking to analyse satellite systems. We present techniques for analysing satellite systems that differ from the more common quantitative approaches based on traditional simulation and testing. These techniques have not been applied in this context before. We present the use of probabilistic techniques via a suite of detailed examples, together with their analysis. Our presentation is done in an incremental manner: in terms of complexity of application domains and system models, and a detailed PRISM model of each scenario. We also provide results from practical work together with a discussion about future improvements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The size of online image datasets is constantly increasing. Considering an image dataset with millions of images, image retrieval becomes a seemingly intractable problem for exhaustive similarity search algorithms. Hashing methods, which encodes high-dimensional descriptors into compact binary strings, have become very popular because of their high efficiency in search and storage capacity. In the first part, we propose a multimodal retrieval method based on latent feature models. The procedure consists of a nonparametric Bayesian framework for learning underlying semantically meaningful abstract features in a multimodal dataset, a probabilistic retrieval model that allows cross-modal queries and an extension model for relevance feedback. In the second part, we focus on supervised hashing with kernels. We describe a flexible hashing procedure that treats binary codes and pairwise semantic similarity as latent and observed variables, respectively, in a probabilistic model based on Gaussian processes for binary classification. We present a scalable inference algorithm with the sparse pseudo-input Gaussian process (SPGP) model and distributed computing. In the last part, we define an incremental hashing strategy for dynamic databases where new images are added to the databases frequently. The method is based on a two-stage classification framework using binary and multi-class SVMs. The proposed method also enforces balance in binary codes by an imbalance penalty to obtain higher quality binary codes. We learn hash functions by an efficient algorithm where the NP-hard problem of finding optimal binary codes is solved via cyclic coordinate descent and SVMs are trained in a parallelized incremental manner. For modifications like adding images from an unseen class, we propose an incremental procedure for effective and efficient updates to the previous hash functions. Experiments on three large-scale image datasets demonstrate that the incremental strategy is capable of efficiently updating hash functions to the same retrieval performance as hashing from scratch.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: Cost-effectiveness analysis of a 6-month treatment of apixaban (10 mg/12h, first 7 days; 5 mg/12h afterwards) for the treatment of the first event of venous thromboembolism (VTE) and prevention of recurrences, versus low-molecular-weight heparins/vitamin K antagonists treatment (LMWH/VKA). Material and methods: A lifetime Markov model with 13 health states was used for describing the course of the disease. Efficacy and safety data were obtained from AMPLIFY and AMPLIFY-EXT clinical trials; health outcomes were measured as life years gained (LYG) and quality-adjusted life years (QALY). The chosen perspective of this analysis has been the Spanish National Health System (NHS). Drugs, management of VTE and complications costs were obtained from several Spanish data sources (€, 2014). A 3% discount rate was applied to health outcomes and costs. Univariate and probabilistic sensitivity analyses (SA) were performed in order to assess the robustness of the results. Results: Apixaban was the most effective therapy with 7.182 LYG and 5.865 QALY, versus 7.160 LYG and 5.838 QALYs with LMWH/VKA. Furthermore, apixaban had a lower total cost (€13,374.70 vs €13,738.30). Probabilistic SA confirmed dominance of apixaban (led to better health outcomes with less associated costs) in 89% of the simulations. Conclusions: Apixaban 5 mg/12h versus LMWH/VKA was an efficient therapeutic strategy for the treatment and prevention of recurrences of VTE from the NHS perspective.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Introduction: Studies on infant dietary intake do not generally focus on the types of liquids consumed. Objective: To document by age and breastfeeding status, the types of liquids present in the diet of Mexican children under 1 year of age (< 1 y) who participated in the National Health and Nutrition Survey 2012 (ENSANUT-2012). Methods: Analysis of the infant < 1 y feeding practices from the ENSANUT-2012 survey in non-breastfed (non-BF) and breastfed (BF) infants by status quo for the consumption of liquids grouped in: water, formula, fortified LICONSA milk, nutritive liquids (NL; thin cereal-based gruel with water or milk and coffee with milk) and non-nutritive liquids (non-NL) as sugared water, water-based drinks, tea, beans or chicken broth, aguamiel and coffee. In this infants < 1 y we analyzed the not grouped consumption of liquids in the first three days of life (newborns) from the mother's recall. Percentage and confidence intervals (95% CI) were calculated adjusting for survey design. Statistical differences were analyzed by Z test. Results: We observed a high consumption of human milk followed by formula (56.7%) and water (51.1%) in infants under 6 months of age (< 6 mo). The proportion of non-BF infants consuming non-NL was higher than for BF infants (p < 0.05). More than 60% of older infants (6 mo and < 1 y) consumed formula and were non-BF. In newborns formula consumption was predominant, followed by tea or infusion and water. Conclusions: Non-breast milk liquids are present undesirably in Mexican infants' diet and non-NL are consumed earlier than NL, revealing inadequate early dietary practices.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The U.S. Nuclear Regulatory Commission implemented a safety goal policy in response to the 1979 Three Mile Island accident. This policy addresses the question “How safe is safe enough?” by specifying quantitative health objectives (QHOs) for comparison with results from nuclear power plant (NPP) probabilistic risk analyses (PRAs) to determine whether proposed regulatory actions are justified based on potential safety benefit. Lessons learned from recent operating experience—including the 2011 Fukushima accident—indicate that accidents involving multiple units at a shared site can occur with non-negligible frequency. Yet risk contributions from such scenarios are excluded by policy from safety goal evaluations—even for the nearly 60% of U.S. NPP sites that include multiple units. This research develops and applies methods for estimating risk metrics for comparison with safety goal QHOs using models from state-of-the-art consequence analyses to evaluate the effect of including multi-unit accident risk contributions in safety goal evaluations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

For climate risk management, cumulative distribution functions (CDFs) are an important source of information. They are ideally suited to compare probabilistic forecasts of primary (e.g. rainfall) or secondary data (e.g. crop yields). Summarised as CDFs, such forecasts allow an easy quantitative assessment of possible, alternative actions. Although the degree of uncertainty associated with CDF estimation could influence decisions, such information is rarely provided. Hence, we propose Cox-type regression models (CRMs) as a statistical framework for making inferences on CDFs in climate science. CRMs were designed for modelling probability distributions rather than just mean or median values. This makes the approach appealing for risk assessments where probabilities of extremes are often more informative than central tendency measures. CRMs are semi-parametric approaches originally designed for modelling risks arising from time-to-event data. Here we extend this original concept beyond time-dependent measures to other variables of interest. We also provide tools for estimating CDFs and surrounding uncertainty envelopes from empirical data. These statistical techniques intrinsically account for non-stationarities in time series that might be the result of climate change. This feature makes CRMs attractive candidates to investigate the feasibility of developing rigorous global circulation model (GCM)-CRM interfaces for provision of user-relevant forecasts. To demonstrate the applicability of CRMs, we present two examples for El Ni ? no/Southern Oscillation (ENSO)-based forecasts: the onset date of the wet season (Cairns, Australia) and total wet season rainfall (Quixeramobim, Brazil). This study emphasises the methodological aspects of CRMs rather than discussing merits or limitations of the ENSO-based predictors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of the present study was to propose and evaluate the use of factor analysis (FA) in obtaining latent variables (factors) that represent a set of pig traits simultaneously, for use in genome-wide selection (GWS) studies. We used crosses between outbred F2 populations of Brazilian Piau X commercial pigs. Data were obtained on 345 F2 pigs, genotyped for 237 SNPs, with 41 traits. FA allowed us to obtain four biologically interpretable factors: ?weight?, ?fat?, ?loin?, and ?performance?. These factors were used as dependent variables in multiple regression models of genomic selection (Bayes A, Bayes B, RR-BLUP, and Bayesian LASSO). The use of FA is presented as an interesting alternative to select individuals for multiple variables simultaneously in GWS studies; accuracy measurements of the factors were similar to those obtained when the original traits were considered individually. The similarities between the top 10% of individuals selected by the factor, and those selected by the individual traits, were also satisfactory. Moreover, the estimated markers effects for the traits were similar to those found for the relevant factor.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação (mestrado)—Universidade de Brasília, Faculdade de Economia, Administração e Contabilidade, Programa de Pós-Graduação em Administração, 2016.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The study examined changes in doctors’ working hours and satisfaction with working hours over five time points and explored the influence of personal characteristics on these outcomes. Latent growth curve modeling was applied to Medicine in Australia: Balancing Employment and Life data, collected from 2008 to 2012. Findings showed that working hours significantly declined over time, with a greater decrease among males, older doctors, and doctors with fewer children. Satisfaction increased faster over time among specialists, doctors with poorer health, those whose partners did not work full-time, and those with older children. The more hours the doctors worked initially, the lower satisfaction reported, and the greater the increase in satisfaction. Findings are consistent with a culture change in the medical profession, whereby long working hours are no longer seen as synonymous with professionalism. This is important to take into account in projecting future workforce supply.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we study a challenging problem of mining data generating rules and state transforming rules (i.e., semantics) underneath multiple correlated time series streams. A novel Correlation field-based Semantics Learning Framework (CfSLF) is proposed to learn the semantic. In the framework, we use Hidden Markov Random Field (HMRF) method to model relationship between latent states and observations in multiple correlated time series to learn data generating rules. The transforming rules are learned from corresponding latent state sequence of multiple time series based on Markov chain character. The reusable semantics learned by CfSLF can be fed into various analysis tools, such as prediction or anomaly detection. Moreover, we present two algorithms based on the semantics, which can later be applied to next-n step prediction and anomaly detection. Experiments on real world data sets demonstrate the efficiency and effectiveness of the proposed method. © Springer-Verlag 2013.