898 resultados para Exponential Random Graph Model


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the past decade, the advent of efficient genome sequencing tools and high-throughput experimental biotechnology has lead to enormous progress in the life science. Among the most important innovations is the microarray tecnology. It allows to quantify the expression for thousands of genes simultaneously by measurin the hybridization from a tissue of interest to probes on a small glass or plastic slide. The characteristics of these data include a fair amount of random noise, a predictor dimension in the thousand, and a sample noise in the dozens. One of the most exciting areas to which microarray technology has been applied is the challenge of deciphering complex disease such as cancer. In these studies, samples are taken from two or more groups of individuals with heterogeneous phenotypes, pathologies, or clinical outcomes. these samples are hybridized to microarrays in an effort to find a small number of genes which are strongly correlated with the group of individuals. Eventhough today methods to analyse the data are welle developed and close to reach a standard organization (through the effort of preposed International project like Microarray Gene Expression Data -MGED- Society [1]) it is not unfrequant to stumble in a clinician's question that do not have a compelling statistical method that could permit to answer it.The contribution of this dissertation in deciphering disease regards the development of new approaches aiming at handle open problems posed by clinicians in handle specific experimental designs. In Chapter 1 starting from a biological necessary introduction, we revise the microarray tecnologies and all the important steps that involve an experiment from the production of the array, to the quality controls ending with preprocessing steps that will be used into the data analysis in the rest of the dissertation. While in Chapter 2 a critical review of standard analysis methods are provided stressing most of problems that In Chapter 3 is introduced a method to adress the issue of unbalanced design of miacroarray experiments. In microarray experiments, experimental design is a crucial starting-point for obtaining reasonable results. In a two-class problem, an equal or similar number of samples it should be collected between the two classes. However in some cases, e.g. rare pathologies, the approach to be taken is less evident. We propose to address this issue by applying a modified version of SAM [2]. MultiSAM consists in a reiterated application of a SAM analysis, comparing the less populated class (LPC) with 1,000 random samplings of the same size from the more populated class (MPC) A list of the differentially expressed genes is generated for each SAM application. After 1,000 reiterations, each single probe given a "score" ranging from 0 to 1,000 based on its recurrence in the 1,000 lists as differentially expressed. The performance of MultiSAM was compared to the performance of SAM and LIMMA [3] over two simulated data sets via beta and exponential distribution. The results of all three algorithms over low- noise data sets seems acceptable However, on a real unbalanced two-channel data set reagardin Chronic Lymphocitic Leukemia, LIMMA finds no significant probe, SAM finds 23 significantly changed probes but cannot separate the two classes, while MultiSAM finds 122 probes with score >300 and separates the data into two clusters by hierarchical clustering. We also report extra-assay validation in terms of differentially expressed genes Although standard algorithms perform well over low-noise simulated data sets, multi-SAM seems to be the only one able to reveal subtle differences in gene expression profiles on real unbalanced data. In Chapter 4 a method to adress similarities evaluation in a three-class prblem by means of Relevance Vector Machine [4] is described. In fact, looking at microarray data in a prognostic and diagnostic clinical framework, not only differences could have a crucial role. In some cases similarities can give useful and, sometimes even more, important information. The goal, given three classes, could be to establish, with a certain level of confidence, if the third one is similar to the first or the second one. In this work we show that Relevance Vector Machine (RVM) [2] could be a possible solutions to the limitation of standard supervised classification. In fact, RVM offers many advantages compared, for example, with his well-known precursor (Support Vector Machine - SVM [3]). Among these advantages, the estimate of posterior probability of class membership represents a key feature to address the similarity issue. This is a highly important, but often overlooked, option of any practical pattern recognition system. We focused on Tumor-Grade-three-class problem, so we have 67 samples of grade I (G1), 54 samples of grade 3 (G3) and 100 samples of grade 2 (G2). The goal is to find a model able to separate G1 from G3, then evaluate the third class G2 as test-set to obtain the probability for samples of G2 to be member of class G1 or class G3. The analysis showed that breast cancer samples of grade II have a molecular profile more similar to breast cancer samples of grade I. Looking at the literature this result have been guessed, but no measure of significance was gived before.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work provides a forward step in the study and comprehension of the relationships between stochastic processes and a certain class of integral-partial differential equation, which can be used in order to model anomalous diffusion and transport in statistical physics. In the first part, we brought the reader through the fundamental notions of probability and stochastic processes, stochastic integration and stochastic differential equations as well. In particular, within the study of H-sssi processes, we focused on fractional Brownian motion (fBm) and its discrete-time increment process, the fractional Gaussian noise (fGn), which provide examples of non-Markovian Gaussian processes. The fGn, together with stationary FARIMA processes, is widely used in the modeling and estimation of long-memory, or long-range dependence (LRD). Time series manifesting long-range dependence, are often observed in nature especially in physics, meteorology, climatology, but also in hydrology, geophysics, economy and many others. We deepely studied LRD, giving many real data examples, providing statistical analysis and introducing parametric methods of estimation. Then, we introduced the theory of fractional integrals and derivatives, which indeed turns out to be very appropriate for studying and modeling systems with long-memory properties. After having introduced the basics concepts, we provided many examples and applications. For instance, we investigated the relaxation equation with distributed order time-fractional derivatives, which describes models characterized by a strong memory component and can be used to model relaxation in complex systems, which deviates from the classical exponential Debye pattern. Then, we focused in the study of generalizations of the standard diffusion equation, by passing through the preliminary study of the fractional forward drift equation. Such generalizations have been obtained by using fractional integrals and derivatives of distributed orders. In order to find a connection between the anomalous diffusion described by these equations and the long-range dependence, we introduced and studied the generalized grey Brownian motion (ggBm), which is actually a parametric class of H-sssi processes, which have indeed marginal probability density function evolving in time according to a partial integro-differential equation of fractional type. The ggBm is of course Non-Markovian. All around the work, we have remarked many times that, starting from a master equation of a probability density function f(x,t), it is always possible to define an equivalence class of stochastic processes with the same marginal density function f(x,t). All these processes provide suitable stochastic models for the starting equation. Studying the ggBm, we just focused on a subclass made up of processes with stationary increments. The ggBm has been defined canonically in the so called grey noise space. However, we have been able to provide a characterization notwithstanding the underline probability space. We also pointed out that that the generalized grey Brownian motion is a direct generalization of a Gaussian process and in particular it generalizes Brownain motion and fractional Brownain motion as well. Finally, we introduced and analyzed a more general class of diffusion type equations related to certain non-Markovian stochastic processes. We started from the forward drift equation, which have been made non-local in time by the introduction of a suitable chosen memory kernel K(t). The resulting non-Markovian equation has been interpreted in a natural way as the evolution equation of the marginal density function of a random time process l(t). We then consider the subordinated process Y(t)=X(l(t)) where X(t) is a Markovian diffusion. The corresponding time-evolution of the marginal density function of Y(t) is governed by a non-Markovian Fokker-Planck equation which involves the same memory kernel K(t). We developed several applications and derived the exact solutions. Moreover, we considered different stochastic models for the given equations, providing path simulations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

[EN] This paper deals with the study of some new properties of the intrinsic order graph. The intrinsic order graph is the natural graphical representation of a complex stochastic Boolean system (CSBS). A CSBS is a system depending on an arbitrarily large number n of mutually independent random Boolean variables. The intrinsic order graph displays its 2n vertices (associated to the CSBS) from top to bottom, in decreasing order of their occurrence probabilities. New relations between the intrinsic ordering and the Hamming weight (i.e., the number of 1-bits in a binary n-tuple) are derived. Further, the distribution of the weights of the 2n nodes in the intrinsic order graph is analyzed…

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the present work we perform an econometric analysis of the Tribal art market. To this aim, we use a unique and original database that includes information on Tribal art market auctions worldwide from 1998 to 2011. In Literature, art prices are modelled through the hedonic regression model, a classic fixed-effect model. The main drawback of the hedonic approach is the large number of parameters, since, in general, art data include many categorical variables. In this work, we propose a multilevel model for the analysis of Tribal art prices that takes into account the influence of time on artwork prices. In fact, it is natural to assume that time exerts an influence over the price dynamics in various ways. Nevertheless, since the set of objects change at every auction date, we do not have repeated measurements of the same items over time. Hence, the dataset does not constitute a proper panel; rather, it has a two-level structure in that items, level-1 units, are grouped in time points, level-2 units. The main theoretical contribution is the extension of classical multilevel models to cope with the case described above. In particular, we introduce a model with time dependent random effects at the second level. We propose a novel specification of the model, derive the maximum likelihood estimators and implement them through the E-M algorithm. We test the finite sample properties of the estimators and the validity of the own-written R-code by means of a simulation study. Finally, we show that the new model improves considerably the fit of the Tribal art data with respect to both the hedonic regression model and the classic multilevel model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis deals with heterogeneous architectures in standard workstations. Heterogeneous architectures represent an appealing alternative to traditional supercomputers because they are based on commodity components fabricated in large quantities. Hence their price-performance ratio is unparalleled in the world of high performance computing (HPC). In particular, different aspects related to the performance and consumption of heterogeneous architectures have been explored. The thesis initially focuses on an efficient implementation of a parallel application, where the execution time is dominated by an high number of floating point instructions. Then the thesis touches the central problem of efficient management of power peaks in heterogeneous computing systems. Finally it discusses a memory-bounded problem, where the execution time is dominated by the memory latency. Specifically, the following main contributions have been carried out: A novel framework for the design and analysis of solar field for Central Receiver Systems (CRS) has been developed. The implementation based on desktop workstation equipped with multiple Graphics Processing Units (GPUs) is motivated by the need to have an accurate and fast simulation environment for studying mirror imperfection and non-planar geometries. Secondly, a power-aware scheduling algorithm on heterogeneous CPU-GPU architectures, based on an efficient distribution of the computing workload to the resources, has been realized. The scheduler manages the resources of several computing nodes with a view to reducing the peak power. The two main contributions of this work follow: the approach reduces the supply cost due to high peak power whilst having negligible impact on the parallelism of computational nodes. from another point of view the developed model allows designer to increase the number of cores without increasing the capacity of the power supply unit. Finally, an implementation for efficient graph exploration on reconfigurable architectures is presented. The purpose is to accelerate graph exploration, reducing the number of random memory accesses.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The ability to represent the transport and fate of an oil slick at the sea surface is a formidable task. By using an accurate numerical representation of oil evolution and movement in seawater, the possibility to asses and reduce the oil-spill pollution risk can be greatly improved. The blowing of the wind on the sea surface generates ocean waves, which give rise to transport of pollutants by wave-induced velocities that are known as Stokes’ Drift velocities. The Stokes’ Drift transport associated to a random gravity wave field is a function of the wave Energy Spectra that statistically fully describe it and that can be provided by a wave numerical model. Therefore, in order to perform an accurate numerical simulation of the oil motion in seawater, a coupling of the oil-spill model with a wave forecasting model is needed. In this Thesis work, the coupling of the MEDSLIK-II oil-spill numerical model with the SWAN wind-wave numerical model has been performed and tested. In order to improve the knowledge of the wind-wave model and its numerical performances, a preliminary sensitivity study to different SWAN model configuration has been carried out. The SWAN model results have been compared with the ISPRA directional buoys located at Venezia, Ancona and Monopoli and the best model settings have been detected. Then, high resolution currents provided by a relocatable model (SURF) have been used to force both the wave and the oil-spill models and its coupling with the SWAN model has been tested. The trajectories of four drifters have been simulated by using JONSWAP parametric spectra or SWAN directional-frequency energy output spectra and results have been compared with the real paths traveled by the drifters.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis deals with three different physical models, where each model involves a random component which is linked to a cubic lattice. First, a model is studied, which is used in numerical calculations of Quantum Chromodynamics.In these calculations random gauge-fields are distributed on the bonds of the lattice. The formulation of the model is fitted into the mathematical framework of ergodic operator families. We prove, that for small coupling constants, the ergodicity of the underlying probability measure is indeed ensured and that the integrated density of states of the Wilson-Dirac operator exists. The physical situations treated in the next two chapters are more similar to one another. In both cases the principle idea is to study a fermion system in a cubic crystal with impurities, that are modeled by a random potential located at the lattice sites. In the second model we apply the Hartree-Fock approximation to such a system. For the case of reduced Hartree-Fock theory at positive temperatures and a fixed chemical potential we consider the limit of an infinite system. In that case we show the existence and uniqueness of minimizers of the Hartree-Fock functional. In the third model we formulate the fermion system algebraically via C*-algebras. The question imposed here is to calculate the heat production of the system under the influence of an outer electromagnetic field. We show that the heat production corresponds exactly to what is empirically predicted by Joule's law in the regime of linear response.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In questa tesi si è studiato l’insorgere di eventi critici in un semplice modello neurale del tipo Integrate and Fire, basato su processi dinamici stocastici markoviani definiti su una rete. Il segnale neurale elettrico è stato modellato da un flusso di particelle. Si è concentrata l’attenzione sulla fase transiente del sistema, cercando di identificare fenomeni simili alla sincronizzazione neurale, la quale può essere considerata un evento critico. Sono state studiate reti particolarmente semplici, trovando che il modello proposto ha la capacità di produrre effetti "a cascata" nell’attività neurale, dovuti a Self Organized Criticality (auto organizzazione del sistema in stati instabili); questi effetti non vengono invece osservati in Random Walks sulle stesse reti. Si è visto che un piccolo stimolo random è capace di generare nell’attività della rete delle fluttuazioni notevoli, in particolar modo se il sistema si trova in una fase al limite dell’equilibrio. I picchi di attività così rilevati sono stati interpretati come valanghe di segnale neurale, fenomeno riconducibile alla sincronizzazione.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a novel methodology to generate realistic network flow traces to enable systematic evaluation of network monitoring systems in various traffic conditions. Our technique uses a graph-based approach to model the communication structure observed in real-world traces and to extract traffic templates. By combining extracted and user-defined traffic templates, realistic network flow traces that comprise normal traffic and customized conditions are generated in a scalable manner. A proof-of-concept implementation demonstrates the utility and simplicity of our method to produce a variety of evaluation scenarios. We show that the extraction of templates from real-world traffic leads to a manageable number of templates that still enable accurate re-creation of the original communication properties on the network flow level.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We use a conceptual model to investigate how randomly varying building heights within a city affect the atmospheric drag forces and the aerodynamic roughness length of the city. The model is based on the assumptions regarding wake spreading and mutual sheltering effects proposed by Raupach (Boundary-Layer Meteorol 60:375-395, 1992). It is applied both to canopies having uniform building heights and to those having the same building density and mean height, but with variability about the mean. For each simulated urban area, a correction is determined, due to height variability, to the shear stress predicted for the uniform building height case. It is found that u (*)/u (*R) , where u (*) is the friction velocity and u (*R) is the friction velocity from the uniform building height case, is expressed well as an algebraic function of lambda and sigma (h) /h (m) , where lambda is the frontal area index, sigma (h) is the standard deviation of the building height, and h (m) is the mean building height. The simulations also resulted in a simple algebraic relation for z (0)/z (0R) as a function of lambda and sigma (h) /h (m) , where z (0) is the aerodynamic roughness length and z (0R) is z (0) found from the original Raupach formulation for a uniform canopy. Model results are in keeping with those of several previous studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Full axon counting of optic nerve cross-sections represents the most accurate method to quantify axonal damage, but such analysis is very labour intensive. Recently, a new method has been developed, termed targeted sampling, which combines the salient features of a grading scheme with axon counting. Preliminary findings revealed the method compared favourably with random sampling. The aim of the current study was to advance our understanding of the effect of sampling patterns on axon counts by comparing estimated axon counts from targeted sampling with those obtained from fixed-pattern sampling in a large collection of optic nerves with different severities of axonal injury.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Optical coherence tomography (OCT) is a well-established image modality in ophthalmology and used daily in the clinic. Automatic evaluation of such datasets requires an accurate segmentation of the retinal cell layers. However, due to the naturally low signal to noise ratio and the resulting bad image quality, this task remains challenging. We propose an automatic graph-based multi-surface segmentation algorithm that internally uses soft constraints to add prior information from a learned model. This improves the accuracy of the segmentation and increase the robustness to noise. Furthermore, we show that the graph size can be greatly reduced by applying a smart segmentation scheme. This allows the segmentation to be computed in seconds instead of minutes, without deteriorating the segmentation accuracy, making it ideal for a clinical setup. An extensive evaluation on 20 OCT datasets of healthy eyes was performed and showed a mean unsigned segmentation error of 3.05 ±0.54 μm over all datasets when compared to the average observer, which is lower than the inter-observer variability. Similar performance was measured for the task of drusen segmentation, demonstrating the usefulness of using soft constraints as a tool to deal with pathologies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A cascading failure is a failure in a system of interconnected parts, in which the breakdown of one element can lead to the subsequent collapse of the others. The aim of this paper is to introduce a simple combinatorial model for the study of cascading failures. In particular, having in mind particle systems and Markov random fields, we take into consideration a network of interacting urns displaced over a lattice. Every urn is Pólya-like and its reinforcement matrix is not only a function of time (time contagion) but also of the behavior of the neighboring urns (spatial contagion), and of a random component, which can represent either simple fate or the impact of exogenous factors. In this way a non-trivial dependence structure among the urns is built, and it is used to study default avalanches over the lattice. Thanks to its flexibility and its interesting probabilistic properties, the given construction may be used to model different phenomena characterized by cascading failures such as power grids and financial networks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The problem of estimating the numbers of motor units N in a muscle is embedded in a general stochastic model using the notion of thinning from point process theory. In the paper a new moment type estimator for the numbers of motor units in a muscle is denned, which is derived using random sums with independently thinned terms. Asymptotic normality of the estimator is shown and its practical value is demonstrated with bootstrap and approximative confidence intervals for a data set from a 31-year-old healthy right-handed, female volunteer. Moreover simulation results are presented and Monte-Carlo based quantiles, means, and variances are calculated for N in{300,600,1000}.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Despite the widespread popularity of linear models for correlated outcomes (e.g. linear mixed modesl and time series models), distribution diagnostic methodology remains relatively underdeveloped in this context. In this paper we present an easy-to-implement approach that lends itself to graphical displays of model fit. Our approach involves multiplying the estimated marginal residual vector by the Cholesky decomposition of the inverse of the estimated marginal variance matrix. Linear functions or the resulting "rotated" residuals are used to construct an empirical cumulative distribution function (ECDF), whose stochastic limit is characterized. We describe a resampling technique that serves as a computationally efficient parametric bootstrap for generating representatives of the stochastic limit of the ECDF. Through functionals, such representatives are used to construct global tests for the hypothesis of normal margional errors. In addition, we demonstrate that the ECDF of the predicted random effects, as described by Lange and Ryan (1989), can be formulated as a special case of our approach. Thus, our method supports both omnibus and directed tests. Our method works well in a variety of circumstances, including models having independent units of sampling (clustered data) and models for which all observations are correlated (e.g., a single time series).