Biblioteca Digital

897 resultados para Transformation-based semi-parametric estimators

Particle competition and cooperation for semi-supervised learning with label noise

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Veja mais

Particle Competition and Cooperation in Networks for Semi-Supervised Learning

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Semi-supervised learning is one of the important topics in machine learning, concerning with pattern classification where only a small subset of data is labeled. In this paper, a new network-based (or graph-based) semi-supervised classification model is proposed. It employs a combined random-greedy walk of particles, with competition and cooperation mechanisms, to propagate class labels to the whole network. Due to the competition mechanism, the proposed model has a local label spreading fashion, i.e., each particle only visits a portion of nodes potentially belonging to it, while it is not allowed to visit those nodes definitely occupied by particles of other classes. In this way, a "divide-and-conquer" effect is naturally embedded in the model. As a result, the proposed model can achieve a good classification rate while exhibiting low computational complexity order in comparison to other network-based semi-supervised algorithms. Computer simulations carried out for synthetic and real-world data sets provide a numeric quantification of the performance of the method.

Veja mais

Semi-supervised learning guided by the modularity measure in complex networks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Semi-supervised learning techniques have gained increasing attention in the machine learning community, as a result of two main factors: (1) the available data is exponentially increasing; (2) the task of data labeling is cumbersome and expensive, involving human experts in the process. In this paper, we propose a network-based semi-supervised learning method inspired by the modularity greedy algorithm, which was originally applied for unsupervised learning. Changes have been made in the process of modularity maximization in a way to adapt the model to propagate labels throughout the network. Furthermore, a network reduction technique is introduced, as well as an extensive analysis of its impact on the network. Computer simulations are performed for artificial and real-world databases, providing a numerical quantitative basis for the performance of the proposed method.

Veja mais

Adaptive T-spline refinement for isogeometric analysis in planar geometries

Relevância:

100.00% 100.00%

Publicador:

Resumo:

[EN]We present a new strategy, based on the meccano method [1, 2, 3], to construct a T-spline parameterization of 2D geometries for the application of isogeometric analysis. The proposed method only demands a boundary representation of the geometry as input data. The algorithm obtains, as a result, high quality parametric transformation between 2D objects and the parametric domain, the unit square. The key of the method lies in defining an isomorphic transformation between the parametric and physical T-mesh finding the optimal position of the interior nodes by applying a new T-mesh untangling and smoothing procedure. Bivariate T-spline representation is calculated by imposing the interpolation conditions on points sited both on the interior and on the boundary of the geometry…

Veja mais

Modelli non lineari statici e dinamici su dati microeconomici: analisi delle condizioni finanziarie delle famiglie italiane

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The thesis studies the economic and financial conditions of Italian households, by using microeconomic data of the Survey on Household Income and Wealth (SHIW) over the period 1998-2006. It develops along two lines of enquiry. First it studies the determinants of households holdings of assets and liabilities and estimates their correlation degree. After a review of the literature, it estimates two non-linear multivariate models on the interactions between assets and liabilities with repeated cross-sections. Second, it analyses households financial difficulties. It defines a quantitative measure of financial distress and tests, by means of non-linear dynamic probit models, whether the probability of experiencing financial difficulties is persistent over time. Chapter 1 provides a critical review of the theoretical and empirical literature on the estimation of assets and liabilities holdings, on their interactions and on households net wealth. The review stresses the fact that a large part of the literature explain households debt holdings as a function, among others, of net wealth, an assumption that runs into possible endogeneity problems. Chapter 2 defines two non-linear multivariate models to study the interactions between assets and liabilities held by Italian households. Estimation refers to a pooling of cross-sections of SHIW. The first model is a bivariate tobit that estimates factors affecting assets and liabilities and their degree of correlation with results coherent with theoretical expectations. To tackle the presence of non normality and heteroskedasticity in the error term, generating non consistent tobit estimators, semi-parametric estimates are provided that confirm the results of the tobit model. The second model is a quadrivariate probit on three different assets (safe, risky and real) and total liabilities; the results show the expected patterns of interdependence suggested by theoretical considerations. Chapter 3 reviews the methodologies for estimating non-linear dynamic panel data models, drawing attention to the problems to be dealt with to obtain consistent estimators. Specific attention is given to the initial condition problem raised by the inclusion of the lagged dependent variable in the set of explanatory variables. The advantage of using dynamic panel data models lies in the fact that they allow to simultaneously account for true state dependence, via the lagged variable, and unobserved heterogeneity via individual effects specification. Chapter 4 applies the models reviewed in Chapter 3 to analyse financial difficulties of Italian households, by using information on net wealth as provided in the panel component of the SHIW. The aim is to test whether households persistently experience financial difficulties over time. A thorough discussion is provided of the alternative approaches proposed by the literature (subjective/qualitative indicators versus quantitative indexes) to identify households in financial distress. Households in financial difficulties are identified as those holding amounts of net wealth lower than the value corresponding to the first quartile of net wealth distribution. Estimation is conducted via four different methods: the pooled probit model, the random effects probit model with exogenous initial conditions, the Heckman model and the recently developed Wooldridge model. Results obtained from all estimators accept the null hypothesis of true state dependence and show that, according with the literature, less sophisticated models, namely the pooled and exogenous models, over-estimate such persistence.

Veja mais

Uncertainty About the Incubation Period of AIDS and its Impact on Backcalculation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We analyze three sets of doubly-censored cohort data on incubation times, estimating incubation distributions using semi-parametric methods and assessing the comparability of the estimates. Weibull models appear to be inappropriate for at least one of the cohorts, and the estimates for the different cohorts are substantially different. We use these estimates as inputs for backcalculation, using a nonparametric method based on maximum penalized likelihood. The different incubations all produce fits to the reported AIDS counts that are as good as the fit from a nonstationary incubation distribution that models treatment effects, but the estimated infection curves are very different. We also develop a method for estimating nonstationarity as part of the backcalculation procedure and find that such estimates also depend very heavily on the assumed incubation distribution. We conclude that incubation distributions are so uncertain that meaningful error bounds are difficult to place on backcalculated estimates and that backcalculation may be too unreliable to be used without being supplemented by other sources of information in HIV prevalence and incidence.

Veja mais

The Open Door Mission: Measuring and predicting outcomes of one community-based substance abuse treatment program

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objectives of this study were to identify and measure the average outcomes of the Open Door Mission's nine-month community-based substance abuse treatment program, identify predictors of successful outcomes, and make recommendations to the Open Door Mission for improving its treatment program.^ The Mission's program is exclusive to adult men who have limited financial resources: most of which were homeless or dependent on parents or other family members for basic living needs. Many, but not all, of these men are either chemically dependent or have a history of substance abuse.^ This study tracked a cohort of the Mission's graduates throughout this one-year study and identified various indicators of success at short-term intervals, which may be predictive of longer-term outcomes. We tracked various levels of 12-step program involvement, as well as other social and spiritual activities, such as church affiliation and recovery support.^ Twenty-four of the 66 subjects, or 36% met the Mission's requirements for success. Specific to this success criteria; Fifty-four, or 82% reported affiliation with a home church; Twenty-six, or 39% reported full-time employment; Sixty-one, or 92% did not report or were not identified as having any post-treatment arrests or incarceration, and; Forty, or 61% reported continuous abstinence from both drugs and alcohol.^ Five research-based hypotheses were developed and tested. The primary analysis tool was the web-based non-parametric dependency modeling tool, B-Course, which revealed some strong associations with certain variables, and helped the researchers generate and test several data-driven hypotheses. Full-time employment is the greatest predictor of abstinence: 95% of those who reported full time employment also reported continuous post-treatment abstinence, while 50% of those working part-time were abstinent and 29% of those with no employment were abstinent. Working with a 12-step sponsor, attending aftercare, and service with others were identified as predictors of abstinence.^ This study demonstrates that associations with abstinence and the ODM success criteria are not simply based on one social or behavioral factor. Rather, these relationships are interdependent, and show that abstinence is achieved and maintained through a combination of several 12-step recovery activities. This study used a simple assessment methodology, which demonstrated strong associations across variables and outcomes, which have practical applicability to the Open Door Mission for improving its treatment program. By leveraging the predictive capability of the various success determination methodologies discussed and developed throughout this study, we can identify accurate outcomes with both validity and reliability. This assessment instrument can also be used as an intervention that, if operationalized to the Mission’s clients during the primary treatment program, may measurably improve the effectiveness and outcomes of the Open Door Mission.^

Veja mais

Estimadores bayesianos de la fiabilidad con muestreo censurado

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El estudio de la fiabilidad de componentes y sistemas tiene gran importancia en diversos campos de la ingenieria, y muy concretamente en el de la informatica. Al analizar la duracion de los elementos de la muestra hay que tener en cuenta los elementos que no fallan en el tiempo que dure el experimento, o bien los que fallen por causas distintas a la que es objeto de estudio. Por ello surgen nuevos tipos de muestreo que contemplan estos casos. El mas general de ellos, el muestreo censurado, es el que consideramos en nuestro trabajo. En este muestreo tanto el tiempo hasta que falla el componente como el tiempo de censura son variables aleatorias. Con la hipotesis de que ambos tiempos se distribuyen exponencialmente, el profesor Hurt estudio el comportamiento asintotico del estimador de maxima verosimilitud de la funcion de fiabilidad. En principio parece interesante utilizar metodos Bayesianos en el estudio de la fiabilidad porque incorporan al analisis la informacion a priori de la que se dispone normalmente en problemas reales. Por ello hemos considerado dos estimadores Bayesianos de la fiabilidad de una distribucion exponencial que son la media y la moda de la distribucion a posteriori. Hemos calculado la expansion asint6tica de la media, varianza y error cuadratico medio de ambos estimadores cuando la distribuci6n de censura es exponencial. Hemos obtenido tambien la distribucion asintotica de los estimadores para el caso m3s general de que la distribucion de censura sea de Weibull. Dos tipos de intervalos de confianza para muestras grandes se han propuesto para cada estimador. Los resultados se han comparado con los del estimador de maxima verosimilitud, y con los de dos estimadores no parametricos: limite producto y Bayesiano, resultando un comportamiento superior por parte de uno de nuestros estimadores. Finalmente nemos comprobado mediante simulacion que nuestros estimadores son robustos frente a la supuesta distribuci6n de censura, y que uno de los intervalos de confianza propuestos es valido con muestras pequenas. Este estudio ha servido tambien para confirmar el mejor comportamiento de uno de nuestros estimadores. SETTING OUT AND SUMMARY OF THE THESIS When we study the lifetime of components it's necessary to take into account the elements that don't fail during the experiment, or those that fail by reasons which are desirable to exclude from consideration. The model of random censorship is very usefull for analysing these data. In this model the time to failure and the time censor are random variables. We obtain two Bayes estimators of the reliability function of an exponential distribution based on randomly censored data. We have calculated the asymptotic expansion of the mean, variance and mean square error of both estimators, when the censor's distribution is exponential. We have obtained also the asymptotic distribution of the estimators for the more general case of censor's Weibull distribution. Two large-sample confidence bands have been proposed for each estimator. The results have been compared with those of the maximum likelihood estimator, and with those of two non parametric estimators: Product-limit and Bayesian. One of our estimators has the best behaviour. Finally we have shown by simulation, that our estimators are robust against the assumed censor's distribution, and that one of our intervals does well in small sample situation.

Veja mais

Effectiveness of abstract interpretation in automatic parallelization: a case study in logic programming

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We report on a detailed study of the application and effectiveness of program analysis based on abstract interpretation to automatic program parallelization. We study the case of parallelizing logic programs using the notion of strict independence. We first propose and prove correct a methodology for the application in the parallelization task of the information inferred by abstract interpretation, using a parametric domain. The methodology is generic in the sense of allowing the use of different analysis domains. A number of well-known approximation domains are then studied and the transformation into the parametric domain defined. The transformation directly illustrates the relevance and applicability of each abstract domain for the application. Both local and global analyzers are then built using these domains and embedded in a complete parallelizing compiler. Then, the performance of the domains in this context is assessed through a number of experiments. A comparatively wide range of aspects is studied, from the resources needed by the analyzers in terms of time and memory to the actual benefits obtained from the information inferred. Such benefits are evaluated both in terms of the characteristics of the parallelized code and of the actual speedups obtained from it. The results show that data flow analysis plays an important role in achieving efficient parallelizations, and that the cost of such analysis can be reasonable even for quite sophisticated abstract domains. Furthermore, the results also offer significant insight into the characteristics of the domains, the demands of the application, and the trade-offs involved.

Veja mais

Experiências do desenvolvimento de transformador para alta temperatura baseado em isolação semi-híbrida e óleo vegetal isolante.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

O transformador de potência é um importante equipamento utilizado no sistema elétrico de potência, responsável por transmitir energia elétrica ou potência elétrica de um circuito a outro e transformar tensões e correntes de um circuito elétrico. O transformador de potência tem ampla aplicação, podendo ser utilizado em subestações de usinas de geração, transmissão e distribuição. Neste sentido, mudanças recentes ocorridas no sistema elétrico brasileiro, causadas principalmente pelo aumento considerável de carga e pelo desenvolvimento tecnológico tem proporcionado a fabricação de um transformador com a aplicação de alta tecnologia, aumentando a confiabilidade deste equipamento e, em paralelo, a redução do seu custo global. Tradicionalmente, os transformadores são fabricados com um sistema de isolação que associa isolantes sólidos e celulose, ambos, imersos em óleo mineral isolante, constituição esta que define um limite à temperatura operacional contínua. No entanto, ao se substituir este sistema de isolação formado por papel celulose e óleo mineral isolante por um sistema de isolação semi- híbrida - aplicação de papel NOMEX e óleo vegetal isolante, a capacidade de carga do transformador pode ser aumentada por suportar maiores temperaturas. Desta forma, o envelhecimento do sistema de isolação poderá ser em longo prazo, significativamente reduzido. Esta técnica de aumentar os limites térmicos do transformador pode eliminar, essencialmente, as restrições térmicas associadas à isolação celulósica, provendo uma solução econômica para aperfeiçoar o uso de transformadores de potência, aumentando a sua confiabilidade operacional. Adicionalmente, à aplicação de sensores de fibra óptica, em substituição aos sensores de imagem térmica no monitoramento das temperaturas internas do transformador, se apresentam como importante opção na definição do equacionamento do comportamento do transformador sob o ponto de vista térmico.

Veja mais

La calidad y su impacto sobre la rentabilidad y la volatilidad

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El objetivo del presente estudio consiste en analizar el impacto que la publicación de la noticia de obtención de un certificado de calidad (ISO 9000) tiene sobre el valor de mercado de la empresa y sobre la volatilidad del precio de cotización de las acciones. La muestra utilizada incluye todas las empresas que, habiendo obtenido un certificado de calidad, han cotizado en el mercado secundario de valores español entre los años 1993 y 1999. Para medir el impacto de la obtención un certificado de calidad sobre los resultados se ha analizado los excesos de rentabilidad, mientras para medir la variación en la volatilidad se han realizado cuatro test, dos paramétricos, uno no paramétrico y una propuesta de test semiparamétrico. Los resultados indican que el mercado de capitales reacciona positivamente a la obtención de este certificado, provocando además un incremento en la volatilidad de los precios de cotización.

Veja mais

Influencia de la calidad sobre la rentabilidad y la volatilidad

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El objetivo del presente estudio consiste en analizar el impacto que la publicación de la noticia de obtención de un certificado de calidad (ISO 9000) tiene sobre el valor de mercado de la empresa y sobre la volatilidad del precio de cotización de las acciones. Adicionalmente se examinan diversos factores determinantes del impacto de la obtención del certificado sobre la rentabilidad. La muestra utilizada incluye todas las empresas que, habiendo cotizado en el mercado continuo entre los años 1993 y 1999, han obtenido un certificado de calidad. Para medir el impacto de la obtención de un certificado de calidad sobre los resultados se ha analizado los excesos de rentabilidad, mientras que para medir la variación en la volatilidad se han realizado cuatro test, dos paramétricos, uno no paramétrico y una propuesta de test semiparamétrico. Los resultados indican que el mercado reacciona positivamente a la obtención de este certificado, provocando además un incremento en la volatilidad de los precios de cotización.

Veja mais

Predictive gene lists for breast cancer prognosis: a topographic visualisation study

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: The controversy surrounding the non-uniqueness of predictive gene lists (PGL) of small selected subsets of genes from very large potential candidates as available in DNA microarray experiments is now widely acknowledged 1. Many of these studies have focused on constructing discriminative semi-parametric models and as such are also subject to the issue of random correlations of sparse model selection in high dimensional spaces. In this work we outline a different approach based around an unsupervised patient-specific nonlinear topographic projection in predictive gene lists. Methods: We construct nonlinear topographic projection maps based on inter-patient gene-list relative dissimilarities. The Neuroscale, the Stochastic Neighbor Embedding(SNE) and the Locally Linear Embedding(LLE) techniques have been used to construct two-dimensional projective visualisation plots of 70 dimensional PGLs per patient, classifiers are also constructed to identify the prognosis indicator of each patient using the resulting projections from those visualisation techniques and investigate whether a-posteriori two prognosis groups are separable on the evidence of the gene lists. A literature-proposed predictive gene list for breast cancer is benchmarked against a separate gene list using the above methods. Generalisation ability is investigated by using the mapping capability of Neuroscale to visualise the follow-up study, but based on the projections derived from the original dataset. Results: The results indicate that small subsets of patient-specific PGLs have insufficient prognostic dissimilarity to permit a distinction between two prognosis patients. Uncertainty and diversity across multiple gene expressions prevents unambiguous or even confident patient grouping. Comparative projections across different PGLs provide similar results. Conclusion: The random correlation effect to an arbitrary outcome induced by small subset selection from very high dimensional interrelated gene expression profiles leads to an outcome with associated uncertainty. This continuum and uncertainty precludes any attempts at constructing discriminative classifiers. However a patient's gene expression profile could possibly be used in treatment planning, based on knowledge of other patients' responses. We conclude that many of the patients involved in such medical studies are intrinsically unclassifiable on the basis of provided PGL evidence. This additional category of 'unclassifiable' should be accommodated within medical decision support systems if serious errors and unnecessary adjuvant therapy are to be avoided.

Veja mais

Nonparametric versus Parametric Statistical Approaches for Genetic Anticipation: The Pancreatic Cancer Case

Relevância:

100.00% 100.00%

Publicador:

Resumo:

2000 Mathematics Subject Classi cation: 62N01, 62N05, 62P10, 92D10, 92D30.

Veja mais

Probabilistic methods for seasonal forecasting in a changing climate: Cox-type regression models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For climate risk management, cumulative distribution functions (CDFs) are an important source of information. They are ideally suited to compare probabilistic forecasts of primary (e.g. rainfall) or secondary data (e.g. crop yields). Summarised as CDFs, such forecasts allow an easy quantitative assessment of possible, alternative actions. Although the degree of uncertainty associated with CDF estimation could influence decisions, such information is rarely provided. Hence, we propose Cox-type regression models (CRMs) as a statistical framework for making inferences on CDFs in climate science. CRMs were designed for modelling probability distributions rather than just mean or median values. This makes the approach appealing for risk assessments where probabilities of extremes are often more informative than central tendency measures. CRMs are semi-parametric approaches originally designed for modelling risks arising from time-to-event data. Here we extend this original concept beyond time-dependent measures to other variables of interest. We also provide tools for estimating CDFs and surrounding uncertainty envelopes from empirical data. These statistical techniques intrinsically account for non-stationarities in time series that might be the result of climate change. This feature makes CRMs attractive candidates to investigate the feasibility of developing rigorous global circulation model (GCM)-CRM interfaces for provision of user-relevant forecasts. To demonstrate the applicability of CRMs, we present two examples for El Ni ? no/Southern Oscillation (ENSO)-based forecasts: the onset date of the wet season (Cairns, Australia) and total wet season rainfall (Quixeramobim, Brazil). This study emphasises the methodological aspects of CRMs rather than discussing merits or limitations of the ENSO-based predictors.

Veja mais

897 resultados para Transformation-based semi-parametric estimators

Filtro por publicador