972 resultados para r-functions
Resumo:
Interdependence is the main feature of dyadic relationships and, in recent years, various statistical procedures have been proposed for quantifying and testing this social attribute in different dyadic designs. The purpose of this paper is to develop several functions for this kind of statistical tests in an R package, known as nonindependence, for use by applied social researchers. A Graphical User Interface (GUI) is also developed to facilitate the use of the functions included in this package. Examples drawn from psychological research and simulated data are used to illustrate how the software works.
Resumo:
MSC 2010: 30C10, 32A30, 30G35
Resumo:
Copyright © 2013 Springer Netherlands.
Resumo:
Background: Gene expression analysis has emerged as a major biological research area, with real-time quantitative reverse transcription PCR (RT-QPCR) being one of the most accurate and widely used techniques for expression profiling of selected genes. In order to obtain results that are comparable across assays, a stable normalization strategy is required. In general, the normalization of PCR measurements between different samples uses one to several control genes (e. g. housekeeping genes), from which a baseline reference level is constructed. Thus, the choice of the control genes is of utmost importance, yet there is not a generally accepted standard technique for screening a large number of candidates and identifying the best ones. Results: We propose a novel approach for scoring and ranking candidate genes for their suitability as control genes. Our approach relies on publicly available microarray data and allows the combination of multiple data sets originating from different platforms and/or representing different pathologies. The use of microarray data allows the screening of tens of thousands of genes, producing very comprehensive lists of candidates. We also provide two lists of candidate control genes: one which is breast cancer-specific and one with more general applicability. Two genes from the breast cancer list which had not been previously used as control genes are identified and validated by RT-QPCR. Open source R functions are available at http://www.isrec.isb-sib.ch/similar to vpopovic/research/ Conclusion: We proposed a new method for identifying candidate control genes for RT-QPCR which was able to rank thousands of genes according to some predefined suitability criteria and we applied it to the case of breast cancer. We also empirically showed that translating the results from microarray to PCR platform was achievable.
Resumo:
With the growth in new technologies, using online tools have become an everyday lifestyle. It has a greater impact on researchers as the data obtained from various experiments needs to be analyzed and knowledge of programming has become mandatory even for pure biologists. Hence, VTT came up with a new tool, R Executables (REX) which is a web application designed to provide a graphical interface for biological data functions like Image analysis, Gene expression data analysis, plotting, disease and control studies etc., which employs R functions to provide results. REX provides a user interactive application for the biologists to directly enter the values and run the required analysis with a single click. The program processes the given data in the background and prints results rapidly. Due to growth of data and load on server, the interface has gained problems concerning time consumption, poor GUI, data storage issues, security, minimal user interactive experience and crashes with large amount of data. This thesis handles the methods by which these problems were resolved and made REX a better application for the future. The old REX was developed using Python Django and now, a new programming language, Vaadin has been implemented. Vaadin is a Java framework for developing web applications and the programming language is extremely similar to Java with new rich components. Vaadin provides better security, better speed, good and interactive interface. In this thesis, subset functionalities of REX was selected which includes IST bulk plotting and image segmentation and implemented those using Vaadin. A code of 662 lines was programmed by me which included Vaadin as the front-end handler while R language was used for back-end data retrieval, computing and plotting. The application is optimized to allow further functionalities to be migrated with ease from old REX. Future development is focused on including Hight throughput screening functions along with gene expression database handling
Resumo:
The effects. of moisture, cation concentration, dens ity , temper~ t ure and grai n si ze on the electrical resistivity of so il s are examined using laboratory prepared soils. An i nexpen si ve method for preparing soils of different compositions was developed by mixing various size fractions i n the laboratory. Moisture and cation c oncentration are related to soil resistivity by powe r functions, whereas soil resistiv ity and temperature, density, Yo gravel, sand , sil t, and clay are related by exponential functions . A total of 1066 cases (8528 data) from all the experiments were used in a step-wise multiple linear r egression to determine the effect of each variable on soil resistivity. Six variables out of the eight variables studied account for 92.57/. of the total variance in so il resistivity with a correlation coefficient of 0.96. The other two variables (silt and gravel) did not increase the · variance. Moisture content was found to be - the most important Yo clay. variable- affecting s oil res istivi ty followed by These two variables account for 90.81Yo of the total variance in soil resistivity with a correlation ~oefficient ·.of 0 . 95. Based on these results an equation to ' ~~ed{ ct soil r esist ivi ty using moisture and Yo clay is developed . To t est the predicted equation, resistivity measurements were made on natural soils both in s i tu a nd i n the laboratory. The data show that field and laboratory measurements are comparable. The predicted regression line c losely coinciqes with resistivity data from area A and area B soils ~clayey and silty~clayey sands). Resistivity data and the predicted regression line in the case of c layey soils (clays> 40%) do not coincide, especially a t l ess than 15% moisture. The regression equation overestimates the resistivity of so i l s from area C and underestimates for area D soils. Laboratory prepared high clay soils give similar trends. The deviations are probably caused by heterogeneous distribution of mo i sture and difference in the type o f cl ays present in these soils.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
The main aim of this Ph.D. dissertation is the study of clustering dependent data by means of copula functions with particular emphasis on microarray data. Copula functions are a popular multivariate modeling tool in each field where the multivariate dependence is of great interest and their use in clustering has not been still investigated. The first part of this work contains the review of the literature of clustering methods, copula functions and microarray experiments. The attention focuses on the K–means (Hartigan, 1975; Hartigan and Wong, 1979), the hierarchical (Everitt, 1974) and the model–based (Fraley and Raftery, 1998, 1999, 2000, 2007) clustering techniques because their performance is compared. Then, the probabilistic interpretation of the Sklar’s theorem (Sklar’s, 1959), the estimation methods for copulas like the Inference for Margins (Joe and Xu, 1996) and the Archimedean and Elliptical copula families are presented. In the end, applications of clustering methods and copulas to the genetic and microarray experiments are highlighted. The second part contains the original contribution proposed. A simulation study is performed in order to evaluate the performance of the K–means and the hierarchical bottom–up clustering methods in identifying clusters according to the dependence structure of the data generating process. Different simulations are performed by varying different conditions (e.g., the kind of margins (distinct, overlapping and nested) and the value of the dependence parameter ) and the results are evaluated by means of different measures of performance. In light of the simulation results and of the limits of the two investigated clustering methods, a new clustering algorithm based on copula functions (‘CoClust’ in brief) is proposed. The basic idea, the iterative procedure of the CoClust and the description of the written R functions with their output are given. The CoClust algorithm is tested on simulated data (by varying the number of clusters, the copula models, the dependence parameter value and the degree of overlap of margins) and is compared with the performance of model–based clustering by using different measures of performance, like the percentage of well–identified number of clusters and the not rejection percentage of H0 on . It is shown that the CoClust algorithm allows to overcome all observed limits of the other investigated clustering techniques and is able to identify clusters according to the dependence structure of the data independently of the degree of overlap of margins and the strength of the dependence. The CoClust uses a criterion based on the maximized log–likelihood function of the copula and can virtually account for any possible dependence relationship between observations. Many peculiar characteristics are shown for the CoClust, e.g. its capability of identifying the true number of clusters and the fact that it does not require a starting classification. Finally, the CoClust algorithm is applied to the real microarray data of Hedenfalk et al. (2001) both to the gene expressions observed in three different cancer samples and to the columns (tumor samples) of the whole data matrix.
Resumo:
BACKGROUND: Gene expression analysis has emerged as a major biological research area, with real-time quantitative reverse transcription PCR (RT-QPCR) being one of the most accurate and widely used techniques for expression profiling of selected genes. In order to obtain results that are comparable across assays, a stable normalization strategy is required. In general, the normalization of PCR measurements between different samples uses one to several control genes (e.g. housekeeping genes), from which a baseline reference level is constructed. Thus, the choice of the control genes is of utmost importance, yet there is not a generally accepted standard technique for screening a large number of candidates and identifying the best ones. RESULTS: We propose a novel approach for scoring and ranking candidate genes for their suitability as control genes. Our approach relies on publicly available microarray data and allows the combination of multiple data sets originating from different platforms and/or representing different pathologies. The use of microarray data allows the screening of tens of thousands of genes, producing very comprehensive lists of candidates. We also provide two lists of candidate control genes: one which is breast cancer-specific and one with more general applicability. Two genes from the breast cancer list which had not been previously used as control genes are identified and validated by RT-QPCR. Open source R functions are available at http://www.isrec.isb-sib.ch/~vpopovic/research/ CONCLUSION: We proposed a new method for identifying candidate control genes for RT-QPCR which was able to rank thousands of genes according to some predefined suitability criteria and we applied it to the case of breast cancer. We also empirically showed that translating the results from microarray to PCR platform was achievable.
Resumo:
Las funciones de segundo orden son cada vez más empleadas en el análisis de procesos ecológicos. En este trabajo presentamos dos funciones de 2º orden desarrolladas recientemente que permiten analizar la interacción espacio-temporal entre dos especies o tipos funcionales de individuos. Estas funciones han sido desarrolladas para el estudio de interacciones entre especies en masas forestales a partir de la actual distribución diamétrica de los árboles. La primera de ellas es la función bivariante para procesos de puntos con marca Krsmm, que permite analizar la correlación espacial de una variable entre los individuos pertenecientes a dos especies en función de la distancia. La segunda es la función de reemplazo , que permite analizar la asociación entre los individuos pertenecientes a dos especies en función de la diferencia entre sus diámetros u otra variable asociada a dichos individuos. Para mostrar el comportamiento de ambas funciones en el análisis de sistemas forestales en los que operan diferentes procesos ecológicos se presentan tres casos de estudio: una masa mixta de Pinus pinea L. y Pinus pinaster Ait. en la Meseta Norte, un bosque de niebla de la Región Tropical Andina y el ecotono entre las masas de Quercus pyrenaica Willd. y Pinus sylvestris L. en el Sistema Central, en los que tanto la función Krsmm como la función r se utilizan para analizar la dinámica forestal a partir de parcelas experimentales con todos los árboles localizados y de parcelas de inventario.
Resumo:
In the process of engineering design of structural shapes, the flat plate analysis results can be generalized to predict behaviors of complete structural shapes. In this case, the purpose of this project is to analyze a thin flat plate under conductive heat transfer and to simulate the temperature distribution, thermal stresses, total displacements, and buckling deformations. The current approach in these cases has been using the Finite Element Method (FEM), whose basis is the construction of a conforming mesh. In contrast, this project uses the mesh-free Scan Solve Method. This method eliminates the meshing limitation using a non-conforming mesh. I implemented this modeling process developing numerical algorithms and software tools to model thermally induced buckling. In addition, convergence analysis was achieved, and the results were compared with FEM. In conclusion, the results demonstrate that the method gives similar solutions to FEM in quality, but it is computationally less time consuming.
Resumo:
Partendo da un’analisi dei problemi che si incontrano nella fase di conceptual design, si presentano le diverse tecniche di modellazione tridimensionale, con particolare attenzione al metodo subdivision e agli algoritmi che lo governano (Chaikin, Doo – Sabin). Vengono poi proposti alcuni esempi applicativi della modellazione free form e skeleton, con una successiva comparazione, sugli stessi modelli, delle sequenze e operazioni necessarie con le tradizionali tecniche di modellazione parametrica. Si riporta un esempio dell’utilizzo del software IronCAD, il primo software a unire la modellazione parametrica e diretta. Si descrivono le limitazioni della modellazione parametrica e di quella history free nella fase concettuale di un progetto, per arrivare a definire le caratteristiche della hybrid modeling, nuovo approccio alla modellazione. Si presenta brevemente il prototipo, in fase di sviluppo, che tenta di applicare concretamente i concetti dell’hybrid modeling e che vuole essere la base di partenza per una nuova generazione di softwares CAD. Infine si presenta la possibilità di ottenere simulazioni real time su modelli che subiscono modifiche topologiche. La simulazione real time è permessa dalla ridefinizione in forma parametrica del problema lineare elastico che viene successivamente risolto mediante l’applicazione congiunta delle R – Functions e del metodo PGD. Seguono esempi di simulazione real time.
Resumo:
"Vegeu el resum a l'inici del document del fitxer adjunt."
Resumo:
[eng] In this paper we claim that capital is as important in the production of ideas as in the production of final goods. Hence, we introduce capital in the production of knowledge and discuss the associated problems arising from the public good nature of knowledge. We show that although population growth can affect economic growth, it is not necessary for growth to arise. We derive both the social planner and the decentralized economy growth rates and show the optimal subsidy that decentralizes it. We also show numerically that the effects of population growth on the market growth rate, the optimal growth rate and the optimal subsidy are small. Besides, we find that physical capital is more important for the production of knowledge than for the production of goods.
Resumo:
[eng] In this paper we claim that capital is as important in the production of ideas as in the production of final goods. Hence, we introduce capital in the production of knowledge and discuss the associated problems arising from the public good nature of knowledge. We show that although population growth can affect economic growth, it is not necessary for growth to arise. We derive both the social planner and the decentralized economy growth rates and show the optimal subsidy that decentralizes it. We also show numerically that the effects of population growth on the market growth rate, the optimal growth rate and the optimal subsidy are small. Besides, we find that physical capital is more important for the production of knowledge than for the production of goods.