983 resultados para Software Packages
Resumo:
Biplots are graphical displays of data matrices based on the decomposition of a matrix as the product of two matrices. Elements of these two matrices are used as coordinates for the rows and columns of the data matrix, with an interpretation of the joint presentation that relies on the properties of the scalar product. Because the decomposition is not unique, there are several alternative ways to scale the row and column points of the biplot, which can cause confusion amongst users, especially when software packages are not united in their approach to this issue. We propose a new scaling of the solution, called the standard biplot, which applies equally well to a wide variety of analyses such as correspondence analysis, principal component analysis, log-ratio analysis and the graphical results of a discriminant analysis/MANOVA, in fact to any method based on the singular-value decomposition. The standard biplot also handles data matrices with widely different levels of inherent variance. Two concepts taken from correspondence analysis are important to this idea: the weighting of row and column points, and the contributions made by the points to the solution. In the standard biplot one set of points, usually the rows of the data matrix, optimally represent the positions of the cases or sample units, which are weighted and usually standardized in some way unless the matrix contains values that are comparable in their raw form. The other set of points, usually the columns, is represented in accordance with their contributions to the low-dimensional solution. As for any biplot, the projections of the row points onto vectors defined by the column points approximate the centred and (optionally) standardized data. The method is illustrated with several examples to demonstrate how the standard biplot copes in different situations to give a joint map which needs only one common scale on the principal axes, thus avoiding the problem of enlarging or contracting the scale of one set of points to make the biplot readable. The proposal also solves the problem in correspondence analysis of low-frequency categories that are located on the periphery of the map, giving the false impression that they are important.
Resumo:
The Microbe browser is a web server providing comparative microbial genomics data. It offers comprehensive, integrated data from GenBank, RefSeq, UniProt, InterPro, Gene Ontology and the Orthologs Matrix Project (OMA) database, displayed along with gene predictions from five software packages. The Microbe browser is daily updated from the source databases and includes all completely sequenced bacterial and archaeal genomes. The data are displayed in an easy-to-use, interactive website based on Ensembl software. The Microbe browser is available at http://microbe.vital-it.ch/. Programmatic access is available through the OMA application programming interface (API) at http://microbe.vital-it.ch/api.
Resumo:
Although correspondence analysis is now widely available in statistical software packages and applied in a variety of contexts, notably the social and environmental sciences, there are still some misconceptions about this method as well as unresolved issues which remain controversial to this day. In this paper we hope to settle these matters, namely (i) the way CA measures variance in a two-way table and how to compare variances between tables of different sizes, (ii) the influence, or rather lack of influence, of outliers in the usual CA maps, (iii) the scaling issue and the biplot interpretation of maps,(iv) whether or not to rotate a solution, and (v) statistical significance of results.
Resumo:
BACKGROUND: Finding genes that are differentially expressed between conditions is an integral part of understanding the molecular basis of phenotypic variation. In the past decades, DNA microarrays have been used extensively to quantify the abundance of mRNA corresponding to different genes, and more recently high-throughput sequencing of cDNA (RNA-seq) has emerged as a powerful competitor. As the cost of sequencing decreases, it is conceivable that the use of RNA-seq for differential expression analysis will increase rapidly. To exploit the possibilities and address the challenges posed by this relatively new type of data, a number of software packages have been developed especially for differential expression analysis of RNA-seq data. RESULTS: We conducted an extensive comparison of eleven methods for differential expression analysis of RNA-seq data. All methods are freely available within the R framework and take as input a matrix of counts, i.e. the number of reads mapping to each genomic feature of interest in each of a number of samples. We evaluate the methods based on both simulated data and real RNA-seq data. CONCLUSIONS: Very small sample sizes, which are still common in RNA-seq experiments, impose problems for all evaluated methods and any results obtained under such conditions should be interpreted with caution. For larger sample sizes, the methods combining a variance-stabilizing transformation with the 'limma' method for differential expression analysis perform well under many different conditions, as does the nonparametric SAMseq method.
Resumo:
Introduction. This paper studies the situation of research on Catalan literature between 1976 and 2003 by carrying out a bibliometric and social network analysis of PhD theses defended in Spain. It has a dual aim: to present interesting results for the discipline and to demonstrate the methodological efficacy of scientometric tools in the humanities, a field in which they are often neglected due to the difficulty of gathering data. Method. The analysis was performed on 151 records obtained from the TESEO database of PhD theses. The quantitative estimates include the use of the UCINET and Pajek software packages. Authority control was performed on the records. Analysis. Descriptive statistics were used to describe the sample and the distribution of responses to each question. Sex differences on key questions were analysed using the Chi-squared test. Results. The value of the figures obtained is demonstrated. The information obtained on the topic and the periods studied in the theses, and on the actors involved (doctoral students, thesis supervisors and members of defence committees), provide important insights into the mechanisms of humanities disciplines. The main research tendencies of Catalan literature are identified. It is observed that the composition of members of the thesis defence committees follows Lotka's Law. Conclusions. Bibliometric analysis and social network analysis may be especially useful in the humanities and in other fields which are lacking in scientometric data in comparison with the experimental sciences.
Resumo:
BACKGROUND: Today, recognition and classification of sequence motifs and protein folds is a mature field, thanks to the availability of numerous comprehensive and easy to use software packages and web-based services. Recognition of structural motifs, by comparison, is less well developed and much less frequently used, possibly due to a lack of easily accessible and easy to use software. RESULTS: In this paper, we describe an extension of DeepView/Swiss-PdbViewer through which structural motifs may be defined and searched for in large protein structure databases, and we show that common structural motifs involved in stabilizing protein folds are present in evolutionarily and structurally unrelated proteins, also in deeply buried locations which are not obviously related to protein function. CONCLUSIONS: The possibility to define custom motifs and search for their occurrence in other proteins permits the identification of recurrent arrangements of residues that could have structural implications. The possibility to do so without having to maintain a complex software/hardware installation on site brings this technology to experts and non-experts alike.
Resumo:
PURPOSE: Mutations in IDH3B, an enzyme participating in the Krebs cycle, have recently been found to cause autosomal recessive retinitis pigmentosa (arRP). The MDH1 gene maps within the RP28 arRP linkage interval and encodes cytoplasmic malate dehydrogenase, an enzyme functionally related to IDH3B. As a proof of concept for candidate gene screening to be routinely performed by ultra high throughput sequencing (UHTs), we analyzed MDH1 in a patient from each of the two families described so far to show linkage between arRP and RP28. METHODS: With genomic long-range PCR, we amplified all introns and exons of the MDH1 gene (23.4 kb). PCR products were then sequenced by short-read UHTs with no further processing. Computer-based mapping of the reads and mutation detection were performed by three independent software packages. RESULTS: Despite the intrinsic complexity of human genome sequences, reads were easily mapped and analyzed, and all algorithms used provided the same results. The two patients were homozygous for all DNA variants identified in the region, which confirms previous linkage and homozygosity mapping results, but had different haplotypes, indicating genetic or allelic heterogeneity. None of the DNA changes detected could be associated with the disease. CONCLUSIONS: The MDH1 gene is not the cause of RP28-linked arRP. Our experimental strategy shows that long-range genomic PCR followed by UHTs provides an excellent system to perform a thorough screening of candidate genes for hereditary retinal degeneration.
Resumo:
Three pavement design software packages were compared with regards to how they were different in determining design input parameters and their influences on the pavement thickness. StreetPave designs the concrete pavement thickness based on the PCA method and the equivalent asphalt pavement thickness. The WinPAS software performs both concrete and asphalt pavements following the AASHTO 1993 design method. The APAI software designs asphalt pavements based on pre-mechanistic/empirical AASHTO methodology. First, the following four critical design input parameters were identified: traffic, subgrade strength, reliability, and design life. The sensitivity analysis of these four design input parameters were performed using three pavement design software packages to identify which input parameters require the most attention during pavement design. Based on the current pavement design procedures and sensitivity analysis results, a prototype pavement design and sensitivity analysis (PD&SA) software package was developed to retrieve the pavement thickness design value for a given condition and allow a user to perform a pavement design sensitivity analysis. The prototype PD&SA software is a computer program that stores pavement design results in database that is designed for the user to input design data from the variety of design programs and query design results for given conditions. The prototype Pavement Design and Sensitivity Analysis (PA&SA) software package was developed to demonstrate the concept of retrieving the pavement design results from the database for a design sensitivity analysis. This final report does not include the prototype software which will be validated and tested during the next phase.
Resumo:
With nearly 2,000 free and open source software (FLOSS) licenses, software license proliferation¿ can be a major headache for software development organizations trying to speed development through software component reuse, as well as companies redistributing software packages as components of their products. Scope is one problem: from the Free Beer license to the GPL family of licenses to platform-specific licenses such as Apache and Eclipse, the number and variety of licenses make it difficult for companies to ¿do the right thing¿ with respect to the software components in their products and applications. In addition to the sheer number of licenses, each license carries within it the author¿s specific definition of how the software can be used and re-used. Permissive licenses like BSD and MIT make it easy; software can be redistributed and developers can modify code without the requirement of making changes publicly available. Reciprocal licenses, on the other hand, place varying restrictions on re-use and redistribution. Woe to the developer who snags a bit of code after a simple web search without understanding the ramifications of license restrictions.
Resumo:
OBJECTIVES: The purpose of this study was to compare myocardial blood flow (MBF) and myocardial flow reserve (MFR) estimates from rubidium-82 positron emission tomography ((82)Rb PET) data using 10 software packages (SPs) based on 8 tracer kinetic models. BACKGROUND: It is unknown how MBF and MFR values from existing SPs agree for (82)Rb PET. METHODS: Rest and stress (82)Rb PET scans of 48 patients with suspected or known coronary artery disease were analyzed in 10 centers. Each center used 1 of 10 SPs to analyze global and regional MBF using the different kinetic models implemented. Values were considered to agree if they simultaneously had an intraclass correlation coefficient >0.75 and a difference <20% of the median across all programs. RESULTS: The most common model evaluated was the Ottawa Heart Institute 1-tissue compartment model (OHI-1-TCM). MBF values from 7 of 8 SPs implementing this model agreed best. Values from 2 other models (alternative 1-TCM and Axially distributed) also agreed well, with occasional differences. The MBF results from other models (e.g., 2-TCM and retention) were less in agreement with values from OHI-1-TCM. CONCLUSIONS: SPs using the most common kinetic model-OHI-1-TCM-provided consistent results in measuring global and regional MBF values, suggesting that they may be used interchangeably to process data acquired with a common imaging protocol.
Resumo:
Background: The repertoire of statistical methods dealing with the descriptive analysis of the burden of a disease has been expanded and implemented in statistical software packages during the last years. The purpose of this paper is to present a web-based tool, REGSTATTOOLS http://regstattools.net intended to provide analysis for the burden of cancer, or other group of disease registry data. Three software applications are included in REGSTATTOOLS: SART (analysis of disease"s rates and its time trends), RiskDiff (analysis of percent changes in the rates due to demographic factors and risk of developing or dying from a disease) and WAERS (relative survival analysis). Results: We show a real-data application through the assessment of the burden of tobacco-related cancer incidence in two Spanish regions in the period 1995-2004. Making use of SART we show that lung cancer is the most common cancer among those cancers, with rising trends in incidence among women. We compared 2000-2004 data with that of 1995-1999 to assess percent changes in the number of cases as well as relative survival using RiskDiff and WAERS, respectively. We show that the net change increase in lung cancer cases among women was mainly attributable to an increased risk of developing lung cancer, whereas in men it is attributable to the increase in population size. Among men, lung cancer relative survival was higher in 2000-2004 than in 1995-1999, whereas it was similar among women when these time periods were compared. Conclusions: Unlike other similar applications, REGSTATTOOLS does not require local software installation and it is simple to use, fast and easy to interpret. It is a set of web-based statistical tools intended for automated calculation of population indicators that any professional in health or social sciences may require.
Resumo:
The aim of this work is to study flow properties at T-junction of pipe, pressure loss suffered by the flow after passing through T-junction and to study reliability of the classical engineering formulas used to find head loss for T-junction of pipes. In this we have compared our results with CFD software packages with classical formula and made an attempt to determine accuracy of the classical formulas. In this work we have studies head loss in T-junction of pipes with various inlet velocities, head loss in T-junction of pipes when the angle of the junction is slightly different from 90 degrees and T-junction with different area of cross-section of the main pipe and branch pipe. In this work we have simulated the flow at T-junction of pipe with FLUENT and Comsol Multiphysics and observed flow properties inside the T-junction and studied the head loss suffered by fluid flow after passing through the junction. We have also compared pressure (head) losses obtained by classical formulas by A. Vazsonyi and Andrew Gardel and formulas obtained by assuming T-junction as combination of other pipe components and observations obtained from software experiments. One of the purposes of this study is also to study change in pressure loss with change in angle of T-junction. Using software we can have better view of flow inside the junction and study turbulence, kinetic energy, pressure loss etc. Such simulations save a lot of time and can be performed without actually doing the experiment. There were no real life experiments made, the results obtained completely rely on accuracy of software and numerical methods used.
Resumo:
The purpose of this work is to demonstrate the usefulness of low cost high performance computers. It is presented technics and software packages used by computational chemists. Access to high-performance computing power remains crucial for many computational quantum chemistry. So, this work introduces the concept of PC cluster, an economical computing plataform.
Resumo:
The development of new tools for chemoinformatics, allied to the use of different algorithms and computer programmes for structure elucidation of organic compounds, is growing fast worldwide. Massive efforts in research and development are currently being pursued both by academia and the so-called chemistry software development companies. The demystification of this environment provoked by the availability of software packages and a vast array of publications exert a positive impact on chemistry. In this work, an overview concerning the more classical approaches as well as new strategies on computer-based tools for structure elucidation of organic compounds is presented. Historical background is also taken into account since these techniques began to develop around four decades ago. Attention will be paid to companies which develop, distribute or commercialize software as well as web-based and open access tools which are currently available to chemists.
Resumo:
Research focus of this thesis is to explore options for building systems for business critical web applications. Business criticality here includes requirements for data protection and system availability. The focus is on open source software. Goals are to identify robust technologies and engineering practices to implement such systems. Research methods include experiments made with sample systems built around chosen software packages that represent certain technologies. The main research focused on finding a good method for database data replication, a key functionality for high-availability, database-driven web applications. Research included also finding engineering best practices from books written by administrators of high traffic web applications. Experiment with database replication showed, that block level synchronous replication offered by DRBD replication software offered considerably more robust data protection and high-availability functionality compared to leading open source database product MySQL, and its built-in asynchronous replication. For master-master database setups, block level replication is more recommended way to build high-availability into the system. Based on thesis research, building high-availability web applications is possible using a combination of open source software and engineering best practices for data protection, availability planning and scaling.