856 resultados para regression algorithm


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The atomic force microscope is not only a very convenient tool for studying the topography of different samples, but it can also be used to measure specific binding forces between molecules. For this purpose, one type of molecule is attached to the tip and the other one to the substrate. Approaching the tip to the substrate allows the molecules to bind together. Retracting the tip breaks the newly formed bond. The rupture of a specific bond appears in the force-distance curves as a spike from which the binding force can be deduced. In this article we present an algorithm to automatically process force-distance curves in order to obtain bond strength histograms. The algorithm is based on a fuzzy logic approach that permits an evaluation of "quality" for every event and makes the detection procedure much faster compared to a manual selection. In this article, the software has been applied to measure the binding strength between tubuline and microtubuline associated proteins.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: The first AO comprehensive pediatric long bone fracture classification system has been established following a structured path of development and validation with experienced pediatric surgeons. Methods: A follow-up series of agreement studies was applied to specify and evaluate a grading system for displacement of pediatric supracondylar fractures. An iterative process comprising an international group of 5 experienced pediatric surgeons (Phase 1) followed by a pragmatic multicenter agreement study involving 26 raters (Phase 2) was used. The last evaluations were conducted on a consecutive collection of 154 supracondylar fractures documented by standard anteroposterior and lateral radiographs. Results: Fractures were classified according to 1 of 4 grades: I = incomplete fracture with no or minimal displacement; II = Incomplete fracture with continuity of the posterior (extension fracture) or anterior cortex (flexion fracture); III = lack of bone continuity (broken cortex), but still some contact between the fracture planes; IV = complete fracture with no bone continuity (broken cortex), and no contact between the fracture planes. A diagnostic algorithm to support the practical application of the grading system in a clinical setting, as well as an aid using a circle placed over the capitellum was proposed. The overall kappa coefficients were 0.68 and 0.61 in the Phase 1 and Phase 2 studies, respectively. In the Phase 1 study, fracture grades I, II, III, and IV were classified with median accuracies of 91%, 82%, 83%, and 99.5%, respectively. Similar median accuracies of 86% (Grade I), 73% (Grade II), 83%(Grade III), and 92% were reported for the Phase 2 study. Reliability was high in distinguishing complete, unstable fractures from stable injuries [ie, kappa coefficients of 0.84 (Phase 1) and 0.83 (Phase 2) were calculated]; in Phase 2, surgeons' accuracies in classifying complete fractures were all above 85%. Conclusions: With clear and unambiguous definition, this new grading system for supracondylar fracture displacement has proved to be sufficiently reliable and accurate when applied by pediatric surgeons in the framework of clinical routine as well as research.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Research in epistasis or gene-gene interaction detection for human complex traits has grown over the last few years. It has been marked by promising methodological developments, improved translation efforts of statistical epistasis to biological epistasis and attempts to integrate different omics information sources into the epistasis screening to enhance power. The quest for gene-gene interactions poses severe multiple-testing problems. In this context, the maxT algorithm is one technique to control the false-positive rate. However, the memory needed by this algorithm rises linearly with the amount of hypothesis tests. Gene-gene interaction studies will require a memory proportional to the squared number of SNPs. A genome-wide epistasis search would therefore require terabytes of memory. Hence, cache problems are likely to occur, increasing the computation time. In this work we present a new version of maxT, requiring an amount of memory independent from the number of genetic effects to be investigated. This algorithm was implemented in C++ in our epistasis screening software MBMDR-3.0.3. We evaluate the new implementation in terms of memory efficiency and speed using simulated data. The software is illustrated on real-life data for Crohn’s disease. Results: In the case of a binary (affected/unaffected) trait, the parallel workflow of MBMDR-3.0.3 analyzes all gene-gene interactions with a dataset of 100,000 SNPs typed on 1000 individuals within 4 days and 9 hours, using 999 permutations of the trait to assess statistical significance, on a cluster composed of 10 blades, containing each four Quad-Core AMD Opteron(tm) Processor 2352 2.1 GHz. In the case of a continuous trait, a similar run takes 9 days. Our program found 14 SNP-SNP interactions with a multiple-testing corrected p-value of less than 0.05 on real-life Crohn’s disease (CD) data. Conclusions: Our software is the first implementation of the MB-MDR methodology able to solve large-scale SNP-SNP interactions problems within a few days, without using much memory, while adequately controlling the type I error rates. A new implementation to reach genome-wide epistasis screening is under construction. In the context of Crohn’s disease, MBMDR-3.0.3 could identify epistasis involving regions that are well known in the field and could be explained from a biological point of view. This demonstrates the power of our software to find relevant phenotype-genotype higher-order associations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The relationship between inflammation and cancer is well established in several tumor types, including bladder cancer. We performed an association study between 886 inflammatory-gene variants and bladder cancer risk in 1,047 cases and 988 controls from the Spanish Bladder Cancer (SBC)/EPICURO Study. A preliminary exploration with the widely used univariate logistic regression approach did not identify any significant SNP after correcting for multiple testing. We further applied two more comprehensive methods to capture the complexity of bladder cancer genetic susceptibility: Bayesian Threshold LASSO (BTL), a regularized regression method, and AUC-Random Forest, a machine-learning algorithm. Both approaches explore the joint effect of markers. BTL analysis identified a signature of 37 SNPs in 34 genes showing an association with bladder cancer. AUC-RF detected an optimal predictive subset of 56 SNPs. 13 SNPs were identified by both methods in the total population. Using resources from the Texas Bladder Cancer study we were able to replicate 30% of the SNPs assessed. The associations between inflammatory SNPs and bladder cancer were reexamined among non-smokers to eliminate the effect of tobacco, one of the strongest and most prevalent environmental risk factor for this tumor. A 9 SNP-signature was detected by BTL. Here we report, for the first time, a set of SNP in inflammatory genes jointly associated with bladder cancer risk. These results highlight the importance of the complex structure of genetic susceptibility associated with cancer risk.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is well known the relationship between source separation and blind deconvolution: If a filtered version of an unknown i.i.d. signal is observed, temporal independence between samples can be used to retrieve the original signal, in the same manner as spatial independence is used for source separation. In this paper we propose the use of a Genetic Algorithm (GA) to blindly invert linear channels. The use of GA is justified in the case of small number of samples, where other gradient-like methods fails because of poor estimation of statistics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: The clinical course of HIV-1 infection is highly variable among individuals, at least in part as a result of genetic polymorphisms in the host. Toll-like receptors (TLRs) have a key role in innate immunity and mutations in the genes encoding these receptors have been associated with increased or decreased susceptibility to infections. OBJECTIVES: To determine whether single-nucleotide polymorphisms (SNPs) in TLR2-4 and TLR7-9 influenced the natural course of HIV-1 infection. METHODS: Twenty-eight SNPs in TLRs were analysed in HAART-naive HIV-positive patients from the Swiss HIV Cohort Study. The SNPs were detected using Sequenom technology. Haplotypes were inferred using an expectation-maximization algorithm. The CD4 T cell decline was calculated using a least-squares regression. Patients with a rapid CD4 cell decline, less than the 15th percentile, were defined as rapid progressors. The risk of rapid progression associated with SNPs was estimated using a logistic regression model. Other candidate risk factors included age, sex and risk groups (heterosexual, homosexual and intravenous drug use). RESULTS: Two SNPs in TLR9 (1635A/G and +1174G/A) in linkage disequilibrium were associated with the rapid progressor phenotype: for 1635A/G, odds ratio (OR), 3.9 [95% confidence interval (CI),1.7-9.2] for GA versus AA and OR, 4.7 (95% CI,1.9-12.0) for GG versus AA (P = 0.0008). CONCLUSION: Rapid progression of HIV-1 infection was associated with TLR9 polymorphisms. Because of its potential implications for intervention strategies and vaccine developments, additional epidemiological and experimental studies are needed to confirm this association.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A statewide study was performed to develop regional regression equations for estimating selected annual exceedance- probability statistics for ungaged stream sites in Iowa. The study area comprises streamgages located within Iowa and 50 miles beyond the State’s borders. Annual exceedanceprobability estimates were computed for 518 streamgages by using the expected moments algorithm to fit a Pearson Type III distribution to the logarithms of annual peak discharges for each streamgage using annual peak-discharge data through 2010. The estimation of the selected statistics included a Bayesian weighted least-squares/generalized least-squares regression analysis to update regional skew coefficients for the 518 streamgages. Low-outlier and historic information were incorporated into the annual exceedance-probability analyses, and a generalized Grubbs-Beck test was used to detect multiple potentially influential low flows. Also, geographic information system software was used to measure 59 selected basin characteristics for each streamgage. Regional regression analysis, using generalized leastsquares regression, was used to develop a set of equations for each flood region in Iowa for estimating discharges for ungaged stream sites with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities, which are equivalent to annual flood-frequency recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years, respectively. A total of 394 streamgages were included in the development of regional regression equations for three flood regions (regions 1, 2, and 3) that were defined for Iowa based on landform regions and soil regions. Average standard errors of prediction range from 31.8 to 45.2 percent for flood region 1, 19.4 to 46.8 percent for flood region 2, and 26.5 to 43.1 percent for flood region 3. The pseudo coefficients of determination for the generalized leastsquares equations range from 90.8 to 96.2 percent for flood region 1, 91.5 to 97.9 percent for flood region 2, and 92.4 to 96.0 percent for flood region 3. The regression equations are applicable only to stream sites in Iowa with flows not significantly affected by regulation, diversion, channelization, backwater, or urbanization and with basin characteristics within the range of those used to develop the equations. These regression equations will be implemented within the U.S. Geological Survey StreamStats Web-based geographic information system tool. StreamStats allows users to click on any ungaged site on a river and compute estimates of the eight selected statistics; in addition, 90-percent prediction intervals and the measured basin characteristics for the ungaged sites also are provided by the Web-based tool. StreamStats also allows users to click on any streamgage in Iowa and estimates computed for these eight selected statistics are provided for the streamgage.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Context: Ovarian tumors (OT) typing is a competency expected from pathologists, with significant clinical implications. OT however come in numerous different types, some rather rare, with the consequence of few opportunities for practice in some departments. Aim: Our aim was to design a tool for pathologists to train in less common OT typing. Method and Results: Representative slides of 20 less common OT were scanned (Nano Zoomer Digital Hamamatsu®) and the diagnostic algorithm proposed by Young and Scully applied to each case (Young RH and Scully RE, Seminars in Diagnostic Pathology 2001, 18: 161-235) to include: recognition of morphological pattern(s); shortlisting of differential diagnosis; proposition of relevant immunohistochemical markers. The next steps of this project will be: evaluation of the tool in several post-graduate training centers in Europe and Québec; improvement of its design based on evaluation results; diffusion to a larger public. Discussion: In clinical medicine, solving many cases is recognized as of utmost importance for a novice to become an expert. This project relies on the virtual slides technology to provide pathologists with a learning tool aimed at increasing their skills in OT typing. After due evaluation, this model might be extended to other uncommon tumors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traditionally, the Iowa Department of Transportation has used the Iowa Runoff Chart and single-variable regional-regression equations (RREs) from a U.S. Geological Survey report (published in 1987) as the primary methods to estimate annual exceedance-probability discharge (AEPD) for small (20 square miles or less) drainage basins in Iowa. With the publication of new multi- and single-variable RREs by the U.S. Geological Survey (published in 2013), the Iowa Department of Transportation needs to determine which methods of AEPD estimation provide the best accuracy and the least bias for small drainage basins in Iowa. Twenty five streamgages with drainage areas less than 2 square miles (mi2) and 55 streamgages with drainage areas between 2 and 20 mi2 were selected for the comparisons that used two evaluation metrics. Estimates of AEPDs calculated for the streamgages using the expected moments algorithm/multiple Grubbs-Beck test analysis method were compared to estimates of AEPDs calculated from the 2013 multivariable RREs; the 2013 single-variable RREs; the 1987 single-variable RREs; the TR-55 rainfall-runoff model; and the Iowa Runoff Chart. For the 25 streamgages with drainage areas less than 2 mi2, results of the comparisons seem to indicate the best overall accuracy and the least bias may be achieved by using the TR-55 method for flood regions 1 and 3 (published in 2013) and by using the 1987 single-variable RREs for flood region 2 (published in 2013). For drainage basins with areas between 2 and 20 mi2, results of the comparisons seem to indicate the best overall accuracy and the least bias may be achieved by using the 1987 single-variable RREs for the Southern Iowa Drift Plain landform region and for flood region 3 (published in 2013), by using the 2013 multivariable RREs for the Iowan Surface landform region, and by using the 2013 or 1987 single-variable RREs for flood region 2 (published in 2013). For all other landform or flood regions in Iowa, use of the 2013 single-variable RREs may provide the best overall accuracy and the least bias. An examination was conducted to understand why the 1987 single-variable RREs seem to provide better accuracy and less bias than either of the 2013 multi- or single-variable RREs. A comparison of 1-percent annual exceedance-probability regression lines for hydrologic regions 1–4 from the 1987 single-variable RREs and for flood regions 1–3 from the 2013 single-variable RREs indicates that the 1987 single-variable regional-regression lines generally have steeper slopes and lower discharges when compared to 2013 single-variable regional-regression lines for corresponding areas of Iowa. The combination of the definition of hydrologic regions, the lower discharges, and the steeper slopes of regression lines associated with the 1987 single-variable RREs seem to provide better accuracy and less bias when compared to the 2013 multi- or single-variable RREs; better accuracy and less bias was determined particularly for drainage areas less than 2 mi2, and also for some drainage areas between 2 and 20 mi2. The 2013 multi- and single-variable RREs are considered to provide better accuracy and less bias for larger drainage areas. Results of this study indicate that additional research is needed to address the curvilinear relation between drainage area and AEPDs for areas of Iowa.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, a hybrid simulation-based algorithm is proposed for the StochasticFlow Shop Problem. The main idea of the methodology is to transform the stochastic problem into a deterministic problem and then apply simulation to the latter. In order to achieve this goal, we rely on Monte Carlo Simulation and an adapted version of a deterministic heuristic. This approach aims to provide flexibility and simplicity due to the fact that it is not constrained by any previous assumption and relies in well-tested heuristics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

3 Summary 3. 1 English The pharmaceutical industry has been facing several challenges during the last years, and the optimization of their drug discovery pipeline is believed to be the only viable solution. High-throughput techniques do participate actively to this optimization, especially when complemented by computational approaches aiming at rationalizing the enormous amount of information that they can produce. In siiico techniques, such as virtual screening or rational drug design, are now routinely used to guide drug discovery. Both heavily rely on the prediction of the molecular interaction (docking) occurring between drug-like molecules and a therapeutically relevant target. Several softwares are available to this end, but despite the very promising picture drawn in most benchmarks, they still hold several hidden weaknesses. As pointed out in several recent reviews, the docking problem is far from being solved, and there is now a need for methods able to identify binding modes with a high accuracy, which is essential to reliably compute the binding free energy of the ligand. This quantity is directly linked to its affinity and can be related to its biological activity. Accurate docking algorithms are thus critical for both the discovery and the rational optimization of new drugs. In this thesis, a new docking software aiming at this goal is presented, EADock. It uses a hybrid evolutionary algorithm with two fitness functions, in combination with a sophisticated management of the diversity. EADock is interfaced with .the CHARMM package for energy calculations and coordinate handling. A validation was carried out on 37 crystallized protein-ligand complexes featuring 11 different proteins. The search space was defined as a sphere of 15 R around the center of mass of the ligand position in the crystal structure, and conversely to other benchmarks, our algorithms was fed with optimized ligand positions up to 10 A root mean square deviation 2MSD) from the crystal structure. This validation illustrates the efficiency of our sampling heuristic, as correct binding modes, defined by a RMSD to the crystal structure lower than 2 A, were identified and ranked first for 68% of the complexes. The success rate increases to 78% when considering the five best-ranked clusters, and 92% when all clusters present in the last generation are taken into account. Most failures in this benchmark could be explained by the presence of crystal contacts in the experimental structure. EADock has been used to understand molecular interactions involved in the regulation of the Na,K ATPase, and in the activation of the nuclear hormone peroxisome proliferatoractivated receptors a (PPARa). It also helped to understand the action of common pollutants (phthalates) on PPARy, and the impact of biotransformations of the anticancer drug Imatinib (Gleevec®) on its binding mode to the Bcr-Abl tyrosine kinase. Finally, a fragment-based rational drug design approach using EADock was developed, and led to the successful design of new peptidic ligands for the a5ß1 integrin, and for the human PPARa. In both cases, the designed peptides presented activities comparable to that of well-established ligands such as the anticancer drug Cilengitide and Wy14,643, respectively. 3.2 French Les récentes difficultés de l'industrie pharmaceutique ne semblent pouvoir se résoudre que par l'optimisation de leur processus de développement de médicaments. Cette dernière implique de plus en plus. de techniques dites "haut-débit", particulièrement efficaces lorsqu'elles sont couplées aux outils informatiques permettant de gérer la masse de données produite. Désormais, les approches in silico telles que le criblage virtuel ou la conception rationnelle de nouvelles molécules sont utilisées couramment. Toutes deux reposent sur la capacité à prédire les détails de l'interaction moléculaire entre une molécule ressemblant à un principe actif (PA) et une protéine cible ayant un intérêt thérapeutique. Les comparatifs de logiciels s'attaquant à cette prédiction sont flatteurs, mais plusieurs problèmes subsistent. La littérature récente tend à remettre en cause leur fiabilité, affirmant l'émergence .d'un besoin pour des approches plus précises du mode d'interaction. Cette précision est essentielle au calcul de l'énergie libre de liaison, qui est directement liée à l'affinité du PA potentiel pour la protéine cible, et indirectement liée à son activité biologique. Une prédiction précise est d'une importance toute particulière pour la découverte et l'optimisation de nouvelles molécules actives. Cette thèse présente un nouveau logiciel, EADock, mettant en avant une telle précision. Cet algorithme évolutionnaire hybride utilise deux pressions de sélections, combinées à une gestion de la diversité sophistiquée. EADock repose sur CHARMM pour les calculs d'énergie et la gestion des coordonnées atomiques. Sa validation a été effectuée sur 37 complexes protéine-ligand cristallisés, incluant 11 protéines différentes. L'espace de recherche a été étendu à une sphère de 151 de rayon autour du centre de masse du ligand cristallisé, et contrairement aux comparatifs habituels, l'algorithme est parti de solutions optimisées présentant un RMSD jusqu'à 10 R par rapport à la structure cristalline. Cette validation a permis de mettre en évidence l'efficacité de notre heuristique de recherche car des modes d'interactions présentant un RMSD inférieur à 2 R par rapport à la structure cristalline ont été classés premier pour 68% des complexes. Lorsque les cinq meilleures solutions sont prises en compte, le taux de succès grimpe à 78%, et 92% lorsque la totalité de la dernière génération est prise en compte. La plupart des erreurs de prédiction sont imputables à la présence de contacts cristallins. Depuis, EADock a été utilisé pour comprendre les mécanismes moléculaires impliqués dans la régulation de la Na,K ATPase et dans l'activation du peroxisome proliferatoractivated receptor a (PPARa). Il a également permis de décrire l'interaction de polluants couramment rencontrés sur PPARy, ainsi que l'influence de la métabolisation de l'Imatinib (PA anticancéreux) sur la fixation à la kinase Bcr-Abl. Une approche basée sur la prédiction des interactions de fragments moléculaires avec protéine cible est également proposée. Elle a permis la découverte de nouveaux ligands peptidiques de PPARa et de l'intégrine a5ß1. Dans les deux cas, l'activité de ces nouveaux peptides est comparable à celles de ligands bien établis, comme le Wy14,643 pour le premier, et le Cilengitide (PA anticancéreux) pour la seconde.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, a hybrid simulation-based algorithm is proposed for the StochasticFlow Shop Problem. The main idea of the methodology is to transform the stochastic problem into a deterministic problem and then apply simulation to the latter. In order to achieve this goal, we rely on Monte Carlo Simulation and an adapted version of a deterministic heuristic. This approach aims to provide flexibility and simplicity due to the fact that it is not constrained by any previous assumption and relies in well-tested heuristics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We describe the case of a man with a history of complex partial seizures and severe language, cognitive and behavioural regression during early childhood (3.5 years), who underwent epilepsy surgery at the age of 25 years. His early epilepsy had clinical and electroencephalogram features of the syndromes of epilepsy with continuous spike waves during sleep and acquired epileptic aphasia (Landau-Kleffner syndrome), which we considered initially to be of idiopathic origin. Seizures recurred at 19 years and presurgical investigations at 25 years showed a lateral frontal epileptic focus with spread to Broca's area and the frontal orbital regions. Histopathology revealed a focal cortical dysplasia, not visible on magnetic resonance imaging. The prolonged but reversible early regression and the residual neuropsychological disorders during adulthood were probably the result of an active left frontal epilepsy, which interfered with language and behaviour during development. Our findings raise the question of the role of focal cortical dysplasia as an aetiology in the syndromes of epilepsy with continuous spike waves during sleep and acquired epileptic aphasia.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Spatial data analysis mapping and visualization is of great importance in various fields: environment, pollution, natural hazards and risks, epidemiology, spatial econometrics, etc. A basic task of spatial mapping is to make predictions based on some empirical data (measurements). A number of state-of-the-art methods can be used for the task: deterministic interpolations, methods of geostatistics: the family of kriging estimators (Deutsch and Journel, 1997), machine learning algorithms such as artificial neural networks (ANN) of different architectures, hybrid ANN-geostatistics models (Kanevski and Maignan, 2004; Kanevski et al., 1996), etc. All the methods mentioned above can be used for solving the problem of spatial data mapping. Environmental empirical data are always contaminated/corrupted by noise, and often with noise of unknown nature. That's one of the reasons why deterministic models can be inconsistent, since they treat the measurements as values of some unknown function that should be interpolated. Kriging estimators treat the measurements as the realization of some spatial randomn process. To obtain the estimation with kriging one has to model the spatial structure of the data: spatial correlation function or (semi-)variogram. This task can be complicated if there is not sufficient number of measurements and variogram is sensitive to outliers and extremes. ANN is a powerful tool, but it also suffers from the number of reasons. of a special type ? multiplayer perceptrons ? are often used as a detrending tool in hybrid (ANN+geostatistics) models (Kanevski and Maignank, 2004). Therefore, development and adaptation of the method that would be nonlinear and robust to noise in measurements, would deal with the small empirical datasets and which has solid mathematical background is of great importance. The present paper deals with such model, based on Statistical Learning Theory (SLT) - Support Vector Regression. SLT is a general mathematical framework devoted to the problem of estimation of the dependencies from empirical data (Hastie et al, 2004; Vapnik, 1998). SLT models for classification - Support Vector Machines - have shown good results on different machine learning tasks. The results of SVM classification of spatial data are also promising (Kanevski et al, 2002). The properties of SVM for regression - Support Vector Regression (SVR) are less studied. First results of the application of SVR for spatial mapping of physical quantities were obtained by the authorsin for mapping of medium porosity (Kanevski et al, 1999), and for mapping of radioactively contaminated territories (Kanevski and Canu, 2000). The present paper is devoted to further understanding of the properties of SVR model for spatial data analysis and mapping. Detailed description of the SVR theory can be found in (Cristianini and Shawe-Taylor, 2000; Smola, 1996) and basic equations for the nonlinear modeling are given in section 2. Section 3 discusses the application of SVR for spatial data mapping on the real case study - soil pollution by Cs137 radionuclide. Section 4 discusses the properties of the modelapplied to noised data or data with outliers.