783 resultados para grid, clustering, statistical, clustering


Relevância:

30.00% 30.00%

Publicador:

Resumo:

A comparison study was carried out between a wireless sensor node with a bare die flip-chip mounted and its reference board with a BGA packaged transceiver chip. The main focus is the return loss (S parameter S11) at the antenna connector, which was highly depended on the impedance mismatch. Modeling including the different interconnect technologies, substrate properties and passive components, was performed to simulate the system in Ansoft Designer software. Statistical methods, such as the use of standard derivation and regression, were applied to the RF performance analysis, to see the impacts of the different parameters on the return loss. Extreme value search, following on the previous analysis, can provide the parameters' values for the minimum return loss. Measurements fit the analysis and simulation well and showed a great improvement of the return loss from -5dB to -25dB for the target wireless sensor node.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Lipoprotein-associated phospholipase A(2) (Lp-PLA(2)) is an emerging risk factor and therapeutic target for cardiovascular disease. The activity and mass of this enzyme are heritable traits, but major genetic determinants have not been explored in a systematic, genome-wide fashion. We carried out a genome-wide association study of Lp-PLA(2) activity and mass in 6,668 Caucasian subjects from the population-based Framingham Heart Study. Clinical data and genotypes from the Affymetrix 550K SNP array were obtained from the open-access Framingham SHARe project. Each polymorphism that passed quality control was tested for associations with Lp-PLA(2) activity and mass using linear mixed models implemented in the R statistical package, accounting for familial correlations, and controlling for age, sex, smoking, lipid-lowering-medication use, and cohort. For Lp-PLA(2) activity, polymorphisms at four independent loci reached genome-wide significance, including the APOE/APOC1 region on chromosome 19 (p = 6 x 10(-24)); CELSR2/PSRC1 on chromosome 1 (p = 3 x 10(-15)); SCARB1 on chromosome 12 (p = 1x10(-8)) and ZNF259/BUD13 in the APOA5/APOA1 gene region on chromosome 11 (p = 4 x 10(-8)). All of these remained significant after accounting for associations with LDL cholesterol, HDL cholesterol, or triglycerides. For Lp-PLA(2) mass, 12 SNPs achieved genome-wide significance, all clustering in a region on chromosome 6p12.3 near the PLA2G7 gene. Our analyses demonstrate that genetic polymorphisms may contribute to inter-individual variation in Lp-PLA(2) activity and mass.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Query processing over the Internet involving autonomous data sources is a major task in data integration. It requires the estimated costs of possible queries in order to select the best one that has the minimum cost. In this context, the cost of a query is affected by three factors: network congestion, server contention state, and complexity of the query. In this paper, we study the effects of both the network congestion and server contention state on the cost of a query. We refer to these two factors together as system contention states. We present a new approach to determining the system contention states by clustering the costs of a sample query. For each system contention state, we construct two cost formulas for unary and join queries respectively using the multiple regression process. When a new query is submitted, its system contention state is estimated first using either the time slides method or the statistical method. The cost of the query is then calculated using the corresponding cost formulas. The estimated cost of the query is further adjusted to improve its accuracy. Our experiments show that our methods can produce quite accurate cost estimates of the submitted queries to remote data sources over the Internet.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Contemporary genetic structure of Atlantic salmon (Salmo salar L.) in the River Moy in Ireland is shown here to be strongly related to landscape features and population demographics, with populations being defined largely by their degree of physical isolation and their size. Samples of juvenile salmon were collected from the 17 major spawning areas on the river Moy and from one spawning area in each of five smaller nearby rivers. No temporal allele frequency differences were observed within locations for 12 microsatellite loci, whereas nearly all spatial samples differed significantly, suggesting that each was a separate population. Bayesian clustering and landscape genetic analyses suggest that these populations can be combined hierarchically into five genetically informative larger groupings. Lakes were found to be the single most important determinant of the observed population structure. Spawning area size was also an important factor. The salmon population of the closest nearby river resembled genetically the largest Moy population grouping. In addition, we showed that anthropogenic influences on spawning habitats, in this case arterial drainage, can affect relationships between populations. Our results show that Atlantic salmon biodiversity can be largely defined by geography, and thus, knowledge of landscape features (for example, as characterized within Geographical Information Systems) has the potential to predict population structure in other rivers without an intensive genetic survey, or at least to help direct sampling. This approach of combining genetics and geography, for sampling and in subsequent statistical analyses, has wider application to the investigation of population structure in other freshwater/anadromous fish species and possibly in marine fish and other organisms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The identification and classification of network traffic and protocols is a vital step in many quality of service and security systems. Traffic classification strategies must evolve, alongside the protocols utilising the Internet, to overcome the use of ephemeral or masquerading port numbers and transport layer encryption. This research expands the concept of using machine learning on the initial statistics of flow of packets to determine its underlying protocol. Recognising the need for efficient training/retraining of a classifier and the requirement for fast classification, the authors investigate a new application of k-means clustering referred to as 'two-way' classification. The 'two-way' classification uniquely analyses a bidirectional flow as two unidirectional flows and is shown, through experiments on real network traffic, to improve classification accuracy by as much as 18% when measured against similar proposals. It achieves this accuracy while generating fewer clusters, that is, fewer comparisons are needed to classify a flow. A 'two-way' classification offers a new way to improve accuracy and efficiency of machine learning statistical classifiers while still maintaining the fast training times associated with the k-means.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Clinical and pathological heterogeneity of breast cancer hinders selection of appropriate treatment for individual cases. Molecular profiling at gene or protein levels may elucidate the biological variance of tumors and provide a new classification system that correlates better with biological, clinical and prognostic parameters. We studied the immunohistochemical profile of a panel of seven important biomarkers using tumor tissue arrays. The tumor samples were then classified with a monothetic (binary variables) clustering algorithm. Two distinct groups of tumors are characterized by the estrogen receptor (ER) status and tumor grade (p = 0.0026). Four biomarkers, c-erbB2, Cox-2, p53 and VEGF, were significantly overexpressed in tumors with the ER-negative (ER-) phenotype. Eight subsets of tumors were further identified according to the expression status of VEGF, c-erbB2 and p53. The malignant potential of the ER-/VEGF+ subgroup was associated with the strong correlations of Cox-2 and c-erb132 with VEGF. Our results indicate that this molecular classification system, based on the statistical analysis of immunohistochemical profiling, is a useful approach for tumor grouping. Some of these subgroups have a relative genetic homogeneity that may allow further study of specific genetically-controlled metabolic pathways. This approach may hold great promise in rationalizing the application of different therapeutic strategies for different subgroups of breast tumors. (C) 2003 Elsevier Inc. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Evidence suggests that in prokaryotes sequence-dependent transcriptional pauses a?ect the dynamics of transcription and translation, as well as of small genetic circuits. So far, a few pause-prone sequences have been identi?ed from in vitro measurements of transcription elongation kinetics.

Results: Using a stochastic model of gene expression at the nucleotide and codon levels with realistic parameter values, we investigate three di?erent but related questions and present statistical methods for their analysis. First, we show that information from in vivo RNA and protein temporal numbers is su?cient to discriminate between models with and without a pause site in their coding sequence. Second, we demonstrate that it is possible to separate a large variety of models from each other with pauses of various durations and locations in the template by means of a hierarchical clustering and a random forest classi?er. Third, we introduce an approximate likelihood function that allows to estimate the location of a pause site.

Conclusions: This method can aid in detecting unknown pause-prone sequences from temporal measurements of RNA and protein numbers at a genome-wide scale and thus elucidate possible roles that these sequences play in the dynamics of genetic networks and phenotype.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this study was to characterize the transcriptome of a balanced polymorphism, under the regulation of a single gene, for phosphate fertilizer responsiveness/arsenate toler- ance in wild grass Holcus lanatus genotypes screened from the same habitat.

De novo transcriptome sequencing, RNAseq (RNA sequencing) and single nucleotide poly- morphism (SNP) calling were conducted on RNA extracted from H.lanatus. Roche 454 sequencing data were assembled into c. 22 000 isotigs, and paired-end Illumina reads for phosphorus-starved (P) and phosphorus-treated (P+) genovars of tolerant (T) and nontoler- ant (N) phenotypes were mapped to this reference transcriptome.

Heatmaps of the gene expression data showed strong clustering of each P+/P treated genovar, as well as clustering by N/T phenotype. Statistical analysis identified 87 isotigs to be significantly differentially expressed between N and T phenotypes and 258 between P+ and P treated plants. SNPs and transcript expression that systematically differed between N and T phenotypes had regulatory function, namely proteases, kinases and ribonuclear RNA- binding protein and transposable elements.

A single gene for arsenate tolerance led to distinct phenotype transcriptomes and SNP pro- files, with large differences in upstream post-translational and post-transcriptional regulatory genes rather than in genes directly involved in P nutrition transport and metabolism per se.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: There is no method routinely used to predict response to anthracycline and cyclophosphamide–based chemotherapy in the clinic; therefore patients often receive treatment for breast cancer with no benefit. Loss of the Fanconi anemia/BRCA (FA/BRCA) DNA damage response (DDR) pathway occurs in approximately 25% of breast cancer patients through several mechanisms and results in sensitization to DNA-damaging agents. The aim of this study was to develop an assay to detect DDR-deficient tumors associated with loss of the FA/BRCA pathway, for the purpose of treatment selection.

Methods: DNA microarray data from 21 FA patients and 11 control subjects were analyzed to identify genetic processes associated with a deficiency in DDR. Unsupervised hierarchical clustering was then performed using 60 BRCA1/2 mutant and 47 sporadic tumor samples, and a molecular subgroup was identified that was defined by the molecular processes represented within FA patients. A 44-gene microarray-based assay (the DDR deficiency assay) was developed to prospectively identify this subgroup from formalin-fixed, paraffin-embedded samples. All statistical tests were two-sided.

Results: In a publicly available independent cohort of 203 patients, the assay predicted complete pathologic response vs residual disease after neoadjuvant DNA-damaging chemotherapy (5-fluorouracil, anthracycline, and cyclophosphamide) with an odds ratio of 3.96 (95% confidence interval [Cl] =1.67 to 9.41; P = .002). In a new independent cohort of 191 breast cancer patients treated with adjuvant 5-fluorouracil, epirubicin, and cyclophosphamide, a positive assay result predicted 5-year relapse-free survival with a hazard ratio of 0.37 (95% Cl = 0.15 to 0.88; P = .03) compared with the assay negative population.

Conclusions: A formalin-fixed, paraffin-embedded tissue-based assay has been developed and independently validated as a predictor of response and prognosis after anthracycline/cyclophosphamide–based chemotherapy in the neoadjuvant and adjuvant settings. These findings warrant further validation in a prospective clinical study.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In recent years, wide-field sky surveys providing deep multi-band imaging have presented a new path for indirectly characterizing the progenitor populations of core-collapse supernovae (SN): systematic light curve studies. We assemble a set of 76 grizy-band Type IIP SN light curves from Pan-STARRS1, obtained over a constant survey program of 4 years and classified using both spectroscopy and machine learning-based photometric techniques. We develop and apply a new Bayesian model for the full multi-band evolution of each light curve in the sample. We find no evidence of a sub-population of fast-declining explosions (historically referred to as "Type IIL" SNe). However, we identify a highly significant relation between the plateau phase decay rate and peak luminosity among our SNe IIP. These results argue in favor of a single parameter, likely determined by initial stellar mass, predominantly controlling the explosions of red supergiants. This relation could also be applied for supernova cosmology, offering a standardizable candle good to an intrinsic scatter of 0.2 mag. We compare each light curve to physical models from hydrodynamic simulations to estimate progenitor initial masses and other properties of the Pan-STARRS1 Type IIP SN sample. We show that correction of systematic discrepancies between modeled and observed SN IIP light curve properties and an expanded grid of progenitor properties, are needed to enable robust progenitor inferences from multi-band light curve samples of this kind. This work will serve as a pathfinder for photometric studies of core-collapse SNe to be conducted through future wide field transient searches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Despite recent therapeutic improvements, the clinical course of diffuse large B-cell lymphoma (DLBCL) still differs considerably among patients. We conducted this retrospective multi-centre study to evaluate the impact of genomic aberrations detected using a high-density genome wide-single nucleotide polymorphism-based array on clinical outcome in a population of DLBCL patients treated with R-CHOP-21 (rituximab, cyclophosphamide, doxorubicine, vincristine and prednisone repeated every 21_d). 166 DNA samples were analysed using the GeneChip Human Mapping 250K NspI. Genomic anomalies were analysed regarding their impact on the clinical course of 124 patients treated with R-CHOP-21. Unsupervised clustering was performed to identify genetically related subgroups of patients with different clinical outcomes. Twenty recurrent genetic lesions showed an impact on the clinical course. Loss of genomic material at 8p23.1 showed the strongest statistical significance and was associated with additional aberrations, such as 17p- and 15q-. Unsupervised clustering identified five DLBCL clusters with distinct genetic profiles, clinical characteristics and outcomes. Genetic features and clusters, associated with a different outcome in patients treated with R-CHOP, have been identified by arrayCGH.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Repeated recolonization of freshwater environments following Pleistocene glaciations has played a major role in the evolution and adaptation of anadromous taxa. Located at the western fringe of Europe, Ireland and Britain were likely recolonized rapidly by anadromous fishes from the North Atlantic following the last glacial maximum (LGM). While the presence of unique mitochondrial haplotypes in Ireland suggests that a cryptic northern refugium may have played a role in recolonization, no explicit test of this hypothesis has been conducted. The three-spined stickleback is native and ubiquitous to aquatic ecosystems throughout Ireland, making it an excellent model species with which to examine the biogeographical history of anadromous fishes in the region. We used mitochondrial and microsatellite markers to examine the presence of divergent evolutionary lineages and to assess broad-scale patterns of geographical clustering among postglacially isolated populations. Our results confirm that Ireland is a region of secondary contact for divergent mitochondrial lineages and that endemic haplotypes occur in populations in Central and Southern Ireland. To test whether a putative Irish lineage arose from a cryptic Irish refugium, we used approximate Bayesian computation (ABC). However, we found no support for this hypothesis. Instead, the Irish lineage likely diverged from the European lineage as a result of postglacial isolation of freshwater populations by rising sea levels. These findings emphasize the need to rigorously test biogeographical hypothesis and contribute further evidence that postglacial processes may have shaped genetic diversity in temperate fauna.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a novel method for the light-curve characterization of Pan-STARRS1 Medium Deep Survey (PS1 MDS) extragalactic sources into stochastic variables (SVs) and burst-like (BL) transients, using multi-band image-differencing time-series data. We select detections in difference images associated with galaxy hosts using a star/galaxy catalog extracted from the deep PS1 MDS stacked images, and adopt a maximum a posteriori formulation to model their difference-flux time-series in four Pan-STARRS1 photometric bands gP1, rP1, iP1, and zP1. We use three deterministic light-curve models to fit BL transients; a Gaussian, a Gamma distribution, and an analytic supernova (SN) model, and one stochastic light-curve model, the Ornstein-Uhlenbeck process, in order to fit variability that is characteristic of active galactic nuclei (AGNs). We assess the quality of fit of the models band-wise and source-wise, using their estimated leave-out-one cross-validation likelihoods and corrected Akaike information criteria. We then apply a K-means clustering algorithm on these statistics, to determine the source classification in each band. The final source classification is derived as a combination of the individual filter classifications, resulting in two measures of classification quality, from the averages across the photometric filters of (1) the classifications determined from the closest K-means cluster centers, and (2) the square distances from the clustering centers in the K-means clustering spaces. For a verification set of AGNs and SNe, we show that SV and BL occupy distinct regions in the plane constituted by these measures. We use our clustering method to characterize 4361 extragalactic image difference detected sources, in the first 2.5 yr of the PS1 MDS, into 1529 BL, and 2262 SV, with a purity of 95.00% for AGNs, and 90.97% for SN based on our verification sets. We combine our light-curve classifications with their nuclear or off-nuclear host galaxy offsets, to define a robust photometric sample of 1233 AGNs and 812 SNe. With these two samples, we characterize their variability and host galaxy properties, and identify simple photometric priors that would enable their real-time identification in future wide-field synoptic surveys.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

PURPOSE: This systematic review reports on the survival of feldspathic porcelain veneers.

MATERIALS AND METHODS: The Cochrane Library, MEDLINE (OVID), Embase, Web of Knowledge, selected journals, clinical trials registers, and conference proceedings were searched independently by two reviewers. Academic colleagues were also contacted to identify relevant research. Inclusion criteria were human cohort studies (prospective and retrospective) and controlled trials assessing outcomes of feldspathic porcelain veneers in more than 15 patients and with at least some of the veneers in situ for 5 years. Of 4,294 articles identified, 116 studies underwent full-text screenings and 69 were further reviewed for eligibility. Of these, 11 were included in the qualitative analysis and 6 (5 cohorts) were included in meta-analyses. Estimated cumulative survival and standard error for each study were assessed and used for meta-, sensitivity, and post hoc analyses. The I2 statistic and the Cochran Q test and its associated P value were used to evaluate statistical heterogeneity, with a random-effects meta-analysis used when the P value for heterogeneity was less than .1. Galbraith, forest, and funnel plots explored heterogeneity, publication patterns, and small study biases.

RESULTS: The estimated cumulative survival for feldspathic porcelain veneers was 95.7% (95% confidence interval [CI]: 92.9% to 98.4%) at 5 years and ranged from 64% to 95% at 10 years across three studies. A post hoc meta-analysis indicated that the 10-year best estimate may approach 95.6% (95% CI: 93.8% to 97.5%). High levels of statistical heterogeneity were found.

CONCLUSIONS: When bonded to enamel substrate, feldspathic porcelain veneers have a very high 10-year survival rate that may approach 95%. Clinical heterogeneity is associated with differences in reported survival rates. Use of clinically relevant survival definitions and careful reporting of tooth characteristics, censorship, clustering, and precise results in future research would improve metaanalytic estimates and aid treatment decisions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Single component geochemical maps are the most basic representation of spatial elemental distributions and commonly used in environmental and exploration geochemistry. However, the compositional nature of geochemical data imposes several limitations on how the data should be presented. The problems relate to the constant sum problem (closure), and the inherently multivariate relative information conveyed by compositional data. Well known is, for instance, the tendency of all heavy metals to show lower values in soils with significant contributions of diluting elements (e.g., the quartz dilution effect); or the contrary effect, apparent enrichment in many elements due to removal of potassium during weathering. The validity of classical single component maps is thus investigated, and reasonable alternatives that honour the compositional character of geochemical concentrations are presented. The first recommended such method relies on knowledge-driven log-ratios, chosen to highlight certain geochemical relations or to filter known artefacts (e.g. dilution with SiO2 or volatiles). This is similar to the classical normalisation approach to a single element. The second approach uses the (so called) log-contrasts, that employ suitable statistical methods (such as classification techniques, regression analysis, principal component analysis, clustering of variables, etc.) to extract potentially interesting geochemical summaries. The caution from this work is that if a compositional approach is not used, it becomes difficult to guarantee that any identified pattern, trend or anomaly is not an artefact of the constant sum constraint. In summary the authors recommend a chain of enquiry that involves searching for the appropriate statistical method that can answer the required geological or geochemical question whilst maintaining the integrity of the compositional nature of the data. The required log-ratio transformations should be applied followed by the chosen statistical method. Interpreting the results may require a closer working relationship between statisticians, data analysts and geochemists.