12 resultados para Analysis of multiple regression
Resumo:
There has been an increasing interest in the development of new methods using Pareto optimality to deal with multi-objective criteria (for example, accuracy and time complexity). Once one has developed an approach to a problem of interest, the problem is then how to compare it with the state of art. In machine learning, algorithms are typically evaluated by comparing their performance on different data sets by means of statistical tests. Standard tests used for this purpose are able to consider jointly neither performance measures nor multiple competitors at once. The aim of this paper is to resolve these issues by developing statistical procedures that are able to account for multiple competing measures at the same time and to compare multiple algorithms altogether. In particular, we develop two tests: a frequentist procedure based on the generalized likelihood-ratio test and a Bayesian procedure based on a multinomial-Dirichlet conjugate model. We further extend them by discovering conditional independences among measures to reduce the number of parameters of such models, as usually the number of studied cases is very reduced in such comparisons. Data from a comparison among general purpose classifiers is used to show a practical application of our tests.
Resumo:
We performed fluorescent in situ hybridization (FISH) for 16q23 abnormalities in 861 patients with newly diagnosed multiple myeloma and identified deletion of 16q [del(16q)] in 19.5%. In 467 cases in which demographic and survival data were available, del(16q) was associated with a worse overall survival (OS). It was an independent prognostic marker and conferred additional adverse survival impact in cases with the known poor-risk cytogenetic factors t(4;14) and del(17p). Gene expression profiling and gene mapping using 500K single-nucleotide polymorphism (SNP) mapping arrays revealed loss of heterozygosity (LOH) involving 3 regions: the whole of 16q, a region centered on 16q12 (the location of CYLD), and a region centered on 16q23 (the location of the WW domain-containing oxidoreductase gene WWOX). CYLD is a negative regulator of the NF-kappaB pathway, and cases with low expression of CYLD were used to define a "low-CYLD signature." Cases with 16q LOH or t(14;16) had significantly reduced WWOX expression. WWOX, the site of the translocation breakpoint in t(14;16) cases, is a known tumor suppressor gene involved in apoptosis, and we were able to generate a "low-WWOX signature" defined by WWOX expression. These 2 genes and their corresponding pathways provide an important insight into the potential mechanisms by which 16q LOH confers poor prognosis.
Resumo:
BACKGROUND AND OBJECTIVE: Molecular analysis by PCR of monoclonally rearranged immunoglobulin (Ig) genes can be used for diagnosis in B-cell lymphoproliferative disorders (LPD), as well as for monitoring minimal residual disease (MRD) after treatment. This technique has the risk of false-positive results due to the "background" amplification of similar rearrangements derived from polyclonal B-cells. This problem can be resolved in advance by additional analyses that discern between polyclonal and monoclonal PCR products, such as the heteroduplex analysis. A second problem is that PCR frequently fails to amplify the junction regions, mainly due to somatic mutations frequently present in mature (post-follicular) B-cell lymphoproliferations. The use of additional targets (e.g. Ig light chain genes) can avoid this problem. DESIGN AND METHODS: We studied the specificity of heteroduplex PCR analysis of several Ig junction regions to detect monoclonal products in samples from 84 MM patients and 24 patients with B cell polyclonal disorders. RESULTS: Using two distinct VH consensus primers (FR3 and FR2) in combination with one JH primer, 79% of the MM displayed monoclonal products. The percentage of positive cases was increased by amplification of the Vlamda-Jlamda junction regions or kappa(de) rearrangements, using two or five pairs of consensus primers, respectively. After including these targets in the heteroduplex PCR analysis, 93% of MM cases displayed monoclonal products. None of the polyclonal samples analyzed resulted in monoclonal products. Dilution experiments showed that monoclonal rearrangements could be detected with a sensitivity of at least 10(-2) in a background with >30% polyclonal B-cells, the sensitivity increasing up to 10(-3) when the polyclonal background was
Resumo:
This paper is part of a special issue of Applied Geochemistry focusing on reliable applications of compositional multivariate statistical methods. This study outlines the application of compositional data analysis (CoDa) to calibration of geochemical data and multivariate statistical modelling of geochemistry and grain-size data from a set of Holocene sedimentary cores from the Ganges-Brahmaputra (G-B) delta. Over the last two decades, understanding near-continuous records of sedimentary sequences has required the use of core-scanning X-ray fluorescence (XRF) spectrometry, for both terrestrial and marine sedimentary sequences. Initial XRF data are generally unusable in ‘raw-format’, requiring data processing in order to remove instrument bias, as well as informed sequence interpretation. The applicability of these conventional calibration equations to core-scanning XRF data are further limited by the constraints posed by unknown measurement geometry and specimen homogeneity, as well as matrix effects. Log-ratio based calibration schemes have been developed and applied to clastic sedimentary sequences focusing mainly on energy dispersive-XRF (ED-XRF) core-scanning. This study has applied high resolution core-scanning XRF to Holocene sedimentary sequences from the tidal-dominated Indian Sundarbans, (Ganges-Brahmaputra delta plain). The Log-Ratio Calibration Equation (LRCE) was applied to a sub-set of core-scan and conventional ED-XRF data to quantify elemental composition. This provides a robust calibration scheme using reduced major axis regression of log-ratio transformed geochemical data. Through partial least squares (PLS) modelling of geochemical and grain-size data, it is possible to derive robust proxy information for the Sundarbans depositional environment. The application of these techniques to Holocene sedimentary data offers an improved methodological framework for unravelling Holocene sedimentation patterns.
Resumo:
Huntington’s disease (HD) is an autosomal neurodegenerative disorder affecting approximately 5-10 persons per 100,000 worldwide. The pathophysiology of HD is not fully understood but the age of onset is known to be highly dependent on the number of CAG triplet repeats in the huntingtin gene. Using 1H NMR spectroscopy this study biochemically profiled 39 brain metabolites in post-mortem striatum (n=14) and frontal lobe (n=14) from HD sufferers and controls (n=28). Striatum metabolites were more perturbed with 15 significantly affected in HD cases, compared with only 4 in frontal lobe (P<0.05; q<0.3). The metabolite which changed most overall was urea which decreased 3.25-fold in striatum (P<0.01). Four metabolites were consistently affected in both brain regions. These included the neurotransmitter precursors tyrosine and L-phenylalanine which were significantly depleted by 1.55-1.58-fold and 1.48-1.54-fold in striatum and frontal lobe, respectively (P=0.02-0.03). They also included L-leucine which was reduced 1.54-1.69-fold (P=0.04-0.09) and myo-inositol which was increased 1.26-1.37-fold (P<0.01). Logistic regression analyses performed with MetaboAnalyst demonstrated that data obtained from striatum produced models which were profoundly more sensitive and specific than those produced from frontal lobe. The brain metabolite changes uncovered in this first 1H NMR investigation of human HD offer new insights into the disease pathophysiology. Further investigations of striatal metabolite disturbances are clearly warranted.
Resumo:
The ability to rearrange the germ-line DNA to generate antibody diversity is an essential prerequisite for the production of a functional repertoire. While this is essential to prevent infections, it also represents the "Achilles heel" of the B-cell lineage, occasionally leading to malignant transformation of these cells by translocation of protooncogenes into the immunoglobulin (Ig) loci. However, in evolutionary terms this is a small price to pay for a functional immune system. The study of the configuration and rearrangements of the Ig gene loci has contributed extensively to our understanding of the natural history of development of myeloma. In addition to this, the analysis of Ig gene rearrangements in B-cell neoplasms provides information about the clonal origin of the disease, prognosis, as well as providing a clinical useful tool for clonality detection and minimal residual disease monitoring. Herein, we review the data currently available on both Ig gene rearrangements and protein patterns seen in myeloma with the aim of illustrating how this knowledge has contributed to our understanding of the pathobiology of myeloma.
Resumo:
Multiple myeloma is characterized by genomic alterations frequently involving gains and losses of chromosomes. Single nucleotide polymorphism (SNP)-based mapping arrays allow the identification of copy number changes at the sub-megabase level and the identification of loss of heterozygosity (LOH) due to monosomy and uniparental disomy (UPD). We have found that SNP-based mapping array data and fluorescence in situ hybridization (FISH) copy number data correlated well, making the technique robust as a tool to investigate myeloma genomics. The most frequently identified alterations are located at 1p, 1q, 6q, 8p, 13, and 16q. LOH is found in these large regions and also in smaller regions throughout the genome with a median size of 1 Mb. We have identified that UPD is prevalent in myeloma and occurs through a number of mechanisms including mitotic nondisjunction and mitotic recombination. For the first time in myeloma, integration of mapping and expression data has allowed us to reduce the complexity of standard gene expression data and identify candidate genes important in both the transition from normal to monoclonal gammopathy of unknown significance (MGUS) to myeloma and in different subgroups within myeloma. We have documented these genes, providing a focus for further studies to identify and characterize those that are key in the pathogenesis of myeloma.
Resumo:
Purpose: Our purpose in this report was to define genes and pathways dysregulated as a consequence of the t(4;14) in myeloma, and to gain insight into the downstream functional effects that may explain the different prognosis of this subgroup.Experimental Design: Fibroblast growth factor receptor 3 (FGFR3) overexpression, the presence of immunoglobulin heavy chain-multiple myeloma SET domain (IgH-MMSET) fusion products and the identification of t(4;14) breakpoints were determined in a series of myeloma cases. Differentially expressed genes were identified between cases with (n = 55) and without (n = 24) a t(4;14) by using global gene expression analysis.Results: Cases with a t(4;14) have a distinct expression pattern compared with other cases of myeloma. A total of 127 genes were identified as being differentially expressed including MMSET and cyclin D2, which have been previously reported as being associated with this translocation. Other important functional classes of genes include cell signaling, apoptosis and related genes, oncogenes, chromatin structure, and DNA repair genes. Interestingly, 25% of myeloma cases lacking evidence of this translocation had up-regulation of the MMSET transcript to the same level as cases with a translocation.Conclusions: t(4;14) cases form a distinct subgroup of myeloma cases with a unique gene signature that may account for their poor prognosis. A number of non-t(4;14) cases also express MMSET consistent with this gene playing a role in myeloma pathogenesis.
Resumo:
Contaminating tumour cells in apheresis products have proved to influence the outcome of patients with multiple myeloma (MM) undergoing autologous stem cell transplantation (APBSCT). The gene scanning of clonally rearranged VDJ segments of the heavy chain immunoglobulin gene (VDJH) is a reproducible and easy to perform technique that can be optimised for clinical laboratories. We used it to analyse the aphereses of 27 MM patients undergoing APBSCT with clonally detectable VDJH segments, and 14 of them yielded monoclonal peaks in at least one apheresis product. The presence of positive results was not related to any pre-transplant characteristics, except the age at diagnosis (lower in patients with negative products, P = 0.04). Moreover, a better pre-transplant response trended to associate with a negative result (P = 0.069). Patients with clonally free products were more likely to obtain a better response to transplant (complete remission, 54% vs 28%; >90% reduction in the M-component, 93% vs 43% P = 0.028). In addition, patients transplanted with polyclonal products had longer progression-free survival, (39 vs 19 months, P = 0.037) and overall survival (81% vs 28% at 5 years, P = 0.045) than those transplanted with monoclonal apheresis. In summary, the gene scanning of apheresis products is a useful and clinically relevant technique in MM transplanted patients.
Resumo:
This paper studies the energy efficiency (EE) of a point-to-point rank-1 Ricean fading multiple-input-multiple-output (MIMO) channel. In particular, a tight lower bound and an asymptotic approximation for the EE of the considered MIMO system are presented, under the assumption that the channel is unknown at the transmitter and perfectly known at the receiver. Moreover, the effects of different system parameters, namely, transmit power, spectral efficiency (SE), and number of transmit and receive antennas, on the EE are analytically investigated. An important observation is that, in the high signal-to-noise ratio regime and with the other system parameters fixed, the optimal transmit power that maximizes the EE increases as the Ricean-K factor increases. On the contrary, the optimal SE and the optimal number of transmit antennas decrease as K increases.
Resumo:
As one of the most successfully commercialized distributed energy resources, the long-term effects of microturbines (MTs) on the distribution network has not been fully investigated due to the complex thermo-fluid-mechanical energy conversion processes. This is further complicated by the fact that the parameter and internal data of MTs are not always available to the electric utility, due to different ownerships and confidentiality concerns. To address this issue, a general modeling approach for MTs is proposed in this paper, which allows for the long-term simulation of the distribution network with multiple MTs. First, the feasibility of deriving a simplified MT model for long-term dynamic analysis of the distribution network is discussed, based on the physical understanding of dynamic processes that occurred within MTs. Then a three-stage identification method is developed in order to obtain a piecewise MT model and predict electro-mechanical system behaviors with saturation. Next, assisted with the electric power flow calculation tool, a fast simulation methodology is proposed to evaluate the long-term impact of multiple MTs on the distribution network. Finally, the model is verified by using Capstone C30 microturbine experiments, and further applied to the dynamic simulation of a modified IEEE 37-node test feeder with promising results.