48 resultados para Failure time data analysis
Resumo:
Quantile normalization (QN) is a technique for microarray data processing and is the default normalization method in the Robust Multi-array Average (RMA) procedure, which was primarily designed for analysing gene expression data from Affymetrix arrays. Given the abundance of Affymetrix microarrays and the popularity of the RMA method, it is crucially important that the normalization procedure is applied appropriately. In this study we carried out simulation experiments and also analysed real microarray data to investigate the suitability of RMA when it is applied to dataset with different groups of biological samples. From our experiments, we showed that RMA with QN does not preserve the biological signal included in each group, but rather it would mix the signals between the groups. We also showed that the Median Polish method in the summarization step of RMA has similar mixing effect. RMA is one of the most widely used methods in microarray data processing and has been applied to a vast volume of data in biomedical research. The problematic behaviour of this method suggests that previous studies employing RMA could have been misadvised or adversely affected. Therefore we think it is crucially important that the research community recognizes the issue and starts to address it. The two core elements of the RMA method, quantile normalization and Median Polish, both have the undesirable effects of mixing biological signals between different sample groups, which can be detrimental to drawing valid biological conclusions and to any subsequent analyses. Based on the evidence presented here and that in the literature, we recommend exercising caution when using RMA as a method of processing microarray gene expression data, particularly in situations where there are likely to be unknown subgroups of samples.
Resumo:
Statistics are regularly used to make some form of comparison between trace evidence or deploy the exclusionary principle (Morgan and Bull, 2007) in forensic investigations. Trace evidence are routinely the results of particle size, chemical or modal analyses and as such constitute compositional data. The issue is that compositional data including percentages, parts per million etc. only carry relative information. This may be problematic where a comparison of percentages and other constraint/closed data is deemed a statistically valid and appropriate way to present trace evidence in a court of law. Notwithstanding an awareness of the existence of the constant sum problem since the seminal works of Pearson (1896) and Chayes (1960) and the introduction of the application of log-ratio techniques (Aitchison, 1986; Pawlowsky-Glahn and Egozcue, 2001; Pawlowsky-Glahn and Buccianti, 2011; Tolosana-Delgado and van den Boogaart, 2013) the problem that a constant sum destroys the potential independence of variances and covariances required for correlation regression analysis and empirical multivariate methods (principal component analysis, cluster analysis, discriminant analysis, canonical correlation) is all too often not acknowledged in the statistical treatment of trace evidence. Yet the need for a robust treatment of forensic trace evidence analyses is obvious. This research examines the issues and potential pitfalls for forensic investigators if the constant sum constraint is ignored in the analysis and presentation of forensic trace evidence. Forensic case studies involving particle size and mineral analyses as trace evidence are used to demonstrate the use of a compositional data approach using a centred log-ratio (clr) transformation and multivariate statistical analyses.
Resumo:
This case study deals with the role of time series analysis in sociology, and its relationship with the wider literature and methodology of comparative case study research. Time series analysis is now well-represented in top-ranked sociology journals, often in the form of ‘pooled time series’ research designs. These studies typically pool multiple countries together into a pooled time series cross-section panel, in order to provide a larger sample for more robust and comprehensive analysis. This approach is well suited to exploring trans-national phenomena, and for elaborating useful macro-level theories specific to social structures, national policies, and long-term historical processes. It is less suited however, to understanding how these global social processes work in different countries. As such, the complexities of individual countries - which often display very different or contradictory dynamics than those suggested in pooled studies – are subsumed. Meanwhile, a robust literature on comparative case-based methods exists in the social sciences, where researchers focus on differences between cases, and the complex ways in which they co-evolve or diverge over time. A good example of this is the inequality literature, where although panel studies suggest a general trend of rising inequality driven by the weakening power of labour, marketisation of welfare, and the rising power of capital, some countries have still managed to remain resilient. This case study takes a closer look at what can be learned by applying the insights of case-based comparative research to the method of time series analysis. Taking international income inequality as its point of departure, it argues that we have much to learn about the viability of different combinations of policy options by examining how they work in different countries over time. By taking representative cases from different welfare systems (liberal, social democratic, corporatist, or antipodean), we can better sharpen our theories of how policies can be more specifically engineered to offset rising inequality. This involves a fundamental realignment of the strategy of time series analysis, grounding it instead in a qualitative appreciation of the historical context of cases, as a basis for comparing effects between different countries.
Resumo:
This paper is part of a special issue of Applied Geochemistry focusing on reliable applications of compositional multivariate statistical methods. This study outlines the application of compositional data analysis (CoDa) to calibration of geochemical data and multivariate statistical modelling of geochemistry and grain-size data from a set of Holocene sedimentary cores from the Ganges-Brahmaputra (G-B) delta. Over the last two decades, understanding near-continuous records of sedimentary sequences has required the use of core-scanning X-ray fluorescence (XRF) spectrometry, for both terrestrial and marine sedimentary sequences. Initial XRF data are generally unusable in ‘raw-format’, requiring data processing in order to remove instrument bias, as well as informed sequence interpretation. The applicability of these conventional calibration equations to core-scanning XRF data are further limited by the constraints posed by unknown measurement geometry and specimen homogeneity, as well as matrix effects. Log-ratio based calibration schemes have been developed and applied to clastic sedimentary sequences focusing mainly on energy dispersive-XRF (ED-XRF) core-scanning. This study has applied high resolution core-scanning XRF to Holocene sedimentary sequences from the tidal-dominated Indian Sundarbans, (Ganges-Brahmaputra delta plain). The Log-Ratio Calibration Equation (LRCE) was applied to a sub-set of core-scan and conventional ED-XRF data to quantify elemental composition. This provides a robust calibration scheme using reduced major axis regression of log-ratio transformed geochemical data. Through partial least squares (PLS) modelling of geochemical and grain-size data, it is possible to derive robust proxy information for the Sundarbans depositional environment. The application of these techniques to Holocene sedimentary data offers an improved methodological framework for unravelling Holocene sedimentation patterns.
Resumo:
Modeling of on-body propagation channels is of paramount importance to those wishing to evaluate radio channel performance for wearable devices in body area networks (BANs). Difficulties in modeling arise due to the highly variable channel conditions related to changes in the user's state and local environment. This study characterizes these influences by using time-series analysis to examine and model signal characteristics for on-body radio channels in user stationary and mobile scenarios in four different locations: anechoic chamber, open office area, hallway, and outdoor environment. Autocorrelation and cross-correlation functions are reported and shown to be dependent on body state and surroundings. Autoregressive (AR) transfer functions are used to perform time-series analysis and develop models for fading in various on-body links. Due to the non-Gaussian nature of the logarithmically transformed observed signal envelope in the majority of mobile user states, a simple method for reproducing the failing based on lognormal and Nakagami statistics is proposed. The validity of the AR models is evaluated using hypothesis testing, which is based on the Ljung-Box statistic, and the estimated distributional parameters of the simulator output compared with those from experimental results.
Resumo:
Wavelet transforms provide basis functions for time-frequency analysis and have properties that are particularly useful for the compression of analogue point on wave transient and disturbance power system signals. This paper evaluates the compression properties of the discrete wavelet transform using actual power system data. The results presented in the paper indicate that reduction ratios up to 10:1 with acceptable distortion are achievable. The paper discusses the application of the reduction method for expedient fault analysis and protection assessment.
Resumo:
The purpose of this study was to explore the care processes experienced by community-dwelling adults dying from advanced heart failure, their family caregivers, and their health-care providers. A descriptive qualitative design was used to guide data collection, analysis, and interpretation. The sample comprised 8 patients, 10 informal caregivers, 11 nurses, 3 physicians, and 3 pharmacists. Data analysis revealed that palliative care was influenced by unique contextual factors (i.e., cancer model of palliative care, limited access to resources, prognostication challenges). Patients described choosing interventions and living with fatigue, pain, shortness of breath, and functional decline. Family caregivers described surviving caregiver burden and drawing on their faith. Health professionals described their role as trying to coordinate care, building expertise, managing medications, and optimizing interprofessional collaboration. Participants strove towards 3 outcomes: effective symptom management, satisfaction with care, and a peaceful death. © McGill University School of Nursing.
Resumo:
The problem of model selection of a univariate long memory time series is investigated once a semi parametric estimator for the long memory parameter has been used. Standard information criteria are not consistent in this case. A Modified Information Criterion (MIC) that overcomes these difficulties is introduced and proofs that show its asymptotic validity are provided. The results are general and cover a wide range of short memory processes. Simulation evidence compares the new and existing methodologies and empirical applications in monthly inflation and daily realized volatility are presented.
Resumo:
Identifying differential expression of genes in psoriatic and healthy skin by microarray data analysis is a key approach to understand the pathogenesis of psoriasis. Analysis of more than one dataset to identify genes commonly upregulated reduces the likelihood of false positives and narrows down the possible signature genes. Genes controlling the critical balance between T helper 17 and regulatory T cells are of special interest in psoriasis. Our objectives were to identify genes that are consistently upregulated in lesional skin from three published microarray datasets. We carried out a reanalysis of gene expression data extracted from three experiments on samples from psoriatic and nonlesional skin using the same stringency threshold and software and further compared the expression levels of 92 genes related to the T helper 17 and regulatory T cell signaling pathways. We found 73 probe sets representing 57 genes commonly upregulated in lesional skin from all datasets. These included 26 probe sets representing 20 genes that have no previous link to the etiopathogenesis of psoriasis. These genes may represent novel therapeutic targets and surely need more rigorous experimental testing to be validated. Our analysis also identified 12 of 92 genes known to be related to the T helper 17 and regulatory T cell signaling pathways, and these were found to be differentially expressed in the lesional skin samples.
Resumo:
Wavelet transforms provide basis functions for time-frequency analysis and have properties that are particularly useful for compression of analogue point on wave transient and disturbance power system signals. This paper evaluates the reduction properties of the wavelet transform using real power system data and discusses the application of the reduction method for information transfer in network communications.
Resumo:
Recent technological advances have increased the quantity of movement data being recorded. While valuable knowledge can be gained by analysing such data, its sheer volume creates challenges. Geovisual analytics, which helps the human cognition process by using tools to reason about data, offers powerful techniques to resolve these challenges. This paper introduces such a geovisual analytics environment for exploring movement trajectories, which provides visualisation interfaces, based on the classic space-time cube. Additionally, a new approach, using the mathematical description of motion within a space-time cube, is used to determine the similarity of trajectories and forms the basis for clustering them. These techniques were used to analyse pedestrian movement. The results reveal interesting and useful spatiotemporal patterns and clusters of pedestrians exhibiting similar behaviour.
Resumo:
Retrospective clinical datasets are often characterized by a relatively small sample size and many missing data. In this case, a common way for handling the missingness consists in discarding from the analysis patients with missing covariates, further reducing the sample size. Alternatively, if the mechanism that generated the missing allows, incomplete data can be imputed on the basis of the observed data, avoiding the reduction of the sample size and allowing methods to deal with complete data later on. Moreover, methodologies for data imputation might depend on the particular purpose and might achieve better results by considering specific characteristics of the domain. The problem of missing data treatment is studied in the context of survival tree analysis for the estimation of a prognostic patient stratification. Survival tree methods usually address this problem by using surrogate splits, that is, splitting rules that use other variables yielding similar results to the original ones. Instead, our methodology consists in modeling the dependencies among the clinical variables with a Bayesian network, which is then used to perform data imputation, thus allowing the survival tree to be applied on the completed dataset. The Bayesian network is directly learned from the incomplete data using a structural expectation–maximization (EM) procedure in which the maximization step is performed with an exact anytime method, so that the only source of approximation is due to the EM formulation itself. On both simulated and real data, our proposed methodology usually outperformed several existing methods for data imputation and the imputation so obtained improved the stratification estimated by the survival tree (especially with respect to using surrogate splits).
Resumo:
The predominant fear in capital markets is that of a price spike. Commodity markets differ in that there is a fear of both upward and down jumps, this results in implied volatility curves displaying distinct shapes when compared to equity markets. The use of a novel functional data analysis (FDA) approach, provides a framework to produce and interpret functional objects that characterise the underlying dynamics of oil future options. We use the FDA framework to examine implied volatility, jump risk, and pricing dynamics within crude oil markets. Examining a WTI crude oil sample for the 2007–2013 period, which includes the global financial crisis and the Arab Spring, strong evidence is found of converse jump dynamics during periods of demand and supply side weakness. This is used as a basis for an FDA-derived Merton (1976) jump diffusion optimised delta hedging strategy, which exhibits superior portfolio management results over traditional methods.
Resumo:
Despite fractured hard rock aquifers underlying over 65% of Ireland, knowledge of key processes controlling groundwater recharge in these bedrock systems is inadequately constrained. In this study, we examined 19 groundwater-level hydrographs from two Irish hillslope sites underlain by hard rock aquifers. Water-level time-series in clustered monitoring wells completed at the subsoil, soil/bedrock interface, shallow and deep bedrocks were continuously monitored hourly over two hydrological years. Correlation methods were applied to investigate groundwater-level response to rainfall, as well as its seasonal variations. The results reveal that the direct groundwater recharge to the shallow and deep bedrocks on hillslope is very limited. Water-level variations within these geological units are likely dominated by slow flow rock matrix storage. The rapid responses to rainfall (⩽2 h) with little seasonal variations were observed to the monitoring wells installed at the subsoil and soil/bedrock interface, as well as those in the shallow or deep bedrocks at the base of the hillslope. This suggests that the direct recharge takes place within these units. An automated time-series procedure using the water-table fluctuation method was developed to estimate groundwater recharge from the water-level and rainfall data. Results show the annual recharge rates of 42–197 mm/yr in the subsoil and soil/bedrock interface, which represent 4–19% of the annual rainfall. Statistical analysis of the relationship between the rainfall intensity and water-table rise reveal that the low rainfall intensity group (⩽1 mm/h) has greater impact on the groundwater recharge rate than other groups (>1 mm/h). This study shows that the combination of the time-series analysis and the water-table fluctuation method could be an useful approach to investigate groundwater recharge in fractured hard rock aquifers in Ireland.
Resumo:
A first stage collision database is assembled which contains electron-impact excitation, ionization,\r and recombination rate coefficients for B, B + , B 2+ , B 3+ , and B 4+ . The first stage database\r is constructed using the R-matrix with pseudostates, time-dependent close-coupling, and perturbative\r distorted-wave methods. A second stage collision database is then assembled which contains\r generalized collisional-radiative ionization, recombination, and power loss rate coefficients as a\r function of both temperature and density. The second stage database is constructed by solution of\r the collisional-radiative equations in the quasi-static equilibrium approximation using the first\r stage database. Both collision database stages reside in electronic form at the IAEA Labeled Atomic\r Data Interface (ALADDIN) database and the Atomic Data Analysis Structure (ADAS) open database.