948 resultados para Cryptography Statistical methods
Resumo:
Background: Detection rates for adenoma and early colorectal cancer (CRC) are unsatisfactory due to low compliance towards invasive screening procedures such as colonoscopy. There is a large unmet screening need calling for an accurate, non-invasive and cost-effective test to screen for early neoplastic and pre-neoplastic lesions. Our goal is to identify effective biomarker combinations to develop a screening test aimed at detecting precancerous lesions and early CRC stages, based on a multigene assay performed on peripheral blood mononuclear cells (PBMC).Methods: A pilot study was conducted on 92 subjects. Colonoscopy revealed 21 CRC, 30 adenomas larger than 1 cm and 41 healthy controls. A panel of 103 biomarkers was selected by two approaches: a candidate gene approach based on literature review and whole transcriptome analysis of a subset of this cohort by Illumina TAG profiling. Blood samples were taken from each patient and PBMC purified. Total RNA was extracted and the 103 biomarkers were tested by multiplex RT-qPCR on the cohort. Different univariate and multivariate statistical methods were applied on the PCR data and 60 biomarkers, with significant p-value (< 0.01) for most of the methods, were selected.Results: The 60 biomarkers are involved in several different biological functions, such as cell adhesion, cell motility, cell signaling, cell proliferation, development and cancer. Two distinct molecular signatures derived from the biomarker combinations were established based on penalized logistic regression to separate patients without lesion from those with CRC or adenoma. These signatures were validated using bootstrapping method, leading to a separation of patients without lesion from those with CRC (Se 67%, Sp 93%, AUC 0.87) and from those with adenoma larger than 1cm (Se 63%, Sp 83%, AUC 0.77). In addition, the organ and disease specificity of these signatures was confirmed by means of patients with other cancer types and inflammatory bowel diseases.Conclusions: The two defined biomarker combinations effectively detect the presence of CRC and adenomas larger than 1 cm with high sensitivity and specificity. A prospective, multicentric, pivotal study is underway in order to validate these results in a larger cohort.
Resumo:
En la actualidad es difícil hablar de procesos estadísticos de análisis cuantitativo de datos sin hacer referencia a la informática aplicada a la investigación. Estos recursos informáticos se basan a menudo en paquetes de programas informáticos que tienen por objetivo ayudar al/la investigador/a en la fase de análisis de datos. En estos momentos uno de los paquetes más perfeccionados y completos es el SPSS (Statistical Package for the Social Sciences). El SPSS es un paquete de programas para llevar a cabo el análisis estadístico de los datos. Constituye una aplicación estadística muy potente, de la que se han ido desarrollando diversas versiones desde sus inicios, en los años setenta. En esta ficha las salidas de ordenador que se presentan corresponden a la versión 11.0.1. No obstante, aunque la forma ha ido variando desde sus inicios, su funcionamiento sigue siendo muy similar entre las diferentes versiones. Antes de iniciarnos en la utilización de las aplicaciones del SPSS es importante familiarizarse con algunas de las ventanas que más usaremos. Al entrar al SPSS lo primero que nos encontramos es el editor de datos. Esta ventana visualiza, básicamente, los datos que iremos introduciendo. El editor de datos incluye dos opciones: la vista de los datos y la de las variables. Estas opciones pueden seleccionarse a partir de las dos pestañas que se presentan en la parte inferior. La vista de datos contiene el menú general y la matriz de datos. Esta matriz está estructurada ubicando los casos en las filas y las variables en las columnas.
Resumo:
The present study explores the statistical properties of a randomization test based on the random assignment of the intervention point in a two-phase (AB) single-case design. The focus is on randomization distributions constructed with the values of the test statistic for all possible random assignments and used to obtain p-values. The shape of those distributions is investigated for each specific data division defined by the moment in which the intervention is introduced. Another aim of the study consisted in testing the detection of inexistent effects (i.e., production of false alarms) in autocorrelated data series, in which the assumption of exchangeability between observations may be untenable. In this way, it was possible to compare nominal and empirical Type I error rates in order to obtain evidence on the statistical validity of the randomization test for each individual data division. The results suggest that when either of the two phases has considerably less measurement times, Type I errors may be too probable and, hence, the decision making process to be carried out by applied researchers may be jeopardized.
Resumo:
Interdependence is the main feature of dyadic relationships and, in recent years, various statistical procedures have been proposed for quantifying and testing this social attribute in different dyadic designs. The purpose of this paper is to develop several functions for this kind of statistical tests in an R package, known as nonindependence, for use by applied social researchers. A Graphical User Interface (GUI) is also developed to facilitate the use of the functions included in this package. Examples drawn from psychological research and simulated data are used to illustrate how the software works.
Resumo:
The present work focuses the attention on the skew-symmetry index as a measure of social reciprocity. This index is based on the correspondence between the amount of behaviour that each individual addresses to its partners and what it receives from them in return. Although the skew-symmetry index enables researchers to describe social groups, statistical inferential tests are required. The main aim of the present study is to propose an overall statistical technique for testing symmetry in experimental conditions, calculating the skew-symmetry statistic (Φ) at group level. Sampling distributions for the skew- symmetry statistic have been estimated by means of a Monte Carlo simulation in order to allow researchers to make statistical decisions. Furthermore, this study will allow researchers to choose the optimal experimental conditions for carrying out their research, as the power of the statistical test has been estimated. This statistical test could be used in experimental social psychology studies in which researchers may control the group size and the number of interactions within dyads.
Resumo:
The present work deals with quantifying group characteristics. Specifically, dyadic measures of interpersonal perceptions were used to forecast group performance. 46 groups of students, 24 of four and 22 of five people, were studied in a real educational assignment context and marks were gathered as an indicator of group performance. Our results show that dyadic measures of interpersonal perceptions account for final marks. By means of linear regression analysis 85% and 85.6% of group performance was respectively explained for group sizes equal to four and five. Results found in the scientific literature based on the individualistic approach are no larger than 18%. The results of the present study support the utility of dyadic approaches for predicting group performance in social contexts.
Resumo:
Workgroup diversity can be conceptualized as variety, separation, or disparity. Thus, the proper operationalization of diversity depends on how a diversity dimension has been defined. Analytically, the minimal diversity must be obtained when there are no differences on an attribute among the members of a group, however maximal diversity has a different shape for each conceptualization of diversity. Previous work on diversity indexes indicated maximum values for variety (e.g., Blau"s index and Teachman"s index), separation (e.g., standard deviation and mean Euclidean distance), and disparity (e.g., coefficient of variation and the Gini coefficient of concentration), although these maximum values are not valid for all group characteristics (i.e., group size and group size parity) and attribute scales (i.e., number of categories). We demonstrate analytically appropriate upper boundaries for conditional diversity determined by some specific group characteristics, avoiding the bias related to absolute diversity. This will allow applied researchers to make better interpretations regarding the relationship between group diversity and group outcomes.
Resumo:
El presente trabajo recoge de forma breve laproblemática de la estimación de la serial en series temporales de datos obtenidos en registros ERP. Se centra en aquellos componentes de frecuencia mis baja, como es el caso de la CNV: Sepropone la utilización alternativa de las técnicas de suavizado del Análisis Exploratorio de Datos (EDA), para mejorar la estimación obtenida, en comparación con la técnica del promediado simple de diferentes ensayos.
Resumo:
Cuando se realiza una encuesta social en un amplio territorio queda siempre el deseo de aplicar análisis similares a los realizados en la encuesta a poblaciones o territorios más reducidos, evidentemente utilizando los propios datos de la encuesta. El objetivo de este articulo consiste en mostrar cómo cada estrato de una muestra estratificada puede constituir una base muestral para llevar a cabo dichos análisis con todas las garantías de precisión o, al menos, con garantías calculables y aceptables sin aumentar el número muestral para la encuesta general.
Resumo:
It is estimated that around 230 people die each year due to radon (222Rn) exposure in Switzerland. 222Rn occurs mainly in closed environments like buildings and originates primarily from the subjacent ground. Therefore it depends strongly on geology and shows substantial regional variations. Correct identification of these regional variations would lead to substantial reduction of 222Rn exposure of the population based on appropriate construction of new and mitigation of already existing buildings. Prediction of indoor 222Rn concentrations (IRC) and identification of 222Rn prone areas is however difficult since IRC depend on a variety of different variables like building characteristics, meteorology, geology and anthropogenic factors. The present work aims at the development of predictive models and the understanding of IRC in Switzerland, taking into account a maximum of information in order to minimize the prediction uncertainty. The predictive maps will be used as a decision-support tool for 222Rn risk management. The construction of these models is based on different data-driven statistical methods, in combination with geographical information systems (GIS). In a first phase we performed univariate analysis of IRC for different variables, namely the detector type, building category, foundation, year of construction, the average outdoor temperature during measurement, altitude and lithology. All variables showed significant associations to IRC. Buildings constructed after 1900 showed significantly lower IRC compared to earlier constructions. We observed a further drop of IRC after 1970. In addition to that, we found an association of IRC with altitude. With regard to lithology, we observed the lowest IRC in sedimentary rocks (excluding carbonates) and sediments and the highest IRC in the Jura carbonates and igneous rock. The IRC data was systematically analyzed for potential bias due to spatially unbalanced sampling of measurements. In order to facilitate the modeling and the interpretation of the influence of geology on IRC, we developed an algorithm based on k-medoids clustering which permits to define coherent geological classes in terms of IRC. We performed a soil gas 222Rn concentration (SRC) measurement campaign in order to determine the predictive power of SRC with respect to IRC. We found that the use of SRC is limited for IRC prediction. The second part of the project was dedicated to predictive mapping of IRC using models which take into account the multidimensionality of the process of 222Rn entry into buildings. We used kernel regression and ensemble regression tree for this purpose. We could explain up to 33% of the variance of the log transformed IRC all over Switzerland. This is a good performance compared to former attempts of IRC modeling in Switzerland. As predictor variables we considered geographical coordinates, altitude, outdoor temperature, building type, foundation, year of construction and detector type. Ensemble regression trees like random forests allow to determine the role of each IRC predictor in a multidimensional setting. We found spatial information like geology, altitude and coordinates to have stronger influences on IRC than building related variables like foundation type, building type and year of construction. Based on kernel estimation we developed an approach to determine the local probability of IRC to exceed 300 Bq/m3. In addition to that we developed a confidence index in order to provide an estimate of uncertainty of the map. All methods allow an easy creation of tailor-made maps for different building characteristics. Our work is an essential step towards a 222Rn risk assessment which accounts at the same time for different architectural situations as well as geological and geographical conditions. For the communication of 222Rn hazard to the population we recommend to make use of the probability map based on kernel estimation. The communication of 222Rn hazard could for example be implemented via a web interface where the users specify the characteristics and coordinates of their home in order to obtain the probability to be above a given IRC with a corresponding index of confidence. Taking into account the health effects of 222Rn, our results have the potential to substantially improve the estimation of the effective dose from 222Rn delivered to the Swiss population.
Resumo:
Sickness absence (SA) is an important social, economic and public health issue. Identifying and understanding the determinants, whether biological, regulatory or, health services-related, of variability in SA duration is essential for better management of SA. The conditional frailty model (CFM) is useful when repeated SA events occur within the same individual, as it allows simultaneous analysis of event dependence and heterogeneity due to unknown, unmeasured, or unmeasurable factors. However, its use may encounter computational limitations when applied to very large data sets, as may frequently occur in the analysis of SA duration. To overcome the computational issue, we propose a Poisson-based conditional frailty model (CFPM) for repeated SA events that accounts for both event dependence and heterogeneity. To demonstrate the usefulness of the model proposed in the SA duration context, we used data from all non-work-related SA episodes that occurred in Catalonia (Spain) in 2007, initiated by either a diagnosis of neoplasm or mental and behavioral disorders. As expected, the CFPM results were very similar to those of the CFM for both diagnosis groups. The CPU time for the CFPM was substantially shorter than the CFM. The CFPM is an suitable alternative to the CFM in survival analysis with recurrent events,especially with large databases.
Resumo:
Highway agencies spend millions of dollars to ensure safe and efficient winter travel. However, the effectiveness of winter-weather maintenance practices on safety and mobility are somewhat difficult to quantify. Safety and Mobility Impacts of Winter Weather - Phase 1 investigated opportunities for improving traffic safety on state-maintained roads in Iowa during winter-weather conditions. In Phase 2, three Iowa Department of Transportation (DOT) high-priority sites were evaluated and realistic maintenance and operations mitigation strategies were also identified. In this project, site prioritization techniques for identifying roadway segments with the potential for safety improvements related to winter-weather crashes, were developed through traditional naïve statistical methods by using raw crash data for seven winter seasons and previously developed metrics. Additionally, crash frequency models were developed using integrated crash data for four winter seasons, with the objective of identifying factors that affect crash frequency during winter seasons and screening roadway segments using the empirical Bayes technique. Based on these prioritization techniques, 11 sites were identified and analyzed in conjunction with input from Iowa DOT district maintenance managers and snowplow operators and the Iowa DOT Road Weather Information System (RWIS) coordinator.
Resumo:
L’objecte del present treball és la realització d’una aplicació que permeti portar a terme el control estadístic multivariable en línia d’una planta SBR.Aquesta eina ha de permetre realitzar un anàlisi estadístic multivariable complet del lot en procés, de l’últim lot finalitzat i de la resta de lots processats a la planta.L’aplicació s’ha de realitzar en l’entorn LabVIEW. L’elecció d’aquest programa vecondicionada per l’actualització del mòdul de monitorització de la planta que s’estàdesenvolupant en aquest mateix entorn
Resumo:
In the context of observed climate change impacts and their effect on agriculture and crop production, this study intends to assess the vulnerability of rural livelihoods through a study case in Karnataka, India. The social approach of climate change vulnerability in this study case includes defining and exploring factors that determine farmers’ vulnerability in four villages. Key informant interviews, farmer workshops and structured household interviews were used for data collection. To analyse the data, we adapted and applied three vulnerability indices: Livelihood Vulnerability Index (LVI), LVI-IPCC and the Livelihood Effect Index (LEI), and used descriptive statistical methods. The data was analysed at two scales: whole sample-level and household level. The results from applying the indices for the whole-sample level show that this community's vulnerability to climate change is moderate, whereas the household-level results show that most of the households' vulnerability is high-very high, while 15 key drivers of vulnerability were identified. Results and limitations of the study are discussed under the rural livelihoods framework, in which the indices are based, allowing a better understanding of the social behaviouraltrends, as well as an holistic and integrated view of the climate change, agriculture, and livelihoods processes shaping vulnerability. We conclude that these indices, although a straightforward method to assess vulnerability, have limitations that could account for inaccuracies and inability to be standardised for benchmarking, therefore we stress the need for further research.
Resumo:
A plant species' genetic population structure is the result of a complex combination of its life history, ecological preferences, position in the ecosystem and historical factors. As a result, many different statistical methods exist that measure different aspects of species' genetic structure. However, little is known about how these methods are interrelated and how they are related to a species' ecology and life history. In this study, we used the IntraBioDiv amplified fragment length polymorphisms data set from 27 high-alpine species to calculate eight genetic summary statistics that we jointly correlate to a set of six ecological and life-history traits. We found that there is a large amount of redundancy among the calculated summary statistics and that there is a significant association with the matrix of species traits. In a multivariate analysis, two main aspects of population structure were visible among the 27 species. The first aspect is related to the species' dispersal capacities and the second is most likely related to the species' postglacial recolonization of the Alps. Furthermore, we found that some summary statistics, most importantly Mantel's r and Jost's D, show different behaviour than expected based on theory. We therefore advise caution in drawing too strong conclusions from these statistics.