Biblioteca Digital

993 resultados para cosmologia, clustering, AP-test

Two way clustering of Microarray Data using a Hybrid Approach

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Microarray technique is rather powerful, as it allows to test up thousands of genes at a time, but this produces an overwhelming set of data files containing huge amounts of data, which is quite difficult to pre-process, separate, classify and correlate for interesting conclusions to be extracted. Modern machine learning, data mining and clustering techniques based on information theory, are needed to read and interpret the information contents buried in those large data sets. Independent Component Analysis method can be used to correct the data affected by corruption processes or to filter the uncorrectable one and then clustering methods can group similar genes or classify samples. In this paper a hybrid approach is used to obtain a two way unsupervised clustering for a corrected microarray data.

Novel fast random search clustering approach for mixing matrix identification in MIMO linear blind inverse problems with sparse inputs

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we propose a novel fast random search clustering (RSC) algorithm for mixing matrix identification in multiple input multiple output (MIMO) linear blind inverse problems with sparse inputs. The proposed approach is based on the clustering of the observations around the directions given by the columns of the mixing matrix that occurs typically for sparse inputs. Exploiting this fact, the RSC algorithm proceeds by parameterizing the mixing matrix using hyperspherical coordinates, randomly selecting candidate basis vectors (i.e. clustering directions) from the observations, and accepting or rejecting them according to a binary hypothesis test based on the Neyman–Pearson criterion. The RSC algorithm is not tailored to any specific distribution for the sources, can deal with an arbitrary number of inputs and outputs (thus solving the difficult under-determined problem), and is applicable to both instantaneous and convolutive mixtures. Extensive simulations for synthetic and real data with different number of inputs and outputs, data size, sparsity factors of the inputs and signal to noise ratios confirm the good performance of the proposed approach under moderate/high signal to noise ratios. RESUMEN. Método de separación ciega de fuentes para señales dispersas basado en la identificación de la matriz de mezcla mediante técnicas de "clustering" aleatorio.

Using multiple landscape genetic approaches to test the validity of genetic clusters in a species characterized by an isolation-by-distance pattern

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bayesian clustering methods are typically used to identify barriers to gene flow, but they are prone to deduce artificial subdivisions in a study population characterized by an isolation-by-distance pattern (IbD). Here we analysed the landscape genetic structure of a population of wild boars (Sus scrofa) from south-western Germany. Two clustering methods inferred the presence of the same genetic discontinuity. However, the population in question was characterized by a strong IbD pattern. While landscape-resistance modelling failed to identify landscape features that influenced wild boar movement, partial Mantel tests and multiple regression of distance matrices (MRDMs) suggested that the empirically inferred clusters were separated by a genuine barrier. When simulating random lines bisecting the study area, 60% of the unique barriers represented, according to partial Mantel tests and MRDMs, significant obstacles to gene flow. By contrast, the random-lines simulation showed that the boundaries of the inferred empirical clusters corresponded to the most important genetic discontinuity in the study area. Given the degree of habitat fragmentation separating the two empirical partitions, it is likely that the clustering programs correctly identified a barrier to gene flow. The differing results between the work published here and other studies suggest that it will be very difficult to draw general conclusions about habitat permeability in wild boar from individual studies.

On a resampling approach for tests on the number of clusters with mixture model-based clustering of tissue samples

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the problem of assessing the number of clusters in a limited number of tissue samples containing gene expressions for possibly several thousands of genes. It is proposed to use a normal mixture model-based approach to the clustering of the tissue samples. One advantage of this approach is that the question on the number of clusters in the data can be formulated in terms of a test on the smallest number of components in the mixture model compatible with the data. This test can be carried out on the basis of the likelihood ratio test statistic, using resampling to assess its null distribution. The effectiveness of this approach is demonstrated on simulated data and on some microarray datasets, as considered previously in the bioinformatics literature. (C) 2004 Elsevier Inc. All rights reserved.

On the statistical analysis of the GS-NS0 cell proteome: Imputation, clustering and variability testing

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have undertaken two-dimensional gel electrophoresis proteomic profiling on a series of cell lines with different recombinant antibody production rates. Due to the nature of gel-based experiments not all protein spots are detected across all samples in an experiment, and hence datasets are invariably incomplete. New approaches are therefore required for the analysis of such graduated datasets. We approached this problem in two ways. Firstly, we applied a missing value imputation technique to calculate missing data points. Secondly, we combined a singular value decomposition based hierarchical clustering with the expression variability test to identify protein spots whose expression correlates with increased antibody production. The results have shown that while imputation of missing data was a useful method to improve the statistical analysis of such data sets, this was of limited use in differentiating between the samples investigated, and highlighted a small number of candidate proteins for further investigation. (c) 2006 Elsevier B.V. All rights reserved.

Clustering and spatial correlations of the neuronal cytoplasmic inclusions, astrocytic plaques and ballooned neurons in corticobasal degeneration

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study tested three hypotheses: (1) that there is clustering of the neuronal cytoplasmic inclusions (NCI), astrocytic plaques (AP) and ballooned neurons (BN) in corticobasal degeneration (CBD), (2) that the clusters of NCI and BN are not spatially correlated, and (3) that the lesions are correlated with disease ‘stage’. In 50% of the regions, clusters of lesions were 400–800 µm in diameter and regularly distributed parallel to the tissue boundary. Clusters of NCI and BN were larger in laminae II/III and V/VI, respectively. In a third of regions, the clusters of BN and NCI were negatively spatially correlated. Cluster size of the BN in the parahippocampal gyrus (PHG) was positively correlated with disease ‘stage’. The data suggest the following: (1) degeneration of the cortico-cortical pathways in CBD, (2) clusters of NCI and BN may affect different anatomical pathways and (3) BN may develop after the NCI in the PHG.

Is the clustering of neurofibrillary tangles in Alzheimer's patients related to the cells of origin of specific cortico-cortical projections?

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The spatial pattern of cellular neurofibrillary tangles (NFT) was studied in the supra- and infragranular layers of various cortical regions in cases of Alzheimer's disease (AD). The objective was to test the hypothesis that NFT formation was associated with the cells of origin of specific cortico-cortical projections. The novel feature of the study was that pattern analysis enabled the dimension and spacing of NFT clusters along the cortical ribbon to be estimated. In the majority of brain regions studied, NFT occurred in clusters of neurons which were regularly spaced along the cortical strip. This pattern is consistent with the predicted distribution of the cells of origin of specific cortico-cortico projections. Mean NFT cluster size varied from 250 to > 12800 microns in different cortical tissues suggesting either variation in the size of the cell clusters or a dynamic process in the development of NFT in relation to these cell clusters. The formation of NFT in cell clusters which may give rise to the feed-forward and feed-back cortico-cortical projections suggests a possible route of spread of NFT pathology in AD between cortical regions and from the cortex to subcortical areas.

Clustering patterns of neurofibrillary tangles in Alzheimer’s disease

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Clustering of cellular neurofibrillary tangles (NFT) was studied in the cerebral cortex and hippocampus in cases of Alzheimer’s disease (AD) using a regression method. The objective of the study was to test the hypothesis that clustering of NFTs reflects the degeneration of the cortico-cortical pathways. In 25/38 (66%) of analyses of individual brain areas, a significant peak to trough and peak to peak distance was obtained suggesting that the clusters of NFTs were regularly distributed in bands parallel to the tissue boundary. In analyses of cortical tissues with regularly distributed clusters, peak to peak distance was between 1000 and 1600 microns in 13/24 (54%) of analyses, >1600 microns in 10/24 (42%) and <1000 microns in 1/24 (4%) of analyses. A regular distribution of NFT clusters was less evident in the CA sectors of the hippocampus than in the cortex. Hence, in a significant proportion of brain areas, the spacing of NFT clusters along the cerebral cortex was consistent with the predicted distribution of the cells of origin of specific cortico-cortical projections. However, in many brain regions, the sizes of the NFT clusters were larger than predicted which may be attributable to the spread of NFTs to adjacent groups of cells as the disease progresses.

A Bimodality Test in High Dimensions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a test for identifying clusters in high dimensional data based on the k-means algorithm when the null hypothesis is spherical normal. We show that projection techniques used for evaluating validity of clusters may be misleading for such data. In particular, we demonstrate that increasingly well-separated clusters are identified as the dimensionality increases, when no such clusters exist. Furthermore, in a case of true bimodality, increasing the dimensionality makes identifying the correct clusters more difficult. In addition to the original conservative test, we propose a practical test with the same asymptotic behavior that performs well for a moderate number of points and moderate dimensionality. ACM Computing Classification System (1998): I.5.3.

The relationship between selected standardized test scores and performance in advanced placement math and science exams: Analyzing the differential effectiveness of scores for course identification and placement

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There is a national need to increase the STEM-related workforce. Among factors leading towards STEM careers include the number of advanced high school mathematics and science courses students complete. Florida's enrollment patterns in STEM-related Advanced Placement (AP) courses, however, reveal that only a small percentage of students enroll into these classes. Therefore, screening tools are needed to find more students for these courses, who are academically ready, yet have not been identified. The purpose of this study was to investigate the extent to which scores from a national standardized test, Preliminary Scholastic Assessment Test/ National Merit Qualifying Test (PSAT/NMSQT), in conjunction with and compared to a state-mandated standardized test, Florida Comprehensive Assessment Test (FCAT), are related to selected AP exam performance in Seminole County Public Schools. An ex post facto correlational study was conducted using 6,189 student records from the 2010 - 2012 academic years. Multiple regression analyses using simultaneous Full Model testing showed differential moderate to strong relationships between scores in eight of the nine AP courses (i.e., Biology, Environmental Science, Chemistry, Physics B, Physics C Electrical, Physics C Mechanical, Statistics, Calculus AB and BC) examined. For example, the significant unique contribution to overall variance in AP scores was a linear combination of PSAT Math (M), Critical Reading (CR) and FCAT Reading (R) for Biology and Environmental Science. Moderate relationships for Chemistry included a linear combination of PSAT M, W (Writing) and FCAT M; a combination of FCAT M and PSAT M was most significantly associated with Calculus AB performance. These findings have implications for both research and practice. FCAT scores, in conjunction with PSAT scores, can potentially be used for specific STEM-related AP courses, as part of a systematic approach towards AP course identification and placement. For courses with moderate to strong relationships, validation studies and development of expectancy tables, which estimate the probability of successful performance on these AP exams, are recommended. Also, findings established a need to examine other related research issues including, but not limited to, extensive longitudinal studies and analyses of other available or prospective standardized test scores.

Assessment of the EEE-4 oral test: a discourse analysis based on complex networks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the development of information technology, the theory and methodology of complex network has been introduced to the language research, which transforms the system of language in a complex networks composed of nodes and edges for the quantitative analysis about the language structure. The development of dependency grammar provides theoretical support for the construction of a treebank corpus, making possible a statistic analysis of complex networks. This paper introduces the theory and methodology of the complex network and builds dependency syntactic networks based on the treebank of speeches from the EEE-4 oral test. According to the analysis of the overall characteristics of the networks, including the number of edges, the number of the nodes, the average degree, the average path length, the network centrality and the degree distribution, it aims to find in the networks potential difference and similarity between various grades of speaking performance. Through clustering analysis, this research intends to prove the network parameters’ discriminating feature and provide potential reference for scoring speaking performance.

Aerobic and anaerobic test performance among elite male football players in different team positions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose was to determine the magnitude of aerobic and anaerobic performance factors among elite male football players in different team positions. Thirty-nine players from the highest Swedish division classified as defenders (n=18), midfield players (n=12) or attackers (n=9) participated. Their mean (± sd) age, height and body mass (bm) were 24.4 (±4.7) years, 1.80 (±5.9)m and 79 (±7.6)kg, respectively. Running economy (RE) and anaerobic threshold (AT) was determined at 10, 12, 14, and 16km/h followed by tests of maximal oxygen uptake (VO2max). Maximal strength (1RM) and average power output (AP) was performed in squat lifting. Squat jump (SJ), counter-movement jump with free arm swing (CMJa), 45m maximal sprint and the Wingate test was performed. Average VO2max for the whole population (WP) was 57.0mL O2•kg-1min-1 . The average AT occurred at about 84% of VO2max. 1RM per kg bm0.67 was 11.9±1.3kg. Average squat power in the whole population at 40% 1RM was 70±9.5W per kg bm0.67 . SJ and CMJa were 38.6±3.8cm and 48.9±4.4cm, respectively. The average sprint time (45m) was 5.78± 0.16s. The AP in the Wingate test was 10.6±0.9W•kg-1 . The average maximal oxygen uptake among players in the highest Swedish division was lower compared to international elite players but the Swedish players were better off concerning the anaerobic threshold and in the anaerobic tests. No significant differences were revealed between defenders, midfielders or attackers concerning the tested parameters presented above.

Aerobic and anaerobic test performance among elite male football players in different team positions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose was to determine the magnitude of aerobic and anaerobic performance factors among elite male football players in different team positions. Thirty-nine players from the highest Swedish division classified as defenders (n=18), midfield players (n=12) or attackers (n=9) participated. Their mean (± sd) age, height and body mass (bm) were 24.4 (±4.7) years, 1.80 (±5.9)m and 79 (±7.6)kg, respectively. Running economy (RE) and anaerobic threshold (AT) was determined at 10, 12, 14, and 16km/h followed by tests of maximal oxygen uptake (VO2max). Maximal strength (1RM) and average power output (AP) was performed in squat lifting. Squat jump (SJ), counter-movement jump with free arm swing (CMJa), 45m maximal sprint and the Wingate test was performed. Average VO2max for the whole population (WP) was 57.0mL O2•kg-1min-1. The average AT occurred at about 84% of VO2max. 1RM per kg bm0.67 was 11.9±1.3kg. Average squat power in the whole population at 40% 1RM was70±9.5W per kg bm0.67. SJ and CMJa were 38.6±3.8cm and 48.9±4.4cm,respectively. The average sprint time (45m) was 5.78± 0.16s. The AP in the Wingate test was 10.6±0.9W•kg-1. The average maximal oxygen uptake among players in the highest Swedish division was lower compared to international elite players but the Swedish players were better off concerning the anaerobic threshold and in the anaerobic tests. No significant differences were revealed between defenders, midfielders or attackers concerning the tested parameters presented above.

Exploiting the clustering of cosmic voids as a novel cosmological probe

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The investigations of the large-scale structure of our Universe provide us with extremely powerful tools to shed light on some of the open issues of the currently accepted Standard Cosmological Model. Until recently, constraining the cosmological parameters from cosmic voids was almost infeasible, because the amount of data in void catalogues was not enough to ensure statistically relevant samples. The increasingly wide and deep fields in present and upcoming surveys have made the cosmic voids become promising probes, despite the fact that we are not yet provided with a unique and generally accepted definition for them. In this Thesis we address the two-point statistics of cosmic voids, in the very first attempt to model its features with cosmological purposes. To this end, we implement an improved version of the void power spectrum presented by Chan et al. (2014). We have been able to build up an exceptionally robust method to tackle with the void clustering statistics, by proposing a functional form that is entirely based on first principles. We extract our data from a suite of high-resolution N-body simulations both in the LCDM and alternative modified gravity scenarios. To accurately compare the data to the theory, we calibrate the model by accounting for a free parameter in the void radius that enters the theory of void exclusion. We then constrain the cosmological parameters by means of a Bayesian analysis. As far as the modified gravity effects are limited, our model is a reliable method to constrain the main LCDM parameters. By contrast, it cannot be used to model the void clustering in the presence of stronger modification of gravity. In future works, we will further develop our analysis on the void clustering statistics, by testing our model on large and high-resolution simulations and on real data, also addressing the void clustering in the halo distribution. Finally, we also plan to combine these constraints with those of other cosmological probes.

Adapting The Sniffin' Sticks Olfactory Test To Diagnose Parkinson's Disease In Estonia.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of the study was to develop a culturally adapted translation of the 12-item smell identification test from Sniffin' Sticks (SS-12) for the Estonian population in order to help diagnose Parkinson's disease (PD). A standard translation of the SS-12 was created and 150 healthy Estonians were questioned about the smells used as response options in the test. Unfamiliar smells were replaced by culturally familiar options. The adapted SS-12 was applied to 70 controls in all age groups, and thereafter to 50 PD patients and 50 age- and sex-matched controls. 14 response options from 48 used in the SS-12 were replaced with familiar smells in an adapted version, in which the mean rate of correct response was 87% (range 73-99) compared to 83% with the literal translation (range 50-98). In PD patients, the average adapted SS-12 score (5.4/12) was significantly lower than in controls (average score 8.9/12), p < 0.0001. A multiple linear regression using the score in the SS-12 as the outcome measure showed that diagnosis and age independently influenced the result of the SS-12. A logistic regression using the SS-12 and age as covariates showed that the SS-12 (but not age) correctly classified 79.0% of subjects into the PD and control category, using a cut-off of <7 gave a sensitivity of 76% and specificity of 86% for the diagnosis of PD. The developed SS-12 cultural adaption is appropriate for testing olfaction in Estonia for the purpose of PD diagnosis.

«
1
2
3
4
5
6
7
8
...
66
67
»