971 resultados para ALS data-set
Resumo:
Explanation of Minimum Data Set (MDS), implementation of Section Q, overview of the program, local contacts and functions, Referral Agency information, role and assistance provided by Long-Term care Ombudsman
Resumo:
Information regarding possible questions on Section Q within the Minimum Data Set.
Resumo:
Dataset for publication in PLOS One
Resumo:
Mass spectrometry (MS)-based proteomics has seen significant technical advances during the past two decades and mass spectrometry has become a central tool in many biosciences. Despite the popularity of MS-based methods, the handling of the systematic non-biological variation in the data remains a common problem. This biasing variation can result from several sources ranging from sample handling to differences caused by the instrumentation. Normalization is the procedure which aims to account for this biasing variation and make samples comparable. Many normalization methods commonly used in proteomics have been adapted from the DNA-microarray world. Studies comparing normalization methods with proteomics data sets using some variability measures exist. However, a more thorough comparison looking at the quantitative and qualitative differences of the performance of the different normalization methods and at their ability in preserving the true differential expression signal of proteins, is lacking. In this thesis, several popular and widely used normalization methods (the Linear regression normalization, Local regression normalization, Variance stabilizing normalization, Quantile-normalization, Median central tendency normalization and also variants of some of the forementioned methods), representing different strategies in normalization are being compared and evaluated with a benchmark spike-in proteomics data set. The normalization methods are evaluated in several ways. The performance of the normalization methods is evaluated qualitatively and quantitatively on a global scale and in pairwise comparisons of sample groups. In addition, it is investigated, whether performing the normalization globally on the whole data or pairwise for the comparison pairs examined, affects the performance of the normalization method in normalizing the data and preserving the true differential expression signal. In this thesis, both major and minor differences in the performance of the different normalization methods were found. Also, the way in which the normalization was performed (global normalization of the whole data or pairwise normalization of the comparison pair) affected the performance of some of the methods in pairwise comparisons. Differences among variants of the same methods were also observed.
Resumo:
significant amount of Expendable Bathythermograph (XBT) data has been collected in the Mediterranean Sea since 1999 in the framework of operational oceanography activities. The management and storage of such a volume of data poses significant challenges and opportunities. The SeaDataNet project, a pan-European infrastructure for marine data diffusion, provides a convenient way to avoid dispersion of these temperature vertical profiles and to facilitate access to a wider public. The XBT data flow, along with the recent improvements in the quality check procedures and the consistence of the available historical data set are described. The main features of SeaDataNet services and the advantage of using this system for long-term data archiving are presented. Finally, focus on the Ligurian Sea is included in order to provide an example of the kind of information and final products devoted to different users can be easily derived from the SeaDataNet web portal.
Resumo:
The effect of number of samples and selection of data for analysis on the calculation of surface motor unit potential (SMUP) size in the statistical method of motor unit number estimates (MUNE) was determined in 10 normal subjects and 10 with amyotrophic lateral sclerosis (ALS). We recorded 500 sequential compound muscle action potentials (CMAPs) at three different stable stimulus intensities (10–50% of maximal CMAP). Estimated mean SMUP sizes were calculated using Poisson statistical assumptions from the variance of 500 sequential CMAP obtained at each stimulus intensity. The results with the 500 data points were compared with smaller subsets from the same data set. The results using a range of 50–80% of the 500 data points were compared with the full 500. The effect of restricting analysis to data between 5–20% of the CMAP and to standard deviation limits was also assessed. No differences in mean SMUP size were found with stimulus intensity or use of different ranges of data. Consistency was improved with a greater sample number. Data within 5% of CMAP size gave both increased consistency and reduced mean SMUP size in many subjects, but excluded valid responses present at that stimulus intensity. These changes were more prominent in ALS patients in whom the presence of isolated SMUP responses was a striking difference from normal subjects. Noise, spurious data, and large SMUP limited the Poisson assumptions. When these factors are considered, consistent statistical MUNE can be calculated from a continuous sequence of data points. A 2 to 2.5 SD or 10% window are reasonable methods of limiting data for analysis. Muscle Nerve 27: 320–331, 2003
Resumo:
A biplot, which is the multivariate generalization of the two-variable scatterplot, can be used to visualize the results of many multivariate techniques, especially those that are based on the singular value decomposition. We consider data sets consisting of continuous-scale measurements, their fuzzy coding and the biplots that visualize them, using a fuzzy version of multiple correspondence analysis. Of special interest is the way quality of fit of the biplot is measured, since it is well-known that regular (i.e., crisp) multiple correspondence analysis seriously under-estimates this measure. We show how the results of fuzzy multiple correspondence analysis can be defuzzified to obtain estimated values of the original data, and prove that this implies an orthogonal decomposition of variance. This permits a measure of fit to be calculated in the familiar form of a percentage of explained variance, which is directly comparable to the corresponding fit measure used in principal component analysis of the original data. The approach is motivated initially by its application to a simulated data set, showing how the fuzzy approach can lead to diagnosing nonlinear relationships, and finally it is applied to a real set of meteorological data.
Resumo:
The singular value decomposition and its interpretation as alinear biplot has proved to be a powerful tool for analysing many formsof multivariate data. Here we adapt biplot methodology to the specifficcase of compositional data consisting of positive vectors each of whichis constrained to have unit sum. These relative variation biplots haveproperties relating to special features of compositional data: the studyof ratios, subcompositions and models of compositional relationships. Themethodology is demonstrated on a data set consisting of six-part colourcompositions in 22 abstract paintings, showing how the singular valuedecomposition can achieve an accurate biplot of the colour ratios and howpossible models interrelating the colours can be diagnosed.
Resumo:
Structural equation models (SEM) are commonly used to analyze the relationship between variables some of which may be latent, such as individual ``attitude'' to and ``behavior'' concerning specific issues. A number of difficulties arise when we want to compare a large number of groups, each with large sample size, and the manifest variables are distinctly non-normally distributed. Using an specific data set, we evaluate the appropriateness of the following alternative SEM approaches: multiple group versus MIMIC models, continuous versus ordinal variables estimation methods, and normal theory versus non-normal estimation methods. The approaches are applied to the ISSP-1993 Environmental data set, with the purpose of exploring variation in the mean level of variables of ``attitude'' to and ``behavior''concerning environmental issues and their mutual relationship across countries. Issues of both theoretical and practical relevance arise in the course of this application.
Resumo:
It is shown how correspondence analysis may be applied to a subset of response categories from a questionnaire survey, for example the subset of undecided responses or the subset of responses for a particular category. The idea is to maintain the original relative frequencies of the categories and not re-express them relative to totals within the subset, as would normally be done in a regular correspondence analysis of the subset. Furthermore, the masses and chi-square metric assigned to the data subset are the same as those in the correspondence analysis of the whole data set. This variant of the method, called Subset Correspondence Analysis, is illustrated on data from the ISSP survey on Family and Changing Gender Roles.
Resumo:
Panel data can be arranged into a matrix in two ways, called 'long' and 'wide' formats (LFand WF). The two formats suggest two alternative model approaches for analyzing paneldata: (i) univariate regression with varying intercept; and (ii) multivariate regression withlatent variables (a particular case of structural equation model, SEM). The present papercompares the two approaches showing in which circumstances they yield equivalent?insome cases, even numerically equal?results. We show that the univariate approach givesresults equivalent to the multivariate approach when restrictions of time invariance (inthe paper, the TI assumption) are imposed on the parameters of the multivariate model.It is shown that the restrictions implicit in the univariate approach can be assessed bychi-square difference testing of two nested multivariate models. In addition, commontests encountered in the econometric analysis of panel data, such as the Hausman test, areshown to have an equivalent representation as chi-square difference tests. Commonalitiesand differences between the univariate and multivariate approaches are illustrated usingan empirical panel data set of firms' profitability as well as a simulated panel data.
Resumo:
Die vorliegende Untersuchung analysiert die Eignung der "Spechtgemeinschaft" als ökologische Indikatorengruppe und formuliert vor dem Hintergrund der Ergebnisse Forderungen und Empfehlungen für einen "spechtgerechten" Umgang mit Wäldern. Die Habitatnutzung von sieben Spechtarten beim Nahrungserwerb wurde über einen Zeitraum von zwei Jahren in urwaldartigen und forstlich genutzten Beständen verschiedener Waldgesellschaften systematisch beobachtet. Das Untersuchungsgebiet ist der Bialowieza-Wald im äußersten Osten Polens, wo in enger räumlicher Nachbarschaft Natur- und Wirtschaftswaldflächen bearbeitet werden konnten. Die Beobachtungen erfolgten zwischen Anfang März 1999 und Ende Februar 2001 und wurden zu allen Jahreszeiten durchgeführt. Vier der insgesamt sechs Probeflächen repräsentieren die wichtigste Laubwaldgesellschaft des Gebietes, das Tilio-Carpinetum, die übrigen zwei die wichtigste Nadelwaldgesellschaft, das Peucedano-Pinetum. Die Hälfte der zwischen 42 und 54 ha großen Probeflächen lag im streng geschützten Urwaldreservat des Bialowieza-Nationalparkes, die übrigen in forstlich genutzten Waldbeständen. Zusätzlich wurde ein 2,5 km langes Transekt durch bewirtschafteten Erlen-Eschen-Auenwald und sehr naturnahen Erlenbruch bearbeitet. Die Probeflächen wurden in ein Raster aus 50x50m großen Quadranten unterteilt. Zur Beobachtung der Spechte beim Nahrungserwerb erfolgten 21 Begehungen je Probefläche bzw. Transekt. Die Probeflächen wurden dazu auf parallelen Linien mit Abständen von je 100m begangen, Startpunkt und Startrichtung wurden variiert. Zur Charakterisierung der Vegetation und Bestandesstruktur erfolgten Erhebungen zur Baumartenzusammensetzung, Größenklassenverteilung der Bäume, Totholzanteil und Krautvegetation. 1332 Beobachtungen von Spechten beim Nahrungserwerb konnten ausgewertet werden. Der Buntspecht wurde in allen Flächen am häufigsten gesehen. Mittel-, Weißrücken- und Kleinspecht wurden überwiegend in den Tilio-Carpineten beobachtet, in den Naturwäldern häufiger als in den bewirtschafteten Beständen. Der Dreizehenspecht wurde im Nadelwald und stärker mit Fichten durchmischtem Laubwald angetroffen. Bei Schwarz- und Grauspecht konnte keine klare Vorliebe für bestimmte Waldgesellschaften ermittelt werden. Der Buntspecht ernährte sich vor allem im Herbst und Winter überwiegend von fetthaltigen Samen und wurde dann meist beim Bearbeiten von Fichten- oder Kiefernzapfen in Schmieden beobachtet. Der Mittelspecht suchte als "Sammelspecht" seine Nahrung vor allem an den Oberflächen der Stämme und Äste. Klein-, Weißrücken-, Dreizehen- und Schwarzspecht traten als Hackspechte in Erscheinung. Die wenigen Daten zum Grauspecht reichen nicht zur Ermittlung der bevorzugten Nahrungserwerbstechnik aus. Bei Bunt-, Mittel- und Weißrückenspecht konnte eine deutliche Vorliebe für die Stieleiche als Nahrungsbaum nachgewiesen werden. Der Dreizehenspecht ist jedoch die einzige der beobachteten Arten mit einer weitgehenden Spezialisierung auf eine bestimmte Baumart, er nutzte in allen Waldgesellschaften meist die Fichte. Insgesamt bevorzugten die Spechte Bäume mit großen Stammdurchmessern, beim Kleinspecht ist diese Vorliebe allerdings nur schwach ausgeprägt. Totholz wurde von Weißrücken-, Dreizehen- und Kleinspecht bei der Nahrungssuche bevorzugt, vom Mittelspecht jedoch nur gelegentlich genutzt. Beim Buntspecht zeigte der Totholz-Nutzungsanteil erhebliche Unterschiede zwischen verschiedenen Baumarten. Liegendes Totholz spielte in den Tilio-Carpineten im Vergleich zu stehendem Totholz und toten Teilen lebender Bäume nur eine geringe Rolle für Nahrung suchende Spechte.
Resumo:
Data deduplication describes a class of approaches that reduce the storage capacity needed to store data or the amount of data that has to be transferred over a network. These approaches detect coarse-grained redundancies within a data set, e.g. a file system, and remove them.rnrnOne of the most important applications of data deduplication are backup storage systems where these approaches are able to reduce the storage requirements to a small fraction of the logical backup data size.rnThis thesis introduces multiple new extensions of so-called fingerprinting-based data deduplication. It starts with the presentation of a novel system design, which allows using a cluster of servers to perform exact data deduplication with small chunks in a scalable way.rnrnAfterwards, a combination of compression approaches for an important, but often over- looked, data structure in data deduplication systems, so called block and file recipes, is introduced. Using these compression approaches that exploit unique properties of data deduplication systems, the size of these recipes can be reduced by more than 92% in all investigated data sets. As file recipes can occupy a significant fraction of the overall storage capacity of data deduplication systems, the compression enables significant savings.rnrnA technique to increase the write throughput of data deduplication systems, based on the aforementioned block and file recipes, is introduced next. The novel Block Locality Caching (BLC) uses properties of block and file recipes to overcome the chunk lookup disk bottleneck of data deduplication systems. This chunk lookup disk bottleneck either limits the scalability or the throughput of data deduplication systems. The presented BLC overcomes the disk bottleneck more efficiently than existing approaches. Furthermore, it is shown that it is less prone to aging effects.rnrnFinally, it is investigated if large HPC storage systems inhibit redundancies that can be found by fingerprinting-based data deduplication. Over 3 PB of HPC storage data from different data sets have been analyzed. In most data sets, between 20 and 30% of the data can be classified as redundant. According to these results, future work in HPC storage systems should further investigate how data deduplication can be integrated into future HPC storage systems.rnrnThis thesis presents important novel work in different area of data deduplication re- search.
Resumo:
Due to the imprecise nature of biological experiments, biological data is often characterized by the presence of redundant and noisy data. This may be due to errors that occurred during data collection, such as contaminations in laboratorial samples. It is the case of gene expression data, where the equipments and tools currently used frequently produce noisy biological data. Machine Learning algorithms have been successfully used in gene expression data analysis. Although many Machine Learning algorithms can deal with noise, detecting and removing noisy instances from the training data set can help the induction of the target hypothesis. This paper evaluates the use of distance-based pre-processing techniques for noise detection in gene expression data classification problems. This evaluation analyzes the effectiveness of the techniques investigated in removing noisy data, measured by the accuracy obtained by different Machine Learning classifiers over the pre-processed data.