989 resultados para Unevenly sampled data
Resumo:
"Research was supported by the United States Air Force through the Air Force Office of Scientific Research, Air Research and Development Command."
Resumo:
Mode of access: Internet.
Resumo:
Includes bibliography.
Resumo:
Includes bibliography.
Resumo:
Digital signal processing (DSP) aims to extract specific information from digital signals. Digital signals are, by definition, physical quantities represented by a sequence of discrete values and from these sequences it is possible to extract and analyze the desired information. The unevenly sampled data can not be properly analyzed using standard techniques of digital signal processing. This work aimed to adapt a technique of DSP, the multiresolution analysis, to analyze unevenly smapled data, to aid the studies in the CoRoT laboratory at UFRN. The process is based on re-indexing the wavelet transform to handle unevenly sampled data properly. The was efective presenting satisfactory results
Resumo:
The primary goal of this dissertation is the study of patterns of viral evolution inferred from serially-sampled sequence data, i.e., sequence data obtained from strains isolated at consecutive time points from a single patient or host. RNA viral populations have an extremely high genetic variability, largely due to their astronomical population sizes within host systems, high replication rate, and short generation time. It is this aspect of their evolution that demands special attention and a different approach when studying the evolutionary relationships of serially-sampled sequence data. New methods that analyze serially-sampled data were developed shortly after a groundbreaking HIV-1 study of several patients from which viruses were isolated at recurring intervals over a period of 10 or more years. These methods assume a tree-like evolutionary model, while many RNA viruses have the capacity to exchange genetic material with one another using a process called recombination. ^ A genealogy involving recombination is best described by a network structure. A more general approach was implemented in a new computational tool, Sliding MinPD, one that is mindful of the sampling times of the input sequences and that reconstructs the viral evolutionary relationships in the form of a network structure with implicit representations of recombination events. The underlying network organization reveals unique patterns of viral evolution and could help explain the emergence of disease-associated mutants and drug-resistant strains, with implications for patient prognosis and treatment strategies. In order to comprehensively test the developed methods and to carry out comparison studies with other methods, synthetic data sets are critical. Therefore, appropriate sequence generators were also developed to simulate the evolution of serially-sampled recombinant viruses, new and more through evaluation criteria for recombination detection methods were established, and three major comparison studies were performed. The newly developed tools were also applied to "real" HIV-1 sequence data and it was shown that the results represented within an evolutionary network structure can be interpreted in biologically meaningful ways. ^
Resumo:
The structural modeling of spatial dependence, using a geostatistical approach, is an indispensable tool to determine parameters that define this structure, applied on interpolation of values at unsampled points by kriging techniques. However, the estimation of parameters can be greatly affected by the presence of atypical observations in sampled data. The purpose of this study was to use diagnostic techniques in Gaussian spatial linear models in geostatistics to evaluate the sensitivity of maximum likelihood and restrict maximum likelihood estimators to small perturbations in these data. For this purpose, studies with simulated and experimental data were conducted. Results with simulated data showed that the diagnostic techniques were efficient to identify the perturbation in data. The results with real data indicated that atypical values among the sampled data may have a strong influence on thematic maps, thus changing the spatial dependence structure. The application of diagnostic techniques should be part of any geostatistical analysis, to ensure a better quality of the information from thematic maps.
Resumo:
To provide reliable estimates for mapping soil properties for precision agriculture requires intensive sampling and costly laboratory analyses. If the spatial structure of ancillary data, such as yield, digital information from aerial photographs, and soil electrical conductivity (EC) measurements, relates to that of soil properties they could be used to guide the sampling intensity for soil surveys. Variograins of permanent soil properties at two study sites on different parent materials were compared with each other and with those for ancillary data. The ranges of spatial dependence identified by the variograms of both sets of properties are of similar orders of magnitude for each study site, Maps of the ancillary data appear to show similar patterns of variation and these seem to relate to those of the permanent properties of the soil. Correlation analysis has confirmed these relations. Maps of kriged estimates from sub-sampled data and the original variograrns showed that the main patterns of variation were preserved when a sampling interval of less than half the average variogram range of ancillary data was used. Digital data from aerial photographs for different years and EC appear to show a more consistent relation with the soil properties than does yield. Aerial photographs, in particular those of bare soil, seem to be the most useful ancillary data and they are often cheaper to obtain than yield and EC data.
Resumo:
The node-density effect is an artifact of phylogeny reconstruction that can cause branch lengths to be underestimated in areas of the tree with fewer taxa. Webster, Payne, and Pagel (2003, Science 301:478) introduced a statistical procedure (the "delta" test) to detect this artifact, and here we report the results of computer simulations that examine the test's performance. In a sample of 50,000 random data sets, we find that the delta test detects the artifact in 94.4% of cases in which it is present. When the artifact is not present (n = 10,000 simulated data sets) the test showed a type I error rate of approximately 1.69%, incorrectly reporting the artifact in 169 data sets. Three measures of tree shape or "balance" failed to predict the size of the node-density effect. This may reflect the relative homogeneity of our randomly generated topologies, but emphasizes that nearly any topology can suffer from the artifact, the effect not being confined only to highly unevenly sampled or otherwise imbalanced trees. The ability to screen phylogenies for the node-density artifact is important for phylogenetic inference and for researchers using phylogenetic trees to infer evolutionary processes, including their use in molecular clock dating. [Delta test; molecular clock; molecular evolution; node-density effect; phylogenetic reconstruction; speciation; simulation.]
Resumo:
When an appropriate fish host is selected, analysis of its parasites offers a useful, reliable, economical, telescoped indication or monitor of environmental health. The value of that information increases when corroborated by another non-parasitological technique. The analysis of parasites is not necessarily simple because not all hosts serve as good models and because the number of species, presence of specific species, intensity of infections, life histories of species, location of species in hosts, and host response for each parasitic species have to be addressed individually to assure usefulness of the tool. Also, different anthropogenic contaminants act in a distinct manner relative to hosts, parasites, and each other as well as being influenced by natural environmental conditions. Total values for all parasitic species infecting a sample cannot necessarily be grouped together. For example, an abundance of numbers of either species or individuals can indicate either a healthy or an unhealthy environment, depending on the species of parasite. Moreover, depending on the parasitic species, its infection, and the time chosen for collection/examination, the assessment may indicate a chronic or acute state of the environmental health. For most types of analyses, the host should be one that has a restricted home range, can be infected by numerous species of parasites, many of which have a variety of additional hosts in their life cycles, and can be readily sampled. Data on parasitic infections in the western mosquitofish (Gambusia affinis), a fish that meets the criteria in two separate studies, illustrate the usefulness of that host as a model to indicate both healthy and detrimentally influenced environments. In those studies, species richness, intensity of select species, host resistance, other hosts involved in life cycles, and other factors all relate to site and contaminating discharge.
Resumo:
Purpose: Development of an interpolation algorithm for re‐sampling spatially distributed CT‐data with the following features: global and local integral conservation, avoidance of negative interpolation values for positively defined datasets and the ability to control re‐sampling artifacts. Method and Materials: The interpolation can be separated into two steps: first, the discrete CT‐data has to be continuously distributed by an analytic function considering the boundary conditions. Generally, this function is determined by piecewise interpolation. Instead of using linear or high order polynomialinterpolations, which do not fulfill all the above mentioned features, a special form of Hermitian curve interpolation is used to solve the interpolation problem with respect to the required boundary conditions. A single parameter is determined, by which the behavior of the interpolation function is controlled. Second, the interpolated data have to be re‐distributed with respect to the requested grid. Results: The new algorithm was compared with commonly used interpolation functions based on linear and second order polynomial. It is demonstrated that these interpolation functions may over‐ or underestimate the source data by about 10%–20% while the parameter of the new algorithm can be adjusted in order to significantly reduce these interpolation errors. Finally, the performance and accuracy of the algorithm was tested by re‐gridding a series of X‐ray CT‐images. Conclusion: Inaccurate sampling values may occur due to the lack of integral conservation. Re‐sampling algorithms using high order polynomialinterpolation functions may result in significant artifacts of the re‐sampled data. Such artifacts can be avoided by using the new algorithm based on Hermitian curve interpolation
Resumo:
Clinical Research Data Quality Literature Review and Pooled Analysis We present a literature review and secondary analysis of data accuracy in clinical research and related secondary data uses. A total of 93 papers meeting our inclusion criteria were categorized according to the data processing methods. Quantitative data accuracy information was abstracted from the articles and pooled. Our analysis demonstrates that the accuracy associated with data processing methods varies widely, with error rates ranging from 2 errors per 10,000 files to 5019 errors per 10,000 fields. Medical record abstraction was associated with the highest error rates (70–5019 errors per 10,000 fields). Data entered and processed at healthcare facilities had comparable error rates to data processed at central data processing centers. Error rates for data processed with single entry in the presence of on-screen checks were comparable to double entered data. While data processing and cleaning methods may explain a significant amount of the variability in data accuracy, additional factors not resolvable here likely exist. Defining Data Quality for Clinical Research: A Concept Analysis Despite notable previous attempts by experts to define data quality, the concept remains ambiguous and subject to the vagaries of natural language. This current lack of clarity continues to hamper research related to data quality issues. We present a formal concept analysis of data quality, which builds on and synthesizes previously published work. We further posit that discipline-level specificity may be required to achieve the desired definitional clarity. To this end, we combine work from the clinical research domain with findings from the general data quality literature to produce a discipline-specific definition and operationalization for data quality in clinical research. While the results are helpful to clinical research, the methodology of concept analysis may be useful in other fields to clarify data quality attributes and to achieve operational definitions. Medical Record Abstractor’s Perceptions of Factors Impacting the Accuracy of Abstracted Data Medical record abstraction (MRA) is known to be a significant source of data errors in secondary data uses. Factors impacting the accuracy of abstracted data are not reported consistently in the literature. Two Delphi processes were conducted with experienced medical record abstractors to assess abstractor’s perceptions about the factors. The Delphi process identified 9 factors that were not found in the literature, and differed with the literature by 5 factors in the top 25%. The Delphi results refuted seven factors reported in the literature as impacting the quality of abstracted data. The results provide insight into and indicate content validity of a significant number of the factors reported in the literature. Further, the results indicate general consistency between the perceptions of clinical research medical record abstractors and registry and quality improvement abstractors. Distributed Cognition Artifacts on Clinical Research Data Collection Forms Medical record abstraction, a primary mode of data collection in secondary data use, is associated with high error rates. Distributed cognition in medical record abstraction has not been studied as a possible explanation for abstraction errors. We employed the theory of distributed representation and representational analysis to systematically evaluate cognitive demands in medical record abstraction and the extent of external cognitive support employed in a sample of clinical research data collection forms. We show that the cognitive load required for abstraction in 61% of the sampled data elements was high, exceedingly so in 9%. Further, the data collection forms did not support external cognition for the most complex data elements. High working memory demands are a possible explanation for the association of data errors with data elements requiring abstractor interpretation, comparison, mapping or calculation. The representational analysis used here can be used to identify data elements with high cognitive demands.
Resumo:
A modelagem da estrutura de dependência espacial pela abordagem da geoestatística é fundamental para a definição de parâmetros que definem esta estrutura, e que são utilizados na interpolação de valores em locais não amostrados pela técnica de krigagem. Entretanto, a estimação de parâmetros pode ser muito afetada pela presença de observações atípicas nos dados amostrados. O desenvolvimento deste trabalho teve por objetivo utilizar técnicas de diagnóstico de influência local em modelos espaciais lineares gaussianos, utilizados em geoestatística, para avaliar a sensibilidade dos estimadores de máxima verossimilhança e máxima verossimilhança restrita na presença de dados discrepantes. Estudos com dados experimentais mostraram que tanto a presença de valores atípicos como de valores considerados influentes, pela análise de diagnóstico, pode exercer forte influência nos mapas temáticos, alterando, assim, a estrutura de dependência espacial. As aplicações de técnicas de diagnóstico de influência local devem fazer parte de toda análise geoestatística a fim de garantir que as informações contidas nos mapas temáticos tenham maior qualidade e possam ser utilizadas com maior segurança pelo agricultor.
Resumo:
This report describes recent updates to the custom-built data-acquisition hardware operated by the Center for Hypersonics. In 2006, an ISA-to-USB bridging card was developed as part of Luke Hillyard's final-year thesis. This card allows the hardware to be connected to any recent personal computers via a (USB or RS232) serial port and it provides a number of simple text-based commands for control of the hardware. A graphical user interface program was also updated to help the experimenter manage the data acquisition functions. Sampled data is stored in text files that have been compressed with the gzip for mat. To simplify the later archiving or transport of the data, all files specific to a shot are stored in a single directory. This includes a text file for the run description, the signal configuration file and the individual sampled-data files, one for each signal that was recorded.