969 resultados para Medicine--Data processing
Resumo:
The sparsely spaced highly permeable fractures of the granitic rock aquifer at Stang-er-Brune (Brittany, France) form a well-connected fracture network of high permeability but unknown geometry. Previous work based on optical and acoustic logging together with single-hole and cross-hole flowmeter data acquired in 3 neighbouring boreholes (70-100 m deep) has identified the most important permeable fractures crossing the boreholes and their hydraulic connections. To constrain possible flow paths by estimating the geometries of known and previously unknown fractures, we have acquired, processed and interpreted multifold, single- and cross-hole GPR data using 100 and 250 MHz antennas. The GPR data processing scheme consisting of timezero corrections, scaling, bandpass filtering and F-X deconvolution, eigenvector filtering, muting, pre-stack Kirchhoff depth migration and stacking was used to differentiate fluid-filled fracture reflections from source generated noise. The final stacked and pre-stack depth-migrated GPR sections provide high-resolution images of individual fractures (dipping 30-90°) in the surroundings (2-20 m for the 100 MHz antennas; 2-12 m for the 250 MHz antennas) of each borehole in a 2D plane projection that are of superior quality to those obtained from single-offset sections. Most fractures previously identified from hydraulic testing can be correlated to reflections in the single-hole data. Several previously unknown major near vertical fractures have also been identified away from the boreholes.
Resumo:
BACKGROUND: Solexa/Illumina short-read ultra-high throughput DNA sequencing technology produces millions of short tags (up to 36 bases) by parallel sequencing-by-synthesis of DNA colonies. The processing and statistical analysis of such high-throughput data poses new challenges; currently a fair proportion of the tags are routinely discarded due to an inability to match them to a reference sequence, thereby reducing the effective throughput of the technology. RESULTS: We propose a novel base calling algorithm using model-based clustering and probability theory to identify ambiguous bases and code them with IUPAC symbols. We also select optimal sub-tags using a score based on information content to remove uncertain bases towards the ends of the reads. CONCLUSION: We show that the method improves genome coverage and number of usable tags as compared with Solexa's data processing pipeline by an average of 15%. An R package is provided which allows fast and accurate base calling of Solexa's fluorescence intensity files and the production of informative diagnostic plots.
Resumo:
Gaia is the most ambitious space astrometry mission currently envisaged and is a technological challenge in all its aspects. We describe a proposal for the payload data handling system of Gaia, as an example of a high-performance, real-time, concurrent, and pipelined data system. This proposal includes the front-end systems for the instrumentation, the data acquisition and management modules, the star data processing modules, and the payload data handling unit. We also review other payload and service module elements and we illustrate a data flux proposal.
Resumo:
Background Nowadays, combining the different sources of information to improve the biological knowledge available is a challenge in bioinformatics. One of the most powerful methods for integrating heterogeneous data types are kernel-based methods. Kernel-based data integration approaches consist of two basic steps: firstly the right kernel is chosen for each data set; secondly the kernels from the different data sources are combined to give a complete representation of the available data for a given statistical task. Results We analyze the integration of data from several sources of information using kernel PCA, from the point of view of reducing dimensionality. Moreover, we improve the interpretability of kernel PCA by adding to the plot the representation of the input variables that belong to any dataset. In particular, for each input variable or linear combination of input variables, we can represent the direction of maximum growth locally, which allows us to identify those samples with higher/lower values of the variables analyzed. Conclusions The integration of different datasets and the simultaneous representation of samples and variables together give us a better understanding of biological knowledge.
Resumo:
DnaSP is a software package for a comprehensive analysis of DNA polymorphism data. Version 5 implements a number of new features and analytical methods allowing extensive DNA polymorphism analyses on large datasets. Among other features, the newly implemented methods allow for: (i) analyses on multiple data files; (ii) haplotype phasing; (iii) analyses on insertion/deletion polymorphism data; (iv) visualizing sliding window results integrated with available genome annotations in the UCSC browser.
Resumo:
Peer-reviewed
Resumo:
Gaia is the most ambitious space astrometry mission currently envisaged and is a technological challenge in all its aspects. We describe a proposal for the payload data handling system of Gaia, as an example of a high-performance, real-time, concurrent, and pipelined data system. This proposal includes the front-end systems for the instrumentation, the data acquisition and management modules, the star data processing modules, and the payload data handling unit. We also review other payload and service module elements and we illustrate a data flux proposal.
Resumo:
Statistics has become an indispensable tool in biomedical research. Thanks, in particular, to computer science, the researcher has easy access to elementary "classical" procedures. These are often of a "confirmatory" nature: their aim is to test hypotheses (for example the efficacy of a treatment) prior to experimentation. However, doctors often use them in situations more complex than foreseen, to discover interesting data structures and formulate hypotheses. This inverse process may lead to misuse which increases the number of "statistically proven" results in medical publications. The help of a professional statistician thus becomes necessary. Moreover, good, simple "exploratory" techniques are now available. In addition, medical data contain quite a high percentage of outliers (data that deviate from the majority). With classical methods it is often very difficult (even for a statistician!) to detect them and the reliability of results becomes questionable. New, reliable ("robust") procedures have been the subject of research for the past two decades. Their practical introduction is one of the activities of the Statistics and Data Processing Department of the University of Social and Preventive Medicine, Lausanne.
Resumo:
Background Nowadays, combining the different sources of information to improve the biological knowledge available is a challenge in bioinformatics. One of the most powerful methods for integrating heterogeneous data types are kernel-based methods. Kernel-based data integration approaches consist of two basic steps: firstly the right kernel is chosen for each data set; secondly the kernels from the different data sources are combined to give a complete representation of the available data for a given statistical task. Results We analyze the integration of data from several sources of information using kernel PCA, from the point of view of reducing dimensionality. Moreover, we improve the interpretability of kernel PCA by adding to the plot the representation of the input variables that belong to any dataset. In particular, for each input variable or linear combination of input variables, we can represent the direction of maximum growth locally, which allows us to identify those samples with higher/lower values of the variables analyzed. Conclusions The integration of different datasets and the simultaneous representation of samples and variables together give us a better understanding of biological knowledge.
Resumo:
One of the fundamental problems with image processing of petrographic thin sections is that the appearance (colour I intensity) of a mineral grain will vary with the orientation of the crystal lattice to the preferred direction of the polarizing filters on a petrographic microscope. This makes it very difficult to determine grain boundaries, grain orientation and mineral species from a single captured image. To overcome this problem, the Rotating Polarizer Stage was used to replace the fixed polarizer and analyzer on a standard petrographic microscope. The Rotating Polarizer Stage rotates the polarizers while the thin section remains stationary, allowing for better data gathering possibilities. Instead of capturing a single image of a thin section, six composite data sets are created by rotating the polarizers through 900 (or 1800 if quartz c-axes measurements need to be taken) in both plane and cross polarized light. The composite data sets can be viewed as separate images and consist of the average intensity image, the maximum intensity image, the minimum intensity image, the maximum position image, the minimum position image and the gradient image. The overall strategy used by the image processing system is to gather the composite data sets, determine the grain boundaries using the gradient image, classify the different mineral species present using the minimum and maximum intensity images and then perform measurements of grain shape and, where possible, partial crystallographic orientation using the maximum intensity and maximum position images.
Resumo:
The telemetry data processing operation intended for a given mission are pre-defined by an onboard telemetry configuration, mission trajectory and overall telemetry methodology have stabilized lately for ISRO vehicles. The given problem on telemetry data processing is reduced through hierarchical problem reduction whereby the sequencing of operations evolves as the control task and operations on data as the function task. The function task Input, Output and execution criteria are captured into tables which are examined by the control task and then schedules when the function task when the criteria is being met.
Resumo:
Analysis by reduction is a linguistically motivated method for checking correctness of a sentence. It can be modelled by restarting automata. In this paper we propose a method for learning restarting automata which are strictly locally testable (SLT-R-automata). The method is based on the concept of identification in the limit from positive examples only. Also we characterize the class of languages accepted by SLT-R-automata with respect to the Chomsky hierarchy.
Resumo:
Die stereoskopische 3-D-Darstellung beruht auf der naturgetreuen Präsentation verschiedener Perspektiven für das rechte und linke Auge. Sie erlangt in der Medizin, der Architektur, im Design sowie bei Computerspielen und im Kino, zukünftig möglicherweise auch im Fernsehen, eine immer größere Bedeutung. 3-D-Displays dienen der zusätzlichen Wiedergabe der räumlichen Tiefe und lassen sich grob in die vier Gruppen Stereoskope und Head-mounted-Displays, Brillensysteme, autostereoskopische Displays sowie echte 3-D-Displays einteilen. Darunter besitzt der autostereoskopische Ansatz ohne Brillen, bei dem N≥2 Perspektiven genutzt werden, ein hohes Potenzial. Die beste Qualität in dieser Gruppe kann mit der Methode der Integral Photography, die sowohl horizontale als auch vertikale Parallaxe kodiert, erreicht werden. Allerdings ist das Verfahren sehr aufwendig und wird deshalb wenig genutzt. Den besten Kompromiss zwischen Leistung und Preis bieten präzise gefertigte Linsenrasterscheiben (LRS), die hinsichtlich Lichtausbeute und optischen Eigenschaften den bereits früher bekannten Barrieremasken überlegen sind. Insbesondere für die ergonomisch günstige Multiperspektiven-3-D-Darstellung wird eine hohe physikalische Monitorauflösung benötigt. Diese ist bei modernen TFT-Displays schon recht hoch. Eine weitere Verbesserung mit dem theoretischen Faktor drei erreicht man durch gezielte Ansteuerung der einzelnen, nebeneinander angeordneten Subpixel in den Farben Rot, Grün und Blau. Ermöglicht wird dies durch die um etwa eine Größenordnung geringere Farbauflösung des menschlichen visuellen Systems im Vergleich zur Helligkeitsauflösung. Somit gelingt die Implementierung einer Subpixel-Filterung, welche entsprechend den physiologischen Gegebenheiten mit dem in Luminanz und Chrominanz trennenden YUV-Farbmodell arbeitet. Weiterhin erweist sich eine Schrägstellung der Linsen im Verhältnis von 1:6 als günstig. Farbstörungen werden minimiert, und die Schärfe der Bilder wird durch eine weniger systematische Vergrößerung der technologisch unvermeidbaren Trennelemente zwischen den Subpixeln erhöht. Der Grad der Schrägstellung ist frei wählbar. In diesem Sinne ist die Filterung als adaptiv an den Neigungswinkel zu verstehen, obwohl dieser Wert für einen konkreten 3-D-Monitor eine Invariante darstellt. Die zu maximierende Zielgröße ist der Parameter Perspektiven-Pixel als Produkt aus Anzahl der Perspektiven N und der effektiven Auflösung pro Perspektive. Der Idealfall einer Verdreifachung wird praktisch nicht erreicht. Messungen mit Hilfe von Testbildern sowie Schrifterkennungstests lieferten einen Wert von knapp über 2. Dies ist trotzdem als eine signifikante Verbesserung der Qualität der 3-D-Darstellung anzusehen. In der Zukunft sind weitere Verbesserungen hinsichtlich der Zielgröße durch Nutzung neuer, feiner als TFT auflösender Technologien wie LCoS oder OLED zu erwarten. Eine Kombination mit der vorgeschlagenen Filtermethode wird natürlich weiterhin möglich und ggf. auch sinnvoll sein.
Resumo:
Data mining means to summarize information from large amounts of raw data. It is one of the key technologies in many areas of economy, science, administration and the internet. In this report we introduce an approach for utilizing evolutionary algorithms to breed fuzzy classifier systems. This approach was exercised as part of a structured procedure by the students Achler, Göb and Voigtmann as contribution to the 2006 Data-Mining-Cup contest, yielding encouragingly positive results.