968 resultados para Speech synthesis Data processing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research presents several components encompassing the scope of the objective of Data Partitioning and Replication Management in Distributed GIS Database. Modern Geographic Information Systems (GIS) databases are often large and complicated. Therefore data partitioning and replication management problems need to be addresses in development of an efficient and scalable solution. ^ Part of the research is to study the patterns of geographical raster data processing and to propose the algorithms to improve availability of such data. These algorithms and approaches are targeting granularity of geographic data objects as well as data partitioning in geographic databases to achieve high data availability and Quality of Service(QoS) considering distributed data delivery and processing. To achieve this goal a dynamic, real-time approach for mosaicking digital images of different temporal and spatial characteristics into tiles is proposed. This dynamic approach reuses digital images upon demand and generates mosaicked tiles only for the required region according to user's requirements such as resolution, temporal range, and target bands to reduce redundancy in storage and to utilize available computing and storage resources more efficiently. ^ Another part of the research pursued methods for efficient acquiring of GIS data from external heterogeneous databases and Web services as well as end-user GIS data delivery enhancements, automation and 3D virtual reality presentation. ^ There are vast numbers of computing, network, and storage resources idling or not fully utilized available on the Internet. Proposed "Crawling Distributed Operating System "(CDOS) approach employs such resources and creates benefits for the hosts that lend their CPU, network, and storage resources to be used in GIS database context. ^ The results of this dissertation demonstrate effective ways to develop a highly scalable GIS database. The approach developed in this dissertation has resulted in creation of TerraFly GIS database that is used by US government, researchers, and general public to facilitate Web access to remotely-sensed imagery and GIS vector information. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The microarray technology provides a high-throughput technique to study gene expression. Microarrays can help us diagnose different types of cancers, understand biological processes, assess host responses to drugs and pathogens, find markers for specific diseases, and much more. Microarray experiments generate large amounts of data. Thus, effective data processing and analysis are critical for making reliable inferences from the data. ^ The first part of dissertation addresses the problem of finding an optimal set of genes (biomarkers) to classify a set of samples as diseased or normal. Three statistical gene selection methods (GS, GS-NR, and GS-PCA) were developed to identify a set of genes that best differentiate between samples. A comparative study on different classification tools was performed and the best combinations of gene selection and classifiers for multi-class cancer classification were identified. For most of the benchmarking cancer data sets, the gene selection method proposed in this dissertation, GS, outperformed other gene selection methods. The classifiers based on Random Forests, neural network ensembles, and K-nearest neighbor (KNN) showed consistently god performance. A striking commonality among these classifiers is that they all use a committee-based approach, suggesting that ensemble classification methods are superior. ^ The same biological problem may be studied at different research labs and/or performed using different lab protocols or samples. In such situations, it is important to combine results from these efforts. The second part of the dissertation addresses the problem of pooling the results from different independent experiments to obtain improved results. Four statistical pooling techniques (Fisher inverse chi-square method, Logit method. Stouffer's Z transform method, and Liptak-Stouffer weighted Z-method) were investigated in this dissertation. These pooling techniques were applied to the problem of identifying cell cycle-regulated genes in two different yeast species. As a result, improved sets of cell cycle-regulated genes were identified. The last part of dissertation explores the effectiveness of wavelet data transforms for the task of clustering. Discrete wavelet transforms, with an appropriate choice of wavelet bases, were shown to be effective in producing clusters that were biologically more meaningful. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation established a software-hardware integrated design for a multisite data repository in pediatric epilepsy. A total of 16 institutions formed a consortium for this web-based application. This innovative fully operational web application allows users to upload and retrieve information through a unique human-computer graphical interface that is remotely accessible to all users of the consortium. A solution based on a Linux platform with My-SQL and Personal Home Page scripts (PHP) has been selected. Research was conducted to evaluate mechanisms to electronically transfer diverse datasets from different hospitals and collect the clinical data in concert with their related functional magnetic resonance imaging (fMRI). What was unique in the approach considered is that all pertinent clinical information about patients is synthesized with input from clinical experts into 4 different forms, which were: Clinical, fMRI scoring, Image information, and Neuropsychological data entry forms. A first contribution of this dissertation was in proposing an integrated processing platform that was site and scanner independent in order to uniformly process the varied fMRI datasets and to generate comparative brain activation patterns. The data collection from the consortium complied with the IRB requirements and provides all the safeguards for security and confidentiality requirements. An 1-MR1-based software library was used to perform data processing and statistical analysis to obtain the brain activation maps. Lateralization Index (LI) of healthy control (HC) subjects in contrast to localization-related epilepsy (LRE) subjects were evaluated. Over 110 activation maps were generated, and their respective LIs were computed yielding the following groups: (a) strong right lateralization: (HC=0%, LRE=18%), (b) right lateralization: (HC=2%, LRE=10%), (c) bilateral: (HC=20%, LRE=15%), (d) left lateralization: (HC=42%, LRE=26%), e) strong left lateralization: (HC=36%, LRE=31%). Moreover, nonlinear-multidimensional decision functions were used to seek an optimal separation between typical and atypical brain activations on the basis of the demographics as well as the extent and intensity of these brain activations. The intent was not to seek the highest output measures given the inherent overlap of the data, but rather to assess which of the many dimensions were critical in the overall assessment of typical and atypical language activations with the freedom to select any number of dimensions and impose any degree of complexity in the nonlinearity of the decision space.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cloud computing offers massive scalability and elasticity required by many scien-tific and commercial applications. Combining the computational and data handling capabilities of clouds with parallel processing also has the potential to tackle Big Data problems efficiently. Science gateway frameworks and workflow systems enable application developers to implement complex applications and make these available for end-users via simple graphical user interfaces. The integration of such frameworks with Big Data processing tools on the cloud opens new oppor-tunities for application developers. This paper investigates how workflow sys-tems and science gateways can be extended with Big Data processing capabilities. A generic approach based on infrastructure aware workflows is suggested and a proof of concept is implemented based on the WS-PGRADE/gUSE science gateway framework and its integration with the Hadoop parallel data processing solution based on the MapReduce paradigm in the cloud. The provided analysis demonstrates that the methods described to integrate Big Data processing with workflows and science gateways work well in different cloud infrastructures and application scenarios, and can be used to create massively parallel applications for scientific analysis of Big Data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper is part of a special issue of Applied Geochemistry focusing on reliable applications of compositional multivariate statistical methods. This study outlines the application of compositional data analysis (CoDa) to calibration of geochemical data and multivariate statistical modelling of geochemistry and grain-size data from a set of Holocene sedimentary cores from the Ganges-Brahmaputra (G-B) delta. Over the last two decades, understanding near-continuous records of sedimentary sequences has required the use of core-scanning X-ray fluorescence (XRF) spectrometry, for both terrestrial and marine sedimentary sequences. Initial XRF data are generally unusable in ‘raw-format’, requiring data processing in order to remove instrument bias, as well as informed sequence interpretation. The applicability of these conventional calibration equations to core-scanning XRF data are further limited by the constraints posed by unknown measurement geometry and specimen homogeneity, as well as matrix effects. Log-ratio based calibration schemes have been developed and applied to clastic sedimentary sequences focusing mainly on energy dispersive-XRF (ED-XRF) core-scanning. This study has applied high resolution core-scanning XRF to Holocene sedimentary sequences from the tidal-dominated Indian Sundarbans, (Ganges-Brahmaputra delta plain). The Log-Ratio Calibration Equation (LRCE) was applied to a sub-set of core-scan and conventional ED-XRF data to quantify elemental composition. This provides a robust calibration scheme using reduced major axis regression of log-ratio transformed geochemical data. Through partial least squares (PLS) modelling of geochemical and grain-size data, it is possible to derive robust proxy information for the Sundarbans depositional environment. The application of these techniques to Holocene sedimentary data offers an improved methodological framework for unravelling Holocene sedimentation patterns.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background Dementia is a global issue, with increasing prevalence rates impacting on health services internationally. People with dementia are frequently admitted to hospital, an environment that may not be suited to their needs. While many initiatives have been developed to improve their care in the acute setting, there is a lack of cohesive understanding of how staff experience and perceive the care they give to people with dementia in the acute setting. Objectives The aim of this qualitative synthesis was to explore health care staffs’ experiences and perceptions of caring for people with dementia in the acute setting. Qualitative synthesis can bring together isolated findings in a meaningful way that can inform policy development. Settings A screening process, using inclusion/exclusion criteria, identified qualitative studies that focused on health care staff caring for people with dementia in acute settings. Participants Twelve reports of nine studies were included for synthesis. Data extraction was conducted on each report by two researchers. Methods Framework synthesis was employed using VIPS framework, using Values, Individualised, Perspective and Social and psychological as concepts to guide synthesis. The VIPS framework has previously been used for exploring approaches to caring for people with dementia. Quality appraisal was conducted using Critical Appraisal Skills Programme (CASP) and NVivo facilitated sensitivity analysis to ensure confidence in the findings. Results Key themes, derived from VIPS, included a number of specific subthemes that examined: infrastructure and care pathways, person-centred approaches to care, how the person interacts with their environment and other patients, and family involvement in care decisions. The synthesis identified barriers to appropriate care for the person with dementia. These include ineffective pathways of care, unsuitable environments, inadequate resources and staffing levels and lack of emphasis on education and training for staff caring for people with dementia. Conclusions This review has identified key issues in the care of people with dementia in the acute setting: improving pathways of care, creating suitable environments, addressing resources and staffing levels and placing emphasis on the education for staff caring for people with dementia. Recommendations are made for practice consideration, policy development and future research. Leadership is required to instil the values needed to care for this client group in an effective and personcentred way. Qualitative evidence synthesis can inform policy and in this case, recommends VIPS as a suitable framework for guiding decisions around care for people with dementia in acute settings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent advances in the massively parallel computational abilities of graphical processing units (GPUs) have increased their use for general purpose computation, as companies look to take advantage of big data processing techniques. This has given rise to the potential for malicious software targeting GPUs, which is of interest to forensic investigators examining the operation of software. The ability to carry out reverse-engineering of software is of great importance within the security and forensics elds, particularly when investigating malicious software or carrying out forensic analysis following a successful security breach. Due to the complexity of the Nvidia CUDA (Compute Uni ed Device Architecture) framework, it is not clear how best to approach the reverse engineering of a piece of CUDA software. We carry out a review of the di erent binary output formats which may be encountered from the CUDA compiler, and their implications on reverse engineering. We then demonstrate the process of carrying out disassembly of an example CUDA application, to establish the various techniques available to forensic investigators carrying out black-box disassembly and reverse engineering of CUDA binaries. We show that the Nvidia compiler, using default settings, leaks useful information. Finally, we demonstrate techniques to better protect intellectual property in CUDA algorithm implementations from reverse engineering.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a study of the mathematical properties of voice as an audio signal -- This work includes signals in which the channel conditions are not ideal for emotion recognition -- Multiresolution analysis- discrete wavelet transform – was performed through the use of Daubechies Wavelet Family (Db1-Haar, Db6, Db8, Db10) allowing the decomposition of the initial audio signal into sets of coefficients on which a set of features was extracted and analyzed statistically in order to differentiate emotional states -- ANNs proved to be a system that allows an appropriate classification of such states -- This study shows that the extracted features using wavelet decomposition are enough to analyze and extract emotional content in audio signals presenting a high accuracy rate in classification of emotional states without the need to use other kinds of classical frequency-time features -- Accordingly, this paper seeks to characterize mathematically the six basic emotions in humans: boredom, disgust, happiness, anxiety, anger and sadness, also included the neutrality, for a total of seven states to identify

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A particular problem for the automatic prediction of prosody in speech synthesis is the realisation of accented syllables since these are affected by many parameters and are perceptually very salient. For the Portuguese language, in Europe, a set of comprehensive quantitative characterisation data and rules is totally lacking. The present paper is intended to be a quantitative contribution to the solution of this problem. In this paper, a preliminary modelling of duration, intensity and variation of F0 in the tonic syllable will be presented. The dependencies of the model with the syllable position in the word and the word position in the phrase are also presented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A flexible and multipurpose bio-inspired hierarchical model for analyzing musical timbre is presented in this paper. Inspired by findings in the fields of neuroscience, computational neuroscience, and psychoacoustics, not only does the model extract spectral and temporal characteristics of a signal, but it also analyzes amplitude modulations on different timescales. It uses a cochlear filter bank to resolve the spectral components of a sound, lateral inhibition to enhance spectral resolution, and a modulation filter bank to extract the global temporal envelope and roughness of the sound from amplitude modulations. The model was evaluated in three applications. First, it was used to simulate subjective data from two roughness experiments. Second, it was used for musical instrument classification using the k-NN algorithm and a Bayesian network. Third, it was applied to find the features that characterize sounds whose timbres were labeled in an audiovisual experiment. The successful application of the proposed model in these diverse tasks revealed its potential in capturing timbral information.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The production and perception of music is a multimodal activity involving auditory, visual and conceptual processing, integrating these with prior knowledge and environmental experience. Musicians utilise expressive physical nuances to highlight salient features of the score. The question arises within the literature as to whether performers’ non-technical, non-sound-producing movements may be communicatively meaningful and convey important structural information to audience members and co-performers. In the light of previous performance research (Vines et al., 2006, Wanderley, 2002, Davidson, 1993), and considering findings within co-speech gestural research and auditory and audio-visual neuroscience, this thesis examines the nature of those movements not directly necessary for the production of sound, and their particular influence on audience perception. Within the current research 3D performance analysis is conducted using the Vicon 12- camera system and Nexus data-processing software. Performance gestures are identified as repeated patterns of motion relating to music structure, which not only express phrasing and structural hierarchy but are consistently and accurately interpreted as such by a perceiving audience. Gestural characteristics are analysed across performers and performance style using two Chopin preludes selected for their diverse yet comparable structures (Opus 28:7 and 6). Effects on perceptual judgements of presentation modes (visual-only, auditory-only, audiovisual, full- and point-light) and viewing conditions are explored. This thesis argues that while performance style is highly idiosyncratic, piano performers reliably generate structural gestures through repeated patterns of upper-body movement. The shapes and locations of phrasing motions are identified particular to the sample of performers investigated. Findings demonstrate that despite the personalised nature of the gestures, performers use increased velocity of movements to emphasise musical structure and that observers accurately and consistently locate phrasing junctures where these patterns and variation in motion magnitude, shape and velocity occur. By viewing performance motions in polar (spherical) rather than cartesian coordinate space it is possible to get mathematically closer to the movement generated by each of the nine performers, revealing distinct patterns of motion relating to phrasing structures, regardless of intended performance style. These patterns are highly individualised both to each performer and performed piece. Instantaneous velocity analysis indicates a right-directed bias of performance motion variation at salient structural features within individual performances. Perceptual analyses demonstrate that audience members are able to accurately and effectively detect phrasing structure from performance motion alone. This ability persists even for degraded point-light performances, where all extraneous environmental information has been removed. The relative contributions of audio, visual and audiovisual judgements demonstrate that the visual component of a performance does positively impact on the over- all accuracy of phrasing judgements, indicating that receivers are most effective in their recognition of structural segmentations when they can both see and hear a performance. Observers appear to make use of a rapid online judgement heuristics, adjusting response processes quickly to adapt and perform accurately across multiple modes of presentation and performance style. In line with existent theories within the literature, it is proposed that this processing ability may be related to cognitive and perceptual interpretation of syntax within gestural communication during social interaction and speech. Findings of this research may have future impact on performance pedagogy, computational analysis and performance research, as well as potentially influencing future investigations of the cognitive aspects of musical and gestural understanding.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The only method used to date to measure dissolved nitrate concentration (NITRATE) with sensors mounted on profiling floats is based on the absorption of light at ultraviolet wavelengths by nitrate ion (Johnson and Coletti, 2002; Johnson et al., 2010; 2013; D’Ortenzio et al., 2012). Nitrate has a modest UV absorption band with a peak near 210 nm, which overlaps with the stronger absorption band of bromide, which has a peak near 200 nm. In addition, there is a much weaker absorption due to dissolved organic matter and light scattering by particles (Ogura and Hanya, 1966). The UV spectrum thus consists of three components, bromide, nitrate and a background due to organics and particles. The background also includes thermal effects on the instrument and slow drift. All of these latter effects (organics, particles, thermal effects and drift) tend to be smooth spectra that combine to form an absorption spectrum that is linear in wavelength over relatively short wavelength spans. If the light absorption spectrum is measured in the wavelength range around 217 to 240 nm (the exact range is a bit of a decision by the operator), then the nitrate concentration can be determined. Two different instruments based on the same optical principles are in use for this purpose. The In Situ Ultraviolet Spectrophotometer (ISUS) built at MBARI or at Satlantic has been mounted inside the pressure hull of a Teledyne/Webb Research APEX and NKE Provor profiling floats and the optics penetrate through the upper end cap into the water. The Satlantic Submersible Ultraviolet Nitrate Analyzer (SUNA) is placed on the outside of APEX, Provor, and Navis profiling floats in its own pressure housing and is connected to the float through an underwater cable that provides power and communications. Power, communications between the float controller and the sensor, and data processing requirements are essentially the same for both ISUS and SUNA. There are several possible algorithms that can be used for the deconvolution of nitrate concentration from the observed UV absorption spectrum (Johnson and Coletti, 2002; Arai et al., 2008; Sakamoto et al., 2009; Zielinski et al., 2011). In addition, the default algorithm that is available in Satlantic sensors is a proprietary approach, but this is not generally used on profiling floats. There are some tradeoffs in every approach. To date almost all nitrate sensors on profiling floats have used the Temperature Compensated Salinity Subtracted (TCSS) algorithm developed by Sakamoto et al. (2009), and this document focuses on that method. It is likely that there will be further algorithm development and it is necessary that the data systems clearly identify the algorithm that is used. It is also desirable that the data system allow for recalculation of prior data sets using new algorithms. To accomplish this, the float must report not just the computed nitrate, but the observed light intensity. Then, the rule to obtain only one NITRATE parameter is, if the spectrum is present then, the NITRATE should be recalculated from the spectrum while the computation of nitrate concentration can also generate useful diagnostics of data quality.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The CATARINA Leg1 cruise was carried out from June 22 to July 24 2012 on board the B/O Sarmiento de Gamboa, under the scientific supervision of Aida Rios (CSIC-IIM). It included the occurrence of the OVIDE hydrological section that was performed in June 2002, 2004, 2006, 2008 and 2010, as part of the CLIVAR program (name A25) ), and under the supervision of Herlé Mercier (CNRSLPO). This section begins near Lisbon (Portugal), runs through the West European Basin and the Iceland Basin, crosses the Reykjanes Ridge (300 miles north of Charlie-Gibbs Fracture Zone, and ends at Cape Hoppe (southeast tip of Greenland). The objective of this repeated hydrological section is to monitor the variability of water mass properties and main current transports in the basin, complementing the international observation array relevant for climate studies. In addition, the Labrador Sea was partly sampled (stations 101-108) between Greenland and Newfoundland, but heavy weather conditions prevented the achievement of the section south of 53°40’N. The quality of CTD data is essential to reach the first objective of the CATARINA project, i.e. to quantify the Meridional Overturning Circulation and water mass ventilation changes and their effect on the changes in the anthropogenic carbon ocean uptake and storage capacity. The CATARINA project was mainly funded by the Spanish Ministry of Sciences and Innovation and co-funded by the Fondo Europeo de Desarrollo Regional. The hydrological OVIDE section includes 95 surface-bottom stations from coast to coast, collecting profiles of temperature, salinity, oxygen and currents, spaced by 2 to 25 Nm depending on the steepness of the topography. The position of the stations closely follows that of OVIDE 2002. In addition, 8 stations were carried out in the Labrador Sea. From the 24 bottles closed at various depth at each stations, samples of sea water are used for salinity and oxygen calibration, and for measurements of biogeochemical components that are not reported here. The data were acquired with a Seabird CTD (SBE911+) and an SBE43 for the dissolved oxygen, belonging to the Spanish UTM group. The software SBE data processing was used after decoding and cleaning the raw data. Then, the LPO matlab toolbox was used to calibrate and bin the data as it was done for the previous OVIDE cruises, using on the one hand pre and post-cruise calibration results for the pressure and temperature sensors (done at Ifremer) and on the other hand the water samples of the 24 bottles of the rosette at each station for the salinity and dissolved oxygen data. A final accuracy of 0.002°C, 0.002 psu and 0.04 ml/l (2.3 umol/kg) was obtained on final profiles of temperature, salinity and dissolved oxygen, compatible with international requirements issued from the WOCE program.