947 resultados para Data pre-processing
Resumo:
Quantile computation has many applications including data mining and financial data analysis. It has been shown that an is an element of-approximate summary can be maintained so that, given a quantile query d (phi, is an element of), the data item at rank [phi N] may be approximately obtained within the rank error precision is an element of N over all N data items in a data stream or in a sliding window. However, scalable online processing of massive continuous quantile queries with different phi and is an element of poses a new challenge because the summary is continuously updated with new arrivals of data items. In this paper, first we aim to dramatically reduce the number of distinct query results by grouping a set of different queries into a cluster so that they can be processed virtually as a single query while the precision requirements from users can be retained. Second, we aim to minimize the total query processing costs. Efficient algorithms are developed to minimize the total number of times for reprocessing clusters and to produce the minimum number of clusters, respectively. The techniques are extended to maintain near-optimal clustering when queries are registered and removed in an arbitrary fashion against whole data streams or sliding windows. In addition to theoretical analysis, our performance study indicates that the proposed techniques are indeed scalable with respect to the number of input queries as well as the number of items and the item arrival rate in a data stream.
Resumo:
Traditional vegetation mapping methods use high cost, labour-intensive aerial photography interpretation. This approach can be subjective and is limited by factors such as the extent of remnant vegetation, and the differing scale and quality of aerial photography over time. An alternative approach is proposed which integrates a data model, a statistical model and an ecological model using sophisticated Geographic Information Systems (GIS) techniques and rule-based systems to support fine-scale vegetation community modelling. This approach is based on a more realistic representation of vegetation patterns with transitional gradients from one vegetation community to another. Arbitrary, though often unrealistic, sharp boundaries can be imposed on the model by the application of statistical methods. This GIS-integrated multivariate approach is applied to the problem of vegetation mapping in the complex vegetation communities of the Innisfail Lowlands in the Wet Tropics bioregion of Northeastern Australia. The paper presents the full cycle of this vegetation modelling approach including sampling sites, variable selection, model selection, model implementation, internal model assessment, model prediction assessments, models integration of discrete vegetation community models to generate a composite pre-clearing vegetation map, independent data set model validation and model prediction's scale assessments. An accurate pre-clearing vegetation map of the Innisfail Lowlands was generated (0.83r(2)) through GIS integration of 28 separate statistical models. This modelling approach has good potential for wider application, including provision of. vital information for conservation planning and management; a scientific basis for rehabilitation of disturbed and cleared areas; a viable method for the production of adequate vegetation maps for conservation and forestry planning of poorly-studied areas. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
A progressive spatial query retrieves spatial data based on previous queries (e.g., to fetch data in a more restricted area with higher resolution). A direct query, on the other side, is defined as an isolated window query. A multi-resolution spatial database system should support both progressive queries and traditional direct queries. It is conceptually challenging to support both types of query at the same time, as direct queries favour location-based data clustering, whereas progressive queries require fragmented data clustered by resolutions. Two new scaleless data structures are proposed in this paper. Experimental results using both synthetic and real world datasets demonstrate that the query processing time based on the new multiresolution approaches is comparable and often better than multi-representation data structures for both types of queries.
Resumo:
The blood types determination is essential to perform safe blood transfusions. In emergency situations isadministrated the “universal donor” blood type. However, sometimes, this blood type can cause incom-patibilities in the transfusion receptor. A mechatronic prototype was developed to solve this problem.The prototype was built to meet specific goals, incorporating all the necessary components. The obtainedsolution is close to the final system that will be produced later, at industrial scale, as a medical device.The prototype is a portable and low cost device, and can be used in remote locations. A computer appli-cation, previously developed is used to operate with the developed mechatronic prototype, and obtainautomatically test results. It allows image acquisition, processing and analysis, based on Computer Visionalgorithms, Machine Learning algorithms and deterministic algorithms. The Machine Learning algorithmsenable the classification of occurrence, or alack of agglutination in the mixture (blood/reagents), and amore reliable and a safer methodology as test data are stored in a database. The work developed allowsthe administration of a compatible blood type in emergency situations, avoiding the discontinuity of the“universal donor” blood type stocks, and reducing the occurrence of human errors in the transfusion practice.
Resumo:
Photonic technologies for data processing in the optical domain are expected to play a major role in future high-speed communications. Nonlinear effects in optical fibres have many attractive features and great, but not yet fully explored potential for optical signal processing. Here we provide an overview of our recent advances in developing novel techniques and approaches to all-optical processing based on fibre nonlinearities.
Resumo:
Recent advances in technology have produced a significant increase in the availability of free sensor data over the Internet. With affordable weather monitoring stations now available to individual meteorology enthusiasts a reservoir of real time data such as temperature, rainfall and wind speed can now be obtained for most of the United States and Europe. Despite the abundance of available data, obtaining useable information about the weather in your local neighbourhood requires complex processing that poses several challenges. This paper discusses a collection of technologies and applications that harvest, refine and process this data, culminating in information that has been tailored toward the user. In this case we are particularly interested in allowing a user to make direct queries about the weather at any location, even when this is not directly instrumented, using interpolation methods. We also consider how the uncertainty that the interpolation introduces can then be communicated to the user of the system, using UncertML, a developing standard for uncertainty representation.
Resumo:
The possibility that developmental dyslexia results from low-level sensory processing deficits has received renewed interest in recent years. Opponents of such sensory-based explanations argue that dyslexia arises primarily from phonological impairments. However, many behavioural correlates of dyslexia cannot be explained sufficiently by cognitive-level accounts and there is anatomical, psychometric and physiological evidence of sensory deficits in the dyslexic population. This thesis aims to determine whether the low-level (pre-attentive) processing of simple auditory stimuli is disrupted in compensated adult dyslexics. Using psychometric and neurophysiological measures, the nature of auditory processing abnormalities is investigated. Group comparisons are supported by analysis of individual data in order to address the issue of heterogeneity in dyslexia. The participant pool consisted of seven compensated dyslexic adults and seven age and IQ matched controls. The dyslexic group were impaired, relative to the control group, on measures of literacy, phonological awareness, working memory and processing speed. Magnetoencephalographic recordings were conducted during processing of simple, non-speech, auditory stimuli. Results confirm that low-level auditory processing deficits are present in compensated dyslexic adults. The amplitude of N1m responses to tone pair stimuli were reduced in the dyslexic group. However, there was no evidence that manipulating either the silent interval or the frequency separation between the tones had a greater detrimental effect on dyslexic participants specifically. Abnormal MMNm responses were recorded in response to frequency deviant stimuli in the dyslexic group. In addition, complete stimulus omissions, which evoked MMNm responses in all control participants, failed to elicit significant MMNm responses in all but one of the dyslexic individuals. The data indicate both a deficit of frequency resolution at a local level of auditory processing and a higher-level deficit relating to the grouping of auditory stimuli, relevant for auditory scene analysis. Implications and directions for future research are outlined.
Resumo:
We present the first experimental implementation of a recently designed quasi-lossless fiber span with strongly reduced signal power excursion. The resulting fiber waveguide medium can be advantageously used both in lightwave communications and in all-optical nonlinear data processing.
Resumo:
We present the first experimental implementation of a recently designed quasi-lossless fibre span with strongly reduced signal power excursion. The resulting fibre waveguide medium can be advantageously used both in lightwave communications and in all-optical nonlinear data processing.
Resumo:
Adults show great variation in their auditory skills, such as being able to discriminate between foreign speech-sounds. Previous research has demonstrated that structural features of auditory cortex can predict auditory abilities; here we are interested in the maturation of 2-Hz frequency-modulation (FM) detection, a task thought to tap into mechanisms underlying language abilities. We hypothesized that an individual's FM threshold will correlate with gray-matter density in left Heschl's gyrus, and that this function-structure relationship will change through adolescence. To test this hypothesis, we collected anatomical magnetic resonance imaging data from participants who were tested and scanned at three time points: at 10, 11.5 and 13 years of age. Participants judged which of two tones contained FM; the modulation depth was adjusted using an adaptive staircase procedure and their threshold was calculated based on the geometric mean of the last eight reversals. Using voxel-based morphometry, we found that FM threshold was significantly correlated with gray-matter density in left Heschl's gyrus at the age of 10 years, but that this correlation weakened with age. While there were no differences between girls and boys at Times 1 and 2, at Time 3 there was a relationship between gray-matter density in left Heschl's gyrus in boys but not in girls. Taken together, our results confirm that the structure of the auditory cortex can predict temporal processing abilities, namely that gray-matter density in left Heschl's gyrus can predict 2-Hz FM detection threshold. This ability is dependent on the processing of sounds changing over time, a skill believed necessary for speech processing. We tested this assumption and found that FM threshold significantly correlated with spelling abilities at Time 1, but that this correlation was found only in boys. This correlation decreased at Time 2, and at Time 3 we found a significant correlation between reading and FM threshold, but again, only in boys. We examined the sex differences in both the imaging and behavioral data taking into account pubertal stages, and found that the correlation between FM threshold and spelling was strongest pre-pubertally, and the correlation between FM threshold and gray-matter density in left Heschl's gyrus was strongest mid-pubertally.
Resumo:
Recent advances in technology have produced a significant increase in the availability of free sensor data over the Internet. With affordable weather monitoring stations now available to individual meteorology enthusiasts a reservoir of real time data such as temperature, rainfall and wind speed can now be obtained for most of the United States and Europe. Despite the abundance of available data, obtaining useable information about the weather in your local neighbourhood requires complex processing that poses several challenges. This paper discusses a collection of technologies and applications that harvest, refine and process this data, culminating in information that has been tailored toward the user. In this case we are particularly interested in allowing a user to make direct queries about the weather at any location, even when this is not directly instrumented, using interpolation methods. We also consider how the uncertainty that the interpolation introduces can then be communicated to the user of the system, using UncertML, a developing standard for uncertainty representation.
Resumo:
All-optical technologies for data processing and signal manipulation are expected to play a major role in future optical communications. Nonlinear phenomena occurring in optical fibre have many attractive features and great, but not yet fully exploited potential in optical signal processing. Here, we overview our recent results and advances in developing novel photonic techniques and approaches to all-optical processing based on fibre nonlinearities. Amongst other topics, we will discuss phase-preserving optical 2R regeneration, the possibility of using parabolic/flat-top pulses for optical signal processing and regeneration, and nonlinear optical pulse shaping. A method for passive nonlinear pulse shaping based on pulse pre-chirping and propagation in a normally dispersive fibre will be presented. The approach provides a simple way of generating various temporal waveforms of fundamental and practical interest. Particular emphasis will be given to the formation and characterization of pulses with a triangular intensity profile. A new technique of doubling/copying optical pulses in both the frequency and time domains using triangular-shaped pulses will be also introduced.