784 resultados para statistical lip modelling
An approach to statistical lip modelling for speaker identification via chromatic feature extraction
Resumo:
This paper presents a novel technique for the tracking of moving lips for the purpose of speaker identification. In our system, a model of the lip contour is formed directly from chromatic information in the lip region. Iterative refinement of contour point estimates is not required. Colour features are extracted from the lips via concatenated profiles taken around the lip contour. Reduction of order in lip features is obtained via principal component analysis (PCA) followed by linear discriminant analysis (LDA). Statistical speaker models are built from the lip features based on the Gaussian mixture model (GMM). Identification experiments performed on the M2VTS1 database, show encouraging results
Resumo:
This thesis targets on a challenging issue that is to enhance users' experience over massive and overloaded web information. The novel pattern-based topic model proposed in this thesis can generate high-quality multi-topic user interest models technically by incorporating statistical topic modelling and pattern mining. We have successfully applied the pattern-based topic model to both fields of information filtering and information retrieval. The success of the proposed model in finding the most relevant information to users mainly comes from its precisely semantic representations to represent documents and also accurate classification of the topics at both document level and collection level.
Resumo:
Current guidelines on clear zone selection and roadside hazard management adopt the US approach based on the likelihood of roadside encroachment by drivers. This approach is based on the available research conducted in the 1960s and 70s. Over time, questions have been raised regarding the robustness and applicability of this research in Australasia in 2010 and in the Safe System context. This paper presents a review of the fundamental research relating to selection of clear zones. Results of extensive rural highway statistical data modelling suggest that a significant proportion of run-off-road to the left casualty crashes occurs in clear zones exceeding 13 m. They also show that the risk of run-off-road to the left casualty crashes was 21% lower where clear zones exceeded 8 m when compared with clear zones in the 4 – 8 m range. The paper discusses a possible approach to selection of clear zones based on managing crash outcomes, rather than on the likelihood of roadside encroachment which is the basis for the current practice. It is expected that this approach would encourage selection of clear zones wider than 8 m when the combination of other road features suggests higher than average casualty crash risk.
Resumo:
This research constructed a readability measurement for French speakers who view English as a second language. It identified the true cognates, which are the similar words from these two languages, as an indicator of the difficulty of an English text for French people. A multilingual lexical resource is used to detect true cognates in text, and Statistical Language Modelling to predict the predict the readability level. The proposed enhanced statistical language model is making a step in the right direction by improving the accuracy of readability predictions for French speakers by up to 10% compared to state of the art approaches. The outcome of this study could accelerate the learning process for French speakers who are studying English. More importantly, this study also benefits the readability estimation research community, presenting an approach and evaluation at sentence level as well as innovating with the use of cognates as a new text feature.
Resumo:
The recent development of indoor wireless local area network (WLAN) standards at 2.45 GHz and 5 GHz has led to increased interest in propagation studies at these frequency bands. Within the indoor environment, human body effects can strongly reduce the quality of wireless communication systems. Human body effects can cause temporal variations and shadowing due to pedestrian movement and antenna- body interaction with portable terminals. This book presents a statistical characterisation, based on measurements, of human body effects on indoor narrowband channels at 2.45 GHz and at 5.2 GHz. A novel cumulative distribution function (CDF) that models the 5 GHz narrowband channel in populated indoor environments is proposed. This novel CDF describes the received envelope in terms of pedestrian traffic. In addition, a novel channel model for the populated indoor environment is proposed for the Multiple-Input Multiple-Output (MIMO) narrowband channel in presence of pedestrians at 2.45 GHz. Results suggest that practical MIMO systems must be sufficiently adaptive if they are to benefit from the capacity enhancement caused by pedestrian movement.
Resumo:
In a seminal data mining article, Leo Breiman [1] argued that to develop effective predictive classification and regression models, we need to move away from the sole dependency on statistical algorithms and embrace a wider toolkit of modeling algorithms that include data mining procedures. Nevertheless, many researchers still rely solely on statistical procedures when undertaking data modeling tasks; the sole reliance on these procedures has lead to the development of irrelevant theory and questionable research conclusions ([1], p.199). We will outline initiatives that the HPC & Research Support group is undertaking to engage researchers with data mining tools and techniques; including a new range of seminars, workshops, and one-on-one consultations covering data mining algorithms, the relationship between data mining and the research cycle, and limitations and problems with these new algorithms. Organisational limitations and restrictions to these initiatives are also discussed.
Resumo:
Concerns regarding groundwater contamination with nitrate and the long-term sustainability of groundwater resources have prompted the development of a multi-layered three dimensional (3D) geological model to characterise the aquifer geometry of the Wairau Plain, Marlborough District, New Zealand. The 3D geological model which consists of eight litho-stratigraphic units has been subsequently used to synthesise hydrogeological and hydrogeochemical data for different aquifers in an approach that aims to demonstrate how integration of water chemistry data within the physical framework of a 3D geological model can help to better understand and conceptualise groundwater systems in complex geological settings. Multivariate statistical techniques(e.g. Principal Component Analysis and Hierarchical Cluster Analysis) were applied to groundwater chemistry data to identify hydrochemical facies which are characteristic of distinct evolutionary pathways and a common hydrologic history of groundwaters. Principal Component Analysis on hydrochemical data demonstrated that natural water-rock interactions, redox potential and human agricultural impact are the key controls of groundwater quality in the Wairau Plain. Hierarchical Cluster Analysis revealed distinct hydrochemical water quality groups in the Wairau Plain groundwater system. Visualisation of the results of the multivariate statistical analyses and distribution of groundwater nitrate concentrations in the context of aquifer lithology highlighted the link between groundwater chemistry and the lithology of host aquifers. The methodology followed in this study can be applied in a variety of hydrogeological settings to synthesise geological, hydrogeological and hydrochemical data and present them in a format readily understood by a wide range of stakeholders. This enables a more efficient communication of the results of scientific studies to the wider community.
Resumo:
A wireless sensor network system must have the ability to tolerate harsh environmental conditions and reduce communication failures. In a typical outdoor situation, the presence of wind can introduce movement in the foliage. This motion of vegetation structures causes large and rapid signal fading in the communication link and must be accounted for when deploying a wireless sensor network system in such conditions. This thesis examines the fading characteristics experienced by wireless sensor nodes due to the effect of varying wind speed in a foliage obstructed transmission path. It presents extensive measurement campaigns at two locations with the approach of a typical wireless sensor networks configuration. The significance of this research lies in the varied approaches of its different experiments, involving a variety of vegetation types, scenarios and the use of different polarisations (vertical and horizontal). Non–line of sight (NLoS) scenario conditions investigate the wind effect based on different vegetation densities including that of the Acacia tree, Dogbane tree and tall grass. Whereas the line of sight (LoS) scenario investigates the effect of wind when the grass is swaying and affecting the ground-reflected component of the signal. Vegetation type and scenarios are envisaged to simulate real life working conditions of wireless sensor network systems in outdoor foliated environments. The results from the measurements are presented in statistical models involving first and second order statistics. We found that in most of the cases, the fading amplitude could be approximated by both Lognormal and Nakagami distribution, whose m parameter was found to depend on received power fluctuations. Lognormal distribution is known as the result of slow fading characteristics due to shadowing. This study concludes that fading caused by variations in received power due to wind in wireless sensor networks systems are found to be insignificant. There is no notable difference in Nakagami m values for low, calm, and windy wind speed categories. It is also shown in the second order analysis, the duration of the deep fades are very short, 0.1 second for 10 dB attenuation below RMS level for vertical polarization and 0.01 second for 10 dB attenuation below RMS level for horizontal polarization. Another key finding is that the received signal strength for horizontal polarisation demonstrates more than 3 dB better performances than the vertical polarisation for LoS and near LoS (thin vegetation) conditions and up to 10 dB better for denser vegetation conditions.
Resumo:
The Clarence-Moreton Basin (CMB) covers approximately 26000 km2 and is the only sub-basin of the Great Artesian Basin (GAB) in which there is flow to both the south-west and the east, although flow to the south-west is predominant. In many parts of the basin, including catchments of the Bremer, Logan and upper Condamine Rivers in southeast Queensland, the Walloon Coal Measures are under exploration for Coal Seam Gas (CSG). In order to assess spatial variations in groundwater flow and hydrochemistry at a basin-wide scale, a 3D hydrogeological model of the Queensland section of the CMB has been developed using GoCAD modelling software. Prior to any large-scale CSG extraction, it is essential to understand the existing hydrochemical character of the different aquifers and to establish any potential linkage. To effectively use the large amount of water chemistry data existing for assessment of hydrochemical evolution within the different lithostratigraphic units, multivariate statistical techniques were employed.
Resumo:
Electricity network investment and asset management require accurate estimation of future demand in energy consumption within specified service areas. For this purpose, simple models are typically developed to predict future trends in electricity consumption using various methods and assumptions. This paper presents a statistical model to predict electricity consumption in the residential sector at the Census Collection District (CCD) level over the state of New South Wales, Australia, based on spatial building and household characteristics. Residential household demographic and building data from the Australian Bureau of Statistics (ABS) and actual electricity consumption data from electricity companies are merged for 74 % of the 12,000 CCDs in the state. Eighty percent of the merged dataset is randomly set aside to establish the model using regression analysis, and the remaining 20 % is used to independently test the accuracy of model prediction against actual consumption. In 90 % of the cases, the predicted consumption is shown to be within 5 kWh per dwelling per day from actual values, with an overall state accuracy of -1.15 %. Given a future scenario with a shift in climate zone and a growth in population, the model is used to identify the geographical or service areas that are most likely to have increased electricity consumption. Such geographical representation can be of great benefit when assessing alternatives to the centralised generation of energy; having such a model gives a quantifiable method to selecting the 'most' appropriate system when a review or upgrade of the network infrastructure is required.
Resumo:
For clinical use, in electrocardiogram (ECG) signal analysis it is important to detect not only the centre of the P wave, the QRS complex and the T wave, but also the time intervals, such as the ST segment. Much research focused entirely on qrs complex detection, via methods such as wavelet transforms, spline fitting and neural networks. However, drawbacks include the false classification of a severe noise spike as a QRS complex, possibly requiring manual editing, or the omission of information contained in other regions of the ECG signal. While some attempts were made to develop algorithms to detect additional signal characteristics, such as P and T waves, the reported success rates are subject to change from person-to-person and beat-to-beat. To address this variability we propose the use of Markov-chain Monte Carlo statistical modelling to extract the key features of an ECG signal and we report on a feasibility study to investigate the utility of the approach. The modelling approach is examined with reference to a realistic computer generated ECG signal, where details such as wave morphology and noise levels are variable.
Resumo:
This chapter addresses data modelling as a means of promoting statistical literacy in the early grades. Consideration is first given to the importance of increasing young children’s exposure to statistical reasoning experiences and how data modelling can be a rich means of doing so. Selected components of data modelling are then reviewed, followed by a report on some findings from the third-year of a three-year longitudinal study across grades one through three.