555 resultados para statistical distribution
Resumo:
Perhaps the most innovative of all independent OLD ventures specialising in ROW content is Jaman. Founded in 2007 by IT entrepreneur Gaurav Dhillon, and based in San Mateo, California, Jaman is a quality specialist distributor of non-Hollywood films. As of late 2010, Jaman had 1.8 million registered users and attracts viewers from most countries in the world. 75% of all use is generated from outside the U.S. Jaman does very well in English speaking parts of the world, particularly current and former Commonwealth countries. The United Kingdom accounts for 29% of users, North America (U.S. and Canada) 26%, and India represents 23%. Jaman is sometimes referred to as ‘social cinema’: a website which brings together the critique and review of a cinephile website (the forums of Rue-morgue.com for cinefantastique movie fans for example) with the social interaction, community and functionality of a social media site (for example Facebook.com). Jaman could be considered a pioneer in this space; a first mover in wrapping commercial movie downloading in an interactive social experience.
Resumo:
A classical condition for fast learning rates is the margin condition, first introduced by Mammen and Tsybakov. We tackle in this paper the problem of adaptivity to this condition in the context of model selection, in a general learning framework. Actually, we consider a weaker version of this condition that allows one to take into account that learning within a small model can be much easier than within a large one. Requiring this “strong margin adaptivity” makes the model selection problem more challenging. We first prove, in a general framework, that some penalization procedures (including local Rademacher complexities) exhibit this adaptivity when the models are nested. Contrary to previous results, this holds with penalties that only depend on the data. Our second main result is that strong margin adaptivity is not always possible when the models are not nested: for every model selection procedure (even a randomized one), there is a problem for which it does not demonstrate strong margin adaptivity.
Resumo:
The measurement error model is a well established statistical method for regression problems in medical sciences, although rarely used in ecological studies. While the situations in which it is appropriate may be less common in ecology, there are instances in which there may be benefits in its use for prediction and estimation of parameters of interest. We have chosen to explore this topic using a conditional independence model in a Bayesian framework using a Gibbs sampler, as this gives a great deal of flexibility, allowing us to analyse a number of different models without losing generality. Using simulations and two examples, we show how the conditional independence model can be used in ecology, and when it is appropriate.
Resumo:
Multivariate volatility forecasts are an important input in many financial applications, in particular portfolio optimisation problems. Given the number of models available and the range of loss functions to discriminate between them, it is obvious that selecting the optimal forecasting model is challenging. The aim of this thesis is to thoroughly investigate how effective many commonly used statistical (MSE and QLIKE) and economic (portfolio variance and portfolio utility) loss functions are at discriminating between competing multivariate volatility forecasts. An analytical investigation of the loss functions is performed to determine whether they identify the correct forecast as the best forecast. This is followed by an extensive simulation study examines the ability of the loss functions to consistently rank forecasts, and their statistical power within tests of predictive ability. For the tests of predictive ability, the model confidence set (MCS) approach of Hansen, Lunde and Nason (2003, 2011) is employed. As well, an empirical study investigates whether simulation findings hold in a realistic setting. In light of these earlier studies, a major empirical study seeks to identify the set of superior multivariate volatility forecasting models from 43 models that use either daily squared returns or realised volatility to generate forecasts. This study also assesses how the choice of volatility proxy affects the ability of the statistical loss functions to discriminate between forecasts. Analysis of the loss functions shows that QLIKE, MSE and portfolio variance can discriminate between multivariate volatility forecasts, while portfolio utility cannot. An examination of the effective loss functions shows that they all can identify the correct forecast at a point in time, however, their ability to discriminate between competing forecasts does vary. That is, QLIKE is identified as the most effective loss function, followed by portfolio variance which is then followed by MSE. The major empirical analysis reports that the optimal set of multivariate volatility forecasting models includes forecasts generated from daily squared returns and realised volatility. Furthermore, it finds that the volatility proxy affects the statistical loss functions’ ability to discriminate between forecasts in tests of predictive ability. These findings deepen our understanding of how to choose between competing multivariate volatility forecasts.
Resumo:
The CDKN2 gene, encoding the cyclin-dependent kinase inhibitor p16, is a tumour suppressor gene that maps to chromosome band 9p21-p22. The most common mechanism of inactivation of this gene in human cancers is through homozygous deletion; however, in a smaller proportion of tumours and tumour cell lines intragenic mutations occur. In this study we have compiled a database of over 120 published point mutations in the CDKN2 gene from a wide variety of tumour types. A further 50 deletions, insertions, and splice mutations in CDKN2 have also been compiled. Furthermore, we have standardised the numbering of all mutations according to the full-length 156 amino acid form of p16. From this study we are able to define several hot spots, some of which occur at conserved residues within the ankyrin domains of p16. While many of the hotspots are shared by a number of cancers, the relative importance of each position varies, possibly reflecting the role of different carcinogens in the development of certain tumours. As reported previously, the mutational spectrum of CDKN2 in melanomas differs from that of internal malignancies and supports the involvement of UV in melanoma tumorigenesis. Notably, 52% of all substitutions in melanoma-derived samples occurred at just six nucleotide positions. Nonsense mutations comprise a comparatively high proportion of mutations present in the CDKN2 gene, and possible explanations for this are discussed.
Resumo:
This paper employs the industry of origin approach to compare value added and productivity of Singapore and Hong Kong's Distribution Trade Sector for the period 2001-2008. The direct comparison between these two economies was motivated by the statements of the Singapore government: Its services sector, especially in Retail Trade, lags behind Hong Kong's productivity levels. The results show that since 2005, Singapore's Distribution performance in terms of labour productivity was below Hong Kong's level, which was largely due to poor performance in its Retail Trade sector arising from an influx of foreign workers. Results from total factor productivity (TFP) between these two economies also suggest that Hong Kong's better performance (since 2005) was largely due to its ability to employ more educated and trained workers with limited use of capital. The results suggest that polices that worked in Hong Kong may not work for Singapore because its population is more diverse which poses a challenge to policy-makers in raising its productivity level.
Resumo:
This thesis investigates profiling and differentiating customers through the use of statistical data mining techniques. The business application of our work centres on examining individuals’ seldomly studied yet critical consumption behaviour over an extensive time period within the context of the wireless telecommunication industry; consumption behaviour (as oppose to purchasing behaviour) is behaviour that has been performed so frequently that it become habitual and involves minimal intentions or decision making. Key variables investigated are the activity initialised timestamp and cell tower location as well as the activity type and usage quantity (e.g., voice call with duration in seconds); and the research focuses are on customers’ spatial and temporal usage behaviour. The main methodological emphasis is on the development of clustering models based on Gaussian mixture models (GMMs) which are fitted with the use of the recently developed variational Bayesian (VB) method. VB is an efficient deterministic alternative to the popular but computationally demandingMarkov chainMonte Carlo (MCMC) methods. The standard VBGMMalgorithm is extended by allowing component splitting such that it is robust to initial parameter choices and can automatically and efficiently determine the number of components. The new algorithm we propose allows more effective modelling of individuals’ highly heterogeneous and spiky spatial usage behaviour, or more generally human mobility patterns; the term spiky describes data patterns with large areas of low probability mixed with small areas of high probability. Customers are then characterised and segmented based on the fitted GMM which corresponds to how each of them uses the products/services spatially in their daily lives; this is essentially their likely lifestyle and occupational traits. Other significant research contributions include fitting GMMs using VB to circular data i.e., the temporal usage behaviour, and developing clustering algorithms suitable for high dimensional data based on the use of VB-GMM.
Resumo:
It is important to promote a sustainable development approach to ensure that economic, environmental and social developments are maintained in balance. Sustainable development and its implications are not just a global concern, it also affects Australia. In particular, rural Australian communities are facing various economic, environmental and social challenges. Thus, the need for sustainable development in rural regions is becoming increasingly important. To promote sustainable development, proper frameworks along with the associated tools optimised for the specific regions, need to be developed. This will ensure that the decisions made for sustainable development are evidence based, instead of subjective opinions. To address these issues, Queensland University of Technology (QUT), through an Australian Research Council (ARC) linkage grant, has initiated research into the development of a Rural Statistical Sustainability Framework (RSSF) to aid sustainable decision making in rural Queensland. This particular branch of the research developed a decision support tool that will become the integrating component of the RSSF. This tool is developed on the web-based platform to allow easy dissemination, quick maintenance and to minimise compatibility issues. The tool is developed based on MapGuide Open Source and it follows the three-tier architecture: Client tier, Web tier and the Server tier. The developed tool is interactive and behaves similar to a familiar desktop-based application. It has the capability to handle and display vector-based spatial data and can give further visual outputs using charts and tables. The data used in this tool is obtained from the QUT research team. Overall the tool implements four tasks to help in the decision-making process. These are the Locality Classification, Trend Display, Impact Assessment and Data Entry and Update. The developed tool utilises open source and freely available software and accounts for easy extensibility and long-term sustainability.
Resumo:
Determination of the placement and rating of transformers and feeders are the main objective of the basic distribution network planning. The bus voltage and the feeder current are two constraints which should be maintained within their standard range. The distribution network planning is hardened when the planning area is located far from the sources of power generation and the infrastructure. This is mainly as a consequence of the voltage drop, line loss and system reliability. Long distance to supply loads causes a significant amount of voltage drop across the distribution lines. Capacitors and Voltage Regulators (VRs) can be installed to decrease the voltage drop. This long distance also increases the probability of occurrence of a failure. This high probability leads the network reliability to be low. Cross-Connections (CC) and Distributed Generators (DGs) are devices which can be employed for improving system reliability. Another main factor which should be considered in planning of distribution networks (in both rural and urban areas) is load growth. For supporting this factor, transformers and feeders are conventionally upgraded which applies a large cost. Installation of DGs and capacitors in a distribution network can alleviate this issue while the other benefits are gained. In this research, a comprehensive planning is presented for the distribution networks. Since the distribution network is composed of low and medium voltage networks, both are included in this procedure. However, the main focus of this research is on the medium voltage network planning. The main objective is to minimize the investment cost, the line loss, and the reliability indices for a study timeframe and to support load growth. The investment cost is related to the distribution network elements such as the transformers, feeders, capacitors, VRs, CCs, and DGs. The voltage drop and the feeder current as the constraints are maintained within their standard range. In addition to minimizing the reliability and line loss costs, the planned network should support a continual growth of loads, which is an essential concern in planning distribution networks. In this thesis, a novel segmentation-based strategy is proposed for including this factor. Using this strategy, the computation time is significantly reduced compared with the exhaustive search method as the accuracy is still acceptable. In addition to being applicable for considering the load growth, this strategy is appropriate for inclusion of practical load characteristic (dynamic), as demonstrated in this thesis. The allocation and sizing problem has a discrete nature with several local minima. This highlights the importance of selecting a proper optimization method. Modified discrete particle swarm optimization as a heuristic method is introduced in this research to solve this complex planning problem. Discrete nonlinear programming and genetic algorithm as an analytical and a heuristic method respectively are also applied to this problem to evaluate the proposed optimization method.
Resumo:
In this paper a new graph-theory and improved genetic algorithm based practical method is employed to solve the optimal sectionalizer switch placement problem. The proposed method determines the best locations of sectionalizer switching devices in distribution networks considering the effects of presence of distributed generation (DG) in fitness functions and other optimization constraints, providing the maximum number of costumers to be supplied by distributed generation sources in islanded distribution systems after possible faults. The proposed method is simulated and tested on several distribution test systems in both cases of with DG and non DG situations. The results of the simulations validate the proposed method for switch placement of the distribution network in the presence of distributed generation.
Resumo:
Vehicle emitted particles are of significant concern based on their potential to influence local air quality and human health. Transport microenvironments usually contain higher vehicle emission concentrations compared to other environments, and people spend a substantial amount of time in these microenvironments when commuting. Currently there is limited scientific knowledge on particle concentration, passenger exposure and the distribution of vehicle emissions in transport microenvironments, partially due to the fact that the instrumentation required to conduct such measurements is not available in many research centres. Information on passenger waiting time and location in such microenvironments has also not been investigated, which makes it difficult to evaluate a passenger’s spatial-temporal exposure to vehicle emissions. Furthermore, current emission models are incapable of rapidly predicting emission distribution, given the complexity of variations in emission rates that result from changes in driving conditions, as well as the time spent in driving condition within the transport microenvironment. In order to address these scientific gaps in knowledge, this work conducted, for the first time, a comprehensive statistical analysis of experimental data, along with multi-parameter assessment, exposure evaluation and comparison, and emission model development and application, in relation to traffic interrupted transport microenvironments. The work aimed to quantify and characterise particle emissions and human exposure in the transport microenvironments, with bus stations and a pedestrian crossing identified as suitable research locations representing a typical transport microenvironment. Firstly, two bus stations in Brisbane, Australia, with different designs, were selected to conduct measurements of particle number size distributions, particle number and PM2.5 concentrations during two different seasons. Simultaneous traffic and meteorological parameters were also monitored, aiming to quantify particle characteristics and investigate the impact of bus flow rate, station design and meteorological conditions on particle characteristics at stations. The results showed higher concentrations of PN20-30 at the station situated in an open area (open station), which is likely to be attributed to the lower average daily temperature compared to the station with a canyon structure (canyon station). During precipitation events, it was found that particle number concentration in the size range 25-250 nm decreased greatly, and that the average daily reduction in PM2.5 concentration on rainy days compared to fine days was 44.2 % and 22.6 % at the open and canyon station, respectively. The effect of ambient wind speeds on particle number concentrations was also examined, and no relationship was found between particle number concentration and wind speed for the entire measurement period. In addition, 33 pairs of average half-hourly PN7-3000 concentrations were calculated and identified at the two stations, during the same time of a day, and with the same ambient wind speeds and precipitation conditions. The results of a paired t-test showed that the average half-hourly PN7-3000 concentrations at the two stations were not significantly different at the 5% confidence level (t = 0.06, p = 0.96), which indicates that the different station designs were not a crucial factor for influencing PN7-3000 concentrations. A further assessment of passenger exposure to bus emissions on a platform was evaluated at another bus station in Brisbane, Australia. The sampling was conducted over seven weekdays to investigate spatial-temporal variations in size-fractionated particle number and PM2.5 concentrations, as well as human exposure on the platform. For the whole day, the average PN13-800 concentration was 1.3 x 104 and 1.0 x 104 particle/cm3 at the centre and end of the platform, respectively, of which PN50-100 accounted for the largest proportion to the total count. Furthermore, the contribution of exposure at the bus station to the overall daily exposure was assessed using two assumed scenarios of a school student and an office worker. It was found that, although the daily time fraction (the percentage of time spend at a location in a whole day) at the station was only 0.8 %, the daily exposure fractions (the percentage of exposures at a location accounting for the daily exposure) at the station were 2.7% and 2.8 % for exposure to PN13-800 and 2.7% and 3.5% for exposure to PM2.5 for the school student and the office worker, respectively. A new parameter, “exposure intensity” (the ratio of daily exposure fraction and the daily time fraction) was also defined and calculated at the station, with values of 3.3 and 3.4 for exposure to PN13-880, and 3.3 and 4.2 for exposure to PM2.5, for the school student and the office worker, respectively. In order to quantify the enhanced emissions at critical locations and define the emission distribution in further dispersion models for traffic interrupted transport microenvironments, a composite line source emission (CLSE) model was developed to specifically quantify exposure levels and describe the spatial variability of vehicle emissions in traffic interrupted microenvironments. This model took into account the complexity of vehicle movements in the queue, as well as different emission rates relevant to various driving conditions (cruise, decelerate, idle and accelerate), and it utilised multi-representative segments to capture the accurate emission distribution for real vehicle flow. This model does not only helped to quantify the enhanced emissions at critical locations, but it also helped to define the emission source distribution of the disrupted steady flow for further dispersion modelling. The model then was applied to estimate particle number emissions at a bidirectional bus station used by diesel and compressed natural gas fuelled buses. It was found that the acceleration distance was of critical importance when estimating particle number emission, since the highest emissions occurred in sections where most of the buses were accelerating and no significant increases were observed at locations where they idled. It was also shown that emissions at the front end of the platform were 43 times greater than at the rear of the platform. The CLSE model was also applied at a signalled pedestrian crossing, in order to assess increased particle number emissions from motor vehicles when forced to stop and accelerate from rest. The CLSE model was used to calculate the total emissions produced by a specific number and mix of light petrol cars and diesel passenger buses including 1 car travelling in 1 direction (/1 direction), 14 cars / 1 direction, 1 bus / 1 direction, 28 cars / 2 directions, 24 cars and 2 buses / 2 directions, and 20 cars and 4 buses / 2 directions. It was found that the total emissions produced during stopping on a red signal were significantly higher than when the traffic moved at a steady speed. Overall, total emissions due to the interruption of the traffic increased by a factor of 13, 11, 45, 11, 41, and 43 for the above 6 cases, respectively. In summary, this PhD thesis presents the results of a comprehensive study on particle number and mass concentration, together with particle size distribution, in a bus station transport microenvironment, influenced by bus flow rates, meteorological conditions and station design. Passenger spatial-temporal exposure to bus emitted particles was also assessed according to waiting time and location along the platform, as well as the contribution of exposure at the bus station to overall daily exposure. Due to the complexity of the interrupted traffic flow within the transport microenvironments, a unique CLSE model was also developed, which is capable of quantifying emission levels at critical locations within the transport microenvironment, for the purpose of evaluating passenger exposure and conducting simulations of vehicle emission dispersion. The application of the CLSE model at a pedestrian crossing also proved its applicability and simplicity for use in a real-world transport microenvironment.
Resumo:
Background: Rapid weight gain in infancy is an important predictor of obesity in later childhood. Our aim was to determine which modifiable variables are associated with rapid weight gain in early life. Methods: Subjects were healthy infants enrolled in NOURISH, a randomised, controlled trial evaluating an intervention to promote positive early feeding practices. This analysis used the birth and baseline data for NOURISH. Birthweight was collected from hospital records and infants were also weighed at baseline assessment when they were aged 4-7 months and before randomisation. Infant feeding practices and demographic variables were collected from the mother using a self administered questionnaire. Rapid weight gain was defined as an increase in weight-for-age Z-score (using WHO standards) above 0.67 SD from birth to baseline assessment, which is interpreted clinically as crossing centile lines on a growth chart. Variables associated with rapid weight gain were evaluated using a multivariable logistic regression model. Results: Complete data were available for 612 infants (88% of the total sample recruited) with a mean (SD) age of 4.3 (1.0) months at baseline assessment. After adjusting for mother's age, smoking in pregnancy, BMI, and education and infant birthweight, age, gender and introduction of solid foods, the only two modifiable factors associated with rapid weight gain to attain statistical significance were formula feeding [OR=1.72 (95%CI 1.01-2.94), P= 0.047] and feeding on schedule [OR=2.29 (95%CI 1.14-4.61), P=0.020]. Male gender and lower birthweight were non-modifiable factors associated with rapid weight gain. Conclusions: This analysis supports the contention that there is an association between formula feeding, feeding to schedule and weight gain in the first months of life. Mechanisms may include the actual content of formula milk (e.g. higher protein intake) or differences in feeding styles, such as feeding to schedule, which increase the risk of overfeeding. Trial Registration: Australian Clinical Trials Registry ACTRN12608000056392
Resumo:
Protection of a distribution network in the presence of distributed generators (DGs) using overcurrent relays is a challenging task due to the changes in fault current levels and reverse power flow. Specifically, in the presence of current limited converter interfaced DGs, overcurrent relays may fail to isolate the faulted section either in grid connected or islanded mode of operation. In this paper, a new inverse type relay is presented to protect a distribution network, which may have several DG connections. The new relay characteristic is designed based on the measured admittance of the protected line. The relay is capable of detecting faults under changing fault current levels. The relay performance is evaluated using PSCAD simulation and laboratory experiments.
Resumo:
The potential of distributed reactive power control to improve the voltage profile of radial distribution feeders has been reported in literature by few authors. However, the multiple inverters injecting or absorbing reactive power across a distribution feeder may introduce control interactions and/or voltage instability. Such controller interactions can be alleviated if the inverters are allowed to operate on voltage droop. First, the paper demonstrates that a linear shallow droop line can maintain the steady state voltage profile close to reference, up to a certain level of loading. Then, impacts of the shallow droop line control on line losses and line power factors are examined. Finally, a piecewise linear droop line which can achieve reduced line losses and close to unity power factor at the feeder source is proposed.