122 resultados para Academic performance prediction
Resumo:
Real-World Data Mining Applications generally do not end up with the creation of the models. The use of the model is the final purpose especially in prediction tasks. The problem arises when the model is built based on much more information than that the user can provide in using the model. As a result, the performance of model reduces drastically due to many missing attributes values. This paper develops a new learning system framework, called as User Query Based Learning System (UQBLS), for building data mining models best suitable for users use. We demonstrate its deployment in a real-world application of the lifetime prediction of metallic components in buildings
Resumo:
Research Background - Young people with negative experiences of mainstream education often display low levels of traditional academic achievement. These young people tend to display considerable cultural and social resources developed through their repeated experiences of adversity. Education research has a duty to provide these young people with opportunities to showcase, assess and translate their social and cultural resources into symbolic forms of capital. This creative work addresses the research question, how can educators maximise the social and cultural capital they help young people acquire through live music performances and studio recordings? Research Contribution - This live music performance, built on existing artistic reputations of the artists, saw the lads support one of their local heroes from Brisbane Hip Hop music scene. In doing so they showcased what their three years of concerted musical engagement can achieve within supportive flexible learning environments. The new knowledge derived from this research focuses on the academic and self confidence benefits for disengaged young people using festival performances as authentic learning activities. Research Significance - This research is significant because it aims to maximise the number of tangible outcomes related to a school-based arts project. The young participants gained technical, artistic, social and commercial status during this project. Individual performances were distributed and downloaded via creative commons licences at the Australian Creative Resource Archive. This performance also contributed to their certified qualifications and acted as pilot research data for two competitively funded ARC grants (DP0209421 & LP0883643)
Resumo:
Harmful Algal Blooms (HABs) are a worldwide problem that have been increasing in frequency and extent over the past several decades. HABs severely damage aquatic ecosystems by destroying benthic habitat, reducing invertebrate and fish populations and affecting larger species such as dugong that rely on seagrasses for food. Few statistical models for predicting HAB occurrences have been developed, and in common with most predictive models in ecology, those that have been developed do not fully account for uncertainties in parameters and model structure. This makes management decisions based on these predictions more risky than might be supposed. We used a probit time series model and Bayesian Model Averaging (BMA) to predict occurrences of blooms of Lyngbya majuscula, a toxic cyanophyte, in Deception Bay, Queensland, Australia. We found a suite of useful predictors for HAB occurrence, with Temperature figuring prominently in models with the majority of posterior support, and a model consisting of the single covariate average monthly minimum temperature showed by far the greatest posterior support. A comparison of alternative model averaging strategies was made with one strategy using the full posterior distribution and a simpler approach that utilised the majority of the posterior distribution for predictions but with vastly fewer models. Both BMA approaches showed excellent predictive performance with little difference in their predictive capacity. Applications of BMA are still rare in ecology, particularly in management settings. This study demonstrates the power of BMA as an important management tool that is capable of high predictive performance while fully accounting for both parameter and model uncertainty.
Resumo:
The research on project learning has recognised the significance of knowledge transfer in project based organisations (PBOs). Effective knowledge transfer across projects avoids reinventions, enhances knowledge creation and saves lots of time that is crucial in project environment. In order to facilitate knowledge transfer, many PBOs have invested lots of financial and human resources to implement IT-based knowledge repository. However, some empirical studies found that employees would rather turn for knowledge to colleagues despite their ready access to IT-based knowledge repository. Therefore, it is apparent that social networks play a pivotal role in the knowledge transfer across projects. Some scholars attempt to explore the effect of network structure on knowledge transfer and performance, however, focused only on egocentric networks and the groups’ internal social networks. It has been found that the project’s external social network is also critical, in that the team members can not handle critical situations and accomplish the projects on time without the assistance and knowledge from external sources. To date, the influence of the structure of a project team’s internal and external social networks on project performance, and the interrelation between both networks are barely known. In order to obtain such knowledge, this paper explores the interrelation between the structure of a project team’s internal and external social networks, and their effect on the project team’s performance. Data is gathered through survey questionnaire distributed online to respondents. Collected data is analysed applying social network analysis (SNA) tools and SPSS. The theoretical contribution of this paper is the knowledge of the interrelation between the structure of a project team’s internal and external social networks and their influence on the project team’s performance. The practical contribution lies in the guideline to be proposed for constructing the structure of project team’s internal and external social networks.
Resumo:
This review explores the question whether chemometrics methods enhance the performance of electroanalytical methods. Electroanalysis has long benefited from the well-established techniques such as potentiometric titrations, polarography and voltammetry, and the more novel ones such as electronic tongues and noses, which have enlarged the scope of applications. The electroanalytical methods have been improved with the application of chemometrics for simultaneous quantitative prediction of analytes or qualitative resolution of complex overlapping responses. Typical methods include partial least squares (PLS), artificial neural networks (ANNs), and multiple curve resolution methods (MCR-ALS, N-PLS and PARAFAC). This review aims to provide the practising analyst with a broad guide to electroanalytical applications supported by chemometrics. In this context, after a general consideration of the use of a number of electroanalytical techniques with the aid of chemometrics methods, several overviews follow with each one focusing on an important field of application such as food, pharmaceuticals, pesticides and the environment. The growth of chemometrics in conjunction with electronic tongue and nose sensors is highlighted, and this is followed by an overview of the use of chemometrics for the resolution of complicated profiles for qualitative identification of analytes, especially with the use of the MCR-ALS methodology. Finally, the performance of electroanalytical methods is compared with that of some spectrophotometric procedures on the basis of figures-of-merit. This showed that electroanalytical methods can perform as well as the spectrophotometric ones. PLS-1 appears to be the method of practical choice if the %relative prediction error of not, vert, similar±10% is acceptable.
Resumo:
Successful project delivery of construction projects depends on many factors. With regard to the construction of a facility, selecting a competent contractor for the job is paramount. As such, various approaches have been advanced to facilitate tender award decisions. Essentially, this type of decision involves the prediction of a bidderÕs performance based on information available at the tender stage. A neural network based prediction model was developed and presented in this paper. Project data for the study were obtained from the Hong Kong Housing Department. Information from the tender reports was used as input variables and performance records of the successful bidder during construction were used as output variables. It was found that the networks for the prediction of performance scores for Works gave the highest hit rate. In addition, the two most sensitive input variables toward such prediction are ‘‘Difference between Estimate’’ and ‘‘Difference between the next closest bid’’. Both input variables are price related, thus suggesting the importance of tender sufficiency for the assurance of quality production.
Resumo:
This study examined the utility of self-efficacy as a predictor of social activity and mood control in multiple sclerosis (MS). Seventy-one subjects with MS were recruited from people attending an MS centre or from a mailing list and were examined on two occasions that were two months apart. Clinic patients were more disabled than patients who completed assessments by post, but they were of higher socioeconomic status and were less dysphoric. We attempted to predict self-reported performance of mood control and social activity at two months, from self-efficacy or performance on these tasks at pretest. Demographic variables, disorder status, disability, self-esteem and depression were also allowed to compete for entry into multiple regressions. Substantial stability in mood, performance and disability was observed over the two months. In both mood control and social activity, past performance was the strongest predictor of later performance, but self-efficacy also contributed significantly to the prediction. The disability level entered a prediction of socila activity, but no other variables predicted either type of performance. A secondary analysis predicting self-esteem at two months also included self-efficacy for social activity, illustrating the contribution of perceived capability to later assessments of self-worth. The study provided support for self-efficacy as a predictor of later behavioural outcomes and self-esteem in multiple sclerosis.
Resumo:
Work-integrated learning in the form of internships is increasingly important for universities as they seek to compete for students, and seek links with industries. Yet, there is surprisingly little empirical research on the details of internships: (1) What they should accomplish? How they should be structure? (3) How students performance should be assess? There is also surprisingly little conceptual analysis of these key issues, either for business internships in general. or for marketing internships in particular. Furthermore, the "answers" on these issues may differ depending upon the perspective if the three stakeholders: students, business managers and university academics. There is not study in the marketing literature which surveys all three groups on these important aspects of internships. To fill these gaps, this paper discusses and analyses internships goals, internship structure, and internship assessment or undergraduate marketing internships, and then reports on a survey of the views of all three stakeholder groups on these issues. There are a considerable variety of approaches for internships, but generally there is consensus among the stake holder groups, with some notable differences. Managerial implication include recognition of the importance of having and academic aspects in internships; mutual understanding concerning needs and constraints; and the requirement that companies, students, and academics take a long-term view of internship programs to achieve mutually beneficial outcomes.
Resumo:
This paper presents the preliminary results in establishing a strategy for predicting Zenith Tropospheric Delay (ZTD) and relative ZTD (rZTD) between Continuous Operating Reference Stations (CORS) in near real-time. It is anticipated that the predicted ZTD or rZTD can assist the network-based Real-Time Kinematic (RTK) performance over long inter-station distances, ultimately, enabling a cost effective method of delivering precise positioning services to sparsely populated regional areas, such as Queensland. This research firstly investigates two ZTD solutions: 1) the post-processed IGS ZTD solution and 2) the near Real-Time ZTD solution. The near Real-Time solution is obtained through the GNSS processing software package (Bernese) that has been deployed for this project. The predictability of the near Real-Time Bernese solution is analyzed and compared to the post-processed IGS solution where it acts as the benchmark solution. The predictability analyses were conducted with various prediction time of 15, 30, 45, and 60 minutes to determine the error with respect to timeliness. The predictability of ZTD and relative ZTD is determined (or characterized) by using the previously estimated ZTD as the predicted ZTD of current epoch. This research has shown that both the ZTD and relative ZTD predicted errors are random in nature; the STD grows from a few millimeters to sub-centimeters while the predicted delay interval ranges from 15 to 60 minutes. Additionally, the RZTD predictability shows very little dependency on the length of tested baselines of up to 1000 kilometers. Finally, the comparison of near Real-Time Bernese solution with IGS solution has shown a slight degradation in the prediction accuracy. The less accurate NRT solution has an STD error of 1cm within the delay of 50 minutes. However, some larger errors of up to 10cm are observed.
Resumo:
An adaptive agent improves its performance by learning from experience. This paper describes an approach to adaptation based on modelling dynamic elements of the environment in order to make predictions of likely future state. This approach is akin to an elite sports player being able to “read the play”, allowing for decisions to be made based on predictions of likely future outcomes. Modelling of the agent‟s likely future state is performed using Markov Chains and a technique called “Motion and Occupancy Grids”. The experiments in this paper compare the performance of the planning system with and without the use of this predictive model. The results of the study demonstrate a surprising decrease in performance when using the predictions of agent occupancy. The results are derived from statistical analysis of the agent‟s performance in a high fidelity simulation of a world leading real robot soccer team.
Resumo:
The high morbidity and mortality associated with atherosclerotic coronary vascular disease (CVD) and its complications are being lessened by the increased knowledge of risk factors, effective preventative measures and proven therapeutic interventions. However, significant CVD morbidity remains and sudden cardiac death continues to be a presenting feature for some subsequently diagnosed with CVD. Coronary vascular disease is also the leading cause of anaesthesia related complications. Stress electrocardiography/exercise testing is predictive of 10 year risk of CVD events and the cardiovascular variables used to score this test are monitored peri-operatively. Similar physiological time-series datasets are being subjected to data mining methods for the prediction of medical diagnoses and outcomes. This study aims to find predictors of CVD using anaesthesia time-series data and patient risk factor data. Several pre-processing and predictive data mining methods are applied to this data. Physiological time-series data related to anaesthetic procedures are subjected to pre-processing methods for removal of outliers, calculation of moving averages as well as data summarisation and data abstraction methods. Feature selection methods of both wrapper and filter types are applied to derived physiological time-series variable sets alone and to the same variables combined with risk factor variables. The ability of these methods to identify subsets of highly correlated but non-redundant variables is assessed. The major dataset is derived from the entire anaesthesia population and subsets of this population are considered to be at increased anaesthesia risk based on their need for more intensive monitoring (invasive haemodynamic monitoring and additional ECG leads). Because of the unbalanced class distribution in the data, majority class under-sampling and Kappa statistic together with misclassification rate and area under the ROC curve (AUC) are used for evaluation of models generated using different prediction algorithms. The performance based on models derived from feature reduced datasets reveal the filter method, Cfs subset evaluation, to be most consistently effective although Consistency derived subsets tended to slightly increased accuracy but markedly increased complexity. The use of misclassification rate (MR) for model performance evaluation is influenced by class distribution. This could be eliminated by consideration of the AUC or Kappa statistic as well by evaluation of subsets with under-sampled majority class. The noise and outlier removal pre-processing methods produced models with MR ranging from 10.69 to 12.62 with the lowest value being for data from which both outliers and noise were removed (MR 10.69). For the raw time-series dataset, MR is 12.34. Feature selection results in reduction in MR to 9.8 to 10.16 with time segmented summary data (dataset F) MR being 9.8 and raw time-series summary data (dataset A) being 9.92. However, for all time-series only based datasets, the complexity is high. For most pre-processing methods, Cfs could identify a subset of correlated and non-redundant variables from the time-series alone datasets but models derived from these subsets are of one leaf only. MR values are consistent with class distribution in the subset folds evaluated in the n-cross validation method. For models based on Cfs selected time-series derived and risk factor (RF) variables, the MR ranges from 8.83 to 10.36 with dataset RF_A (raw time-series data and RF) being 8.85 and dataset RF_F (time segmented time-series variables and RF) being 9.09. The models based on counts of outliers and counts of data points outside normal range (Dataset RF_E) and derived variables based on time series transformed using Symbolic Aggregate Approximation (SAX) with associated time-series pattern cluster membership (Dataset RF_ G) perform the least well with MR of 10.25 and 10.36 respectively. For coronary vascular disease prediction, nearest neighbour (NNge) and the support vector machine based method, SMO, have the highest MR of 10.1 and 10.28 while logistic regression (LR) and the decision tree (DT) method, J48, have MR of 8.85 and 9.0 respectively. DT rules are most comprehensible and clinically relevant. The predictive accuracy increase achieved by addition of risk factor variables to time-series variable based models is significant. The addition of time-series derived variables to models based on risk factor variables alone is associated with a trend to improved performance. Data mining of feature reduced, anaesthesia time-series variables together with risk factor variables can produce compact and moderately accurate models able to predict coronary vascular disease. Decision tree analysis of time-series data combined with risk factor variables yields rules which are more accurate than models based on time-series data alone. The limited additional value provided by electrocardiographic variables when compared to use of risk factors alone is similar to recent suggestions that exercise electrocardiography (exECG) under standardised conditions has limited additional diagnostic value over risk factor analysis and symptom pattern. The effect of the pre-processing used in this study had limited effect when time-series variables and risk factor variables are used as model input. In the absence of risk factor input, the use of time-series variables after outlier removal and time series variables based on physiological variable values’ being outside the accepted normal range is associated with some improvement in model performance.
Resumo:
The performance of an adaptive filter may be studied through the behaviour of the optimal and adaptive coefficients in a given environment. This thesis investigates the performance of finite impulse response adaptive lattice filters for two classes of input signals: (a) frequency modulated signals with polynomial phases of order p in complex Gaussian white noise (as nonstationary signals), and (b) the impulsive autoregressive processes with alpha-stable distributions (as non-Gaussian signals). Initially, an overview is given for linear prediction and adaptive filtering. The convergence and tracking properties of the stochastic gradient algorithms are discussed for stationary and nonstationary input signals. It is explained that the stochastic gradient lattice algorithm has many advantages over the least-mean square algorithm. Some of these advantages are having a modular structure, easy-guaranteed stability, less sensitivity to the eigenvalue spread of the input autocorrelation matrix, and easy quantization of filter coefficients (normally called reflection coefficients). We then characterize the performance of the stochastic gradient lattice algorithm for the frequency modulated signals through the optimal and adaptive lattice reflection coefficients. This is a difficult task due to the nonlinear dependence of the adaptive reflection coefficients on the preceding stages and the input signal. To ease the derivations, we assume that reflection coefficients of each stage are independent of the inputs to that stage. Then the optimal lattice filter is derived for the frequency modulated signals. This is performed by computing the optimal values of residual errors, reflection coefficients, and recovery errors. Next, we show the tracking behaviour of adaptive reflection coefficients for frequency modulated signals. This is carried out by computing the tracking model of these coefficients for the stochastic gradient lattice algorithm in average. The second-order convergence of the adaptive coefficients is investigated by modeling the theoretical asymptotic variance of the gradient noise at each stage. The accuracy of the analytical results is verified by computer simulations. Using the previous analytical results, we show a new property, the polynomial order reducing property of adaptive lattice filters. This property may be used to reduce the order of the polynomial phase of input frequency modulated signals. Considering two examples, we show how this property may be used in processing frequency modulated signals. In the first example, a detection procedure in carried out on a frequency modulated signal with a second-order polynomial phase in complex Gaussian white noise. We showed that using this technique a better probability of detection is obtained for the reduced-order phase signals compared to that of the traditional energy detector. Also, it is empirically shown that the distribution of the gradient noise in the first adaptive reflection coefficients approximates the Gaussian law. In the second example, the instantaneous frequency of the same observed signal is estimated. We show that by using this technique a lower mean square error is achieved for the estimated frequencies at high signal-to-noise ratios in comparison to that of the adaptive line enhancer. The performance of adaptive lattice filters is then investigated for the second type of input signals, i.e., impulsive autoregressive processes with alpha-stable distributions . The concept of alpha-stable distributions is first introduced. We discuss that the stochastic gradient algorithm which performs desirable results for finite variance input signals (like frequency modulated signals in noise) does not perform a fast convergence for infinite variance stable processes (due to using the minimum mean-square error criterion). To deal with such problems, the concept of minimum dispersion criterion, fractional lower order moments, and recently-developed algorithms for stable processes are introduced. We then study the possibility of using the lattice structure for impulsive stable processes. Accordingly, two new algorithms including the least-mean P-norm lattice algorithm and its normalized version are proposed for lattice filters based on the fractional lower order moments. Simulation results show that using the proposed algorithms, faster convergence speeds are achieved for parameters estimation of autoregressive stable processes with low to moderate degrees of impulsiveness in comparison to many other algorithms. Also, we discuss the effect of impulsiveness of stable processes on generating some misalignment between the estimated parameters and the true values. Due to the infinite variance of stable processes, the performance of the proposed algorithms is only investigated using extensive computer simulations.
Resumo:
Predicting safety on roadways is standard practice for road safety professionals and has a corresponding extensive literature. The majority of safety prediction models are estimated using roadway segment and intersection (microscale) data, while more recently efforts have been undertaken to predict safety at the planning level (macroscale). Safety prediction models typically include roadway, operations, and exposure variables—factors known to affect safety in fundamental ways. Environmental variables, in particular variables attempting to capture the effect of rain on road safety, are difficult to obtain and have rarely been considered. In the few cases weather variables have been included, historical averages rather than actual weather conditions during which crashes are observed have been used. Without the inclusion of weather related variables researchers have had difficulty explaining regional differences in the safety performance of various entities (e.g. intersections, road segments, highways, etc.) As part of the NCHRP 8-44 research effort, researchers developed PLANSAFE, or planning level safety prediction models. These models make use of socio-economic, demographic, and roadway variables for predicting planning level safety. Accounting for regional differences - similar to the experience for microscale safety models - has been problematic during the development of planning level safety prediction models. More specifically, without weather related variables there is an insufficient set of variables for explaining safety differences across regions and states. Furthermore, omitted variable bias resulting from excluding these important variables may adversely impact the coefficients of included variables, thus contributing to difficulty in model interpretation and accuracy. This paper summarizes the results of an effort to include weather related variables, particularly various measures of rainfall, into accident frequency prediction and the prediction of the frequency of fatal and/or injury degree of severity crash models. The purpose of the study was to determine whether these variables do in fact improve overall goodness of fit of the models, whether these variables may explain some or all of observed regional differences, and identifying the estimated effects of rainfall on safety. The models are based on Traffic Analysis Zone level datasets from Michigan, and Pima and Maricopa Counties in Arizona. Numerous rain-related variables were found to be statistically significant, selected rain related variables improved the overall goodness of fit, and inclusion of these variables reduced the portion of the model explained by the constant in the base models without weather variables. Rain tends to diminish safety, as expected, in fairly complex ways, depending on rain frequency and intensity.