991 resultados para Data-cleaning


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background When large scale trials are investigating the effects of interventions on appetite, it is paramount to efficiently monitor large amounts of human data. The original hand-held Electronic Appetite Ratings System (EARS) was designed to facilitate the administering and data management of visual analogue scales (VAS) of subjective appetite sensations. The purpose of this study was to validate a novel hand-held method (EARS II (HP® iPAQ)) against the standard Pen and Paper (P&P) method and the previously validated EARS. Methods Twelve participants (5 male, 7 female, aged 18-40) were involved in a fully repeated measures design. Participants were randomly assigned in a crossover design, to either high fat (>48% fat) or low fat (<28% fat) meal days, one week apart and completed ratings using the three data capture methods ordered according to Latin Square. The first set of appetite sensations was completed in a fasted state, immediately before a fixed breakfast. Thereafter, appetite sensations were completed every thirty minutes for 4h. An ad libitum lunch was provided immediately before completing a final set of appetite sensations. Results Repeated measures ANOVAs were conducted for ratings of hunger, fullness and desire to eat. There were no significant differences between P&P compared with either EARS or EARS II (p > 0.05). Correlation coefficients between P&P and EARS II, controlling for age and gender, were performed on Area Under the Curve ratings. R2 for Hunger (0.89), Fullness (0.96) and Desire to Eat (0.95) were statistically significant (p < 0.05). Conclusions EARS II was sensitive to the impact of a meal and recovery of appetite during the postprandial period and is therefore an effective device for monitoring appetite sensations. This study provides evidence and support for further validation of the novel EARS II method for monitoring appetite sensations during large scale studies. The added versatility means that future uses of the system provides the potential to monitor a range of other behavioural and physiological measures often important in clinical and free living trials.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The discovery of protein variation is an important strategy in disease diagnosis within the biological sciences. The current benchmark for elucidating information from multiple biological variables is the so called “omics” disciplines of the biological sciences. Such variability is uncovered by implementation of multivariable data mining techniques which come under two primary categories, machine learning strategies and statistical based approaches. Typically proteomic studies can produce hundreds or thousands of variables, p, per observation, n, depending on the analytical platform or method employed to generate the data. Many classification methods are limited by an n≪p constraint, and as such, require pre-treatment to reduce the dimensionality prior to classification. Recently machine learning techniques have gained popularity in the field for their ability to successfully classify unknown samples. One limitation of such methods is the lack of a functional model allowing meaningful interpretation of results in terms of the features used for classification. This is a problem that might be solved using a statistical model-based approach where not only is the importance of the individual protein explicit, they are combined into a readily interpretable classification rule without relying on a black box approach. Here we incorporate statistical dimension reduction techniques Partial Least Squares (PLS) and Principal Components Analysis (PCA) followed by both statistical and machine learning classification methods, and compared them to a popular machine learning technique, Support Vector Machines (SVM). Both PLS and SVM demonstrate strong utility for proteomic classification problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The research team recognized the value of network-level Falling Weight Deflectometer (FWD) testing to evaluate the structural condition trends of flexible pavements. However, practical limitations due to the cost of testing, traffic control and safety concerns and the ability to test a large network may discourage some agencies from conducting the network-level FWD testing. For this reason, the surrogate measure of the Structural Condition Index (SCI) is suggested for use. The main purpose of the research presented in this paper is to investigate data mining strategies and to develop a prediction method of the structural condition trends for network-level applications which does not require FWD testing. The research team first evaluated the existing and historical pavement condition, distress, ride, traffic and other data attributes in the Texas Department of Transportation (TxDOT) Pavement Maintenance Information System (PMIS), applied data mining strategies to the data, discovered useful patterns and knowledge for SCI value prediction, and finally provided a reasonable measure of pavement structural condition which is correlated to the SCI. To evaluate the performance of the developed prediction approach, a case study was conducted using the SCI data calculated from the FWD data collected on flexible pavements over a 5-year period (2005 – 09) from 354 PMIS sections representing 37 pavement sections on the Texas highway system. The preliminary study results showed that the proposed approach can be used as a supportive pavement structural index in the event when FWD deflection data is not available and help pavement managers identify the timing and appropriate treatment level of preventive maintenance activities.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Although accountability in the form of high stakes testing is in favour in the contemporary Australian educational context, this practice remains a highly contested source of debate. Proponents for high stakes tests claim that higher standards in teaching and learning result from their implementation, whereas others believe that this type of testing regime is not required and may even in fact be counterproductive. Regardless of what side of the debate you sit on, the reality is that at present, high stakes testing appears to be here to stay. It could therefore be argued it is essential that teachers understand accountability and possess the specific skills to interpret and use test data beneficially.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper demonstrates the affordances of the work diary as a data collection tool for both pilot studies and qualitative research of social interactions. Observation is the cornerstone of many qualitative, ethnographic research projects (Creswell, 2008). However, determining through observation, the activities of busy school teams could be likened to joining dots of a child’s drawing activity to reveal a complex picture of interactions. Teachers, leaders and support personnel are in different locations within a school, performing diverse tasks for a variety of outcomes, which hopefully achieve a common goal. As a researcher, the quest to observe these busy teams and their interactions with each other was daunting and perhaps unrealistic. The decision to use a diary as part of a wider research project was to overcome the physical impossibility of simultaneously observing multiple team members. One reported advantage of the use of the diary in research was its suitability as a substitute for lengthy researcher observation, because multiple data sets could be collected at once (Lewis et al, 2005; Marelli, 2007).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A substantial body of literature exists identifying factors contributing to under-performing Enterprise Resource Planning systems (ERPs), including poor communication, lack of executive support and user dissatisfaction (Calisir et al., 2009). Of particular interest is Momoh et al.’s (2010) recent review identifying poor data quality (DQ) as one of nine critical factors associated with ERP failure. DQ is central to ERP operating processes, ERP facilitated decision-making and inter-organizational cooperation (Batini et al., 2009). Crucial in ERP contexts is that the integrated, automated, process driven nature of ERP data flows can amplify DQ issues, compounding minor errors as they flow through the system (Haug et al., 2009; Xu et al., 2002). However, the growing appreciation of the importance of DQ in determining ERP success lacks research addressing the relationship between stakeholders’ requirements and perceptions of ERP DQ, perceived data utility and the impact of users’ treatment of data on ERP outcomes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study determined the rate and indication for revision between cemented, uncemented, hybrid and resurfacing groups from NJR (6 th edition) data. Data validity was determined by interrogating for episodes of misclassification. We identified 6,034 (2.7%) misclassified episodes, containing 97 (4.3%) revisions. Kaplan-Meier revision rates at 3 years were 0.9% cemented, 1.9% for uncemented, 1.2% for hybrids and 3.0% for resurfacings (significant difference across all groups, p<0.001, with identical pattern in patients <55 years). Regression analysis indicated both prosthesis group and age significantly influenced failure (p<0.001). Revision for pain, aseptic loosening, and malalignment were highest in uncemented and resurfacing arthroplasty. Revision for dislocation was highest in uncemented hips (significant difference between groups, p<0.001). Feedback to the NJR on data misclassification has been made for future analysis. © 2012 Wichtig Editore.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background Bactrocera dorsalis s.s. is a pestiferous tephritid fruit fly distributed from Pakistan to the Pacific, with the Thai/Malay peninsula its southern limit. Sister pest taxa, B. papayae and B. philippinensis, occur in the southeast Asian archipelago and the Philippines, respectively. The relationship among these species is unclear due to their high molecular and morphological similarity. This study analysed population structure of these three species within a southeast Asian biogeographical context to assess potential dispersal patterns and the validity of their current taxonomic status. Results Geometric morphometric results generated from 15 landmarks for wings of 169 flies revealed significant differences in wing shape between almost all sites following canonical variate analysis. For the combined data set there was a greater isolation-by-distance (IBD) effect under a ‘non-Euclidean’ scenario which used geographical distances within a biogeographical ‘Sundaland context’ (r2 = 0.772, P < 0.0001) as compared to a ‘Euclidean’ scenario for which direct geographic distances between sample sites was used (r2 = 0.217, P < 0.01). COI sequence data were obtained for 156 individuals and yielded 83 unique haplotypes with no correlation to current taxonomic designations via a minimum spanning network. BEAST analysis provided a root age and location of 540kya in northern Thailand, with migration of B. dorsalis s.l. into Malaysia 470kya and Sumatra 270kya. Two migration events into the Philippines are inferred. Sequence data revealed a weak but significant IBD effect under the ‘non-Euclidean’ scenario (r2 = 0.110, P < 0.05), with no historical migration evident between Taiwan and the Philippines. Results are consistent with those expected at the intra-specific level. Conclusions Bactrocera dorsalis s.s., B. papayae and B. philippinensis likely represent one species structured around the South China Sea, having migrated from northern Thailand into the southeast Asian archipelago and across into the Philippines. No migration is apparent between the Philippines and Taiwan. This information has implications for quarantine, trade and pest management.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article presents a methodology that integrates cumulative plots with probe vehicle data for estimation of travel time statistics (average, quartile) on urban networks. The integration reduces relative deviation among the cumulative plots so that the classical analytical procedure of defining the area between the plots as the total travel time can be applied. For quartile estimation, a slicing technique is proposed. The methodology is validated with real data from Lucerne, Switzerland and it is concluded that the travel time estimates from the proposed methodology are statistically equivalent to the observed values.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Structural health monitoring (SHM) refers to the procedure used to assess the condition of structures so that their performance can be monitored and any damage can be detected early. Early detection of damage and appropriate retrofitting will aid in preventing failure of the structure and save money spent on maintenance or replacement and ensure the structure operates safely and efficiently during its whole intended life. Though visual inspection and other techniques such as vibration based ones are available for SHM of structures such as bridges, the use of acoustic emission (AE) technique is an attractive option and is increasing in use. AE waves are high frequency stress waves generated by rapid release of energy from localised sources within a material, such as crack initiation and growth. AE technique involves recording these waves by means of sensors attached on the surface and then analysing the signals to extract information about the nature of the source. High sensitivity to crack growth, ability to locate source, passive nature (no need to supply energy from outside, but energy from damage source itself is utilised) and possibility to perform real time monitoring (detecting crack as it occurs or grows) are some of the attractive features of AE technique. In spite of these advantages, challenges still exist in using AE technique for monitoring applications, especially in the area of analysis of recorded AE data, as large volumes of data are usually generated during monitoring. The need for effective data analysis can be linked with three main aims of monitoring: (a) accurately locating the source of damage; (b) identifying and discriminating signals from different sources of acoustic emission and (c) quantifying the level of damage of AE source for severity assessment. In AE technique, the location of the emission source is usually calculated using the times of arrival and velocities of the AE signals recorded by a number of sensors. But complications arise as AE waves can travel in a structure in a number of different modes that have different velocities and frequencies. Hence, to accurately locate a source it is necessary to identify the modes recorded by the sensors. This study has proposed and tested the use of time-frequency analysis tools such as short time Fourier transform to identify the modes and the use of the velocities of these modes to achieve very accurate results. Further, this study has explored the possibility of reducing the number of sensors needed for data capture by using the velocities of modes captured by a single sensor for source localization. A major problem in practical use of AE technique is the presence of sources of AE other than crack related, such as rubbing and impacts between different components of a structure. These spurious AE signals often mask the signals from the crack activity; hence discrimination of signals to identify the sources is very important. This work developed a model that uses different signal processing tools such as cross-correlation, magnitude squared coherence and energy distribution in different frequency bands as well as modal analysis (comparing amplitudes of identified modes) for accurately differentiating signals from different simulated AE sources. Quantification tools to assess the severity of the damage sources are highly desirable in practical applications. Though different damage quantification methods have been proposed in AE technique, not all have achieved universal approval or have been approved as suitable for all situations. The b-value analysis, which involves the study of distribution of amplitudes of AE signals, and its modified form (known as improved b-value analysis), was investigated for suitability for damage quantification purposes in ductile materials such as steel. This was found to give encouraging results for analysis of data from laboratory, thereby extending the possibility of its use for real life structures. By addressing these primary issues, it is believed that this thesis has helped improve the effectiveness of AE technique for structural health monitoring of civil infrastructures such as bridges.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many common diseases, such as the flu and cardiovascular disease, increase markedly in winter and dip in summer. These seasonal patterns have been part of life for millennia and were first noted in ancient Greece by both Hippocrates and Herodotus. Recent interest has focused on climate change, and the concern that seasons will become more extreme with harsher winter and summer weather. We describe a set of R functions designed to model seasonal patterns in disease. We illustrate some simple descriptive and graphical methods, a more complex method that is able to model non-stationary patterns, and the case–crossover for controlling for seasonal confounding.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Advances in information and communication technologies have brought about an information revolution, leading to fundamental changes in the way that information is collected or generated, shared and distributed. The importance of establishing systems in which research findings can be readily made available to and used by other researchers has long been recognized in international scientific collaborations. If the data access principles adopted by international scientific collaborations are to be effectively implemented they must be supported by the national policies and laws in place in the countries in which participating researchers are operating.