407 resultados para Data combination
Resumo:
The accuracy of data derived from linked-segment models depends on how well the system has been represented. Previous investigations describing the gait of persons with partial foot amputation did not account for the unique anthropometry of the residuum or the inclusion of a prosthesis and footwear in the model and, as such, are likely to have underestimated the magnitude of the peak joint moments and powers. This investigation determined the effect of inaccuracies in the anthropometric input data on the kinetics of gait. Toward this end, a geometric model was developed and validated to estimate body segment parameters of various intact and partial feet. These data were then incorporated into customized linked-segment models, and the kinetic data were compared with that obtained from conventional models. Results indicate that accurate modeling increased the magnitude of the peak hip and knee joint moments and powers during terminal swing. Conventional inverse dynamic models are sufficiently accurate for research questions relating to stance phase. More accurate models that account for the anthropometry of the residuum, prosthesis, and footwear better reflect the work of the hip extensors and knee flexors to decelerate the limb during terminal swing phase.
Resumo:
By using the Rasch model, much detailed diagnostic information is available to developers of survey and assessment instruments and to the researchers who use them. We outline an approach to the analysis of data obtained from the administration of survey instruments that can enable researchers to recognise and diagnose difficulties with those instruments and then to suggest remedial actions that can improve the measurement properties of the scales included in questionnaires. We illustrate the approach using examples drawn from recent research and demonstrate how the approach can be used to generate figures that make the results of Rasch analyses accessible to non-specialists.
Resumo:
Research on social networking sites like Facebook is emerging but sparse. The exploratory study investigates the value users derive from self-described ‘cool’ Facebook applications, and explores the features that either encourage or discourage users to recommend application to their friends. Thus the concepts of value and cool are explored in a social networking setting. Our qualitative data shows that consumers derive a combination of functional value along with either social or emotional value from the applications. Female Facebook users indicated self-expression as important, while mates then to use Facebook application to socially compete. Three broad categories emerged for application features; symmetrical features can both encourage or discourage recommendation, asymmetrical features one encourage or discourage but not both, and polar features where different levels of the same feature encourage or discourage. Recommending or not recommending an application tends to be the result of a combination of features rather than one feature in isolation.
Resumo:
The paper analyses the expected value of OD volumes from probe with fixed error, error that is proportional to zone size and inversely proportional to zone size. To add realism to the analysis, real trip ODs in the Tokyo Metropolitan Region are synthesised. The results show that for small zone coding with average radius of 1.1km, and fixed measurement error of 100m, an accuracy of 70% can be expected. The equivalent accuracy for medium zone coding with average radius of 5km would translate into a fixed error of approximately 300m. As expected small zone coding is more sensitive than medium zone coding as the chances of the probe error envelope falling into adjacent zones are higher. For the same error radii, error proportional to zone size would deliver higher level of accuracy. As over half (54.8%) of the trip ends start or end at zone with equivalent radius of ≤ 1.2 km and only 13% of trips ends occurred at zones with equivalent radius ≥2.5km, measurement error that is proportional to zone size such as mobile phone would deliver higher level of accuracy. The synthesis of real OD with different probe error characteristics have shown that expected value of >85% is difficult to achieve for small zone coding with average radius of 1.1km. For most transport applications, OD matrix at medium zone coding is sufficient for transport management. From this study it can be drawn that GPS with error range between 2 and 5m, and at medium zone coding (average radius of 5km) would provide OD estimates greater than 90% of the expected value. However, for a typical mobile phone operating error range at medium zone coding the expected value would be lower than 85%. This paper assumes transmission of one origin and one destination positions from the probe. However, if multiple positions within the origin and destination zones are transmitted, map matching to transport network could be performed and it would greatly improve the accuracy of the probe data.
Resumo:
This paper presents a model to estimate travel time using cumulative plots. Three different cases considered are i) case-Det, for only detector data; ii) case-DetSig, for detector data and signal controller data and iii) case-DetSigSFR: for detector data, signal controller data and saturation flow rate. The performance of the model for different detection intervals is evaluated. It is observed that detection interval is not critical if signal timings are available. Comparable accuracy can be obtained from larger detection interval with signal timings or from shorter detection interval without signal timings. The performance for case-DetSig and for case-DetSigSFR is consistent with accuracy generally more than 95% whereas, case-Det is highly sensitive to the signal phases in the detection interval and its performance is uncertain if detection interval is integral multiple of signal cycles.
Resumo:
Light Detection and Ranging (LIDAR) has great potential to assist vegetation management in power line corridors by providing more accurate geometric information of the power line assets and vegetation along the corridors. However, the development of algorithms for the automatic processing of LIDAR point cloud data, in particular for feature extraction and classification of raw point cloud data, is in still in its infancy. In this paper, we take advantage of LIDAR intensity and try to classify ground and non-ground points by statistically analyzing the skewness and kurtosis of the intensity data. Moreover, the Hough transform is employed to detected power lines from the filtered object points. The experimental results show the effectiveness of our methods and indicate that better results were obtained by using LIDAR intensity data than elevation data.
Resumo:
The explosive growth of the World-Wide-Web and the emergence of ecommerce are the major two factors that have led to the development of recommender systems (Resnick and Varian, 1997). The main task of recommender systems is to learn from users and recommend items (e.g. information, products or books) that match the users’ personal preferences. Recommender systems have been an active research area for more than a decade. Many different techniques and systems with distinct strengths have been developed to generate better quality recommendations. One of the main factors that affect recommenders’ recommendation quality is the amount of information resources that are available to the recommenders. The main feature of the recommender systems is their ability to make personalised recommendations for different individuals. However, for many ecommerce sites, it is difficult for them to obtain sufficient knowledge about their users. Hence, the recommendations they provided to their users are often poor and not personalised. This information insufficiency problem is commonly referred to as the cold-start problem. Most existing research on recommender systems focus on developing techniques to better utilise the available information resources to achieve better recommendation quality. However, while the amount of available data and information remains insufficient, these techniques can only provide limited improvements to the overall recommendation quality. In this thesis, a novel and intuitive approach towards improving recommendation quality and alleviating the cold-start problem is attempted. This approach is enriching the information resources. It can be easily observed that when there is sufficient information and knowledge base to support recommendation making, even the simplest recommender systems can outperform the sophisticated ones with limited information resources. Two possible strategies are suggested in this thesis to achieve the proposed information enrichment for recommenders: • The first strategy suggests that information resources can be enriched by considering other information or data facets. Specifically, a taxonomy-based recommender, Hybrid Taxonomy Recommender (HTR), is presented in this thesis. HTR exploits the relationship between users’ taxonomic preferences and item preferences from the combination of the widely available product taxonomic information and the existing user rating data, and it then utilises this taxonomic preference to item preference relation to generate high quality recommendations. • The second strategy suggests that information resources can be enriched simply by obtaining information resources from other parties. In this thesis, a distributed recommender framework, Ecommerce-oriented Distributed Recommender System (EDRS), is proposed. The proposed EDRS allows multiple recommenders from different parties (i.e. organisations or ecommerce sites) to share recommendations and information resources with each other in order to improve their recommendation quality. Based on the results obtained from the experiments conducted in this thesis, the proposed systems and techniques have achieved great improvement in both making quality recommendations and alleviating the cold-start problem.
Resumo:
A few studies examined interactive effects between air pollution and temperature on health outcomes. This study is to examine if temperature modified effects of ozone and cardiovascular mortality in 95 large US cities. A nonparametric and a parametric regression models were separately used to explore interactive effects of temperature and ozone on cardiovascular mortality during May and October, 1987-2000. A Bayesian meta-analysis was used to pool estimates. Both models illustrate that temperature enhanced the ozone effects on mortality in the northern region, but obviously in the southern region. A 10-ppb increment in ozone was associated with 0.41 % (95% posterior interval (PI): -0.19 %, 0.93 %), 0.27 % (95% PI: -0.44 %, 0.87 %) and 1.68 % (95% PI: 0.07 %, 3.26 %) increases in daily cardiovascular mortality corresponding to low, moderate and high levels of temperature, respectively. We concluded that temperature modified effects of ozone, particularly in the northern region.
Resumo:
Total deposition of petrol, diesel and environmental tobacco smoke (ETS) aerosols in the human respiratory tract for nasal breathing conditions was computed for 14 nonsmoking volunteers, considering the specific anatomical and respiratory parameters of each volunteer and the specific size distribution for each inhalation experiment. Theoretical predictions were 34.6% for petrol, 24.0% for diesel, and 18.5% for ETS particles. Compared to the experimental results, predicted deposition values were consistently smaller than the measured data (41.4% for petrol, 29.6% for diesel, and 36.2% for ETS particles). The apparent discrepancy between experimental data on total deposition and modeling results may be reconciled by considering the non-spherical shape of the test aerosols by diameter-dependent dynamic shape factors to account for differences between mobility-equivalent and volume-equivalent or thermodynamic diameters. While the application of dynamic shape factors is able to explain the observed differences for petrol and diesel particles, additional mechanisms may be required for ETS particle deposition, such as the size reduction upon inspiration by evaporation of volatile compounds and/or condensation-induced restructuring, and, possibly, electrical charge effects.
Resumo:
Random Indexing K-tree is the combination of two algorithms suited for large scale document clustering.
Resumo:
The wide range of contributing factors and circumstances surrounding crashes on road curves suggest that no single intervention can prevent these crashes. This paper presents a novel methodology, based on data mining techniques, to identify contributing factors and the relationship between them. It identifies contributing factors that influence the risk of a crash. Incident records, described using free text, from a large insurance company were analysed with rough set theory. Rough set theory was used to discover dependencies among data, and reasons using the vague, uncertain and imprecise information that characterised the insurance dataset. The results show that male drivers, who are between 50 and 59 years old, driving during evening peak hours are involved with a collision, had a lowest crash risk. Drivers between 25 and 29 years old, driving from around midnight to 6 am and in a new car has the highest risk. The analysis of the most significant contributing factors on curves suggests that drivers with driving experience of 25 to 42 years, who are driving a new vehicle have the highest crash cost risk, characterised by the vehicle running off the road and hitting a tree. This research complements existing statistically based tools approach to analyse road crashes. Our data mining approach is supported with proven theory and will allow road safety practitioners to effectively understand the dependencies between contributing factors and the crash type with the view to designing tailored countermeasures.
Resumo:
Objective: To summarise the extent to which narrative text fields in administrative health data are used to gather information about the event resulting in presentation to a health care provider for treatment of an injury, and to highlight best practise approaches to conducting narrative text interrogation for injury surveillance purposes.----- Design: Systematic review----- Data sources: Electronic databases searched included CINAHL, Google Scholar, Medline, Proquest, PubMed and PubMed Central.. Snowballing strategies were employed by searching the bibliographies of retrieved references to identify relevant associated articles.----- Selection criteria: Papers were selected if the study used a health-related database and if the study objectives were to a) use text field to identify injury cases or use text fields to extract additional information on injury circumstances not available from coded data or b) use text fields to assess accuracy of coded data fields for injury-related cases or c) describe methods/approaches for extracting injury information from text fields.----- Methods: The papers identified through the search were independently screened by two authors for inclusion, resulting in 41 papers selected for review. Due to heterogeneity between studies metaanalysis was not performed.----- Results: The majority of papers reviewed focused on describing injury epidemiology trends using coded data and text fields to supplement coded data (28 papers), with these studies demonstrating the value of text data for providing more specific information beyond what had been coded to enable case selection or provide circumstantial information. Caveats were expressed in terms of the consistency and completeness of recording of text information resulting in underestimates when using these data. Four coding validation papers were reviewed with these studies showing the utility of text data for validating and checking the accuracy of coded data. Seven studies (9 papers) described methods for interrogating injury text fields for systematic extraction of information, with a combination of manual and semi-automated methods used to refine and develop algorithms for extraction and classification of coded data from text. Quality assurance approaches to assessing the robustness of the methods for extracting text data was only discussed in 8 of the epidemiology papers, and 1 of the coding validation papers. All of the text interrogation methodology papers described systematic approaches to ensuring the quality of the approach.----- Conclusions: Manual review and coding approaches, text search methods, and statistical tools have been utilised to extract data from narrative text and translate it into useable, detailed injury event information. These techniques can and have been applied to administrative datasets to identify specific injury types and add value to previously coded injury datasets. Only a few studies thoroughly described the methods which were used for text mining and less than half of the studies which were reviewed used/described quality assurance methods for ensuring the robustness of the approach. New techniques utilising semi-automated computerised approaches and Bayesian/clustering statistical methods offer the potential to further develop and standardise the analysis of narrative text for injury surveillance.
Resumo:
A data-driven background dataset refinement technique was recently proposed for SVM based speaker verification. This method selects a refined SVM background dataset from a set of candidate impostor examples after individually ranking examples by their relevance. This paper extends this technique to the refinement of the T-norm dataset for SVM-based speaker verification. The independent refinement of the background and T-norm datasets provides a means of investigating the sensitivity of SVM-based speaker verification performance to the selection of each of these datasets. Using refined datasets provided improvements of 13% in min. DCF and 9% in EER over the full set of impostor examples on the 2006 SRE corpus with the majority of these gains due to refinement of the T-norm dataset. Similar trends were observed for the unseen data of the NIST 2008 SRE.
Resumo:
1. Ecological data sets often use clustered measurements or use repeated sampling in a longitudinal design. Choosing the correct covariance structure is an important step in the analysis of such data, as the covariance describes the degree of similarity among the repeated observations. 2. Three methods for choosing the covariance are: the Akaike information criterion (AIC), the quasi-information criterion (QIC), and the deviance information criterion (DIC). We compared the methods using a simulation study and using a data set that explored effects of forest fragmentation on avian species richness over 15 years. 3. The overall success was 80.6% for the AIC, 29.4% for the QIC and 81.6% for the DIC. For the forest fragmentation study the AIC and DIC selected the unstructured covariance, whereas the QIC selected the simpler autoregressive covariance. Graphical diagnostics suggested that the unstructured covariance was probably correct. 4. We recommend using DIC for selecting the correct covariance structure.