942 resultados para categorical and mix datasets
Resumo:
The technologies and methodologies of assembly design and evaluation in the early design stage are highly significant to product development. This paper looks at a promising technology to mix real components (e.g. physical prototypes, assembly tools, machines, etc.) with virtual components to create an Augmented Reality (AR) interface for assembly process evaluation. The goal of this paper is to clarify the methodologies and enabling technologies of how to establish an AR assembly simulation and evaluation environment. The architecture of an AR assembly system is proposed and the important functional modules including AR environment set-up, design for assembly (DFA) analysis and AR assembly sequence planning in an AR environment are discussed in detail.
Resumo:
The application of Discriminant function analysis (DFA) is not a new idea in the study of tephrochrology. In this paper, DFA is applied to compositional datasets of two different types of tephras from Mountain Ruapehu in New Zealand and Mountain Rainier in USA. The canonical variables from the analysis are further investigated with a statistical methodology of change-point problems in order to gain a better understanding of the change in compositional pattern over time. Finally, a special case of segmented regression has been proposed to model both the time of change and the change in pattern. This model can be used to estimate the age for the unknown tephras using Bayesian statistical calibration
Resumo:
By using suitable parameters, we present a uni¯ed aproach for describing four methods for representing categorical data in a contingency table. These methods include: correspondence analysis (CA), the alternative approach using Hellinger distance (HD), the log-ratio (LR) alternative, which is appropriate for compositional data, and the so-called non-symmetrical correspondence analysis (NSCA). We then make an appropriate comparison among these four methods and some illustrative examples are given. Some approaches based on cumulative frequencies are also linked and studied using matrices. Key words: Correspondence analysis, Hellinger distance, Non-symmetrical correspondence analysis, log-ratio analysis, Taguchi inertia
Resumo:
Theory of compositional data analysis is often focused on the composition only. However in practical applications we often treat a composition together with covariables with some other scale. This contribution systematically gathers and develop statistical tools for this situation. For instance, for the graphical display of the dependence of a composition with a categorical variable, a colored set of ternary diagrams might be a good idea for a first look at the data, but it will fast hide important aspects if the composition has many parts, or it takes extreme values. On the other hand colored scatterplots of ilr components could not be very instructive for the analyst, if the conventional, black-box ilr is used. Thinking on terms of the Euclidean structure of the simplex, we suggest to set up appropriate projections, which on one side show the compositional geometry and on the other side are still comprehensible by a non-expert analyst, readable for all locations and scales of the data. This is e.g. done by defining special balance displays with carefully- selected axes. Following this idea, we need to systematically ask how to display, explore, describe, and test the relation to complementary or explanatory data of categorical, real, ratio or again compositional scales. This contribution shows that it is sufficient to use some basic concepts and very few advanced tools from multivariate statistics (principal covariances, multivariate linear models, trellis or parallel plots, etc.) to build appropriate procedures for all these combinations of scales. This has some fundamental implications in their software implementation, and how might they be taught to analysts not already experts in multivariate analysis
Resumo:
Given a set of images of scenes containing different object categories (e.g. grass, roads) our objective is to discover these objects in each image, and to use this object occurrences to perform a scene classification (e.g. beach scene, mountain scene). We achieve this by using a supervised learning algorithm able to learn with few images to facilitate the user task. We use a probabilistic model to recognise the objects and further we classify the scene based on their object occurrences. Experimental results are shown and evaluated to prove the validity of our proposal. Object recognition performance is compared to the approaches of He et al. (2004) and Marti et al. (2001) using their own datasets. Furthermore an unsupervised method is implemented in order to evaluate the advantages and disadvantages of our supervised classification approach versus an unsupervised one
Resumo:
This paper addresses the application of a PCA analysis on categorical data prior to diagnose a patients data set using a Case-Based Reasoning (CBR) system. The particularity is that the standard PCA techniques are designed to deal with numerical attributes, but our medical data set contains many categorical data and alternative methods as RS-PCA are required. Thus, we propose to hybridize RS-PCA (Regular Simplex PCA) and a simple CBR. Results show how the hybrid system produces similar results when diagnosing a medical data set, that the ones obtained when using the original attributes. These results are quite promising since they allow to diagnose with less computation effort and memory storage
Resumo:
This part contains geomorphological, hydrological and other information concerning the desktop research of the River Tyne catchment area.
Resumo:
PowerPoint Slides relating to theory and use of SPSS. Used in Research Skills for Biomedical Science
Resumo:
A list of many network datasets
Resumo:
Linux commands that are generally useful for analyzing data; it is very easy to reduce phenomena such as links, nodes, URLs or downloads, to multiply repeating identifiers and then sorting and counting appearances.
Predicting sense of community and participation by applying machine learning to open government data
Resumo:
Community capacity is used to monitor socio-economic development. It is composed of a number of dimensions, which can be measured to understand the possible issues in the implementation of a policy or the outcome of a project targeting a community. Measuring community capacity dimensions is usually expensive and time consuming, requiring locally organised surveys. Therefore, we investigate a technique to estimate them by applying the Random Forests algorithm on secondary open government data. This research focuses on the prediction of measures for two dimensions: sense of community and participation. The most important variables for this prediction were determined. The variables included in the datasets used to train the predictive models complied with two criteria: nationwide availability; sufficiently fine-grained geographic breakdown, i.e. neighbourhood level. The models explained 77% of the sense of community measures and 63% of participation. Due to the low geographic detail of the outcome measures available, further research is required to apply the predictive models to a neighbourhood level. The variables that were found to be more determinant for prediction were only partially in agreement with the factors that, according to the social science literature consulted, are the most influential for sense of community and participation. This finding should be further investigated from a social science perspective, in order to be understood in depth.
Resumo:
This paper uses a two-sided market model of hospital competition to study the implications of di§erent remunerations schemes on the physiciansí side. The two-sided market approach is characterized by the concept of common network externality (CNE) introduced by Bardey et al. (2010). This type of externality occurs when occurs when both sides value, possibly with di§erent intensities, the same network externality. We explicitly introduce e§ort exerted by doctors. By increasing the number of medical acts (which involves a costly e§ort) the doctor can increase the quality of service o§ered to patients (over and above the level implied by the CNE). We Örst consider pure salary, capitation or fee-for-service schemes. Then, we study schemes that mix fee-for-service with either salary or capitation payments. We show that salary schemes (either pure or in combination with fee-for-service) are more patient friendly than (pure or mixed) capitations schemes. This comparison is exactly reversed on the providersíside. Quite surprisingly, patients always loose when a fee-for-service scheme is introduced (pure of mixed). This is true even though the fee-for-service is the only way to induce the providers to exert e§ort and it holds whatever the patientsívaluation of this e§ort. In other words, the increase in quality brought about by the fee-for-service is more than compensated by the increase in fees faced by patients.
Resumo:
La medición de la desigualdad de oportunidades con las bases de PISA implican varias limitaciones: (i) la muestra sólo representa una fracción limitada de las cohortes de jóvenes de 15 años en los países en desarrollo y (ii) estas fracciones no son uniformes entre países ni entre periodos. Lo anterior genera dudas sobre la confiabilidad de estas mediciones cuando se usan para comparaciones internacionales: mayor equidad puede ser resultado de una muestra más restringida y más homogénea. A diferencia de enfoques previos basados en reconstrucción de las muestras, el enfoque del documento consiste en proveer un índice bidimensional que incluye logro y acceso como dimensiones del índice. Se utilizan varios métodos de agregación y se observan cambios considerables en los rankings de (in) equidad de oportunidades cuando solo se observa el logro y cuando se observan ambas dimensiones en las pruebas de PISA 2006/2009. Finalmente se propone una generalización del enfoque permitiendo otras dimensiones adicionales y otros pesos utilizados en la agregación.
Resumo:
Multicultural leadership is a topic a great interest in nowadays globalized work environment. Colombia emerges as an attractive marketplace with appealing business opportunities, especially for German enterprises. After presenting Colombia’s current political, social and economic situation, the thesis elaborates the complex subject of cultural differences while focusing on the peculiarities of German and Colombian national cultures. The resulting implications for a team’s collaboration and leader effectiveness are theoretically supported with reference to the landmark studies of Hofstede and GLOBE. By utilizing semi-structured interview techniques, a qualitative research enriches the previous findings and gives an all-encompassing insight in German-Colombian teamwork. The investigation identifies distinctive behavioral patterns and relations, which imply challenges and factors of success for multicultural team leaders. Finally, a categorical analysis examines the influence of cultural traits on team performance and evaluates the effectiveness of the applied leadership style.
Resumo:
El modelat d'escenes és clau en un gran ventall d'aplicacions que van des de la generació mapes fins a la realitat augmentada. Aquesta tesis presenta una solució completa per a la creació de models 3D amb textura. En primer lloc es presenta un mètode de Structure from Motion seqüencial, a on el model 3D de l'entorn s'actualitza a mesura que s'adquireix nova informació visual. La proposta és més precisa i robusta que l'estat de l'art. També s'ha desenvolupat un mètode online, basat en visual bag-of-words, per a la detecció eficient de llaços. Essent una tècnica completament seqüencial i automàtica, permet la reducció de deriva, millorant la navegació i construcció de mapes. Per tal de construir mapes en àrees extenses, es proposa un algorisme de simplificació de models 3D, orientat a aplicacions online. L'eficiència de les propostes s'ha comparat amb altres mètodes utilitzant diversos conjunts de dades submarines i terrestres.