866 resultados para Compositional data analysis-roots in geosciences
Resumo:
En los últimos años la sociedad está experimentando una serie de cambios. Uno de estos cambios es la datificación (“datafication” en inglés). Este término puede ser definido como la transformación sistemática de aspectos de la vida cotidiana de las personas en datos procesados por ordenadores. Cada día, a cada minuto y a cada segundo, cada vez que alguien emplea un dispositivo digital,hay datos siendo guardados en algún lugar. Se puede tratar del contenido de un correo electrónico pero también puede ser el número de pasos que esa persona ha caminado o su historial médico. El simple almacenamiento de datos no proporciona un valor añadido por si solo. Para extraer conocimiento de los datos, y por tanto darles un valor, se requiere del análisis de datos. La ciencia de los datos junto con el análisis de datos se está volviendo cada vez más popular. Hoy en día, se pueden encontrar millones de web APIs estadísticas; estas APIs ofrecen la posibilidad de analizar tendencias o sentimientos presentes en las redes sociales o en internet en general. Una de las redes sociales más populares, Twitter, es pública. Cada mensaje, o tweet, publicado puede ser visto por cualquier persona en el mundo, siempre y cuando posea una conexión a internet. Esto hace de Twitter un medio interesante a la hora de analizar hábitos sociales o perfiles de consumo. Es en este contexto en que se engloba este proyecto. Este trabajo, combinando el análisis estadístico de datos y el análisis de contenido, trata de extraer conocimiento de tweets públicos de Twitter. En particular tratará de establecer si el género es un factor influyente en las relaciones entre usuarios de Twitter. Para ello, se analizará una base de datos que contiene casi 2.000 tweets. En primer lugar se determinará el género de los usuarios mediante web APIs. En segundo lugar se empleará el contraste de hipótesis para saber si el género influye en los usuarios a la hora de relacionarse con otros usuarios. Finalmente se construirá un modelo estadístico para predecir el comportamiento de los usuarios de Twitter en relación a su género.
Resumo:
Participation trends in 6-hour ultra-marathons held word-wide were investigated to gain basic demographic data on 6-hour ultra-marathoners and where these races took place. Participation trends and the association between nationality and race performance were investigated in all 6-hour races held worldwide between 1991 and 2010. Participation increased linearly in both women and men across years. The annual number of finishes was significantly higher in men than in women (P=0.013). The male-to-female ratio remained stable at ~4 since 1991. Runners in age group 45-49 years showed the largest increase in participation for both men (800 participants in 18 years) and women (208 participants in 16 years). Europe attracted most of the runners from other continents (166 runners), more than all other continents combined (55 runners). European runners also showed the best top ten performances (73±3 km for women and 77±11 km for men), while African (with 65±9 km for men) and South American (54±4 km for women and 65±2 km for men) runners showed the weakest. To summarize, participation in 6-hour ultra-marathons increased across years. Most of the development took place in Europe and in athletes in the age group 45-49 years. Europe also attracted the most diverse field of athletes with runners from all other continents. European runners accounted for the most runners and achieved the best top ten performances.
Resumo:
Complex systems in causal relationships are known to be circular rather than linear; this means that a particular result is not produced by a single cause, but rather that both positive and negative feedback processes are involved. However, although interpreting systemic interrelationships requires a language formed by circles, this has only been developed at the diagram level, and not from an axiomatic point of view. The first difficulty encountered when analysing any complex system is that usually the only data available relate to the various variables, so the first objective was to transform these data into cause-and-effect relationships. Once this initial step was taken, our discrete chaos theory could be applied by finding the causal circles that will form part of the system attractor and allow their behavior to be interpreted. As an application of the technique presented, we analyzed the system associated with the transcription factors of inflammatory diseases.
Resumo:
The aim of this paper is to propose a mathematical model to determine invariant sets, set covering, orbits and, in particular, attractors in the set of tourism variables. Analysis was carried out based on a pre-designed algorithm and applying our interpretation of chaos theory developed in the context of General Systems Theory. This article sets out the causal relationships associated with tourist flows in order to enable the formulation of appropriate strategies. Our results can be applied to numerous cases. For example, in the analysis of tourist flows, these findings can be used to determine whether the behaviour of certain groups affects that of other groups and to analyse tourist behaviour in terms of the most relevant variables. Unlike statistical analyses that merely provide information on current data, our method uses orbit analysis to forecast, if attractors are found, the behaviour of tourist variables in the immediate future.
Resumo:
National Highway Traffic Safety Administration, Washington, D.C.
Resumo:
The mechanical behavior of the vertebrate skull is often modeled using free-body analysis of simple geometric structures and, more recently, finite-element (FE) analysis. In this study, we compare experimentally collected in vivo bone strain orientations and magnitudes from the cranium of the American alligator with those extrapolated from a beam model and extracted from an FE model. The strain magnitudes predicted from beam and FE skull models bear little similarity to relative and absolute strain magnitudes recorded during in vivo biting experiments. However, quantitative differences between principal strain orientations extracted from the FE skull model and recorded during the in vivo experiments were smaller, and both generally matched expectations from the beam model. The differences in strain magnitude between the data sets may be attributable to the level of resolution of the models, the material properties used in the FE model, and the loading conditions (i.e., external forces and constraints). This study indicates that FE models and modeling of skulls as simple engineering structures may give a preliminary idea of how these structures are loaded, but whenever possible, modeling results should be verified with either in vitro or preferably in vivo testing, especially if precise knowledge of strain magnitudes is desired. (c) 2005 Wiley-Liss, Inc.
Resumo:
Exploratory analysis of data seeks to find common patterns to gain insights into the structure and distribution of the data. In geochemistry it is a valuable means to gain insights into the complicated processes making up a petroleum system. Typically linear visualisation methods like principal components analysis, linked plots, or brushing are used. These methods can not directly be employed when dealing with missing data and they struggle to capture global non-linear structures in the data, however they can do so locally. This thesis discusses a complementary approach based on a non-linear probabilistic model. The generative topographic mapping (GTM) enables the visualisation of the effects of very many variables on a single plot, which is able to incorporate more structure than a two dimensional principal components plot. The model can deal with uncertainty, missing data and allows for the exploration of the non-linear structure in the data. In this thesis a novel approach to initialise the GTM with arbitrary projections is developed. This makes it possible to combine GTM with algorithms like Isomap and fit complex non-linear structure like the Swiss-roll. Another novel extension is the incorporation of prior knowledge about the structure of the covariance matrix. This extension greatly enhances the modelling capabilities of the algorithm resulting in better fit to the data and better imputation capabilities for missing data. Additionally an extensive benchmark study of the missing data imputation capabilities of GTM is performed. Further a novel approach, based on missing data, will be introduced to benchmark the fit of probabilistic visualisation algorithms on unlabelled data. Finally the work is complemented by evaluating the algorithms on real-life datasets from geochemical projects.
Resumo:
DUE TO INCOMPLETE PAPERWORK, ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT
Resumo:
A tanulmány arra a feltevésre épül, hogy minél erősebb a bizalomra méltóság szintje egy adott üzleti kapcsolatban, annál inkább igaz, hogy nagy kockázatú tevékenységek mennek végbe benne. Ilyen esetekben a bizalomra méltóság a kapcsolatban zajló események, cselekvések irányítási eszközévé válik, és az üzleti kapcsolatban megjelenik a cselekvési hajlandóságként értelmezett bizalom. A tanulmány felhívja a figyelmet a bizalom és a bizalomra méltóság fogalmai közötti különbségre, szisztematikus különválasztásuk fontosságára. Bemutatja az úgynevezett diadikus adatelemzés gazdálkodástudományi alkalmazását. Empirikus eredményei is igazolják, hogy ezzel a módszerrel az üzleti kapcsolatok társas jellemzőinek (köztük a bizalomnak) és a közöttük lévő kapcsolatoknak mélyebb elemzésére nyílik lehetőség. ____ The paper rests on the behavioral interpretation of trust, making a clear distinction between trustworthiness (honesty) and trust interpreted as willingness to engage in risky situations with specific partners. The hypothesis tested is that in a business relation marked by high levels of trustworthiness as perceived by the opposite parties, willingness to be involved in risky situations is higher than it is in relations where actors do not believe their partners to be highly trustworthy. Testing this hypothesis clearly calls for dyadic operationalization, measurement, and analysis. The authors present the first economic application of a newly developed statistical technique called dyadic data analysis, which has already been applied in social psychology. It clearly overcomes the problem of single-ended research in business relations analysis and allows a deeper understanding of any dyadic phenomenon, including trust/trustworthiness as a governance mechanism.
Resumo:
A compositional multivariate approach is used to analyse regional scale soil geochemical data obtained as part of the Tellus Project generated by the Geological Survey Northern Ireland (GSNI). The multi-element total concentration data presented comprise XRF analyses of 6862 rural soil samples collected at 20cm depths on a non-aligned grid at one site per 2 km2. Censored data were imputed using published detection limits. Using these imputed values for 46 elements (including LOI), each soil sample site was assigned to the regional geology map provided by GSNI initially using the dominant lithology for the map polygon. Northern Ireland includes a diversity of geology representing a stratigraphic record from the Mesoproterozoic, up to and including the Palaeogene. However, the advance of ice sheets and their meltwaters over the last 100,000 years has left at least 80% of the bedrock covered by superficial deposits, including glacial till and post-glacial alluvium and peat. The question is to what extent the soil geochemistry reflects the underlying geology or superficial deposits. To address this, the geochemical data were transformed using centered log ratios (clr) to observe the requirements of compositional data analysis and avoid closure issues. Following this, compositional multivariate techniques including compositional Principal Component Analysis (PCA) and minimum/maximum autocorrelation factor (MAF) analysis method were used to determine the influence of underlying geology on the soil geochemistry signature. PCA showed that 72% of the variation was determined by the first four principal components (PC’s) implying “significant” structure in the data. Analysis of variance showed that only 10 PC’s were necessary to classify the soil geochemical data. To consider an improvement over PCA that uses the spatial relationships of the data, a classification based on MAF analysis was undertaken using the first 6 dominant factors. Understanding the relationship between soil geochemistry and superficial deposits is important for environmental monitoring of fragile ecosystems such as peat. To explore whether peat cover could be predicted from the classification, the lithology designation was adapted to include the presence of peat, based on GSNI superficial deposit polygons and linear discriminant analysis (LDA) undertaken. Prediction accuracy for LDA classification improved from 60.98% based on PCA using 10 principal components to 64.73% using MAF based on the 6 most dominant factors. The misclassification of peat may reflect degradation of peat covered areas since the creation of superficial deposit classification. Further work will examine the influence of underlying lithologies on elemental concentrations in peat composition and the effect of this in classification analysis.
Resumo:
Market research is often conducted through conventional methods such as surveys, focus groups and interviews. But the drawbacks of these methods are that they can be costly and timeconsuming. This study develops a new method, based on a combination of standard techniques like sentiment analysis and normalisation, to conduct market research in a manner that is free and quick. The method can be used in many application-areas, but this study focuses mainly on the veganism market to identify vegan food preferences in the form of a profile. Several food words are identified, along with their distribution between positive and negative sentiments in the profile. Surprisingly, non-vegan foods such as cheese, cake, milk, pizza and chicken dominate the profile, indicating that there is a significant market for vegan-suitable alternatives for such foods. Meanwhile, vegan-suitable foods such as coconut, potato, blueberries, kale and tofu also make strong appearances in the profile. Validation is performed by using the method on Volkswagen vehicle data to identify positive and negative sentiment across five car models. Some results were found to be consistent with sales figures and expert reviews, while others were inconsistent. The reliability of the method is therefore questionable, so the results should be used with caution.