948 resultados para statistical methods


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The thesis is an investigation of the principle of least effort (Zipf 1949 [1972]). The principle is simple (all effort should be least) and universal (it governs the totality of human behavior). Since the principle is also functional, the thesis adopts a functional theory of language as its theoretical framework, i.e. Natural Linguistics. The explanatory system of Natural Linguistics posits that higher principles govern preferences, which, in turn, manifest themselves as concrete, specific processes in a given language. Therefore, the thesis’ aim is to investigate the principle of least effort on the basis of external evidence from English. The investigation falls into the three following strands: the investigation of the principle itself, the investigation of its application in articulatory effort and the investigation of its application in phonological processes. The structure of the thesis reflects the division of its broad aims. The first part of the thesis presents its theoretical background (Chapter One and Chapter Two), the second part of the thesis deals with application of least effort in articulatory effort (Chapter Three and Chapter Four), whereas the third part discusses the principle of least effort in phonological processes (Chapter Five and Chapter Six). Chapter One serves as an introduction, examining various aspects of the principle of least effort such as its history, literature, operation and motivation. It overviews various names which denote least effort, explains the origins of the principle and reviews the literature devoted to the principle of least effort in a chronological order. The chapter also discusses the nature and operation of the principle, providing numerous examples of the principle at work. It emphasizes the universal character of the principle from the linguistic field (low-level phonetic processes and language universals) and the non-linguistic ones (physics, biology, psychology and cognitive sciences), proving that the principle governs human behavior and choices. Chapter Two provides the theoretical background of the thesis in terms of its theoretical framework and discusses the terms used in the thesis’ title, i.e. hierarchy and preference. It justifies the selection of Natural Linguistics as the thesis’ theoretical framework by outlining its major assumptions and demonstrating its explanatory power. As far as the concepts of hierarchy and preference are concerned, the chapter provides their definitions and reviews their various understandings via decision theories and linguistic preference-based theories. Since the thesis investigates the principle of least effort in language and speech, Chapter Three considers the articulatory aspect of effort. It reviews the notion of easy and difficult sounds and discusses the concept of articulatory effort, overviewing its literature as well as various understandings in a chronological fashion. The chapter also presents the concept of articulatory gestures within the framework of Articulatory Phonology. The thesis’ aim is to investigate the principle of least effort on the basis of external evidence, therefore Chapters Four and Six provide evidence in terms of three experiments, text message studies (Chapter Four) and phonological processes in English (Chapter Six). Chapter Four contains evidence for the principle of least effort in articulation on the basis of experiments. It describes the experiments in terms of their predictions and methodology. In particular, it discusses the adopted measure of effort established by means of the effort parameters as well as their status. The statistical methods of the experiments are also clarified. The chapter reports on the results of the experiments, presenting them in a graphical way and discusses their relation to the tested predictions. Chapter Four establishes a hierarchy of speakers’ preferences with reference to articulatory effort (Figures 30, 31). The thesis investigates the principle of least effort in phonological processes, thus Chapter Five is devoted to the discussion of phonological processes in Natural Phonology. The chapter explains the general nature and motivation of processes as well as the development of processes in child language. It also discusses the organization of processes in terms of their typology as well as the order in which processes apply. The chapter characterizes the semantic properties of processes and overviews Luschützky’s (1997) contribution to NP with respect to processes in terms of their typology and incorporation of articulatory gestures in the concept of a process. Chapter Six investigates phonological processes. In particular, it identifies the issues of lenition/fortition definition and process typology by presenting the current approaches to process definitions and their typology. Since the chapter concludes that no coherent definition of lenition/fortition exists, it develops alternative lenition/fortition definitions. The chapter also revises the typology of phonological processes under effort management, which is an extended version of the principle of least effort. Chapter Seven concludes the thesis with a list of the concepts discussed in the thesis, enumerates the proposals made by the thesis in discussing the concepts and presents some questions for future research which have emerged in the course of investigation. The chapter also specifies the extent to which the investigation of the principle of least effort is a meaningful contribution to phonology.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Tämän työn tavoitteena oli selvittää tietojohtamisen eri käytäntöjen vaikutusta oppimiseen, uudistumiseen sekä yrityksen innovaatiokyvykkyyteen. Työssä on keskitytty erityisesti sellaisiin tietojohtamisen käytäntöihin, jotka edistävät oppimista ja uusiutumista yrityksissä. Työssä on käytetty tilastollisia menetelmiä, muun muassa faktorianalyysia, korrelaatioanalyysia sekä regressiota, analysoitaessa 259 suomalaisesta yrityksestä kerättyä kyselydataa niiden tietojohtamisen käytöntöihin ja aineettomaan pääomaan liittyen. Analyysi osoittaa, että useat tietojohtamisen käytännöt vaikuttavat positiivisesti yrityksen uudistumiseen ja sitä kautta innovaatiokyvykkyyteen. Henkilöstön kouluttaminen sekä parhaiden käytäntöjen kerääminen ja soveltaminen yrityksessä ovat positiivisesti yhteydessä innovaatiokyvykkyyteen. Henkilöstön kouluttamisella on merkittävin suora vaikutus innovaatiokyvykkyyteen ja tässä työssä on esitetty, että koulutuksen tarjoamisen suurin vaikutus on oppimismyönteisen kulttuurin kehittyminen yrityksiin sen sijaan, että koulutuksella pyrittäisiin vain parantamaan tehtäväkenttään liittyviä taitoja ja tietoja. Henkilöstön kouluttaminen, parhaat käytännöt sekä sosialisaatiossa tapahtuva tiedon vaihto ja suhteiden solmiminen vaikuttavat positiivisesti uudistumispääomaan. Työn tulosten perusteella uudistumispääomalla on merkittävä rooli innovaatioiden syntymisessä yrityksissä. Uudistumispääoma medioi koulutuksen, parhaiden käytäntöjen ja mahdollisesti myös sosialisaation vaikutusta innovaatiokyvykkyyteen ja on näin merkittävä osa innovaatioiden syntyä yrityksissä. Innovaatiokyvykkyyden osatekijöiden ymmärtäminen voi auttaa johtajia ja esimiehiä keskittämään huomionsa tiettyihin tietojohtamisen käytäntöihin edistääkseen innovaatioiden syntymistä yrityksessä sen sijaan, että he pyrkisivät vain vaikuttamaan innovaatioprosessiin.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The study of ichthyio-plankton stages and its relations with the environment and other organisms is therefore crucial for a correct use of fishery resources. In this context, the extraction and the analysis of the content of the digestive tract, is a key method for the identification of the diet in early larval stages, the determination of the resources they rely on and possibly a comparison with the diet of other species. Additionally this approach could be useful in determination on occurrence of species competition. This technique is preceded by the analysis of morphometric data (Blackith & Reyment, 1971; Marcus, 1990), that is the acquisition of quantitative variables measured from the morphology of the object of study. They are linear distances, count, angles and ratios. The subsequent application of multivariate statistical methods, aims to quantify the changes in morphological measures between and within groups, relating them to the type and size of prey and evaluate if some changes appear in food choices along the larvae growth.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The fish meat has a particular chemical composition which gives its high nutritional value. However, this food is identified for being highly perishable and this aspect is often named as a barrier to fish consumption. The southwestern Paraná region, parallel to the country's reality, it is characterized by low fish consumption; and one of the strategies aimed at increasing the consumption of this important protein source is encouraging the production of other species besides tilapia. Within this context, it is necessary to know about the meat characteristics. In this sense, the objective of this study was to evaluate the technological potential of pacu, grass carp and catfish species. To do so, at first, it was discussed the chemical and biometric assessment under two distinct descriptive statistical methods, of the three species; and it was also evaluated the discriminating capacity of the study. In a second moment, an evaluation of effects done by two different processes of washing (acid and alkaline) regarding the removal of nitrogen compounds, pigments and the emulsifying ability of the proteins contained in the protein base obtained. Finally, in the third phase, it was aimed to realize the methodology optimization in GC-MS for the analysis geosmin and MIB (2-metilisoborneol) compounds that are responsible for taste/smell of soil and mold in freshwater fish. The results showed a high protein and low lipid content for the three species. The comparison between means and medians revealed symmetry only for protein values and biometric measurements. Lipids, when evaluated only by the means, overestimate the levels for all species. Correlations between body measurements and fillet yield had low correlation, regardless of the species analyzed, and the best prediction equation relates the total weight and fillet weight. The biometric variables were the best discriminating among the species. The evaluation of the washings, it was found that the acidic and basic processes were equally (p ≥ 0.05) efficient (p ≤ 0.05) for the removal of nitrogen compounds on the fish pulps. Regarding the extraction of pigments, a removal efficiency was recorded only for the pacu species, the data were assessed by the parameters L *, a *, b *. When evaluated by the total color difference (ΔE) before and after washing for both processes (acid/alkaline) the ΔE proved feasible perceived by naked eye for all species. The catfish was characterized as the fish that presents the clearest meat with the basic washing considered the most effective in removing pigments for this species. Protein bases obtained by alkaline washes have higher emulsifying capacity (p ≤ 0.05) when compared to unwashed and washed in acid process pulps. The methodology applied for the quantification of MIB and geosmin, allowed to establish that the method of extraction and purification of analytes had low recovery and future studies should be developed for identification and quantification of MIB and geosmin on fish samples.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Currently, the decision analysis in production processes involves a level of detail, in which the problem is subdivided to analyze it in terms of different and conflicting points of view. The multi-criteria analysis has been an important tool that helps assertive decisions related to the production process. This process of analysis has been incorporated into various areas of production engineering, by applying multi-criteria methods in solving the problems of the productive sector. This research presents a statistical study on the use of multi-criteria methods in the areas of Production Engineering, where 935 papers were filtered from 20.663 publications in scientific journals, considering a level of the publication quality based on the impact factor published by the JCR between 2010 and 2015. In this work, the descriptive statistics is used to represent some information and statistical analysis on the volume of applications methods. Relevant results were found with respect to the "amount of advanced methods that are being applied and in which areas related to Production Engineering." This information may provide support to researchers when preparing a multi-criteria application, whereupon it will be possible to check in which issues and how often the other authors have used multi-criteria methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

When designing systems that are complex, dynamic and stochastic in nature, simulation is generally recognised as one of the best design support technologies, and a valuable aid in the strategic and tactical decision making process. A simulation model consists of a set of rules that define how a system changes over time, given its current state. Unlike analytical models, a simulation model is not solved but is run and the changes of system states can be observed at any point in time. This provides an insight into system dynamics rather than just predicting the output of a system based on specific inputs. Simulation is not a decision making tool but a decision support tool, allowing better informed decisions to be made. Due to the complexity of the real world, a simulation model can only be an approximation of the target system. The essence of the art of simulation modelling is abstraction and simplification. Only those characteristics that are important for the study and analysis of the target system should be included in the simulation model. The purpose of simulation is either to better understand the operation of a target system, or to make predictions about a target system’s performance. It can be viewed as an artificial white-room which allows one to gain insight but also to test new theories and practices without disrupting the daily routine of the focal organisation. What you can expect to gain from a simulation study is very well summarised by FIRMA (2000). His idea is that if the theory that has been framed about the target system holds, and if this theory has been adequately translated into a computer model this would allow you to answer some of the following questions: · Which kind of behaviour can be expected under arbitrarily given parameter combinations and initial conditions? · Which kind of behaviour will a given target system display in the future? · Which state will the target system reach in the future? The required accuracy of the simulation model very much depends on the type of question one is trying to answer. In order to be able to respond to the first question the simulation model needs to be an explanatory model. This requires less data accuracy. In comparison, the simulation model required to answer the latter two questions has to be predictive in nature and therefore needs highly accurate input data to achieve credible outputs. These predictions involve showing trends, rather than giving precise and absolute predictions of the target system performance. The numerical results of a simulation experiment on their own are most often not very useful and need to be rigorously analysed with statistical methods. These results then need to be considered in the context of the real system and interpreted in a qualitative way to make meaningful recommendations or compile best practice guidelines. One needs a good working knowledge about the behaviour of the real system to be able to fully exploit the understanding gained from simulation experiments. The goal of this chapter is to brace the newcomer to the topic of what we think is a valuable asset to the toolset of analysts and decision makers. We will give you a summary of information we have gathered from the literature and of the experiences that we have made first hand during the last five years, whilst obtaining a better understanding of this exciting technology. We hope that this will help you to avoid some pitfalls that we have unwittingly encountered. Section 2 is an introduction to the different types of simulation used in Operational Research and Management Science with a clear focus on agent-based simulation. In Section 3 we outline the theoretical background of multi-agent systems and their elements to prepare you for Section 4 where we discuss how to develop a multi-agent simulation model. Section 5 outlines a simple example of a multi-agent system. Section 6 provides a collection of resources for further studies and finally in Section 7 we will conclude the chapter with a short summary.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The aim of this thesis is to review and augment the theory and methods of optimal experimental design. In Chapter I the scene is set by considering the possible aims of an experimenter prior to an experiment, the statistical methods one might use to achieve those aims and how experimental design might aid this procedure. It is indicated that, given a criterion for design, a priori optimal design will only be possible in certain instances and, otherwise, some form of sequential procedure would seem to be indicated. In Chapter 2 an exact experimental design problem is formulated mathematically and is compared with its continuous analogue. Motivation is provided for the solution of this continuous problem, and the remainder of the chapter concerns this problem. A necessary and sufficient condition for optimality of a design measure is given. Problems which might arise in testing this condition are discussed, in particular with respect to possible non-differentiability of the criterion function at the design being tested. Several examples are given of optimal designs which may be found analytically and which illustrate the points discussed earlier in the chapter. In Chapter 3 numerical methods of solution of the continuous optimal design problem are reviewed. A new algorithm is presented with illustrations of how it should be used in practice. It is shown that, for reasonably large sample size, continuously optimal designs may be approximated to well by an exact design. In situations where this is not satisfactory algorithms for improvement of this design are reviewed. Chapter 4 consists of a discussion of sequentially designed experiments, with regard to both the philosophies underlying, and the application of the methods of, statistical inference. In Chapter 5 we criticise constructively previous suggestions for fully sequential design procedures. Alternative suggestions are made along with conjectures as to how these might improve performance. Chapter 6 presents a simulation study, the aim of which is to investigate the conjectures of Chapter 5. The results of this study provide empirical support for these conjectures. In Chapter 7 examples are analysed. These suggest aids to sequential experimentation by means of reduction of the dimension of the design space and the possibility of experimenting semi-sequentially. Further examples are considered which stress the importance of the use of prior information in situations of this type. Finally we consider the design of experiments when semi-sequential experimentation is mandatory because of the necessity of taking batches of observations at the same time. In Chapter 8 we look at some of the assumptions which have been made and indicate what may go wrong where these assumptions no longer hold.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A human genome contains more than 20 000 protein-encoding genes. A human proteome, instead, has been estimated to be much more complex and dynamic. The most powerful tool to study proteins today is mass spectrometry (MS). MS based proteomics is based on the measurement of the masses of charged peptide ions in a gas-phase. The peptide amino acid sequence can be deduced, and matching proteins can be found, using software to correlate MS-data with sequence database information. Quantitative proteomics allow the estimation of the absolute or relative abundance of a certain protein in a sample. The label-free quantification methods use the intrinsic MS-peptide signals in the calculation of the quantitative values enabling the comparison of peptide signals from numerous patient samples. In this work, a quantitative MS methodology was established to study aromatase overexpressing (AROM+) male mouse liver and ovarian endometriosis tissue samples. The workflow of label-free quantitative proteomics was optimized in terms of sensitivity and robustness, allowing the quantification of 1500 proteins with a low coefficient of variance in both sample types. Additionally, five statistical methods were evaluated for the use with label-free quantitative proteomics data. The proteome data was integrated with other omics datasets, such as mRNA microarray and metabolite data sets. As a result, an altered lipid metabolism in liver was discovered in male AROM+ mice. The results suggest a reduced beta oxidation of long chain phospholipids in the liver and increased levels of pro-inflammatory fatty acids in the circulation in these mice. Conversely, in the endometriosis tissues, a set of proteins highly specific for ovarian endometrioma were discovered, many of which were under the regulation of the growth factor TGF-β1. This finding supports subsequent biomarker verification in a larger number of endometriosis patient samples.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this MA thesis, test anxiety related to English exams among Finnish upper secondary school students was studied. In addition, the ways students try to cope with test anxiety were investigated. The purpose of the study was to investigate gender differences in test anxiety, the effects of test anxiety on academic performance and relationships between test anxiety, academic performance and coping strategies. Test anxiety and coping strategies were analysed as scores of questionnaire responses. Coping strategies comprised of three categories – task-orientation and preparation, seeking social support and avoidance. Academic performance was analysed as teacher ratings of general performance in English exams. In total 67 subjects were studied. The subjects were Finnish general upper secondary school students. The data were collected by using online questionnaires. This data were mainly quantitative, but also qualitative elements were included. The quantitative data were analysed by using statistical methods. The results showed that females experienced statistically significantly more test anxiety than males. In addition, a statistically significant correlation was found between test anxiety levels and academic performance ratings of the subjects: the higher the test anxiety score, the lower the academic performance rating. A meaningful correlation was found between test anxiety and seeking social support as a coping strategy: a higher test anxiety score was related to using social support as a coping strategy. However, no relationships were found between academic performance and the three coping strategies when quantitative and qualitative data were analysed. Therefore, different coping strategies per se did not seem to be related to academic performance, but instead it was assumed that the effectiveness of coping strategies is dependent on individual differences. In order to obtain more generalisable results and to gain more understanding of test anxiety and coping with it, a larger number of subjects form different areas of Finland and of different ages could be examined in future studies. Moreover, cross-national and cross-cultural studies could provide valuable information. As a practical recommendation for educational purposes, the results of this study indicated that a more individualised approach is needed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

O prognóstico da perda dentária é um dos principais problemas na prática clínica de medicina dentária. Um dos principais fatores prognósticos é a quantidade de suporte ósseo do dente, definido pela área da superfície radicular dentária intraóssea. A estimação desta grandeza tem sido realizada por diferentes metodologias de investigação com resultados heterogéneos. Neste trabalho utilizamos o método da planimetria com microtomografia para calcular a área da superfície radicular (ASR) de uma amostra de cinco dentes segundos pré-molares inferiores obtida da população portuguesa, com o objetivo final de criar um modelo estatístico para estimar a área de superfície radicular intraóssea a partir de indicadores clínicos da perda óssea. Por fim propomos um método para aplicar os resultados na prática. Os dados referentes à área da superfície radicular, comprimento total do dente (CT) e dimensão mésio-distal máxima da coroa (MDeq) serviram para estabelecer as relações estatísticas entre variáveis e definir uma distribuição normal multivariada. Por fim foi criada uma amostra de 37 observações simuladas a partir da distribuição normal multivariada definida e estatisticamente idênticas aos dados da amostra de cinco dentes. Foram ajustados cinco modelos lineares generalizados aos dados simulados. O modelo estatístico foi selecionado segundo os critérios de ajustamento, preditibilidade, potência estatística, acurácia dos parâmetros e da perda de informação, e validado pela análise gráfica de resíduos. Apoiados nos resultados propomos um método em três fases para estimação área de superfície radicular perdida/remanescente. Na primeira fase usamos o modelo estatístico para estimar a área de superfície radicular, na segunda estimamos a proporção (decis) de raiz intraóssea usando uma régua de Schei adaptada e na terceira multiplicamos o valor obtido na primeira fase por um coeficiente que representa a proporção de raiz perdida (ASRp) ou da raiz remanescente (ASRr) para o decil estimado na segunda fase. O ponto forte deste estudo foi a aplicação de metodologia estatística validada para operacionalizar dados clínicos na estimação de suporte ósseo perdido. Como pontos fracos consideramos a aplicação destes resultados apenas aos segundos pré-molares mandibulares e a falta de validação clínica.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In order to address the increasing stakeholder requirements for environmentally sustainable products and processes, firms often need the participation of their supply chain partners. Green supply chain management has emerged as a set of managerial practices that integrate environmental issues into supply chain management. If implemented successfully, green supply chain management can be a way to achieve competitive advantage while enhancing the environmental sustainability of the firm. The overall purpose of this dissertation is to contribute to the discussion on green supply chain management practices from the perspective of their drivers and performance implications. The theoretical background arises from the literature on competitive strategy, firm performance and green supply chain management. The research questions are addressed by analysing firm-level data from manufacturing, trading and logistics firms operating in Finland. The empirical data comes from two consecutive Finland State of Logistics surveys in 2012 and 2014, combined with financial reporting data from external databases. The data is analysed with multiple statistical methods. First, the thesis contributes to the discussion of the drivers of GSCM practices. To enhance the understanding of the relationship between competitive strategy and GSCM practices, a conceptual tool to describe generic competitive strategy approaches was developed. The findings suggest that firms pursuing marketing differentiation are more likely to be able to compete by having only small environmental effects and by adopting a more advanced form of external green supply chain management, such as a combination of strong environmental collaboration and the increased environmental monitoring of suppliers. Furthermore, customer requirements for environmental sustainability are found to be an important driver in the implementation of internal GSCM practices. Firms can respond to this customer pressure by passing environmental requirements on to their suppliers, either through environmental collaboration or environmental monitoring. Second, this thesis adds value to the existing literature on the effects of green supply chain management practices on firm performance. The thesis provides support for the idea that there is a positive relationship between GSCM practices and firm performance and enhances the understanding of how different types of GSCM practices are related to 1) financial, 2) operational and 3) environmental performance in manufacturing and logistics. The empirical results suggest that while internal GSCM practices have the strongest effect on environmentalperformance, environmental collaboration with customers seems to be the most effective way to improve financial performance. In terms of operational performance, the findings were more mixed, suggesting that the operational performance of firms is more likely to be affected by firm characteristics than by the choices they make regarding their environmental collaboration. This thesis is also one of the first attempts to empirically analyse the relationship between GSCM practices and performance among logistics service providers. The findings also have managerial relevance. Management, especially in manufacturing and logistics industries, may benefit by gaining knowledge about which types of GSCM practice could provide the largest benefits in terms of different performance dimensions. This thesis also has implications for policy-makers and regulators regarding how to promote environmentally friendly activities among 1) manufacturing; 2) trading; and 3) logistics firms.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A principios de 1990, los documentalistas comienzan a interesarse en hacer aplicaciones matemáticas y estadísticas en las unidades bibliográficas. F. J. Coles y Nellie B. Eales en 1917 hicieron el primer estudio con un grupo de títulos de documentos cuyo análisis consideraba el país de origen (White, p. 35). En 1923, E. Wyndham Hulme fue la primera persona en usar el término "estadísticas bibliográficas".Y propuso la utilización de métodos estadísticos para tener parámetros que sirvan para conocer el proceso de la comunicación escrita y, la naturaleza y curso del desarrollo de una disciplina. Para lograr ese aspecto empezó contando un número de documentos y analizando varias facetas de la comunicación escrita empleada en ellos (Ferrante, p. 201). En un documento escrito en 1969, Alan Pritchard propuso el término bibliometría para reemplazar el término "estadísticas bibliográficas" empleado por Hulme, argumentando que el, término es ambiguo, no muy descriptivo y que puede ser confundido con las estadísticas puras o estadísticas de bibliografías. El definió el término bibliometría como la aplicación de la matemática y métodos estadísticos a los libros y otros documentos (p. 348-349). Y desde ese momento se ha utilizado este término.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Sequences of timestamped events are currently being generated across nearly every domain of data analytics, from e-commerce web logging to electronic health records used by doctors and medical researchers. Every day, this data type is reviewed by humans who apply statistical tests, hoping to learn everything they can about how these processes work, why they break, and how they can be improved upon. To further uncover how these processes work the way they do, researchers often compare two groups, or cohorts, of event sequences to find the differences and similarities between outcomes and processes. With temporal event sequence data, this task is complex because of the variety of ways single events and sequences of events can differ between the two cohorts of records: the structure of the event sequences (e.g., event order, co-occurring events, or frequencies of events), the attributes about the events and records (e.g., gender of a patient), or metrics about the timestamps themselves (e.g., duration of an event). Running statistical tests to cover all these cases and determining which results are significant becomes cumbersome. Current visual analytics tools for comparing groups of event sequences emphasize a purely statistical or purely visual approach for comparison. Visual analytics tools leverage humans' ability to easily see patterns and anomalies that they were not expecting, but is limited by uncertainty in findings. Statistical tools emphasize finding significant differences in the data, but often requires researchers have a concrete question and doesn't facilitate more general exploration of the data. Combining visual analytics tools with statistical methods leverages the benefits of both approaches for quicker and easier insight discovery. Integrating statistics into a visualization tool presents many challenges on the frontend (e.g., displaying the results of many different metrics concisely) and in the backend (e.g., scalability challenges with running various metrics on multi-dimensional data at once). I begin by exploring the problem of comparing cohorts of event sequences and understanding the questions that analysts commonly ask in this task. From there, I demonstrate that combining automated statistics with an interactive user interface amplifies the benefits of both types of tools, thereby enabling analysts to conduct quicker and easier data exploration, hypothesis generation, and insight discovery. The direct contributions of this dissertation are: (1) a taxonomy of metrics for comparing cohorts of temporal event sequences, (2) a statistical framework for exploratory data analysis with a method I refer to as high-volume hypothesis testing (HVHT), (3) a family of visualizations and guidelines for interaction techniques that are useful for understanding and parsing the results, and (4) a user study, five long-term case studies, and five short-term case studies which demonstrate the utility and impact of these methods in various domains: four in the medical domain, one in web log analysis, two in education, and one each in social networks, sports analytics, and security. My dissertation contributes an understanding of how cohorts of temporal event sequences are commonly compared and the difficulties associated with applying and parsing the results of these metrics. It also contributes a set of visualizations, algorithms, and design guidelines for balancing automated statistics with user-driven analysis to guide users to significant, distinguishing features between cohorts. This work opens avenues for future research in comparing two or more groups of temporal event sequences, opening traditional machine learning and data mining techniques to user interaction, and extending the principles found in this dissertation to data types beyond temporal event sequences.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The main aim of this study was to determine the impact of innovation on productivity in service sector companies — especially those in the hospitality sector — that value the reduction of environmental impact as relevant to the innovation process. We used a structural analysis model based on the one developed by Crépon, Duguet, and Mairesse (1998). This model is known as the CDM model (an acronym of the authors’ surnames). These authors developed seminal studies in the field of the relationships between innovation and productivity (see Griliches 1979; Pakes and Grilliches 1980). The main advantage of the CDM model is its ability to integrate the process of innovation and business productivity from an empirical perspective.