846 resultados para Exploratory analysis
Resumo:
In 2003, Eurostat published an 'experimental' dataset on regional innovation levels derived from the Second Community Innovation Survey. This dataset, part of the European Innovation Scoreboard, also contains a range of regional labour market indicators. In this paper, we report an exploratory analysis of this data, focussing on how the labour market characteristics of regions shape regions' absorptive capacity (RACAP) and their ability to assimilate knowledge from public and externally conducted R&D. In particular, we aim to establish whether labour market aspects of RACAP are more important for innovation in prosperous or lagging regions of the European Union (EU). © Springer-Verlag 2006.
Resumo:
Very large spatially-referenced datasets, for example, those derived from satellite-based sensors which sample across the globe or large monitoring networks of individual sensors, are becoming increasingly common and more widely available for use in environmental decision making. In large or dense sensor networks, huge quantities of data can be collected over small time periods. In many applications the generation of maps, or predictions at specific locations, from the data in (near) real-time is crucial. Geostatistical operations such as interpolation are vital in this map-generation process and in emergency situations, the resulting predictions need to be available almost instantly, so that decision makers can make informed decisions and define risk and evacuation zones. It is also helpful when analysing data in less time critical applications, for example when interacting directly with the data for exploratory analysis, that the algorithms are responsive within a reasonable time frame. Performing geostatistical analysis on such large spatial datasets can present a number of problems, particularly in the case where maximum likelihood. Although the storage requirements only scale linearly with the number of observations in the dataset, the computational complexity in terms of memory and speed, scale quadratically and cubically respectively. Most modern commodity hardware has at least 2 processor cores if not more. Other mechanisms for allowing parallel computation such as Grid based systems are also becoming increasingly commonly available. However, currently there seems to be little interest in exploiting this extra processing power within the context of geostatistics. In this paper we review the existing parallel approaches for geostatistics. By recognising that diffeerent natural parallelisms exist and can be exploited depending on whether the dataset is sparsely or densely sampled with respect to the range of variation, we introduce two contrasting novel implementations of parallel algorithms based on approximating the data likelihood extending the methods of Vecchia [1988] and Tresp [2000]. Using parallel maximum likelihood variogram estimation and parallel prediction algorithms we show that computational time can be significantly reduced. We demonstrate this with both sparsely sampled data and densely sampled data on a variety of architectures ranging from the common dual core processor, found in many modern desktop computers, to large multi-node super computers. To highlight the strengths and weaknesses of the diffeerent methods we employ synthetic data sets and go on to show how the methods allow maximum likelihood based inference on the exhaustive Walker Lake data set.
Resumo:
Enterprise Risk Management (ERM) and Knowledge Management (KM) both encompass top-down and bottom-up approaches developing and embedding risk knowledge concepts and processes in strategy, policies, risk appetite definition, the decision-making process and business processes. The capacity to transfer risk knowledge affects all stakeholders and understanding of the risk knowledge about the enterprise's value is a key requirement in order to identify protection strategies for business sustainability. There are various factors that affect this capacity for transferring and understanding. Previous work has established that there is a difference between the influence of KM variables on Risk Control and on the perceived value of ERM. Communication among groups appears as a significant variable in improving Risk Control but only as a weak factor in improving the perceived value of ERM. However, the ERM mandate requires for its implementation a clear understanding, of risk management (RM) policies, actions and results, and the use of the integral view of RM as a governance and compliance program to support the value driven management of the organization. Furthermore, ERM implementation demands better capabilities for unification of the criteria of risk analysis, alignment of policies and protection guidelines across the organization. These capabilities can be affected by risk knowledge sharing between the RM group and the Board of Directors and other executives in the organization. This research presents an exploratory analysis of risk knowledge transfer variables used in risk management practice. A survey to risk management executives from 65 firms in various industries was undertaken and 108 answers were analyzed. Potential relationships among the variables are investigated using descriptive statistics and multivariate statistical models. The level of understanding of risk management policies and reports by the board is related to the quality of the flow of communication in the firm and perceived level of integration of the risk policy in the business processes.
Resumo:
Visualising data for exploratory analysis is a major challenge in many applications. Visualisation allows scientists to gain insight into the structure and distribution of the data, for example finding common patterns and relationships between samples as well as variables. Typically, visualisation methods like principal component analysis and multi-dimensional scaling are employed. These methods are favoured because of their simplicity, but they cannot cope with missing data and it is difficult to incorporate prior knowledge about properties of the variable space into the analysis; this is particularly important in the high-dimensional, sparse datasets typical in geochemistry. In this paper we show how to utilise a block-structured correlation matrix using a modification of a well known non-linear probabilistic visualisation model, the Generative Topographic Mapping (GTM), which can cope with missing data. The block structure supports direct modelling of strongly correlated variables. We show that including prior structural information it is possible to improve both the data visualisation and the model fit. These benefits are demonstrated on artificial data as well as a real geochemical dataset used for oil exploration, where the proposed modifications improved the missing data imputation results by 3 to 13%.
Resumo:
Exploratory analysis of petroleum geochemical data seeks to find common patterns to help distinguish between different source rocks, oils and gases, and to explain their source, maturity and any intra-reservoir alteration. However, at the outset, one is typically faced with (a) a large matrix of samples, each with a range of molecular and isotopic properties, (b) a spatially and temporally unrepresentative sampling pattern, (c) noisy data and (d) often, a large number of missing values. This inhibits analysis using conventional statistical methods. Typically, visualisation methods like principal components analysis are used, but these methods are not easily able to deal with missing data nor can they capture non-linear structure in the data. One approach to discovering complex, non-linear structure in the data is through the use of linked plots, or brushing, while ignoring the missing data. In this paper we introduce a complementary approach based on a non-linear probabilistic model. Generative topographic mapping enables the visualisation of the effects of very many variables on a single plot, while also dealing with missing data. We show how using generative topographic mapping also provides an optimal method with which to replace missing values in two geochemical datasets, particularly where a large proportion of the data is missing.
Resumo:
Managing supply chains effectively has become a critical element in enhancing company profitability and has been identified as the new frontier of competitive advantage. An important element of effective supply chain management is the strategic positioning of the company. The strategic positioning process is concerned with the choice of production-centred activities a company carries out internally and those provided externally. Strategic positioning within manufacturing supply chains however is a relatively recent research topic with apparently few articles currently available that explicitly address associated issues directly. Moreover there is no previous research working strategic positioning of manufacturing operations in global context. Therefore the purpose of this paper is to explore strategic positioning within global supply chains. This paper is based on three cases drawn from the cross industry sector manufacturing companies. It describes an exploratory analysis which is aimed at gaining insight into the success factor to form a strategic positioning within global supply chains.
Resumo:
To be competitive in contemporary turbulent environments, firms must be capable of processing huge amounts of information, and effectively convert it into actionable knowledge. This is particularly the case in the marketing context, where problems are also usually highly complex, unstructured and ill-defined. In recent years, the development of marketing management support systems has paralleled this evolution in informational problems faced by managers, leading to a growth in the study (and use) of artificial intelligence and soft computing methodologies. Here, we present and implement a novel intelligent system that incorporates fuzzy logic and genetic algorithms to operate in an unsupervised manner. This approach allows the discovery of interesting association rules, which can be linguistically interpreted, in large scale databases (KDD or Knowledge Discovery in Databases.) We then demonstrate its application to a distribution channel problem. It is shown how the proposed system is able to return a number of novel and potentially-interesting associations among variables. Thus, it is argued that our method has significant potential to improve the analysis of marketing and business databases in practice, especially in non-programmed decisional scenarios, as well as to assist scholarly researchers in their exploratory analysis. © 2013 Elsevier Inc.
Resumo:
Knowledge accessing from external organisations is important to firms, especially entrepreneurial ones which often cannot generate internally all the knowledge necessary for innovation. There is, however, a lack of evidence concerning the association between the evolution of firms and the evolution of their networks. The aim of this paper is to begin to fill this gap by undertaking an exploratory analysis of the relationship between the vintage of firms and their knowledge sourcing networks. Drawing on an analysis of firms in the UK, the paper finds some evidence of a U-shaped relationship existing between firm age and the frequency of accessing knowledge from certain sources. Emerging entrepreneurial firms tend to be highly active with regard to accessing knowledge for a range of sources and geographic locations, with the rate of networking dropping somewhat during the period of peak firm growth. For instance, it is found that firms tend to less frequently access knowledge sources such as universities and research institutes in their own region during a stage of peak turnover growth. Overall, the results suggest a complex relationship between the lifecycle of the firm and its networking patterns. It is concluded that policymakers need to become more aware that network formation and utilisation by firms is likely to vary dependent upon their lifecycle position.
Resumo:
Background and aims: Lixisenatide, a once-daily prandial glucagon-like peptide-1 receptor agonist, reduces postprandial (PP) glycaemic excursions and HbA 1c . We report an exploratory analysis of the GetGoal-M and S trials in patients with type 2 diabetes mellitus (T2DM) with different changes in PP glucagon levels in response to lixisenatide treatment. Materials and methods: Patients (n=423) were stratified by their change in 2 hour PP glucagon level between baseline evaluation and Week 24 of treat - ment with lixisenatide as add-on to oral antidiabetics (OADs) into groups of Greater Change (GC; n=213) or Smaller Change (SC; n=210) in plasma glucagon levels (median change -23.57 ng/L). ANOVA and Chi-squared tests were used for the comparison of continuous and categorical variables, respec - tively. Baseline and endpoint continuous measurements in each group were compared using paired t -tests. Results: Mean change from baseline in 2 hour PP glucagon levels for the GC vs SC groups was -47.19 vs -0.59 ng/L (p<0.0001), respectively. Patients in the GC group had a shorter mean duration of diabetes (7.3 vs 9.0 years; p=0.0036) and lesser OAD use (4.5 vs 5.7 years; p=0.0092) than those in the SC group. Patients in the GC group had a greater mean reduction in HbA 1c (-1.10 vs -0.67%; p<0.0001), fasting plasma glucose (FPG; -25.20 vs -9.30 mg/dL [p<0.0001]), PP plasma glucose (PPG; -129.40 vs -78.22 mg/dL [p<0.0001]), and a greater drop in weight (-2.27 vs -1.17 kg; p=0.0002) and body mass index (-0.84 vs -0.44 kg/m 2 ; p=0.0002) than those in the SC group. More patients in the GC group also achieved composite endpoints, including HbA 1c <7% with no symptomatic hypoglycaemia and no weight gain (40.38 vs 19.52%; p<0.0001), than in the SC group. Conclusion: Greater reductions in PP glucagon associated with lixisenatide as add-on to OADs in patients with T2DM are also associated with greater reductions in HbA1c, FPG, PPG, and greater weight loss, highlighting the importance of glucagon suppression on therapeutic response. Clinical Trial Registration Number: NCT00712673; NCT00713830 Supported by: Sanof
Resumo:
The primary aim of this dissertation is to develop data mining tools for knowledge discovery in biomedical data when multiple (homogeneous or heterogeneous) sources of data are available. The central hypothesis is that, when information from multiple sources of data are used appropriately and effectively, knowledge discovery can be better achieved than what is possible from only a single source. ^ Recent advances in high-throughput technology have enabled biomedical researchers to generate large volumes of diverse types of data on a genome-wide scale. These data include DNA sequences, gene expression measurements, and much more; they provide the motivation for building analysis tools to elucidate the modular organization of the cell. The challenges include efficiently and accurately extracting information from the multiple data sources; representing the information effectively, developing analytical tools, and interpreting the results in the context of the domain. ^ The first part considers the application of feature-level integration to design classifiers that discriminate between soil types. The machine learning tools, SVM and KNN, were used to successfully distinguish between several soil samples. ^ The second part considers clustering using multiple heterogeneous data sources. The resulting Multi-Source Clustering (MSC) algorithm was shown to have a better performance than clustering methods that use only a single data source or a simple feature-level integration of heterogeneous data sources. ^ The third part proposes a new approach to effectively incorporate incomplete data into clustering analysis. Adapted from K-means algorithm, the Generalized Constrained Clustering (GCC) algorithm makes use of incomplete data in the form of constraints to perform exploratory analysis. Novel approaches for extracting constraints were proposed. For sufficiently large constraint sets, the GCC algorithm outperformed the MSC algorithm. ^ The last part considers the problem of providing a theme-specific environment for mining multi-source biomedical data. The database called PlasmoTFBM, focusing on gene regulation of Plasmodium falciparum, contains diverse information and has a simple interface to allow biologists to explore the data. It provided a framework for comparing different analytical tools for predicting regulatory elements and for designing useful data mining tools. ^ The conclusion is that the experiments reported in this dissertation strongly support the central hypothesis.^
Resumo:
In outsourcing relationships with China, the Electronic Manufacturing (EM) and Information Technology Services (ITS) industry in Taiwan may possess such advantages as the continuing growth of its production value, complete manufacturing supply chain, low production cost and a large-scale Chinese market, and language and culture similarity compared to outsourcing to other countries. Nevertheless, the Council for Economic Planning and Development of Executive Yuan (CEPD) found that Taiwan's IT services outsourcing to China is subject to certain constraints and might not be as successful as the EM outsourcing (Aggarwal, 2003; CEPD, 2004a; CIER, 2003; Einhorn and Kriplani, 2003; Kumar and Zhu, 2006; Li and Gao, 2003; MIC, 2006). Some studies examined this issue, but failed to (1) provide statistical evidence about lower prevalence rates of IT services outsourcing, and (2) clearly explain the lower prevalence rates of IT services outsourcing by identifying similarities and differences between both types of outsourcing contexts. This research seeks to fill that gap and possibly provide potential strategic guidelines to ITS firms in Taiwan. This study adopts Transaction Cost Economics (TCE) as the theoretical basis. The basic premise is that different types of outsourcing activities may incur differing transaction costs and realize varying degrees of outsourcing success due to differential attributes of the transactions in the outsourcing process. Using primary data gathered from questionnaire surveys of ninety two firms, the results from exploratory analysis and binary logistic regression indicated that (1) when outsourcing to China, Taiwanese firms' ITS outsourcing tends to have higher level of asset specificity, uncertainty and technical skills relative to EM outsourcing, and these features indirectly reduce firms' outsourcing prevalence rates via their direct positive impacts on transaction costs; (2) Taiwanese firms' ITS outsourcing tends to have lower level of transaction structurability relative to EM outsourcing, and this feature indirectly increases firms' outsourcing prevalence rates via its direct negative impacts on transaction costs; (3) frequency does influence firms' transaction costs in ITS outsourcing positively, but does not bring impacts into their outsourcing prevalence rates, (4) relatedness does influence firms' transaction costs positively and prevalence rates negatively in ITS outsourcing, but its impacts on the prevalence rates are not caused by the mediation effects of transaction costs, and (5) firm size of outsourcing provider does not affect firms' transaction costs, but does affect their outsourcing prevalence rates in ITS outsourcing directly and positively. Using primary data gathered from face-to-face interviews of executives from seven firms, the results from inductive analysis indicated that (1) IT services outsourcing has lower prevalence rates than EM outsourcing, and (2) this result is mainly attributed to Taiwan's core competence in manufacturing and management and higher overall transaction costs of IT services outsourcing. Specifically, there is not much difference between both types of outsourcing context in the transaction characteristics of reputation and most aspects of overall comparison. Although there are some differences in the feature of firm size of the outsourcing provider, the difference doesn't cause apparent impacts on firms' overall transaction costs. The medium or above medium difference in the transaction characteristics of asset specificity, uncertainty, frequency, technical skills, transaction structurability, and relatedness has caused higher overall transaction costs for IT services outsourcing. This higher cost might cause lower prevalence rates for ITS outsourcing relative to EM outsourcing. Overall, the interview results are consistent with the statistical analyses and provide support to my expectation that in outsourcing to China, Taiwan's electronic manufacturing firms do have lower prevalence rates of IT services outsourcing relative to EM outsourcing due to higher transaction costs caused by certain attributes. To solve this problem, firms' management should aim at identifying alternative strategies and strive to reduce their overall transaction costs of IT services outsourcing by initiating appropriate strategies which fit their environment and needs.
Resumo:
This study aimed to assess ambient air quality in a urban area of Natal, capital of Rio Grande do Norte (latitude 5º49'29 '' S and longitude 35º13'34'' W), aiming to determine the metals concentration in particulate matter (PM10 and PM2,5) of atmospheric air in the urban area o the Natal city. The sampling period for the study consisted of data acquisition from January to December 2012. Samples were collected on glass fiber filters by means of two large volumes samplers, one for PM2,5 (AGV PM 2,5) and another for PM10 (PM10 AGV). Monthly averages ranged from 8.92 to 19.80 g.m-3 , where the annual average was 16,21 g.m-3 for PM10 and PM2,5 monthly averages ranged from 2,84 to 7,89 g.m -3 , with an annual average of 5,61 g.m-3 . The results of PM2,5 and PM10 concentrations were related meteorological variables and for information on the effects of these variables on the concentration of PM, an exploratory analysis of the data using Principal Component Analysis (PCA) was performed. The results of the PCA showed that with increasing barometric pressure, the direction of the winds, the rainfall and relative humidity decreases the concentration of PM and the variable weekday little influence compared the meteorological variables. Filters containing particulate matter were selected in six days and subjected to microwave digestion. After digestion samples were analyzed by with Inductively Coupled Plasma Mass Spectrometry (ICP-MS). The concentrations for heavy metals Vanadium, Chromium, Manganese, Nickel, Copper, Arsenic and lead were determined. The highest concentrations of metals were for Pb and Cu, whose average PM10 values were, respectively, 5,34 and 2,34 ng.m-3 and PM2,5 4,68 and 2,95 ng.m-3 . Concentrations for metals V, Cr, Mn, Ni, and Cd were respectively 0,13, 0,39, 0,48, 0,45 and 0,03 ng.m-3 for PM10 fraction and PM2,5 fraction, 0,05, 0,10, 0,10, 0,34 and 0,01 ng.m-3. The concentration for As was null for the two fractions
Resumo:
Background The chronic cumulative nature of caries makes treatment needs a severe problem in adults. Despite the fact that oral diseases occur in social contexts, there are few studies using multilevel analyses focusing on treatment needs. Thus, considering the importance of context in explaining oral health related inequalities, this study aims to evaluate the social determinants of dental treatment needs in 35–44 year old Brazilian adults, assessing whether inequalities in needs are expressed at individual and contextual levels. Methods The dependent variables were based on the prevalence of normative dental treatment needs in adults: (a) restorative treatment; (b) tooth extraction and (c) prosthetic treatment. The independent variables at first level were household income, formal education level, sex and race. At second level, income, sanitation, infrastructure and house conditions. The city-level variables were the Human Development Index (HDI) and indicators related to health services. Exploratory analysis was performed evaluating the effect of each level through calculating Prevalence Ratios (PR). In addition, a three-level multilevel modelling was constructed for all outcomes to verify the effect of individual characteristics and also the influence of context. Results In relation to the need for restorative treatment, the main factors implicated were related to individual socioeconomic position, however the city-level contextual effect should also be considered. Regarding need for tooth extraction, the contextual effect does not seem to be important and, in relation to the needs for prosthetic treatment, the final model showed effect of individual-level and city-level. Variables related to health services did not show significant effects. Conclusions Dental treatment needs related to primary care (restoration and tooth extraction) and secondary care (prosthesis) were strongly associated with individual socioeconomic position, mainly income and education, in Brazilian adults. In addition to this individual effect, a city-level contextual effect, represented by HDI, was also observed for need for restorations and prosthesis, but not for tooth extractions. These findings have important implications for the health policy especially for financing and planning, since the distribution of oral health resources must consider the inequalities in availability and affordability of dental care for all.
Resumo:
UNLABELLED: Infants born to HIV-1-infected mothers in resource-limited areas where replacement feeding is unsafe and impractical are repeatedly exposed to HIV-1 throughout breastfeeding. Despite this, the majority of infants do not contract HIV-1 postnatally, even in the absence of maternal antiretroviral therapy. This suggests that immune factors in breast milk of HIV-1-infected mothers help to limit vertical transmission. We compared the HIV-1 envelope-specific breast milk and plasma antibody responses of clade C HIV-1-infected postnatally transmitting and nontransmitting mothers in the control arm of the Malawi-based Breastfeeding Antiretrovirals and Nutrition Study using multivariable logistic regression modeling. We found no association between milk or plasma neutralization activity, antibody-dependent cell-mediated cytotoxicity, or HIV-1 envelope-specific IgG responses and postnatal transmission risk. While the envelope-specific breast milk and plasma IgA responses also did not reach significance in predicting postnatal transmission risk in the primary model after correction for multiple comparisons, subsequent exploratory analysis using two distinct assay methodologies demonstrated that the magnitudes of breast milk total and secretory IgA responses against a consensus HIV-1 envelope gp140 (B.con env03) were associated with reduced postnatal transmission risk. These results suggest a protective role for mucosal HIV-1 envelope-specific IgA responses in the context of postnatal virus transmission. This finding supports further investigations into the mechanisms by which mucosal IgA reduces risk of HIV-1 transmission via breast milk and into immune interventions aimed at enhancing this response. IMPORTANCE: Infants born to HIV-1-infected mothers are repeatedly exposed to the virus in breast milk. Remarkably, the transmission rate is low, suggesting that immune factors in the breast milk of HIV-1-infected mothers help to limit transmission. We compared the antibody responses in plasma and breast milk of HIV-1-transmitting and -nontransmitting mothers to identify responses that correlated with reduced risk of postnatal HIV-1 transmission. We found that neither plasma nor breast milk IgG antibody responses were associated with risk of HIV-1 transmission. In contrast, the magnitudes of the breast milk IgA and secretory IgA responses against HIV-1 envelope proteins were associated with reduced risk of postnatal HIV-1 transmission. The results of this study support further investigations of the mechanisms by which mucosal IgA may reduce the risk of HIV-1 transmission via breastfeeding and the development of strategies to enhance milk envelope-specific IgA responses to reduce mother-to-child HIV transmission and promote an HIV-free generation.
Resumo:
The complexity of modern geochemical data sets is increasing in several aspects (number of available samples, number of elements measured, number of matrices analysed, geological-environmental variability covered, etc), hence it is becoming increasingly necessary to apply statistical methods to elucidate their structure. This paper presents an exploratory analysis of one such complex data set, the Tellus geochemical soil survey of Northern Ireland (NI). This exploratory analysis is based on one of the most fundamental exploratory tools, principal component analysis (PCA) and its graphical representation as a biplot, albeit in several variations: the set of elements included (only major oxides vs. all observed elements), the prior transformation applied to the data (none, a standardization or a logratio transformation) and the way the covariance matrix between components is estimated (classical estimation vs. robust estimation). Results show that a log-ratio PCA (robust or classical) of all available elements is the most powerful exploratory setting, providing the following insights: the first two processes controlling the whole geochemical variation in NI soils are peat coverage and a contrast between “mafic” and “felsic” background lithologies; peat covered areas are detected as outliers by a robust analysis, and can be then filtered out if required for further modelling; and peat coverage intensity can be quantified with the %Br in the subcomposition (Br, Rb, Ni).