932 resultados para data complexity
Resumo:
Solving pharmaceutical crystal structures from powder diffraction data is discussed in terms of the methodologies that have been applied and the complexity of the structures that have been solved. The principles underlying these methodologies are summarized and representative examples of polymorph, solvate, salt and cocrystal structure solutions are provided, together with examples of some particularly challenging structure determinations.
Resumo:
For users of climate services, the ability to quickly determine the datasets that best fit one's needs would be invaluable. The volume, variety and complexity of climate data makes this judgment difficult. The ambition of CHARMe ("Characterization of metadata to enable high-quality climate services") is to give a wider interdisciplinary community access to a range of supporting information, such as journal articles, technical reports or feedback on previous applications of the data. The capture and discovery of this "commentary" information, often created by data users rather than data providers, and currently not linked to the data themselves, has not been significantly addressed previously. CHARMe applies the principles of Linked Data and open web standards to associate, record, search and publish user-derived annotations in a way that can be read both by users and automated systems. Tools have been developed within the CHARMe project that enable annotation capability for data delivery systems already in wide use for discovering climate data. In addition, the project has developed advanced tools for exploring data and commentary in innovative ways, including an interactive data explorer and comparator ("CHARMe Maps") and a tool for correlating climate time series with external "significant events" (e.g. instrument failures or large volcanic eruptions) that affect the data quality. Although the project focuses on climate science, the concepts are general and could be applied to other fields. All CHARMe system software is open-source, released under a liberal licence, permitting future projects to re-use the source code as they wish.
Resumo:
With the increase in e-commerce and the digitisation of design data and information,the construction sector has become reliant upon IT infrastructure and systems. The design and production process is more complex, more interconnected, and reliant upon greater information mobility, with seamless exchange of data and information in real time. Construction small and medium-sized enterprises (CSMEs), in particular,the speciality contractors, can effectively utilise cost-effective collaboration-enabling technologies, such as cloud computing, to help in the effective transfer of information and data to improve productivity. The system dynamics (SD) approach offers a perspective and tools to enable a better understanding of the dynamics of complex systems. This research focuses upon system dynamics methodology as a modelling and analysis tool in order to understand and identify the key drivers in the absorption of cloud computing for CSMEs. The aim of this paper is to determine how the use of system dynamics (SD) can improve the management of information flow through collaborative technologies leading to improved productivity. The data supporting the use of system dynamics was obtained through a pilot study consisting of questionnaires and interviews from five CSMEs in the UK house-building sector.
Resumo:
Conceptualisations of disability that emphasise the contextual and cultural nature of disability and the embodiment of these within a national system of data collection present a number of challenges especially where this process is devolved to schools. The requirement for measures based on contextual and subjective experiences gives rise to particular difficulties in achieving parity in the way data is analysed and reported. This paper presents an account of the testing of a tool intended for use by schools as they collect data from parents to identify children who meet the criteria of disability established in Disability Discrimination Acts (DDAs). Data were validated through interviews with parents and teachers and observations of children and highlighted the pivotal role of the criterion of impact. The findings are set in the context of schools meeting their legal duties to identify disabled children and their support needs in a way that captures the complexity of disabled children’s school lives and provides useful and useable data.
Resumo:
The size and complexity of data sets generated within ecosystem-level programmes merits their capture, curation, storage and analysis, synthesis and visualisation using Big Data approaches. This review looks at previous attempts to organise and analyse such data through the International Biological Programme and draws on the mistakes made and the lessons learned for effective Big Data approaches to current Research Councils United Kingdom (RCUK) ecosystem-level programmes, using Biodiversity and Ecosystem Service Sustainability (BESS) and Environmental Virtual Observatory Pilot (EVOp) as exemplars. The challenges raised by such data are identified, explored and suggestions are made for the two major issues of extending analyses across different spatio-temporal scales and for the effective integration of quantitative and qualitative data.
Resumo:
In Information Visualization, adding and removing data elements can strongly impact the underlying visual space. We have developed an inherently incremental technique (incBoard) that maintains a coherent disposition of elements from a dynamic multidimensional data set on a 2D grid as the set changes. Here, we introduce a novel layout that uses pairwise similarity from grid neighbors, as defined in incBoard, to reposition elements on the visual space, free from constraints imposed by the grid. The board continues to be updated and can be displayed alongside the new space. As similar items are placed together, while dissimilar neighbors are moved apart, it supports users in the identification of clusters and subsets of related elements. Densely populated areas identified in the incSpace can be efficiently explored with the corresponding incBoard visualization, which is not susceptible to occlusion. The solution remains inherently incremental and maintains a coherent disposition of elements, even for fully renewed sets. The algorithm considers relative positions for the initial placement of elements, and raw dissimilarity to fine tune the visualization. It has low computational cost, with complexity depending only on the size of the currently viewed subset, V. Thus, a data set of size N can be sequentially displayed in O(N) time, reaching O(N (2)) only if the complete set is simultaneously displayed.
Complexity and anisotropy in host morphology make populations less susceptible to epidemic outbreaks
Resumo:
One of the challenges in epidemiology is to account for the complex morphological structure of hosts such as plant roots, crop fields, farms, cells, animal habitats and social networks, when the transmission of infection occurs between contiguous hosts. Morphological complexity brings an inherent heterogeneity in populations and affects the dynamics of pathogen spread in such systems. We have analysed the influence of realistically complex host morphology on the threshold for invasion and epidemic outbreak in an SIR (susceptible-infected-recovered) epidemiological model. We show that disorder expressed in the host morphology and anisotropy reduces the probability of epidemic outbreak and thus makes the system more resistant to epidemic outbreaks. We obtain general analytical estimates for minimally safe bounds for an invasion threshold and then illustrate their validity by considering an example of host data for branching hosts (salamander retinal ganglion cells). Several spatial arrangements of hosts with different degrees of heterogeneity have been considered in order to separately analyse the role of shape complexity and anisotropy in the host population. The estimates for invasion threshold are linked to morphological characteristics of the hosts that can be used for determining the threshold for invasion in practical applications.
Resumo:
We report 6 K-Ar ages and paleomagnetic data from 28 sites collected in Jurassic, Lower Cretaceous and Paleocene rocks of the Santa Marta massif, to test previous hypothesis of rotations and translations of this massif, whose rock assemblage differs from other basement-cored ranges adjacent to the Guyana margin. Three magnetic components were identified in this study. A first component has a direction parallel to the present magnetic field and was uncovered in all units (D 352, I = 25.6, k = 57.35, a95 = 5.3, N = 12). A second component was isolated in Cretaceous limestone and Jurassic volcaniclastic rocks (D = 8.8, I = 8.3, k = 24.71, a95 = 13.7, N = 6), and it was interpreted as of Early Cretaceous age. In Jurassic sites with this component, Early Cretaceous K-Ar ages obtained from this and previous studies are interpreted as reset ages. The third component was uncovered in eight sites of Jurassic volcaniclastic rocks, and its direction indicates negative shallow to moderate inclinations and northeastward declinations. K-Ar ages in these sites are of Early (196.5 +/- 4.9 Ma) to early Late Jurassic age (156.6 +/- 8.9 Ma). Due to local structural complexity and too few Cretaceous outcrops to perform a reliable unconformity test, we only used two sites with (1) K-Ar ages, (2) less structural complexity, and (3) reliable structural data for Jurassic and Cretaceous rocks. The mean direction of the Jurassic component is (D = 20.4, I = -18.2, k = 46.9, a95 = 5.1, n = 18 specimens from two sites). These paleomagnetic data support previous models of northward along-margin translations of Grenvillian-cored massifs. Additionally, clockwise vertical-axis rotation of this massif, with respect to the stable craton, is also documented; the sense of rotation is similar to that proposed for the Perija Range and other ranges of the southern Caribbean margin. More data is needed to confirm the magnitudes of rotations and translations. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
The advancement of GPS technology enables GPS devices not only to be used as orientation and navigation tools, but also to track travelled routes. GPS tracking data provides essential information for a broad range of urban planning applications such as transportation routing and planning, traffic management and environmental control. This paper describes on processing the data that was collected by tracking the cars of 316 volunteers over a seven-week period. The detailed information is extracted. The processed data is further connected to the underlying road network by means of maps. Geographical maps are applied to check how the car-movements match the road network. The maps capture the complexity of the car-movements in the urban area. The results show that 90% of the trips on the plane match the road network within a tolerance.
Resumo:
To have good data quality with high complexity is often seen to be important. Intuition says that the higher accuracy and complexity the data have the better the analytic solutions becomes if it is possible to handle the increasing computing time. However, for most of the practical computational problems, high complexity data means that computational times become too long or that heuristics used to solve the problem have difficulties to reach good solutions. This is even further stressed when the size of the combinatorial problem increases. Consequently, we often need a simplified data to deal with complex combinatorial problems. In this study we stress the question of how the complexity and accuracy in a network affect the quality of the heuristic solutions for different sizes of the combinatorial problem. We evaluate this question by applying the commonly used p-median model, which is used to find optimal locations in a network of p supply points that serve n demand points. To evaluate this, we vary both the accuracy (the number of nodes) of the network and the size of the combinatorial problem (p). The investigation is conducted by the means of a case study in a region in Sweden with an asymmetrically distributed population (15,000 weighted demand points), Dalecarlia. To locate 5 to 50 supply points we use the national transport administrations official road network (NVDB). The road network consists of 1.5 million nodes. To find the optimal location we start with 500 candidate nodes in the network and increase the number of candidate nodes in steps up to 67,000 (which is aggregated from the 1.5 million nodes). To find the optimal solution we use a simulated annealing algorithm with adaptive tuning of the temperature. The results show that there is a limited improvement in the optimal solutions when the accuracy in the road network increase and the combinatorial problem (low p) is simple. When the combinatorial problem is complex (large p) the improvements of increasing the accuracy in the road network are much larger. The results also show that choice of the best accuracy of the network depends on the complexity of the combinatorial (varying p) problem.
Resumo:
Delineation of commuting regions has always been based on statistical units, often municipalities or wards. However, using these units has certain disadvantages as their land areas differ considerably. Much information is lost in the larger spatial base units and distortions in self-containment values, the main criterion in rule-based delineation procedures, occur. Alternatively, one can start from relatively small standard size units such as hexagons. In this way, much greater detail in spatial patterns is obtained. In this paper, regions are built by means of intrazonal maximization (Intramax) on the basis of hexagons. The use of geoprocessing tools, specifically developed for the processing ofcommuting data, speeds up processing time considerably. The results of the Intramax analysis are evaluated with travel-to-work area constraints, and comparisons are made with commuting fields, accessibility to employment, commuting flow density and network commuting flow size. From selected steps in the regionalization process, a hierarchy of nested commuting regions emerges, revealing the complexity of commuting patterns.
Resumo:
As scientific workflows and the data they operate on, grow in size and complexity, the task of defining how those workflows should execute (which resources to use, where the resources must be in readiness for processing etc.) becomes proportionally more difficult. While "workflow compilers", such as Pegasus, reduce this burden, a further problem arises: since specifying details of execution is now automatic, a workflow's results are harder to interpret, as they are partly due to specifics of execution. By automating steps between the experiment design and its results, we lose the connection between them, hindering interpretation of results. To reconnect the scientific data with the original experiment, we argue that scientists should have access to the full provenance of their data, including not only parameters, inputs and intermediary data, but also the abstract experiment, refined into a concrete execution by the "workflow compiler". In this paper, we describe preliminary work on adapting Pegasus to capture the process of workflow refinement in the PASOA provenance system.
Resumo:
In this research the 3DVAR data assimilation scheme is implemented in the numerical model DIVAST in order to optimize the performance of the numerical model by selecting an appropriate turbulence scheme and tuning its parameters. Two turbulence closure schemes: the Prandtl mixing length model and the two-equation k-ε model were incorporated into DIVAST and examined with respect to their universality of application, complexity of solutions, computational efficiency and numerical stability. A square harbour with one symmetrical entrance subject to tide-induced flows was selected to investigate the structure of turbulent flows. The experimental part of the research was conducted in a tidal basin. A significant advantage of such laboratory experiment is a fully controlled environment where domain setup and forcing are user-defined. The research shows that the Prandtl mixing length model and the two-equation k-ε model, with default parameterization predefined according to literature recommendations, overestimate eddy viscosity which in turn results in a significant underestimation of velocity magnitudes in the harbour. The data assimilation of the model-predicted velocity and laboratory observations significantly improves model predictions for both turbulence models by adjusting modelled flows in the harbour to match de-errored observations. 3DVAR allows also to identify and quantify shortcomings of the numerical model. Such comprehensive analysis gives an optimal solution based on which numerical model parameters can be estimated. The process of turbulence model optimization by reparameterization and tuning towards optimal state led to new constants that may be potentially applied to complex turbulent flows, such as rapidly developing flows or recirculating flows.
Resumo:
Online geographic-databases have been growing increasingly as they have become a crucial source of information for both social networks and safety-critical systems. Since the quality of such applications is largely related to the richness and completeness of their data, it becomes imperative to develop adaptable and persistent storage systems, able to make use of several sources of information as well as enabling the fastest possible response from them. This work will create a shared and extensible geographic model, able to retrieve and store information from the major spatial sources available. A geographic-based system also has very high requirements in terms of scalability, computational power and domain complexity, causing several difficulties for a traditional relational database as the number of results increases. NoSQL systems provide valuable advantages for this scenario, in particular graph databases which are capable of modeling vast amounts of inter-connected data while providing a very substantial increase of performance for several spatial requests, such as finding shortestpath routes and performing relationship lookups with high concurrency. In this work, we will analyze the current state of geographic information systems and develop a unified geographic model, named GeoPlace Explorer (GE). GE is able to import and store spatial data from several online sources at a symbolic level in both a relational and a graph databases, where several stress tests were performed in order to find the advantages and disadvantages of each database paradigm.
Resumo:
The increasing use of fossil fuels in line with cities demographic explosion carries out to huge environmental impact in society. For mitigate these social impacts, regulatory requirements have positively influenced the environmental consciousness of society, as well as, the strategic behavior of businesses. Along with this environmental awareness, the regulatory organs have conquered and formulated new laws to control potentially polluting activities, mostly in the gas stations sector. Seeking for increasing market competitiveness, this sector needs to quickly respond to internal and external pressures, adapting to the new standards required in a strategic way to get the Green Badge . Gas stations have incorporated new strategies to attract and retain new customers whom present increasingly social demand. In the social dimension, these projects help the local economy by generating jobs and income distribution. In this survey, the present research aims to align the social, economic and environmental dimensions to set the sustainable performance indicators at Gas Stations sector in the city of Natal/RN. The Sustainable Balanced Scorecard (SBSC) framework was create with a set of indicators for mapping the production process of gas stations. This mapping aimed at identifying operational inefficiencies through multidimensional indicators. To carry out this research, was developed a system for evaluating the sustainability performance with application of Data Envelopment Analysis (DEA) through a quantitative method approach to detect system s efficiency level. In order to understand the systemic complexity, sub organizational processes were analyzed by the technique Network Data Envelopment Analysis (NDEA) figuring their micro activities to identify and diagnose the real causes of overall inefficiency. The sample size comprised 33 Gas stations and the conceptual model included 15 indicators distributed in the three dimensions of sustainability: social, environmental and economic. These three dimensions were measured by means of classical models DEA-CCR input oriented. To unify performance score of individual dimensions, was designed a unique grouping index based upon two means: arithmetic and weighted. After this, another analysis was performed to measure the four perspectives of SBSC: learning and growth, internal processes, customers, and financial, unifying, by averaging the performance scores. NDEA results showed that no company was assessed with excellence in sustainability performance. Some NDEA higher efficiency Gas Stations proved to be inefficient under certain perspectives of SBSC. In the sequence, a comparative sustainable performance and assessment analyzes among the gas station was done, enabling entrepreneurs evaluate their performance in the market competitors. Diagnoses were also obtained to support the decision making of entrepreneurs in improving the management of organizational resources and promote guidelines the regulators. Finally, the average index of sustainable performance was 69.42%, representing the efforts of the environmental suitability of the Gas station. This results point out a significant awareness of this segment, but it still needs further action to enhance sustainability in the long term