851 resultados para data types and operators
Resumo:
The idea of extracting knowledge in process mining is a descendant of data mining. Both mining disciplines emphasise data flow and relations among elements in the data. Unfortunately, challenges have been encountered when working with the data flow and relations. One of the challenges is that the representation of the data flow between a pair of elements or tasks is insufficiently simplified and formulated, as it considers only a one-to-one data flow relation. In this paper, we discuss how the effectiveness of knowledge representation can be extended in both disciplines. To this end, we introduce a new representation of the data flow and dependency formulation using a flow graph. The flow graph solves the issue of the insufficiency of presenting other relation types, such as many-to-one and one-to-many relations. As an experiment, a new evaluation framework is applied to the Teleclaim process in order to show how this method can provide us with more precise results when compared with other representations.
Resumo:
Defining types of seafloor substrate and relating them to the distribution of fish and invertebrates is an important but difficult goal. An examination of the processing steps of a commercial acoustics analyzing software program, as well as the data values produced by the proprietary first echo measurements, revealed potential benef its and drawbacks for distinguishing acoustically distinct seafloor substrates. The positive aspects were convenient processing steps such as gain adjustment, accurate bottom picking, ease of bad data exclusion, and the ability to average across successive pings in order to increase the signal-to-noise ratio. A noteworthy drawback with the processing was the potential for accidental inclusion of a second echo as if it were part of the first echo. Detailed examination of the echogram measurements quantified the amount of collinearity, revealed the lack of standardization (subtraction of mean, division by standard deviation) before principal components analysis (PCA), and showed correlations of individual echogram measurements with depth and seafloor slope. Despite the facility of the software, these previously unknown processing pitfalls and echogram measurement characteristics may have created data artifacts that generated user-derived substrate classifications, rather than actual seafloor substrate types.
Resumo:
CAD software can be structured as a set of modular 'software tools' only if there is some agreement on the data structures which are to be passed between tools. Beyond this basic requirement, it is desirable to give the agreed structures the status of 'data types' in the language used for interactive design. The ultimate refinement is to have a data management capability which 'understands' how to manipulate such data types. In this paper the requirements of CACSD are formulated from the point of view of Database Management Systems. Progress towards meeting these requirements in both the DBMS and the CACSD community is reviewed. The conclusion reached is that there has been considerable movement towards the realisation of software tools for CACSD, but that this owes more to modern ideas about programming languages, than to DBMS developments. The DBMS field has identified some useful concepts, but further significant progress is expected to come from the exploitation of concepts such as object-oriented programming, logic programming, or functional programming.
Resumo:
An experiment was undertaken in order to investigate the use of sago palm starch, gum arabic and carrageenans as binders in prawn diets. Water stability data are presented; EPT-2 carrageenan was found to be the best binder for both steamed and unsteamed pellets.
Resumo:
Knowledge-elicitation is a common technique used to produce rules about the operation of a plant from the knowledge that is available from human expertise. Similarly, data-mining is becoming a popular technique to extract rules from the data available from the operation of a plant. In the work reported here knowledge was required to enable the supervisory control of an aluminium hot strip mill by the determination of mill set-points. A method was developed to fuse knowledge-elicitation and data-mining to incorporate the best aspects of each technique, whilst avoiding known problems. Utilisation of the knowledge was through an expert system, which determined schedules of set-points and provided information to human operators. The results show that the method proposed in this paper was effective in producing rules for the on-line control of a complex industrial process.
Resumo:
This study focuses on the occurrence and type of clouds observed in West Africa, a subject which has neither been much documented nor quantified. It takes advantage of data collected above Niamey in 2006 with the ARM mobile facility. A survey of cloud characteristics inferred from ground measurements is presented with a focus on their seasonal evolution and diurnal cycle. Four types of clouds are distinguished: high-level clouds, deep convective clouds, shallow convective clouds and mid-level clouds. A frequent occurrence of the latter clouds located at the top of the Saharan Air Layer is highlighted. High-level clouds are ubiquitous throughout the period whereas shallow convective clouds are mainly noticeable during the core of the monsoon. The diurnal cycle of each cloud category and its seasonal evolution is investigated. CloudSat and CALIPSO data are used in order to demonstrate that these four cloud types (in addition to stratocumulus clouds over the ocean) are not a particularity of the Niamey region and that mid-level clouds are present over the Sahara during most of the Monsoon season. Moreover, using complementary data sets, the radiative impact of each type of clouds at the surface level has been quantified in the shortwave and longwave domain. Mid-level clouds and anvil clouds have the largest impact respectively in longwave (about 15 W m−2) and the shortwave (about 150 W m−2). Furthermore, mid-level clouds exert a strong radiative forcing in Spring at a time when the other cloud types are less numerous.
Resumo:
OBJECTIVES: The prediction of protein structure and the precise understanding of protein folding and unfolding processes remains one of the greatest challenges in structural biology and bioinformatics. Computer simulations based on molecular dynamics (MD) are at the forefront of the effort to gain a deeper understanding of these complex processes. Currently, these MD simulations are usually on the order of tens of nanoseconds, generate a large amount of conformational data and are computationally expensive. More and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data. METHODS: To adequately organize, manage, and analyze the data generated by unfolding simulation studies, we designed a data warehouse system that is embedded in a grid environment to facilitate the seamless sharing of available computer resources and thus enable many groups to share complex molecular dynamics simulations on a more regular basis. RESULTS: To gain insight into the conformational fluctuations and stability of the monomeric forms of the amyloidogenic protein transthyretin (TTR), molecular dynamics unfolding simulations of the monomer of human TTR have been conducted. Trajectory data and meta-data of the wild-type (WT) protein and the highly amyloidogenic variant L55P-TTR represent the test case for the data warehouse. CONCLUSIONS: Web and grid services, especially pre-defined data mining services that can run on or 'near' the data repository of the data warehouse, are likely to play a pivotal role in the analysis of molecular dynamics unfolding data.
Resumo:
The occurrence of wind storms in Central Europe is investigated with respect to large-scale atmospheric flow and local wind speeds in the investigation area. Two different methods of storm identification are applied for Central Europe as the target region: one based on characteristics of large-scale flow (circulation weather types, CWT) and the other on the occurrence of extreme wind speeds. The identified events are examined with respect to the NAO phases and CWTs under which they occur. Pressure patterns, wind speeds and cyclone tracks are investigated for storms assigned to different CWTs. Investigations are based on ERA40 reanalysis data. It is shown that about 80% of the storm days in Central Europe are connected with westerly flow and that Central European storm events primarily occur during a moderately positive NAO phase, while strongly positive NAO phases (6.4% of all days) account for more than 20% of the storms. A storm occurs over Central Europe during about 10% of the days with a strong positive NAO index. The most frequent pathway of cyclone systems associated with storms over Central Europe leads from the North Atlantic over the British Isles, North Sea and southern Scandinavia into the Baltic Sea. The mean intensity of the systems typically reaches its maximum near the British Isles. Differences between the characteristics for storms identified from the CWT identification procedure (gale days, based on MSLP fields) and those from extreme winds at Central European grid points are small, even though only 70% of the storm days agree. While most storms occur during westerly flow situations, specific characteristics of storms during the other CWTs are also considered. Copyright © 2009 Royal Meteorological Society
Resumo:
The aim of this in vitro study was to evaluate the effect of different bur types and acid etching protocols on the shear bond strength (SBS) of a resin modified glass ionomer cement (RM-GIC) to primary dentin. Forty-eight clinically sound human primary molars were selected and randomly assigned to four groups (n=12). In G1, the lingual surface of the teeth was cut with a carbide bur until a 2.0-mm-diameter dentin area was exposed, followed by the application of RM-GIC (Vitremer - 3M/ESPE) prepared according to the manufacturer's instructions. The specimens of G2, received the same treatment of G1, however the dentin was conditioned with phosphoric acid. In groups G3 and G4 the same procedures of G1 and G2 were conducted respectively, nevertheless dentin cutting was made with a diamond bur. The specimens were stored in distilled water at 37 degrees C for 24h, and then tested in a universal testing machine. SBS. data were submitted to 2-way ANOVA (= 5%) and indicated that SBS values of RM-GIC bonded to primary dentin cut with different burs were not statistically different, but the specimens that were conditioned with phosphoric acid presented SBS values significantly higher that those without conditioning. To observe micromorphologic characteristics of the effects of dentin surface cut by diamond or carbide rotary instruments and conditioners treatment, some specimens were examined by scanning electron microscopy. Smear layer was present in all specimens regardless of the type of rotary instrument used for dentin cutting, and specimens etched with phosphoric acid presented more effective removal of smear layer. It was concluded that SBS of a RM-GIC to primary dentin was affected by the acid conditioning but the bur type had no influence.
Resumo:
Each plasma physics laboratory has a proprietary scheme to control and data acquisition system. Usually, it is different from one laboratory to another. It means that each laboratory has its own way to control the experiment and retrieving data from the database. Fusion research relies to a great extent on international collaboration and this private system makes it difficult to follow the work remotely. The TCABR data analysis and acquisition system has been upgraded to support a joint research programme using remote participation technologies. The choice of MDSplus (Model Driven System plus) is proved by the fact that it is widely utilized, and the scientists from different institutions may use the same system in different experiments in different tokamaks without the need to know how each system treats its acquisition system and data analysis. Another important point is the fact that the MDSplus has a library system that allows communication between different types of language (JAVA, Fortran, C, C++, Python) and programs such as MATLAB, IDL, OCTAVE. In the case of tokamak TCABR interfaces (object of this paper) between the system already in use and MDSplus were developed, instead of using the MDSplus at all stages, from the control, and data acquisition to the data analysis. This was done in the way to preserve a complex system already in operation and otherwise it would take a long time to migrate. This implementation also allows add new components using the MDSplus fully at all stages. (c) 2012 Elsevier B.V. All rights reserved.
Resumo:
We describe the current status of and provide performance results for a prototype compiler of Prolog to C, ciaocc. ciaocc is novel in that it is designed to accept different kinds of high-level information, typically obtained via an automatic analysis of the initial Prolog program and expressed in a standardized language of assertions. This information is used to optimize the resulting C code, which is then processed by an off-the-shelf C compiler. The basic translation process essentially mimics the unfolding of a bytecode emulator with respect to the particular bytecode corresponding to the Prolog program. This is facilitated by a flexible design of the instructions and their lower-level components. This approach allows reusing a sizable amount of the machinery of the bytecode emulator: predicates already written in C, data definitions, memory management routines and áreas, etc., as well as mixing emulated bytecode with native code in a relatively straightforward way. We report on the performance of programs compiled by the current versión of the system, both with and without analysis information.