949 resultados para data types and operators


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data in germplasm collections contain a mixture of data types; binary, multistate and quantitative. Given the multivariate nature of these data, the pattern analysis methods of classification and ordination have been identified as suitable techniques for statistically evaluating the available diversity. The proximity (or resemblance) measure, which is in part the basis of the complementary nature of classification and ordination techniques, is often specific to particular data types. The use of a combined resemblance matrix has an advantage over data type specific proximity measures. This measure accommodates the different data types without manipulating them to be of a specific type. Descriptors are partitioned into their data types and an appropriate proximity measure is used on each. The separate proximity matrices, after range standardisation, are added as a weighted average and the combined resemblance matrix is then used for classification and ordination. Germplasm evaluation data for 831 accessions of groundnut (Arachis hypogaea L.) from the Australian Tropical Field Crops Genetic Resource Centre, Biloela, Queensland were examined. Data for four binary, five ordered multistate and seven quantitative descriptors have been documented. The interpretative value of different weightings - equal and unequal weighting of data types to obtain a combined resemblance matrix - was investigated by using principal co-ordinate analysis (ordination) and hierarchical cluster analysis. Equal weighting of data types was found to be more valuable for these data as the results provided a greater insight into the patterns of variability available in the Australian groundnut germplasm collection. The complementary nature of pattern analysis techniques enables plant breeders to identify relevant accessions in relation to the descriptors which distinguish amongst them. This additional information may provide plant breeders with a more defined entry point into the germplasm collection for identifying sources of variability for their plant improvement program, thus improving the utilisation of germplasm resources.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we continue our investigation into the development of computational-science software based on the identification and formal specification of Abstract Data Types (ADTs) and their implementation in Fortran 90. In particular, we consider the consequences of using pointers when implementing a formally specified ADT in Fortran 90. Our aim is to highlight the resulting conflict between the goal of information hiding, which is central to the ADT methodology, and the space efficiency of the implementation. We show that the issue of storage recovery cannot be avoided by the ADT user, and present a range of implementations of a simple ADT to illustrate various approaches towards satisfactory storage management. Finally, we propose a set of guidelines for implementing ADTs using pointers in Fortran 90. These guidelines offer a way gracefully to provide disposal operations in Fortran 90. Such an approach is desirable since Fortran 90 does not provide automatic garbage collection which is offered by many object-oriented languages including Eiffel, Java, Smalltalk, and Simula.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Analysis and fusion of social measurements is important to understand what shapes the public’s opinion and the sustainability of the global development. However, modeling data collected from social responses is challenging as the data is typically complex and heterogeneous, which might take the form of stated facts, subjective assessment, choices, preferences or any combination thereof. Model-wise, these responses are a mixture of data types including binary, categorical, multicategorical, continuous, ordinal, count and rank data. The challenge is therefore to effectively handle mixed data in the a unified fusion framework in order to perform inference and analysis. To that end, this paper introduces eRBM (Embedded Restricted Boltzmann Machine) – a probabilistic latent variable model that can represent mixed data using a layer of hidden variables transparent across different types of data. The proposed model can comfortably support largescale data analysis tasks, including distribution modelling, data completion, prediction and visualisation. We demonstrate these versatile features on several moderate and large-scale publicly available social survey datasets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

提出一种移动对象数据库模型——Dynamic Transportation Network Based Moving Objects Database(简称DTNMOD),并给出了DTNMOD中基于移动对象时空轨迹的网络实时动态交通流分析方法.在DTNMOD中,交通网络被表示成动态的时空网络,可以描述交通状态、拓扑结构以及交通参数随时间的变化过程;网络受限的移动对象则用网络移动点表示.DTNMOD模型包含了完整的数据类型和查询操作的定义,因此可以在任何可扩充数据库(如PostgreSQL或SECONDO)中实现,从而得到完整的数据库模型和查询语言.为了对相关模型的性能进行比较与分析,基于PostgreSQL实现了一个原型系统并进行了一系列的实验.实验结果表明,DTNMOD提供了良好的区域查询及连接查询性能.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Compared with structured data sources that are usually stored and analyzed in spreadsheets, relational databases, and single data tables, unstructured construction data sources such as text documents, site images, web pages, and project schedules have been less intensively studied due to additional challenges in data preparation, representation, and analysis. In this paper, our vision for data management and mining addressing such challenges are presented, together with related research results from previous work, as well as our recent developments of data mining on text-based, web-based, image-based, and network-based construction databases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thesis (M.A.)--University of Illinois at Urbana-Champaign.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A preliminary set of heavy metal analyses from surface sediment samples covering the whole Adriatic Basin is presented, and their significance in terms of pollution is discussed. The core samples were analysed for Fe, Mn, Cr, Cu, Ni, Pb, Zn, P, organic carbon, Ca- and Mg-carbonate, and their mineralogical composition and grain size distribution were determined. All heavy metal concentrations found can be attributed to natural sedimentological processes and are not necessarily to be interpreted as indications of pollution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Traffic congestion has a significant impact on the economy and environment. Encouraging the use of multimodal transport (public transport, bicycle, park’n’ride, etc.) has been identified by traffic operators as a good strategy to tackle congestion issues and its detrimental environmental impacts. A multi-modal and multi-objective trip planner provides users with various multi-modal options optimised on objectives that they prefer (cheapest, fastest, safest, etc) and has a potential to reduce congestion on both a temporal and spatial scale. The computation of multi-modal and multi-objective trips is a complicated mathematical problem, as it must integrate and utilize a diverse range of large data sets, including both road network information and public transport schedules, as well as optimising for a number of competing objectives, where fully optimising for one objective, such as travel time, can adversely affect other objectives, such as cost. The relationship between these objectives can also be quite subjective, as their priorities will vary from user to user. This paper will first outline the various data requirements and formats that are needed for the multi-modal multi-objective trip planner to operate, including static information about the physical infrastructure within Brisbane as well as real-time and historical data to predict traffic flow on the road network and the status of public transport. It will then present information on the graph data structures representing the road and public transport networks within Brisbane that are used in the trip planner to calculate optimal routes. This will allow for an investigation into the various shortest path algorithms that have been researched over the last few decades, and provide a foundation for the construction of the Multi-modal Multi-objective Trip Planner by the development of innovative new algorithms that can operate the large diverse data sets and competing objectives.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The quality of data collection methods selected and the integrity of the data collected are integral tot eh success of a study. This chapter focuses on data collection and study validity. After reading the chapter, readers should be able to define types of data collection methods in quantitative research; list advantages and disadvantages of each method; discuss factors related to internal and external validity; critically evaluate data collection methods and discuss the need to operationalise variables of interest for data collection.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper firstly presents the benefits and critical challenges on the use of Bluetooth and Wi-Fi for crowd data collection and monitoring. The major challenges include antenna characteristics, environment’s complexity and scanning features. Wi-Fi and Bluetooth are compared in this paper in terms of architecture, discovery time, popularity of use and signal strength. Type of antennas used and the environment’s complexity such as trees for outdoor and partitions for indoor spaces highly affect the scanning range. The aforementioned challenges are empirically evaluated by “real” experiments using Bluetooth and Wi-Fi Scanners. The issues related to the antenna characteristics are also highlighted by experimenting with different antenna types. Novel scanning approaches including Overlapped Zones and Single Point Multi-Range detection methods will be then presented and verified by real-world tests. These novel techniques will be applied for location identification of the MAC IDs captured that can extract more information about people movement dynamics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As a sequel to a paper that dealt with the analysis of two-way quantitative data in large germplasm collections, this paper presents analytical methods appropriate for two-way data matrices consisting of mixed data types, namely, ordered multicategory and quantitative data types. While various pattern analysis techniques have been identified as suitable for analysis of the mixed data types which occur in germplasm collections, the clustering and ordination methods used often can not deal explicitly with the computational consequences of large data sets (i.e. greater than 5000 accessions) with incomplete information. However, it is shown that the ordination technique of principal component analysis and the mixture maximum likelihood method of clustering can be employed to achieve such analyses. Germplasm evaluation data for 11436 accessions of groundnut (Arachis hypogaea L.) from the International Research Institute of the Semi-Arid Tropics, Andhra Pradesh, India were examined. Data for nine quantitative descriptors measured in the post-rainy season and five ordered multicategory descriptors were used. Pattern analysis results generally indicated that the accessions could be distinguished into four regions along the continuum of growth habit (or plant erectness). Interpretation of accession membership in these regions was found to be consistent with taxonomic information, such as subspecies. Each growth habit region contained accessions from three of the most common groundnut botanical varieties. This implies that within each of the habit types there is the full range of expression for the other descriptors used in the analysis. Using these types of insights, the patterns of variability in germplasm collections can provide scientists with valuable information for their plant improvement programs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A tag-based item recommendation method generates an ordered list of items, likely interesting to a particular user, using the users past tagging behaviour. However, the users tagging behaviour varies in different tagging systems. A potential problem in generating quality recommendation is how to build user profiles, that interprets user behaviour to be effectively used, in recommendation models. Generally, the recommendation methods are made to work with specific types of user profiles, and may not work well with different datasets. In this paper, we investigate several tagging data interpretation and representation schemes that can lead to building an effective user profile. We discuss the various benefits a scheme brings to a recommendation method by highlighting the representative features of user tagging behaviours on a specific dataset. Empirical analysis shows that each interpretation scheme forms a distinct data representation which eventually affects the recommendation result. Results on various datasets show that an interpretation scheme should be selected based on the dominant usage in the tagging data (i.e. either higher amount of tags or higher amount of items present). The usage represents the characteristic of user tagging behaviour in the system. The results also demonstrate how the scheme is able to address the cold-start user problem.