864 resultados para Tree based intercrop systems
Resumo:
This paper presents a multi-class AdaBoost based on incorporating an ensemble of binary AdaBoosts which is organized as Binary Decision Tree (BDT). It is proved that binary AdaBoost is extremely successful in producing accurate classification but it does not perform very well for multi-class problems. To avoid this performance degradation, the multi-class problem is divided into a number of binary problems and binary AdaBoost classifiers are invoked to solve these classification problems. This approach is tested with a dataset consisting of 6500 binary images of traffic signs. Haar-like features of these images are computed and the multi-class AdaBoost classifier is invoked to classify them. A classification rate of 96.7% and 95.7% is achieved for the traffic sign boarders and pictograms, respectively. The proposed approach is also evaluated using a number of standard datasets such as Iris, Wine, Yeast, etc. The performance of the proposed BDT classifier is quite high as compared with the state of the art and it converges very fast to a solution which indicates it as a reliable classifier.
Resumo:
In today’s big data world, data is being produced in massive volumes, at great velocity and from a variety of different sources such as mobile devices, sensors, a plethora of small devices hooked to the internet (Internet of Things), social networks, communication networks and many others. Interactive querying and large-scale analytics are being increasingly used to derive value out of this big data. A large portion of this data is being stored and processed in the Cloud due the several advantages provided by the Cloud such as scalability, elasticity, availability, low cost of ownership and the overall economies of scale. There is thus, a growing need for large-scale cloud-based data management systems that can support real-time ingest, storage and processing of large volumes of heterogeneous data. However, in the pay-as-you-go Cloud environment, the cost of analytics can grow linearly with the time and resources required. Reducing the cost of data analytics in the Cloud thus remains a primary challenge. In my dissertation research, I have focused on building efficient and cost-effective cloud-based data management systems for different application domains that are predominant in cloud computing environments. In the first part of my dissertation, I address the problem of reducing the cost of transactional workloads on relational databases to support database-as-a-service in the Cloud. The primary challenges in supporting such workloads include choosing how to partition the data across a large number of machines, minimizing the number of distributed transactions, providing high data availability, and tolerating failures gracefully. I have designed, built and evaluated SWORD, an end-to-end scalable online transaction processing system, that utilizes workload-aware data placement and replication to minimize the number of distributed transactions that incorporates a suite of novel techniques to significantly reduce the overheads incurred both during the initial placement of data, and during query execution at runtime. In the second part of my dissertation, I focus on sampling-based progressive analytics as a means to reduce the cost of data analytics in the relational domain. Sampling has been traditionally used by data scientists to get progressive answers to complex analytical tasks over large volumes of data. Typically, this involves manually extracting samples of increasing data size (progressive samples) for exploratory querying. This provides the data scientists with user control, repeatable semantics, and result provenance. However, such solutions result in tedious workflows that preclude the reuse of work across samples. On the other hand, existing approximate query processing systems report early results, but do not offer the above benefits for complex ad-hoc queries. I propose a new progressive data-parallel computation framework, NOW!, that provides support for progressive analytics over big data. In particular, NOW! enables progressive relational (SQL) query support in the Cloud using unique progress semantics that allow efficient and deterministic query processing over samples providing meaningful early results and provenance to data scientists. NOW! enables the provision of early results using significantly fewer resources thereby enabling a substantial reduction in the cost incurred during such analytics. Finally, I propose NSCALE, a system for efficient and cost-effective complex analytics on large-scale graph-structured data in the Cloud. The system is based on the key observation that a wide range of complex analysis tasks over graph data require processing and reasoning about a large number of multi-hop neighborhoods or subgraphs in the graph; examples include ego network analysis, motif counting in biological networks, finding social circles in social networks, personalized recommendations, link prediction, etc. These tasks are not well served by existing vertex-centric graph processing frameworks whose computation and execution models limit the user program to directly access the state of a single vertex, resulting in high execution overheads. Further, the lack of support for extracting the relevant portions of the graph that are of interest to an analysis task and loading it onto distributed memory leads to poor scalability. NSCALE allows users to write programs at the level of neighborhoods or subgraphs rather than at the level of vertices, and to declaratively specify the subgraphs of interest. It enables the efficient distributed execution of these neighborhood-centric complex analysis tasks over largescale graphs, while minimizing resource consumption and communication cost, thereby substantially reducing the overall cost of graph data analytics in the Cloud. The results of our extensive experimental evaluation of these prototypes with several real-world data sets and applications validate the effectiveness of our techniques which provide orders-of-magnitude reductions in the overheads of distributed data querying and analysis in the Cloud.
Resumo:
Authentication plays an important role in how we interact with computers, mobile devices, the web, etc. The idea of authentication is to uniquely identify a user before granting access to system privileges. For example, in recent years more corporate information and applications have been accessible via the Internet and Intranet. Many employees are working from remote locations and need access to secure corporate files. During this time, it is possible for malicious or unauthorized users to gain access to the system. For this reason, it is logical to have some mechanism in place to detect whether the logged-in user is the same user in control of the user's session. Therefore, highly secure authentication methods must be used. We posit that each of us is unique in our use of computer systems. It is this uniqueness that is leveraged to "continuously authenticate users" while they use web software. To monitor user behavior, n-gram models are used to capture user interactions with web-based software. This statistical language model essentially captures sequences and sub-sequences of user actions, their orderings, and temporal relationships that make them unique by providing a model of how each user typically behaves. Users are then continuously monitored during software operations. Large deviations from "normal behavior" can possibly indicate malicious or unintended behavior. This approach is implemented in a system called Intruder Detector (ID) that models user actions as embodied in web logs generated in response to a user's actions. User identification through web logs is cost-effective and non-intrusive. We perform experiments on a large fielded system with web logs of approximately 4000 users. For these experiments, we use two classification techniques; binary and multi-class classification. We evaluate model-specific differences of user behavior based on coarse-grain (i.e., role) and fine-grain (i.e., individual) analysis. A specific set of metrics are used to provide valuable insight into how each model performs. Intruder Detector achieves accurate results when identifying legitimate users and user types. This tool is also able to detect outliers in role-based user behavior with optimal performance. In addition to web applications, this continuous monitoring technique can be used with other user-based systems such as mobile devices and the analysis of network traffic.
Resumo:
In this paper, we present a case-based reasoning (CBR) approach solving educational time-tabling problems. Following the basic idea behind CBR, the solutions of previously solved problems are employed to aid finding the solutions for new problems. A list of feature-value pairs is insufficient to represent all the necessary information. We show that attribute graphs can represent more information and thus can help to retrieve re-usable cases that have similar structures to the new problems. The case base is organised as a decision tree to store the attribute graphs of solved problems hierarchically. An example is given to illustrate the retrieval, re-use and adaptation of structured cases. The results from our experiments show the effectiveness of the retrieval and adaptation in the proposed method.
Resumo:
The structured representation of cases by attribute graphs in a Case-Based Reasoning (CBR) system for course timetabling has been the subject of previous research by the authors. In that system, the case base is organised as a decision tree and the retrieval process chooses those cases which are sub attribute graph isomorphic to the new case. The drawback of that approach is that it is not suitable for solving large problems. This paper presents a multiple-retrieval approach that partitions a large problem into small solvable sub-problems by recursively inputting the unsolved part of the graph into the decision tree for retrieval. The adaptation combines the retrieved partial solutions of all the partitioned sub-problems and employs a graph heuristic method to construct the whole solution for the new case. We present a methodology which is not dependant upon problem specific information and which, as such, represents an approach which underpins the goal of building more general timetabling systems. We also explore the question of whether this multiple-retrieval CBR could be an effective initialisation method for local search methods such as Hill Climbing, Tabu Search and Simulated Annealing. Significant results are obtained from a wide range of experiments. An evaluation of the CBR system is presented and the impact of the approach on timetabling research is discussed. We see that the approach does indeed represent an effective initialisation method for these approaches.
Resumo:
We show a simulation model for capacity analysis in mobile systems using a geographic information system (GIS) based tool, used for coverage calculations and frequency assignment, and MATLAB. The model was developed initially for “narrowband” CDMA and TDMA, but was modified for WCDMA. We show also some results for a specific case in “narrowband” CDMA
Resumo:
The structured representation of cases by attribute graphs in a Case-Based Reasoning (CBR) system for course timetabling has been the subject of previous research by the authors. In that system, the case base is organised as a decision tree and the retrieval process chooses those cases which are sub attribute graph isomorphic to the new case. The drawback of that approach is that it is not suitable for solving large problems. This paper presents a multiple-retrieval approach that partitions a large problem into small solvable sub-problems by recursively inputting the unsolved part of the graph into the decision tree for retrieval. The adaptation combines the retrieved partial solutions of all the partitioned sub-problems and employs a graph heuristic method to construct the whole solution for the new case. We present a methodology which is not dependant upon problem specific information and which, as such, represents an approach which underpins the goal of building more general timetabling systems. We also explore the question of whether this multiple-retrieval CBR could be an effective initialisation method for local search methods such as Hill Climbing, Tabu Search and Simulated Annealing. Significant results are obtained from a wide range of experiments. An evaluation of the CBR system is presented and the impact of the approach on timetabling research is discussed. We see that the approach does indeed represent an effective initialisation method for these approaches.
Resumo:
Agricultural land has been identified as a potential source of greenhouse gas emissions offsets through biosequestration in vegetation and soil. In the extensive grazing land of Australia, landholders may participate in the Australian Government’s Emissions Reduction Fund and create offsets by reducing woody vegetation clearing and allowing native woody plant regrowth to grow. This study used bioeconomic modelling to evaluate the trade-offs between an existing central Queensland grazing operation, which has been using repeated tree clearing to maintain pasture growth, and an alternative carbon and grazing enterprise in which tree clearing is reduced and the additional carbon sequestered in trees is sold. The results showed that ceasing clearing in favour of producing offsets produces a higher net present value over 20 years than the existing cattle enterprise at carbon prices, which are close to current (2015) market levels (~$13 t–1 CO2-e). However, by modifying key variables, relative profitability did change. Sensitivity analysis evaluated key variables, which determine the relative profitability of carbon and cattle. In order of importance these were: the carbon price, the gross margin of cattle production, the severity of the tree–grass relationship, the area of regrowth retained, the age of regrowth at the start of the project, and to a lesser extent the cost of carbon project administration, compliance and monitoring. Based on the analysis, retaining regrowth to generate carbon income may be worthwhile for cattle producers in Australia, but careful consideration needs to be given to the opportunity cost of reduced cattle income.
Resumo:
Accurate estimation of road pavement geometry and layer material properties through the use of proper nondestructive testing and sensor technologies is essential for evaluating pavement’s structural condition and determining options for maintenance and rehabilitation. For these purposes, pavement deflection basins produced by the nondestructive Falling Weight Deflectometer (FWD) test data are commonly used. The nondestructive FWD test drops weights on the pavement to simulate traffic loads and measures the created pavement deflection basins. Backcalculation of pavement geometry and layer properties using FWD deflections is a difficult inverse problem, and the solution with conventional mathematical methods is often challenging due to the ill-posed nature of the problem. In this dissertation, a hybrid algorithm was developed to seek robust and fast solutions to this inverse problem. The algorithm is based on soft computing techniques, mainly Artificial Neural Networks (ANNs) and Genetic Algorithms (GAs) as well as the use of numerical analysis techniques to properly simulate the geomechanical system. A widely used pavement layered analysis program ILLI-PAVE was employed in the analyses of flexible pavements of various pavement types; including full-depth asphalt and conventional flexible pavements, were built on either lime stabilized soils or untreated subgrade. Nonlinear properties of the subgrade soil and the base course aggregate as transportation geomaterials were also considered. A computer program, Soft Computing Based System Identifier or SOFTSYS, was developed. In SOFTSYS, ANNs were used as surrogate models to provide faster solutions of the nonlinear finite element program ILLI-PAVE. The deflections obtained from FWD tests in the field were matched with the predictions obtained from the numerical simulations to develop SOFTSYS models. The solution to the inverse problem for multi-layered pavements is computationally hard to achieve and is often not feasible due to field variability and quality of the collected data. The primary difficulty in the analysis arises from the substantial increase in the degree of non-uniqueness of the mapping from the pavement layer parameters to the FWD deflections. The insensitivity of some layer properties lowered SOFTSYS model performances. Still, SOFTSYS models were shown to work effectively with the synthetic data obtained from ILLI-PAVE finite element solutions. In general, SOFTSYS solutions very closely matched the ILLI-PAVE mechanistic pavement analysis results. For SOFTSYS validation, field collected FWD data were successfully used to predict pavement layer thicknesses and layer moduli of in-service flexible pavements. Some of the very promising SOFTSYS results indicated average absolute errors on the order of 2%, 7%, and 4% for the Hot Mix Asphalt (HMA) thickness estimation of full-depth asphalt pavements, full-depth pavements on lime stabilized soils and conventional flexible pavements, respectively. The field validations of SOFTSYS data also produced meaningful results. The thickness data obtained from Ground Penetrating Radar testing matched reasonably well with predictions from SOFTSYS models. The differences observed in the HMA and lime stabilized soil layer thicknesses observed were attributed to deflection data variability from FWD tests. The backcalculated asphalt concrete layer thickness results matched better in the case of full-depth asphalt flexible pavements built on lime stabilized soils compared to conventional flexible pavements. Overall, SOFTSYS was capable of producing reliable thickness estimates despite the variability of field constructed asphalt layer thicknesses.