Biblioteca Digital

4 resultados para tree-ensemble models

em Digital Commons at Florida International University

Oxygen isotope ratios of cellulose-derived phenylglucosazone: An improved paleoclimate indicator of environmental water and relative humidity

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Oxygen atoms within fossil wood provide high-resolution records of climate change, particularly for the Quaternary. However, current analysis methods of fossil cellulose do not differentiate between different positions of the oxygen atoms. Here, we propose a refinement to tree-cellulose paleoclimatology modeling, using the cellulose-derived compound phenylglucosazone as the isotopic substrate. Stem samples from trees were collected at northern latitudes as low as 24°37′N and as high as 69°00′N. We extracted stem water and cellulose from each stem sample and analyzed them for their 18O content. In addition, we derived the cellulose to phenylglucosazone, a compound which lacks the oxygen attached to the second carbon of the cellulose–glucose moieties. Oxygen isotope analysis of phenylglucosazone allowed us to calculate the 18O content of the oxygen attached to the second carbon of the cellulose–glucose moieties. By way of these analyses, we tested two hypotheses: first, that the 18O content of the oxygen attached to second carbon will more closely reflect the 18O content of the stem water, and will not resemble the 18O content of either cellulose or its derivative phenylglucosazone. Second, tree-ring models that incorporate the variable oxygen isotope fractionation shown here and elsewhere are more accurate than those that do not. Our first hypothesis was rejected on the basis that the oxygen isotope ratios of the oxygen attached to the second carbon of the glucose moieties had a noisy isotopic signal with a large standard deviation and gave the poorest correlation with the oxygen isotope ratios of stem water. Related to this isotopic noise, we observed that the correlation between oxygen isotope ratios of phenylglucosazone with both stem water and relative humidity were higher than those observed for cellulose. Our hypothesis about tree-ring models which account for changes in the oxygen isotopic fractionation during cellulose synthesis was consistent only for the 18O content of phenylglucosazone. We showed that the tree-ring model based on the 18O content of phenylglucosazone was an improvement over existing models that are based on whole cellulose. Additionally, this approach may be used in other cellulose based archives such as peat deposits and lacustrine sediments.

Veja mais

Development of prediction models for freeway incident durations using data mining techniques

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The nation's freeway systems are becoming increasingly congested. A major contribution to traffic congestion on freeways is due to traffic incidents. Traffic incidents are non-recurring events such as accidents or stranded vehicles that cause a temporary roadway capacity reduction, and they can account for as much as 60 percent of all traffic congestion on freeways. One major freeway incident management strategy involves diverting traffic to avoid incident locations by relaying timely information through Intelligent Transportation Systems (ITS) devices such as dynamic message signs or real-time traveler information systems. The decision to divert traffic depends foremost on the expected duration of an incident, which is difficult to predict. In addition, the duration of an incident is affected by many contributing factors. Determining and understanding these factors can help the process of identifying and developing better strategies to reduce incident durations and alleviate traffic congestion. A number of research studies have attempted to develop models to predict incident durations, yet with limited success. ^ This dissertation research attempts to improve on this previous effort by applying data mining techniques to a comprehensive incident database maintained by the District 4 ITS Office of the Florida Department of Transportation (FDOT). Two categories of incident duration prediction models were developed: "offline" models designed for use in the performance evaluation of incident management programs, and "online" models for real-time prediction of incident duration to aid in the decision making of traffic diversion in the event of an ongoing incident. Multiple data mining analysis techniques were applied and evaluated in the research. The multiple linear regression analysis and decision tree based method were applied to develop the offline models, and the rule-based method and a tree algorithm called M5P were used to develop the online models. ^ The results show that the models in general can achieve high prediction accuracy within acceptable time intervals of the actual durations. The research also identifies some new contributing factors that have not been examined in past studies. As part of the research effort, software code was developed to implement the models in the existing software system of District 4 FDOT for actual applications. ^

Veja mais

Ensemble Stream Model for Data-Cleaning in Sensor Networks

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ensemble Stream Modeling and Data-cleaning are sensor information processing systems have different training and testing methods by which their goals are cross-validated. This research examines a mechanism, which seeks to extract novel patterns by generating ensembles from data. The main goal of label-less stream processing is to process the sensed events to eliminate the noises that are uncorrelated, and choose the most likely model without over fitting thus obtaining higher model confidence. Higher quality streams can be realized by combining many short streams into an ensemble which has the desired quality. The framework for the investigation is an existing data mining tool. First, to accommodate feature extraction such as a bush or natural forest-fire event we make an assumption of the burnt area (BA*), sensed ground truth as our target variable obtained from logs. Even though this is an obvious model choice the results are disappointing. The reasons for this are two: One, the histogram of fire activity is highly skewed. Two, the measured sensor parameters are highly correlated. Since using non descriptive features does not yield good results, we resort to temporal features. By doing so we carefully eliminate the averaging effects; the resulting histogram is more satisfactory and conceptual knowledge is learned from sensor streams. Second is the process of feature induction by cross-validating attributes with single or multi-target variables to minimize training error. We use F-measure score, which combines precision and accuracy to determine the false alarm rate of fire events. The multi-target data-cleaning trees use information purity of the target leaf-nodes to learn higher order features. A sensitive variance measure such as ƒ-test is performed during each node's split to select the best attribute. Ensemble stream model approach proved to improve when using complicated features with a simpler tree classifier. The ensemble framework for data-cleaning and the enhancements to quantify quality of fitness (30% spatial, 10% temporal, and 90% mobility reduction) of sensor led to the formation of streams for sensor-enabled applications. Which further motivates the novelty of stream quality labeling and its importance in solving vast amounts of real-time mobile streams generated today.

Veja mais

Ensemble Stream Model for Data-Cleaning in Sensor Networks

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ensemble Stream Modeling and Data-cleaning are sensor information processing systems have different training and testing methods by which their goals are cross-validated. This research examines a mechanism, which seeks to extract novel patterns by generating ensembles from data. The main goal of label-less stream processing is to process the sensed events to eliminate the noises that are uncorrelated, and choose the most likely model without over fitting thus obtaining higher model confidence. Higher quality streams can be realized by combining many short streams into an ensemble which has the desired quality. The framework for the investigation is an existing data mining tool. First, to accommodate feature extraction such as a bush or natural forest-fire event we make an assumption of the burnt area (BA*), sensed ground truth as our target variable obtained from logs. Even though this is an obvious model choice the results are disappointing. The reasons for this are two: One, the histogram of fire activity is highly skewed. Two, the measured sensor parameters are highly correlated. Since using non descriptive features does not yield good results, we resort to temporal features. By doing so we carefully eliminate the averaging effects; the resulting histogram is more satisfactory and conceptual knowledge is learned from sensor streams. Second is the process of feature induction by cross-validating attributes with single or multi-target variables to minimize training error. We use F-measure score, which combines precision and accuracy to determine the false alarm rate of fire events. The multi-target data-cleaning trees use information purity of the target leaf-nodes to learn higher order features. A sensitive variance measure such as f-test is performed during each node’s split to select the best attribute. Ensemble stream model approach proved to improve when using complicated features with a simpler tree classifier. The ensemble framework for data-cleaning and the enhancements to quantify quality of fitness (30% spatial, 10% temporal, and 90% mobility reduction) of sensor led to the formation of streams for sensor-enabled applications. Which further motivates the novelty of stream quality labeling and its importance in solving vast amounts of real-time mobile streams generated today.

Veja mais

4 resultados para tree-ensemble models

em Digital Commons at Florida International University

Filtro por publicador

Oxygen isotope ratios of cellulose-derived phenylglucosazone: An improved paleoclimate indicator of environmental water and relative humidity

Development of prediction models for freeway incident durations using data mining techniques

Ensemble Stream Model for Data-Cleaning in Sensor Networks

Ensemble Stream Model for Data-Cleaning in Sensor Networks