933 resultados para SLIM-TREES
Resumo:
Decision Trees need train samples in the train data set to get classification rules. If the number of train data was too small, the important information might be missed and thus the model could not explain the classification rules of data. While it is not affirmative that large scale of train data set can get well model. This Paper analysis the relationship between decision trees and the train data scale. We use nine decision tree algorithms to experiment the accuracy, complexity and robustness of decision tree algorithms. Some results are demonstrated.
Resumo:
Progress report for the Trees and Tweets, Digging into Data Challenge round 3, project.
Resumo:
ENGLISH: We analyzed catches per unit of effort (CPUE) from the Japanese longline fishery for bigeye tuna (Thunnus obesus) in the central and eastern Pacific Ocean (EPO) with regression tree methods. Regression trees have not previously been used to estimate time series of abundance indices fronl CPUE data. The "optimally sized" tree had 139 parameters; year, month, latitude, and longitude interacted to affect bigeye CPUE. The trend in tree-based abundance indices for the EPO was similar to trends estimated from a generalized linear model and fronl an empirical model that combines oceanographic data with information on the distribution of fish relative to environmental conditions. The regression tree was more parsimonious and would be easier to implement than the other two nl0dels, but the tree provided no information about the nlechanisms that caused bigeye CPUEs to vary in time and space. Bigeye CPUEs increased sharply during the mid-1980's and were more variable at the northern and southern edges of the fishing grounds. Both of these results can be explained by changes in actual abundance and changes in catchability. Results from a regression tree that was fitted to a subset of the data indicated that, in the EPO, bigeye are about equally catchable with regular and deep longlines. This is not consistent with observations that bigeye are more abundant at depth and indicates that classification by gear type (regular or deep longline) may not provide a good measure of capture depth. Asimulated annealing algorithm was used to summarize the tree-based results by partitioning the fishing grounds into regions where trends in bigeye CPUE were similar. Simulated annealing can be useful for designing spatial strata in future sampling programs. SPANISH: Analizamos la captura por unidad de esfuerzo (CPUE) de la pesquería palangrera japonesa de atún patudo (Thunnus obesus) en el Océano Pacifico oriental (OPO) y central con métodos de árbol de regresión. Hasta ahora no se han usado árboles de regresión para estimar series de tiempo de índices de abundancia a partir de datos de CPUE. EI árbol de "tamaño optimo" tuvo 139 parámetros; ano, mes, latitud, y longitud interactuaron para afectar la CPUE de patudo. La tendencia en los índices de abundancia basados en árboles para el OPO fue similar a las tendencias estimadas con un modelo lineal generalizado y con un modelo empírico que combina datos oceanográficos con información sobre la distribución de los peces en relación con las condiciones ambientales. EI árbol de regresión fue mas parsimonioso y seria mas fácil de utilizar que los dos otros modelos, pero no proporciono información sobre los mecanismos que causaron que las CPUE de patudo valiaran en el tiempo y en el espacio. Las CPUE de patudo aumentaron notablemente a mediados de los anos 80 y fueron mas variables en los extremos norte y sur de la zona de pesca. Estos dos resultados pueden ser explicados por cambios en la abundancia real y cambios en la capturabilidad. Los resultados de un arbal de regresión ajustado a un subconjunto de los datos indican que, en el OPO, el patudo es igualmente capturable con palangres regulares y profundos. Esto no es consistente con observaciones de que el patudo abunda mas a profundidad e indica que clasificación por tipo de arte (palangre regular 0 profundo) podría no ser una buena medida de la profundidad de captura. Se uso un algoritmo de templado simulado para resumir los resultados basados en el árbol clasificando las zonas de pesca en zonas con tendencias similares en la CPUE de patudo. El templado simulado podría ser útil para diseñar estratos espaciales en programas futuros de muestreo. (PDF contains 45 pages.)
Resumo:
The σD values of nitrated cellulose from a variety of trees covering a wide geographic range have been measured. These measurements have been used to ascertain which factors are likely to cause σD variations in cellulose C-H hydrogen.
It is found that a primary source of tree σD variation is the σD variation of the environmental precipitation. Superimposed on this are isotopic variations caused by the transpiration of the leaf water incorporated by the tree. The magnitude of this transpiration effect appears to be related to relative humidity.
Within a single tree, it is found that the hydrogen isotope variations which occur for a ring sequence in one radial direction may not be exactly the same as those which occur in a different direction. Such heterogeneities appear most likely to occur in trees with asymmetric ring patterns that contain reaction wood. In the absence of reaction wood such heterogeneities do not seem to occur. Thus, hydrogen isotope analyses of tree ring sequences should be performed on trees which do not contain reaction wood.
Comparisons of tree σD variations with variations in local climate are performed on two levels: spatial and temporal. It is found that the σD values of 20 North American trees from a wide geographic range are reasonably well-correlated with the corresponding average annual temperature. The correlation is similar to that observed for a comparison of the σD values of annual precipitation of 11 North American sites with annual temperature. However, it appears that this correlation is significantly disrupted by trees which grew on poorly drained sites such as those in stagnant marshes. Therefore, site selection may be important in choosing trees for climatic interpretation of σD values, although proper sites do not seem to be uncommon.
The measurement of σD values in 5-year samples from the tree ring sequences of 13 trees from 11 North American sites reveals a variety of relationships with local climate. As it was for the spatial σD vs climate comparison, site selection is also apparently important for temporal tree σD vs climate comparisons. Again, it seems that poorly-drained sites are to be avoided. For nine trees from different "well-behaved" sites, it was found that the local climatic variable best related to the σD variations was not the same for all sites.
Two of these trees showed a strong negative correlation with the amount of local summer precipitation. Consideration of factors likely to influence the isotopic composition of summer rain suggests that rainfall intensity may be important. The higher the intensity, the lower the σD value. Such an effect might explain the negative correlation of σD vs summer precipitation amount for these two trees. A third tree also exhibited a strong correlation with summer climate, but in this instance it was a positive correlation of σD with summer temperature.
The remaining six trees exhibited the best correlation between σD values and local annual climate. However, in none of these six cases was it annual temperature that was the most important variable. In fact annual temperature commonly showed no relationship at all with tree σD values. Instead, it was found that a simple mass balance model incorporating two basic assumptions yielded parameters which produced the best relationships with tree σD values. First, it was assumed that the σD values of these six trees reflected the σD values of annual precipitation incorporated by these trees. Second, it was assumed that the σD value of the annual precipitation was a weighted average of two seasonal isotopic components: summer and winter. Mass balance equations derived from these assumptions yielded combinations of variables that commonly showed a relationship with tree σD values where none had previously been discerned.
It was found for these "well-behaved" trees that not all sample intervals in a σD vs local climate plot fell along a well-defined trend. These departures from the local σD VS climate norm were defined as "anomalous". Some of these anomalous intervals were common to trees from different locales. When such widespread commonalty of an anomalous interval occurred, it was observed that the interval corresponded to an interval in which drought had existed in the North American Great Plains.
Consequently, there appears to be a combination of both local and large scale climatic information in the σD variations of tree cellulose C-H hydrogen.