875 resultados para ENVIRONMENT DATA
Resumo:
The primary aim of this dissertation is to develop data mining tools for knowledge discovery in biomedical data when multiple (homogeneous or heterogeneous) sources of data are available. The central hypothesis is that, when information from multiple sources of data are used appropriately and effectively, knowledge discovery can be better achieved than what is possible from only a single source. ^ Recent advances in high-throughput technology have enabled biomedical researchers to generate large volumes of diverse types of data on a genome-wide scale. These data include DNA sequences, gene expression measurements, and much more; they provide the motivation for building analysis tools to elucidate the modular organization of the cell. The challenges include efficiently and accurately extracting information from the multiple data sources; representing the information effectively, developing analytical tools, and interpreting the results in the context of the domain. ^ The first part considers the application of feature-level integration to design classifiers that discriminate between soil types. The machine learning tools, SVM and KNN, were used to successfully distinguish between several soil samples. ^ The second part considers clustering using multiple heterogeneous data sources. The resulting Multi-Source Clustering (MSC) algorithm was shown to have a better performance than clustering methods that use only a single data source or a simple feature-level integration of heterogeneous data sources. ^ The third part proposes a new approach to effectively incorporate incomplete data into clustering analysis. Adapted from K-means algorithm, the Generalized Constrained Clustering (GCC) algorithm makes use of incomplete data in the form of constraints to perform exploratory analysis. Novel approaches for extracting constraints were proposed. For sufficiently large constraint sets, the GCC algorithm outperformed the MSC algorithm. ^ The last part considers the problem of providing a theme-specific environment for mining multi-source biomedical data. The database called PlasmoTFBM, focusing on gene regulation of Plasmodium falciparum, contains diverse information and has a simple interface to allow biologists to explore the data. It provided a framework for comparing different analytical tools for predicting regulatory elements and for designing useful data mining tools. ^ The conclusion is that the experiments reported in this dissertation strongly support the central hypothesis.^
Resumo:
With the advent of peer to peer networks, and more importantly sensor networks, the desire to extract useful information from continuous and unbounded streams of data has become more prominent. For example, in tele-health applications, sensor based data streaming systems are used to continuously and accurately monitor Alzheimer's patients and their surrounding environment. Typically, the requirements of such applications necessitate the cleaning and filtering of continuous, corrupted and incomplete data streams gathered wirelessly in dynamically varying conditions. Yet, existing data stream cleaning and filtering schemes are incapable of capturing the dynamics of the environment while simultaneously suppressing the losses and corruption introduced by uncertain environmental, hardware, and network conditions. Consequently, existing data cleaning and filtering paradigms are being challenged. This dissertation develops novel schemes for cleaning data streams received from a wireless sensor network operating under non-linear and dynamically varying conditions. The study establishes a paradigm for validating spatio-temporal associations among data sources to enhance data cleaning. To simplify the complexity of the validation process, the developed solution maps the requirements of the application on a geometrical space and identifies the potential sensor nodes of interest. Additionally, this dissertation models a wireless sensor network data reduction system by ascertaining that segregating data adaptation and prediction processes will augment the data reduction rates. The schemes presented in this study are evaluated using simulation and information theory concepts. The results demonstrate that dynamic conditions of the environment are better managed when validation is used for data cleaning. They also show that when a fast convergent adaptation process is deployed, data reduction rates are significantly improved. Targeted applications of the developed methodology include machine health monitoring, tele-health, environment and habitat monitoring, intermodal transportation and homeland security.
Resumo:
Diminishing cultural and biological diversity is a current global crisis. Tropical forests and indigenous peoples are adversely affected by social and environmental changes caused by global political and economic systems. The purpose of this thesis was to investigate environmental and livelihood challenges as well as medicinal plant knowledge in a Yagua village in the Peruvian Amazon. Indigenous peoples’ relationships with the environment is an important topic in environmental anthropology, and traditional botanical knowledge is an integral component of ethnobotany. Political ecology provides a useful theoretical perspective for understanding the economic and political dimensions of environmental and social conditions. This research utilized a variety of ethnographic, ethnobotanical, and community-involved methods. Findings include data and analyses about the community’s culture, subsistence and natural resource needs, organizations and institutions, and medicinal plant use. The conclusion discusses the case study in terms of the disciplinary framework and offers suggestions for research and application.
Resumo:
Prior research has established that idiosyncratic volatility of the securities prices exhibits a positive trend. This trend and other factors have made the merits of investment diversification and portfolio construction more compelling. ^ A new optimization technique, a greedy algorithm, is proposed to optimize the weights of assets in a portfolio. The main benefits of using this algorithm are to: (a) increase the efficiency of the portfolio optimization process, (b) implement large-scale optimizations, and (c) improve the resulting optimal weights. In addition, the technique utilizes a novel approach in the construction of a time-varying covariance matrix. This involves the application of a modified integrated dynamic conditional correlation GARCH (IDCC - GARCH) model to account for the dynamics of the conditional covariance matrices that are employed. ^ The stochastic aspects of the expected return of the securities are integrated into the technique through Monte Carlo simulations. Instead of representing the expected returns as deterministic values, they are assigned simulated values based on their historical measures. The time-series of the securities are fitted into a probability distribution that matches the time-series characteristics using the Anderson-Darling goodness-of-fit criterion. Simulated and actual data sets are used to further generalize the results. Employing the S&P500 securities as the base, 2000 simulated data sets are created using Monte Carlo simulation. In addition, the Russell 1000 securities are used to generate 50 sample data sets. ^ The results indicate an increase in risk-return performance. Choosing the Value-at-Risk (VaR) as the criterion and the Crystal Ball portfolio optimizer, a commercial product currently available on the market, as the comparison for benchmarking, the new greedy technique clearly outperforms others using a sample of the S&P500 and the Russell 1000 securities. The resulting improvements in performance are consistent among five securities selection methods (maximum, minimum, random, absolute minimum, and absolute maximum) and three covariance structures (unconditional, orthogonal GARCH, and integrated dynamic conditional GARCH). ^
Resumo:
We completed a synoptic survey of iron, phosphorus, and sulfur concentrations in shallow marine carbonate sediments from south Florida. Total extracted iron concentrations typically were 50 μmol g-1 dry weight (DW) and tended to decrease away from the Florida mainland, whereas total extracted phosphorus concentrations mostly were 10 μmol g-1 DW and tended to decrease from west to east across Florida Bay. Concentrations of reduced sulfur compounds, up to 40 μmol g-1 DW, tended to covary with sediment iron concentrations, suggesting that sulfide mineral formation was iron-limited. An index of iron availability derived from sediment data was negatively correlated with chlorophyll a concentrations in surface waters, demonstrating the close coupling of sediment-water column processes. Eight months after applying a surface layer of iron oxide granules to experimental plots, sediment iron, phosphorus, and sulfur were elevated to a depth of 10 cm relative to control plots. Biomass of the seagrass Thalassia testudinum was not different between control and iron addition plots, but individual shoot growth rates were significantly higher in experimental plots after 8 months. Although the iron content of leaf tissues was significantly higher from iron addition plots, no difference in phosphorus content of T. testudinum leaves was observed. Iron addition altered plant exposure to free sulfide, documented by a significantly higher δ34S of leaf tissue from experimental plots relative to controls. Iron as a buffer to toxic sulfides may promote individual shoot growth, but phosphorus availability to plants still appears to limit production in carbonate sediments.
Resumo:
Road pricing has emerged as an effective means of managing road traffic demand while simultaneously raising additional revenues to transportation agencies. Research on the factors that govern travel decisions has shown that user preferences may be a function of the demographic characteristics of the individuals and the perceived trip attributes. However, it is not clear what are the actual trip attributes considered in the travel decision- making process, how these attributes are perceived by travelers, and how the set of trip attributes change as a function of the time of the day or from day to day. In this study, operational Intelligent Transportation Systems (ITS) archives are mined and the aggregated preferences for a priced system are extracted at a fine time aggregation level for an extended number of days. The resulting information is related to corresponding time-varying trip attributes such as travel time, travel time reliability, charged toll, and other parameters. The time-varying user preferences and trip attributes are linked together by means of a binary choice model (Logit) with a linear utility function on trip attributes. The trip attributes weights in the utility function are then dynamically estimated for each time of day by means of an adaptive, limited-memory discrete Kalman filter (ALMF). The relationship between traveler choices and travel time is assessed using different rules to capture the logic that best represents the traveler perception and the effect of the real-time information on the observed preferences. The impact of travel time reliability on traveler choices is investigated considering its multiple definitions. It can be concluded based on the results that using the ALMF algorithm allows a robust estimation of time-varying weights in the utility function at fine time aggregation levels. The high correlations among the trip attributes severely constrain the simultaneous estimation of their weights in the utility function. Despite the data limitations, it is found that, the ALMF algorithm can provide stable estimates of the choice parameters for some periods of the day. Finally, it is found that the daily variation of the user sensitivities for different periods of the day resembles a well-defined normal distribution.
Resumo:
With the exponential increasing demands and uses of GIS data visualization system, such as urban planning, environment and climate change monitoring, weather simulation, hydrographic gauge and so forth, the geospatial vector and raster data visualization research, application and technology has become prevalent. However, we observe that current web GIS techniques are merely suitable for static vector and raster data where no dynamic overlaying layers. While it is desirable to enable visual explorations of large-scale dynamic vector and raster geospatial data in a web environment, improving the performance between backend datasets and the vector and raster applications remains a challenging technical issue. This dissertation is to implement these challenging and unimplemented areas: how to provide a large-scale dynamic vector and raster data visualization service with dynamic overlaying layers accessible from various client devices through a standard web browser, and how to make the large-scale dynamic vector and raster data visualization service as rapid as the static one. To accomplish these, a large-scale dynamic vector and raster data visualization geographic information system based on parallel map tiling and a comprehensive performance improvement solution are proposed, designed and implemented. They include: the quadtree-based indexing and parallel map tiling, the Legend String, the vector data visualization with dynamic layers overlaying, the vector data time series visualization, the algorithm of vector data rendering, the algorithm of raster data re-projection, the algorithm for elimination of superfluous level of detail, the algorithm for vector data gridding and re-grouping and the cluster servers side vector and raster data caching.
Resumo:
The hydrologic regime of Shark Slough, the most extensive long hydroperiod marsh in Everglades National Park, is largely controlled by the location, volume, and timing of water delivered to it through several control structures from Water Conservation Areas north of the Park. Where natural or anthropogenic barriers to water flow are present, water management practices in this highly regulated system may result in an uneven distribution of water in the marsh, which may impact regional vegetation patterns. In this paper, we use data from 569 sampling locations along five cross-Slough transects to examine regional vegetation distribution, and to test and describe the association of marsh vegetation with several hydrologic and edaphic parameters. Analysis of vegetation:environment relationships yielded estimates of both mean and variance in soil depth, as well as annual hydroperiod, mean water depth, and 30-day maximum water depth within each cover type during the 1990’s. We found that rank abundances of the three major marsh cover types (Tall Sawgrass, Sparse Sawgrass, and Spikerush Marsh) were identical in all portions of Shark Slough, but regional trends in the relative abundance of individual communities were present. Analysis also indicated clear and consistent differences in the hydrologic regime of three marsh cover types, with hydroperiod and water depths increasing in the order Tall Sawgrass , Sparse Sawgrass , Spikerush Marsh. In contrast, soil depth decreased in the same order. Locally, these differences were quite subtle; within a management unit of Shark Slough, mean annual values for the two water depth parameters varied less than 15 cm among types, and hydroperiods varied by 65 days or less. More significantly, regional variation in hydrology equaled or exceeded the variation attributable to cover type within a small area. For instance, estimated hydroperiods for Tall Sawgrass in Northern Shark Slough were longer than for Spikerush Marsh in any of the other regions. Although some of this regional variation may reflect a natural gradient within the Slough, a large proportion is the result of compartmentalization due to current water management practices within the marsh.We conclude that hydroperiod or water depth are the most important influences on vegetation within management units, and attribute larger scale differences in vegetation pattern to the interactions among soil development, hydrology and fire regime in this pivotal portion of Everglades.
Resumo:
The rapid growth of virtualized data centers and cloud hosting services is making the management of physical resources such as CPU, memory, and I/O bandwidth in data center servers increasingly important. Server management now involves dealing with multiple dissimilar applications with varying Service-Level-Agreements (SLAs) and multiple resource dimensions. The multiplicity and diversity of resources and applications are rendering administrative tasks more complex and challenging. This thesis aimed to develop a framework and techniques that would help substantially reduce data center management complexity.^ We specifically addressed two crucial data center operations. First, we precisely estimated capacity requirements of client virtual machines (VMs) while renting server space in cloud environment. Second, we proposed a systematic process to efficiently allocate physical resources to hosted VMs in a data center. To realize these dual objectives, accurately capturing the effects of resource allocations on application performance is vital. The benefits of accurate application performance modeling are multifold. Cloud users can size their VMs appropriately and pay only for the resources that they need; service providers can also offer a new charging model based on the VMs performance instead of their configured sizes. As a result, clients will pay exactly for the performance they are actually experiencing; on the other hand, administrators will be able to maximize their total revenue by utilizing application performance models and SLAs. ^ This thesis made the following contributions. First, we identified resource control parameters crucial for distributing physical resources and characterizing contention for virtualized applications in a shared hosting environment. Second, we explored several modeling techniques and confirmed the suitability of two machine learning tools, Artificial Neural Network and Support Vector Machine, to accurately model the performance of virtualized applications. Moreover, we suggested and evaluated modeling optimizations necessary to improve prediction accuracy when using these modeling tools. Third, we presented an approach to optimal VM sizing by employing the performance models we created. Finally, we proposed a revenue-driven resource allocation algorithm which maximizes the SLA-generated revenue for a data center.^
Resumo:
Public opinion polls in the United States reveal that a great majority of Americans are aware and show concern about ecological issues and the need to preserve natural areas. In South Florida, natural resources have been subjected to enormous strain as the pressure to accommodate a growing population has led to rapid development. Suburbs have been built on areas that were once natural wetlands and farmlands, and the impact today shows a landscape where natural places have all but disappeared. This dissertation examines the intersection between the perceptions that individuals living in the South Florida region have with respect to the natural environment and local ecological problems with where their relationship to nature takes place. ^ The research is based upon both quantitative and qualitative data. The principal methodology used in this research is the ethnographic method, which employed the data gathering techniques of in-depth interviewing and participant observation. The objective of the qualitative portion of the study was to determine how people perceive and relate to their immediate environment. The quantitative portion of the study employed telephone survey data from the FIU/Florida Poll 2000. Data collected through this survey provided the basis to statistically test responses to the research questions. ^ The findings show that people in South Florida have a general idea of the relationship between the human population and the environment but very little knowledge of how they individually affect each other. The experience of private places and public spaces in everyday life permits people to compartmentalize cultural values and understandings of the natural world in separate cognitive schemas. The appreciation of the natural world has almost no connection to their personal sense of obligation to preserve the environment. That obligation is only felt in their home space even though the South Florida environment overall struggles desperately with water shortages, land encroachment, and a rapidly expanding human population whose activities continuously aggravate an already delicate natural balance. ^
Resumo:
The deployment of wireless communications coupled with the popularity of portable devices has led to significant research in the area of mobile data caching. Prior research has focused on the development of solutions that allow applications to run in wireless environments using proxy based techniques. Most of these approaches are semantic based and do not provide adequate support for representing the context of a user (i.e., the interpreted human intention.). Although the context may be treated implicitly it is still crucial to data management. In order to address this challenge this dissertation focuses on two characteristics: how to predict (i) the future location of the user and (ii) locations of the fetched data where the queried data item has valid answers. Using this approach, more complete information about the dynamics of an application environment is maintained. ^ The contribution of this dissertation is a novel data caching mechanism for pervasive computing environments that can adapt dynamically to a mobile user's context. In this dissertation, we design and develop a conceptual model and context aware protocols for wireless data caching management. Our replacement policy uses the validity of the data fetched from the server and the neighboring locations to decide which of the cache entries is less likely to be needed in the future, and therefore a good candidate for eviction when cache space is needed. The context aware driven prefetching algorithm exploits the query context to effectively guide the prefetching process. The query context is defined using a mobile user's movement pattern and requested information context. Numerical results and simulations show that the proposed prefetching and replacement policies significantly outperform conventional ones. ^ Anticipated applications of these solutions include biomedical engineering, tele-health, medical information systems and business. ^
Resumo:
Amidst concerns about achieving high levels of technology to remain competitive in the global market without compromising economic development, national economies are experiencing a high demand for human capital. As higher education is assumed to be the main source of human capital, this analysis focused on a more specific and less explored area of the generally accepted idea that higher education contributes to economic growth. The purpose of this study, therefore, was to find whether higher education also contributes to economic development, and whether that contribution is more substantial in a globalized context. ^ Consequently, a multiple linear regression analysis was conducted to support with statistical significance the answer to the research question: Does higher education contributes to economic development in the context of globalization? The information analyzed was obtained from historical data of 91 selected countries, and the period of time of the study was 10 years (1990–2000). Some variables, however, were lagged back 5, 10 or 15 years along a 15-year timeframe (1975–1990). The resulting comparative static model was based on the Cobb-Douglas production function and the Solow model to specify economic growth as a function of physical capital, labor, technology, and productivity. Then, formal education, economic development, and globalization were added to the equation. ^ The findings of this study supported the assumption that the independent contribution of the changes in higher education completion and globalization to changes in economic growth is more substantial than the contribution of their interaction. The results also suggested that changes in higher and secondary education completion contribute much more to changes in economic growth in less developed countries than in their more developed counterparts. ^ As a conclusion, based on the results of this study, I proposed the implementation of public policy in less developed countries to promote and expand adequate secondary and higher education systems with the purpose of helping in the achievement of economic development. I also recommended further research efforts on this topic to emphasize the contribution of education to the economy, mainly in less developed countries. ^
Resumo:
The purpose of this study was to ascertain the perceptions of educators at one elementary school regarding the changes in the teaching and learning environment and their related effects following the implementation of Florida's A+ high-stakes accountability system. This study also assessed whether these changes were identified by participants as meaningful and enduring, in terms of the definition by Lieberman and Miller (1999). Twenty-one educators, including 17 teachers and four administrators, at Blue Ribbon Elementary school were interviewed. Data were inductively coded and categorized into four major themes: (a) teaching and learning environment consistency, (b) changes in the teaching and learning environment since the implementation of A+, (c) effects of the changes, and (d) significant and enduring change. Findings fell into three categories (a) identified changes since A+ implementation, (b) effects of changes, and (c) what participants believed was significant and long term change, which included those characteristics of the school that had been identified as consistent in the teaching and learning environment. Statements of the participants explained their perceptions about what instructional decisions where made in response to the A+ Plan including the modification of curriculum, the addition or omission of subject matter taught, and the positive or negative impact these decisions had on the teaching and learning environment. It was found that study participants felt all changes and their effects were a direct result of the A+ Plan and viewed many of the changes as being neither significant nor long term Analysis of the educators' perceptions of the changes they experienced revealed the overall feeling that the changes were not indicative of what was necessary to make a school successful. For the participants, the changes lacked the characteristics that they had described as vital in what constituted success. This led to the conclusion that, by Lieberman and Miller's definition, the majority of changes and effects that were implemented at the school as a result of the mandated A+ Plan, were not meaningful and enduring for effective school reform.
Resumo:
With the advent of peer to peer networks, and more importantly sensor networks, the desire to extract useful information from continuous and unbounded streams of data has become more prominent. For example, in tele-health applications, sensor based data streaming systems are used to continuously and accurately monitor Alzheimer's patients and their surrounding environment. Typically, the requirements of such applications necessitate the cleaning and filtering of continuous, corrupted and incomplete data streams gathered wirelessly in dynamically varying conditions. Yet, existing data stream cleaning and filtering schemes are incapable of capturing the dynamics of the environment while simultaneously suppressing the losses and corruption introduced by uncertain environmental, hardware, and network conditions. Consequently, existing data cleaning and filtering paradigms are being challenged. This dissertation develops novel schemes for cleaning data streams received from a wireless sensor network operating under non-linear and dynamically varying conditions. The study establishes a paradigm for validating spatio-temporal associations among data sources to enhance data cleaning. To simplify the complexity of the validation process, the developed solution maps the requirements of the application on a geometrical space and identifies the potential sensor nodes of interest. Additionally, this dissertation models a wireless sensor network data reduction system by ascertaining that segregating data adaptation and prediction processes will augment the data reduction rates. The schemes presented in this study are evaluated using simulation and information theory concepts. The results demonstrate that dynamic conditions of the environment are better managed when validation is used for data cleaning. They also show that when a fast convergent adaptation process is deployed, data reduction rates are significantly improved. Targeted applications of the developed methodology include machine health monitoring, tele-health, environment and habitat monitoring, intermodal transportation and homeland security.
Resumo:
Diminishing cultural and biological diversity is a current global crisis. Tropical forests and indigenous peoples are adversely affected by social and environmental changes caused by global political and economic systems. The purpose of this thesis was to investigate environmental and livelihood challenges as well as medicinal plant knowledge in a Yagua village in the Peruvian Amazon. Indigenous peoples’ relationships with the environment is an important topic in environmental anthropology, and traditional botanical knowledge is an integral component of ethnobotany. Political ecology provides a useful theoretical perspective for understanding the economic and political dimensions of environmental and social conditions. This research utilized a variety of ethnographic, ethnobotanical, and community-involved methods. Findings include data and analyses about the community’s culture, subsistence and natural resource needs, organizations and institutions, and medicinal plant use. The conclusion discusses the case study in terms of the disciplinary framework and offers suggestions for research and application.