12 resultados para High-dimensional Data
em Digital Commons at Florida International University
Resumo:
Due to the rapid advances in computing and sensing technologies, enormous amounts of data are being generated everyday in various applications. The integration of data mining and data visualization has been widely used to analyze these massive and complex data sets to discover hidden patterns. For both data mining and visualization to be effective, it is important to include the visualization techniques in the mining process and to generate the discovered patterns for a more comprehensive visual view. In this dissertation, four related problems: dimensionality reduction for visualizing high dimensional datasets, visualization-based clustering evaluation, interactive document mining, and multiple clusterings exploration are studied to explore the integration of data mining and data visualization. In particular, we 1) propose an efficient feature selection method (reliefF + mRMR) for preprocessing high dimensional datasets; 2) present DClusterE to integrate cluster validation with user interaction and provide rich visualization tools for users to examine document clustering results from multiple perspectives; 3) design two interactive document summarization systems to involve users efforts and generate customized summaries from 2D sentence layouts; and 4) propose a new framework which organizes the different input clusterings into a hierarchical tree structure and allows for interactive exploration of multiple clustering solutions.
Resumo:
Lake Analyzer is a numerical code coupled with supporting visualization tools for determining indices of mixing and stratification that are critical to the biogeochemical cycles of lakes and reservoirs. Stability indices, including Lake Number, Wedderburn Number, Schmidt Stability, and thermocline depth are calculated according to established literature definitions and returned to the user in a time series format. The program was created for the analysis of high-frequency data collected from instrumented lake buoys, in support of the emerging field of aquatic sensor network science. Available outputs for the Lake Analyzer program are: water temperature (error-checked and/or down-sampled), wind speed (error-checked and/or down-sampled), metalimnion extent (top and bottom), thermocline depth, friction velocity, Lake Number, Wedderburn Number, Schmidt Stability, mode-1 vertical seiche period, and Brunt-Väisälä buoyancy frequency. Secondary outputs for several of these indices delineate the parent thermocline depth (seasonal thermocline) from the shallower secondary or diurnal thermocline. Lake Analyzer provides a program suite and best practices for the comparison of mixing and stratification indices in lakes across gradients of climate, hydro-physiography, and time, and enables a more detailed understanding of the resulting biogeochemical transformations at different spatial and temporal scales.
Resumo:
With the recent explosion in the complexity and amount of digital multimedia data, there has been a huge impact on the operations of various organizations in distinct areas, such as government services, education, medical care, business, entertainment, etc. To satisfy the growing demand of multimedia data management systems, an integrated framework called DIMUSE is proposed and deployed for distributed multimedia applications to offer a full scope of multimedia related tools and provide appealing experiences for the users. This research mainly focuses on video database modeling and retrieval by addressing a set of core challenges. First, a comprehensive multimedia database modeling mechanism called Hierarchical Markov Model Mediator (HMMM) is proposed to model high dimensional media data including video objects, low-level visual/audio features, as well as historical access patterns and frequencies. The associated retrieval and ranking algorithms are designed to support not only the general queries, but also the complicated temporal event pattern queries. Second, system training and learning methodologies are incorporated such that user interests are mined efficiently to improve the retrieval performance. Third, video clustering techniques are proposed to continuously increase the searching speed and accuracy by architecting a more efficient multimedia database structure. A distributed video management and retrieval system is designed and implemented to demonstrate the overall performance. The proposed approach is further customized for a mobile-based video retrieval system to solve the perception subjectivity issue by considering individual user's profile. Moreover, to deal with security and privacy issues and concerns in distributed multimedia applications, DIMUSE also incorporates a practical framework called SMARXO, which supports multilevel multimedia security control. SMARXO efficiently combines role-based access control (RBAC), XML and object-relational database management system (ORDBMS) to achieve the target of proficient security control. A distributed multimedia management system named DMMManager (Distributed MultiMedia Manager) is developed with the proposed framework DEMUR; to support multimedia capturing, analysis, retrieval, authoring and presentation in one single framework.
Resumo:
The need for a high quality tourism database is well known. For example, planners and managers need high quality data for budgeting, forecasting, planning marketing and advertising strategies, and staffing. Thus the concepts of quality and need are intertwined to pose a problem to the tourism professional, be they private sector or public sector employees. One could argue that collaboration by public and private sector tourism professionals could provide the best sources and uses of high quality tourism data. This discussion proposes just such a collaboration and a detailed methodology for operationalizing this arrangement.
Resumo:
An automated on-line SPE-LC-MS/MS method was developed for the quantitation of multiple classes of antibiotics in environmental waters. High sensitivity in the low ng/L range was accomplished by using large volume injections with 10-mL of sample. Positive confirmation of analytes was achieved using two selected reaction monitoring (SRM) transitions per antibiotic and quantitation was performed using an internal standard approach. Samples were extracted using online solid phase extraction, then using column switching technique; extracted samples were immediately passed through liquid chromatography and analyzed by tandem mass spectrometry. The total run time per each sample was 20 min. The statistically calculated method detection limits for various environmental samples were between 1.2 and 63 ng/L. Furthermore, the method was validated in terms of precision, accuracy and linearity. ^ The developed analytical methodology was used to measure the occurrence of antibiotics in reclaimed waters (n=56), surface waters (n=53), ground waters (n=8) and drinking waters (n=54) collected from different parts of South Florida. In reclaimed waters, the most frequently detected antibiotics were nalidixic acid, erythromycin, clarithromycin, azithromycin trimethoprim, sulfamethoxazole and ofloxacin (19.3-604.9 ng/L). Detection of antibiotics in reclaimed waters indicates that they can't be completely removed by conventional wastewater treatment process. Furthermore, the average mass loads of antibiotics released into the local environment through reclaimed water were estimated as 0.248 Kg/day. Among the surface waters samples, Miami River (reaching up to 580 ng/L) and Black Creek canal (up to 124 ng/L) showed highest concentrations of antibiotics. No traces of antibiotics were found in ground waters. On the other hand, erythromycin (monitored as anhydro erythromycin) was detected in 82% of the drinking water samples (n.d-66 ng/L). The developed approach is suitable for both research and monitoring applications.^ Major metabolites of antibiotics in reclaimed wates were identified and quantified using high resolution benchtop Q-Exactive orbitrap mass spectrometer. A phase I metabolite of erythromycin was tentatively identified in full scan based on accurate mass measurement. Using extracted ion chromatogram (XIC), high resolution data-dependent MS/MS spectra and metabolic profiling software the metabolite was identified as desmethyl anhydro erythromycin with molecular formula C36H63NO12 and m/z 702.4423. The molar concentration of the metabolite to erythromycin was in the order of 13 %. To my knowledge, this is the first known report on this metabolite in reclaimed water. Another compound acetyl-sulfamethoxazole, a phase II metabolite of sulfamethoxazole was also identified in reclaimed water and mole fraction of the metabolite represent 36 %, of the cumulative sulfamethoxazole concentration. The results were illustrating the importance to include metabolites also in the routine analysis to obtain a mass balance for better understanding of the occurrence, fate and distribution of antibiotics in the environment. ^ Finally, all the antibiotics detected in reclaimed and surface waters were investigated to assess the potential risk to the aquatic organisms. The surface water antibiotic concentrations that represented the real time exposure conditions revealed that the macrolide antibiotics, erythromycin, clarithromycin and tylosin along with quinolone antibiotic, ciprofloxacin were suspected to induce high toxicity to aquatic biota. Preliminary results showing that, among the antibiotic groups tested, macrolides posed the highest ecological threat, and therefore, they may need to be further evaluated with, long-term exposure studies considering bioaccumulation factors and more number of species selected. Overall, the occurrence of antibiotics in aquatic environment is posing an ecological health concern.^
Resumo:
An automated on-line SPE-LC-MS/MS method was developed for the quantitation of multiple classes of antibiotics in environmental waters. High sensitivity in the low ng/L range was accomplished by using large volume injections with 10-mL of sample. Positive confirmation of analytes was achieved using two selected reaction monitoring (SRM) transitions per antibiotic and quantitation was performed using an internal standard approach. Samples were extracted using online solid phase extraction, then using column switching technique; extracted samples were immediately passed through liquid chromatography and analyzed by tandem mass spectrometry. The total run time per each sample was 20 min. The statistically calculated method detection limits for various environmental samples were between 1.2 and 63 ng/L. Furthermore, the method was validated in terms of precision, accuracy and linearity. The developed analytical methodology was used to measure the occurrence of antibiotics in reclaimed waters (n=56), surface waters (n=53), ground waters (n=8) and drinking waters (n=54) collected from different parts of South Florida. In reclaimed waters, the most frequently detected antibiotics were nalidixic acid, erythromycin, clarithromycin, azithromycin trimethoprim, sulfamethoxazole and ofloxacin (19.3-604.9 ng/L). Detection of antibiotics in reclaimed waters indicates that they can’t be completely removed by conventional wastewater treatment process. Furthermore, the average mass loads of antibiotics released into the local environment through reclaimed water were estimated as 0.248 Kg/day. Among the surface waters samples, Miami River (reaching up to 580 ng/L) and Black Creek canal (up to 124 ng/L) showed highest concentrations of antibiotics. No traces of antibiotics were found in ground waters. On the other hand, erythromycin (monitored as anhydro erythromycin) was detected in 82% of the drinking water samples (n.d-66 ng/L). The developed approach is suitable for both research and monitoring applications. Major metabolites of antibiotics in reclaimed wates were identified and quantified using high resolution benchtop Q-Exactive orbitrap mass spectrometer. A phase I metabolite of erythromycin was tentatively identified in full scan based on accurate mass measurement. Using extracted ion chromatogram (XIC), high resolution data-dependent MS/MS spectra and metabolic profiling software the metabolite was identified as desmethyl anhydro erythromycin with molecular formula C36H63NO12 and m/z 702.4423. The molar concentration of the metabolite to erythromycin was in the order of 13 %. To my knowledge, this is the first known report on this metabolite in reclaimed water. Another compound acetyl-sulfamethoxazole, a phase II metabolite of sulfamethoxazole was also identified in reclaimed water and mole fraction of the metabolite represent 36 %, of the cumulative sulfamethoxazole concentration. The results were illustrating the importance to include metabolites also in the routine analysis to obtain a mass balance for better understanding of the occurrence, fate and distribution of antibiotics in the environment. Finally, all the antibiotics detected in reclaimed and surface waters were investigated to assess the potential risk to the aquatic organisms. The surface water antibiotic concentrations that represented the real time exposure conditions revealed that the macrolide antibiotics, erythromycin, clarithromycin and tylosin along with quinolone antibiotic, ciprofloxacin were suspected to induce high toxicity to aquatic biota. Preliminary results showing that, among the antibiotic groups tested, macrolides posed the highest ecological threat, and therefore, they may need to be further evaluated with, long-term exposure studies considering bioaccumulation factors and more number of species selected. Overall, the occurrence of antibiotics in aquatic environment is posing an ecological health concern.
Resumo:
This work is the first work using patterned soft underlayers in multilevel three-dimensional vertical magnetic data storage systems. The motivation stems from an exponentially growing information stockpile, and a corresponding need for more efficient storage devices with higher density. The world information stockpile currently exceeds 150EB (ExaByte=1x1018Bytes); most of which is in analog form. Among the storage technologies (semiconductor, optical and magnetic), magnetic hard disk drives are posed to occupy a big role in personal, network as well as corporate storage. However; this mode suffers from a limit known as the Superparamagnetic limit; which limits achievable areal density due to fundamental quantum mechanical stability requirements. There are many viable techniques considered to defer superparamagnetism into the 100's of Gbit/in2 such as: patterned media, Heat-Assisted Magnetic Recording (HAMR), Self Organized Magnetic Arrays (SOMA), antiferromagnetically coupled structures (AFC), and perpendicular magnetic recording. Nonetheless, these techniques utilize a single magnetic layer; and can thusly be viewed as two-dimensional in nature. In this work a novel three-dimensional vertical magnetic recording approach is proposed. This approach utilizes the entire thickness of a magnetic multilayer structure to store information; with potential areal density well into the Tbit/in2 regime. ^ There are several possible implementations for 3D magnetic recording; each presenting its own set of requirements, merits and challenges. The issues and considerations pertaining to the development of such systems will be examined, and analyzed using empirical and numerical analysis techniques. Two novel key approaches are proposed and developed: (1) Patterned soft underlayer (SUL) which allows for enhanced recording of thicker media, (2) A combinatorial approach for 3D media development that facilitates concurrent investigation of various film parameters on a predefined performance metric. A case study is presented using combinatorial overcoats of Tantalum and Zirconium Oxides for corrosion protection in magnetic media. ^ Feasibility of 3D recording is demonstrated, and an emphasis on 3D media development is emphasized as a key prerequisite. Patterned SUL shows significant enhancement over conventional "un-patterned" SUL, and shows that geometry can be used as a design tool to achieve favorable field distribution where magnetic storage and magnetic phenomena are involved. ^
Resumo:
Recent advances in airborne Light Detection and Ranging (LIDAR) technology allow rapid and inexpensive measurements of topography over large areas. Airborne LIDAR systems usually return a 3-dimensional cloud of point measurements from reflective objects scanned by the laser beneath the flight path. This technology is becoming a primary method for extracting information of different kinds of geometrical objects, such as high-resolution digital terrain models (DTMs), buildings and trees, etc. In the past decade, LIDAR gets more and more interest from researchers in the field of remote sensing and GIS. Compared to the traditional data sources, such as aerial photography and satellite images, LIDAR measurements are not influenced by sun shadow and relief displacement. However, voluminous data pose a new challenge for automated extraction the geometrical information from LIDAR measurements because many raster image processing techniques cannot be directly applied to irregularly spaced LIDAR points. ^ In this dissertation, a framework is proposed to filter out information about different kinds of geometrical objects, such as terrain and buildings from LIDAR automatically. They are essential to numerous applications such as flood modeling, landslide prediction and hurricane animation. The framework consists of several intuitive algorithms. Firstly, a progressive morphological filter was developed to detect non-ground LIDAR measurements. By gradually increasing the window size and elevation difference threshold of the filter, the measurements of vehicles, vegetation, and buildings are removed, while ground data are preserved. Then, building measurements are identified from no-ground measurements using a region growing algorithm based on the plane-fitting technique. Raw footprints for segmented building measurements are derived by connecting boundary points and are further simplified and adjusted by several proposed operations to remove noise, which is caused by irregularly spaced LIDAR measurements. To reconstruct 3D building models, the raw 2D topology of each building is first extracted and then further adjusted. Since the adjusting operations for simple building models do not work well on 2D topology, 2D snake algorithm is proposed to adjust 2D topology. The 2D snake algorithm consists of newly defined energy functions for topology adjusting and a linear algorithm to find the minimal energy value of 2D snake problems. Data sets from urbanized areas including large institutional, commercial, and small residential buildings were employed to test the proposed framework. The results demonstrated that the proposed framework achieves a very good performance. ^
Resumo:
Currently the data storage industry is facing huge challenges with respect to the conventional method of recording data known as longitudinal magnetic recording. This technology is fast approaching a fundamental physical limit, known as the superparamagnetic limit. A unique way of deferring the superparamagnetic limit incorporates the patterning of magnetic media. This method exploits the use of lithography tools to predetermine the areal density. Various nanofabrication schemes are employed to pattern the magnetic material are Focus Ion Beam (FIB), E-beam Lithography (EBL), UV-Optical Lithography (UVL), Self-assembled Media Synthesis and Nanoimprint Lithography (NIL). Although there are many challenges to manufacturing patterned media, the large potential gains offered in terms of areal density make it one of the most promising new technologies on the horizon for future hard disk drives. Thus, this dissertation contributes to the development of future alternative data storage devices and deferring the superparamagnetic limit by designing and characterizing patterned magnetic media using a novel nanoimprint replication process called "Step and Flash Imprint lithography". As opposed to hot embossing and other high temperature-low pressure processes, SFIL can be performed at low pressure and room temperature. Initial experiments carried out, consisted of process flow design for the patterned structures on sputtered Ni-Fe thin films. The main one being the defectivity analysis for the SFIL process conducted by fabricating and testing devices of varying feature sizes (50 nm to 1 μm) and inspecting them optically as well as testing them electrically. Once the SFIL process was optimized, a number of Ni-Fe coated wafers were imprinted with a template having the patterned topography. A minimum feature size of 40 nm was obtained with varying pitch (1:1, 1:1.5, 1:2, and 1:3). The Characterization steps involved extensive SEM study at each processing step as well as Atomic Force Microscopy (AFM) and Magnetic Force Microscopy (MFM) analysis.
Resumo:
Fisheries independent data on relatively unstudied nekton communities were used to explore the efficacy of new tools to be applied in the investigation of shallow coastal coral reef habitats. These data obtained through concurrent diver visual and acoustic surveys provided descriptions of spatial community distribution patterns across seasonal temporal scales in a previously undocumented region. Fish density estimates by both diver and acoustic methodologies showed a general agreement in ability to detect distributional patterns across reef tracts, though magnitude of density estimates were different. Fish communities in southeastern Florida showed significant trends in spatial distribution and seasonal abundance, with higher estimates of biomass obtained in the dry season. Further, community composition shifted across reef tracts and seasons as a function of the movements of several key reef species.
Resumo:
Buildings and other infrastructures located in the coastal regions of the US have a higher level of wind vulnerability. Reducing the increasing property losses and causalities associated with severe windstorms has been the central research focus of the wind engineering community. The present wind engineering toolbox consists of building codes and standards, laboratory experiments, and field measurements. The American Society of Civil Engineers (ASCE) 7 standard provides wind loads only for buildings with common shapes. For complex cases it refers to physical modeling. Although this option can be economically viable for large projects, it is not cost-effective for low-rise residential houses. To circumvent these limitations, a numerical approach based on the techniques of Computational Fluid Dynamics (CFD) has been developed. The recent advance in computing technology and significant developments in turbulence modeling is making numerical evaluation of wind effects a more affordable approach. The present study targeted those cases that are not addressed by the standards. These include wind loads on complex roofs for low-rise buildings, aerodynamics of tall buildings, and effects of complex surrounding buildings. Among all the turbulence models investigated, the large eddy simulation (LES) model performed the best in predicting wind loads. The application of a spatially evolving time-dependent wind velocity field with the relevant turbulence structures at the inlet boundaries was found to be essential. All the results were compared and validated with experimental data. The study also revealed CFD’s unique flow visualization and aerodynamic data generation capabilities along with a better understanding of the complex three-dimensional aerodynamics of wind-structure interactions. With the proper modeling that realistically represents the actual turbulent atmospheric boundary layer flow, CFD can offer an economical alternative to the existing wind engineering tools. CFD’s easy accessibility is expected to transform the practice of structural design for wind, resulting in more wind-resilient and sustainable systems by encouraging optimal aerodynamic and sustainable structural/building design. Thus, this method will help ensure public safety and reduce economic losses due to wind perils.