7 resultados para COMBINING DATA
em Digital Commons at Florida International University
Resumo:
This dissertation analyzes both the economics of the defense contracting process and the impact of total dollar obligations on the economies of U.S. states. Using various econometric techniques, I will estimate relationships across individual contracts, state level output, and income inequality. I will achieve this primarily through the use of a dataset on individual contract obligations. ^ The first essay will catalog the distribution of contracts and isolate aspects of the process that contribute to contract dollar obligations. Accordingly, this study describes several characteristics about individual defense contracts, from 1966-2006: (i) the distribution of contract dollar obligations is extremely rightward skewed, (ii) contracts are unevenly distributed in a geographic sense across the United States, (iii) increased duration of a contract by 10 percent is associated with an increase in costs by 4 percent, (iv) competition does not seem to affect dollar obligations in a substantial way, (v) contract pre-payment financing increases the obligation of contracts from anywhere from 62 to 380 percent over non-financed contracts. ^ The second essay will turn to an aggregate focus, and look the impact of defense spending on state economic output. The analysis in chapter two attempts to estimate the state level fiscal multiplier, deploying Difference-in-Differences estimation as an attempt to filter out potential endogeneity bias. Interstate variation in procurement spending facilitates utilization of a natural experiment scenario, focusing on the spike in relative spending in 1982. The state level relative multiplier estimate here is 1.19, and captures the short run, impact effect of the 1982 spending spike. ^ Finally I will look at the relationship between defense contracting and income inequality. Military spending has typically been observed to have a negative relationship with income inequality. The third chapter examines the existence of this relationship, combining data on defense procurement with data on income inequality at the state level, in a longitudinal analysis across the United States. While the estimates do not suggest a significant relationship exists for the income share of the top ten percent of households, there is a significant positive relationship for the income share of top one percent households for an increase in defense procurement.^
Resumo:
This dissertation analyzes both the economics of the defense contracting process and the impact of total dollar obligations on the economies of U.S. states. Using various econometric techniques, I will estimate relationships across individual contracts, state level output, and income inequality. I will achieve this primarily through the use of a dataset on individual contract obligations. The first essay will catalog the distribution of contracts and isolate aspects of the process that contribute to contract dollar obligations. Accordingly, this study describes several characteristics about individual defense contracts, from 1966-2006: (i) the distribution of contract dollar obligations is extremely rightward skewed, (ii) contracts are unevenly distributed in a geographic sense across the United States, (iii) increased duration of a contract by 10 percent is associated with an increase in costs by 4 percent, (iv) competition does not seem to affect dollar obligations in a substantial way, (v) contract pre-payment financing increases the obligation of contracts from anywhere from 62 to 380 percent over non-financed contracts. The second essay will turn to an aggregate focus, and look the impact of defense spending on state economic output. The analysis in chapter two attempts to estimate the state level fiscal multiplier, deploying Difference-in-Differences estimation as an attempt to filter out potential endogeneity bias. Interstate variation in procurement spending facilitates utilization of a natural experiment scenario, focusing on the spike in relative spending in 1982. The state level relative multiplier estimate here is 1.19, and captures the short run, impact effect of the 1982 spending spike. Finally I will look at the relationship between defense contracting and income inequality. Military spending has typically been observed to have a negative relationship with income inequality. The third chapter examines the existence of this relationship, combining data on defense procurement with data on income inequality at the state level, in a longitudinal analysis across the United States. While the estimates do not suggest a significant relationship exists for the income share of the top ten percent of households, there is a significant positive relationship for the income share of top one percent households for an increase in defense procurement.
Resumo:
This study analyzed the health and overall landcover of citrus crops in Florida. The analysis was completed using Landsat satellite imagery available free of charge from the University of Maryland Global Landcover Change Facility. The project hypothesized that combining citrus production (economic) data with citrus area per county derived from spectral signatures would yield correlations between observable spectral reflectance throughout the year, and the fiscal impact of citrus on local economies. A positive correlation between these two data types would allow us to predict the economic impact of citrus using spectral data analysis to determine final crop harvests.
Resumo:
Modern data centers host hundreds of thousands of servers to achieve economies of scale. Such a huge number of servers create challenges for the data center network (DCN) to provide proportionally large bandwidth. In addition, the deployment of virtual machines (VMs) in data centers raises the requirements for efficient resource allocation and find-grained resource sharing. Further, the large number of servers and switches in the data center consume significant amounts of energy. Even though servers become more energy efficient with various energy saving techniques, DCN still accounts for 20% to 50% of the energy consumed by the entire data center. The objective of this dissertation is to enhance DCN performance as well as its energy efficiency by conducting optimizations on both host and network sides. First, as the DCN demands huge bisection bandwidth to interconnect all the servers, we propose a parallel packet switch (PPS) architecture that directly processes variable length packets without segmentation-and-reassembly (SAR). The proposed PPS achieves large bandwidth by combining switching capacities of multiple fabrics, and it further improves the switch throughput by avoiding padding bits in SAR. Second, since certain resource demands of the VM are bursty and demonstrate stochastic nature, to satisfy both deterministic and stochastic demands in VM placement, we propose the Max-Min Multidimensional Stochastic Bin Packing (M3SBP) algorithm. M3SBP calculates an equivalent deterministic value for the stochastic demands, and maximizes the minimum resource utilization ratio of each server. Third, to provide necessary traffic isolation for VMs that share the same physical network adapter, we propose the Flow-level Bandwidth Provisioning (FBP) algorithm. By reducing the flow scheduling problem to multiple stages of packet queuing problems, FBP guarantees the provisioned bandwidth and delay performance for each flow. Finally, while DCNs are typically provisioned with full bisection bandwidth, DCN traffic demonstrates fluctuating patterns, we propose a joint host-network optimization scheme to enhance the energy efficiency of DCNs during off-peak traffic hours. The proposed scheme utilizes a unified representation method that converts the VM placement problem to a routing problem and employs depth-first and best-fit search to find efficient paths for flows.
Resumo:
Ensemble Stream Modeling and Data-cleaning are sensor information processing systems have different training and testing methods by which their goals are cross-validated. This research examines a mechanism, which seeks to extract novel patterns by generating ensembles from data. The main goal of label-less stream processing is to process the sensed events to eliminate the noises that are uncorrelated, and choose the most likely model without over fitting thus obtaining higher model confidence. Higher quality streams can be realized by combining many short streams into an ensemble which has the desired quality. The framework for the investigation is an existing data mining tool. First, to accommodate feature extraction such as a bush or natural forest-fire event we make an assumption of the burnt area (BA*), sensed ground truth as our target variable obtained from logs. Even though this is an obvious model choice the results are disappointing. The reasons for this are two: One, the histogram of fire activity is highly skewed. Two, the measured sensor parameters are highly correlated. Since using non descriptive features does not yield good results, we resort to temporal features. By doing so we carefully eliminate the averaging effects; the resulting histogram is more satisfactory and conceptual knowledge is learned from sensor streams. Second is the process of feature induction by cross-validating attributes with single or multi-target variables to minimize training error. We use F-measure score, which combines precision and accuracy to determine the false alarm rate of fire events. The multi-target data-cleaning trees use information purity of the target leaf-nodes to learn higher order features. A sensitive variance measure such as ƒ-test is performed during each node's split to select the best attribute. Ensemble stream model approach proved to improve when using complicated features with a simpler tree classifier. The ensemble framework for data-cleaning and the enhancements to quantify quality of fitness (30% spatial, 10% temporal, and 90% mobility reduction) of sensor led to the formation of streams for sensor-enabled applications. Which further motivates the novelty of stream quality labeling and its importance in solving vast amounts of real-time mobile streams generated today.
Resumo:
Modern data centers host hundreds of thousands of servers to achieve economies of scale. Such a huge number of servers create challenges for the data center network (DCN) to provide proportionally large bandwidth. In addition, the deployment of virtual machines (VMs) in data centers raises the requirements for efficient resource allocation and find-grained resource sharing. Further, the large number of servers and switches in the data center consume significant amounts of energy. Even though servers become more energy efficient with various energy saving techniques, DCN still accounts for 20% to 50% of the energy consumed by the entire data center. The objective of this dissertation is to enhance DCN performance as well as its energy efficiency by conducting optimizations on both host and network sides. First, as the DCN demands huge bisection bandwidth to interconnect all the servers, we propose a parallel packet switch (PPS) architecture that directly processes variable length packets without segmentation-and-reassembly (SAR). The proposed PPS achieves large bandwidth by combining switching capacities of multiple fabrics, and it further improves the switch throughput by avoiding padding bits in SAR. Second, since certain resource demands of the VM are bursty and demonstrate stochastic nature, to satisfy both deterministic and stochastic demands in VM placement, we propose the Max-Min Multidimensional Stochastic Bin Packing (M3SBP) algorithm. M3SBP calculates an equivalent deterministic value for the stochastic demands, and maximizes the minimum resource utilization ratio of each server. Third, to provide necessary traffic isolation for VMs that share the same physical network adapter, we propose the Flow-level Bandwidth Provisioning (FBP) algorithm. By reducing the flow scheduling problem to multiple stages of packet queuing problems, FBP guarantees the provisioned bandwidth and delay performance for each flow. Finally, while DCNs are typically provisioned with full bisection bandwidth, DCN traffic demonstrates fluctuating patterns, we propose a joint host-network optimization scheme to enhance the energy efficiency of DCNs during off-peak traffic hours. The proposed scheme utilizes a unified representation method that converts the VM placement problem to a routing problem and employs depth-first and best-fit search to find efficient paths for flows.
Resumo:
Ensemble Stream Modeling and Data-cleaning are sensor information processing systems have different training and testing methods by which their goals are cross-validated. This research examines a mechanism, which seeks to extract novel patterns by generating ensembles from data. The main goal of label-less stream processing is to process the sensed events to eliminate the noises that are uncorrelated, and choose the most likely model without over fitting thus obtaining higher model confidence. Higher quality streams can be realized by combining many short streams into an ensemble which has the desired quality. The framework for the investigation is an existing data mining tool. First, to accommodate feature extraction such as a bush or natural forest-fire event we make an assumption of the burnt area (BA*), sensed ground truth as our target variable obtained from logs. Even though this is an obvious model choice the results are disappointing. The reasons for this are two: One, the histogram of fire activity is highly skewed. Two, the measured sensor parameters are highly correlated. Since using non descriptive features does not yield good results, we resort to temporal features. By doing so we carefully eliminate the averaging effects; the resulting histogram is more satisfactory and conceptual knowledge is learned from sensor streams. Second is the process of feature induction by cross-validating attributes with single or multi-target variables to minimize training error. We use F-measure score, which combines precision and accuracy to determine the false alarm rate of fire events. The multi-target data-cleaning trees use information purity of the target leaf-nodes to learn higher order features. A sensitive variance measure such as f-test is performed during each node’s split to select the best attribute. Ensemble stream model approach proved to improve when using complicated features with a simpler tree classifier. The ensemble framework for data-cleaning and the enhancements to quantify quality of fitness (30% spatial, 10% temporal, and 90% mobility reduction) of sensor led to the formation of streams for sensor-enabled applications. Which further motivates the novelty of stream quality labeling and its importance in solving vast amounts of real-time mobile streams generated today.