876 resultados para Distributed data
Resumo:
In WSNs the communication traffic is often time and space correlated, where multiple nodes in a proximity start transmitting simultaneously. Such a situation is known as spatially correlated contention. The random access method to resolve such contention suffers from high collision rate, whereas the traditional distributed TDMA scheduling techniques primarily try to improve the network capacity by reducing the schedule length. Usually, the situation of spatially correlated contention persists only for a short duration, and therefore generating an optimal or suboptimal schedule is not very useful. Additionally, if an algorithm takes very long time to schedule, it will not only introduce additional delay in the data transfer but also consume more energy. In this paper, we present a distributed TDMA slot scheduling (DTSS) algorithm, which considerably reduces the time required to perform scheduling, while restricting the schedule length to the maximum degree of interference graph. The DTSS algorithm supports unicast, multicast, and broadcast scheduling, simultaneously without any modification in the protocol. We have analyzed the protocol for average case performance and also simulated it using Castalia simulator to evaluate its runtime performance. Both analytical and simulation results show that our protocol is able to considerably reduce the time required for scheduling.
Resumo:
Opportunistic selection in multi-node wireless systems improves system performance by selecting the ``best'' node and by using it for data transmission. In these systems, each node has a real-valued local metric, which is a measure of its ability to improve system performance. Our goal is to identify the best node, which has the largest metric. We propose, analyze, and optimize a new distributed, yet simple, node selection scheme that combines the timer scheme with power control. In it, each node sets a timer and transmit power level as a function of its metric. The power control is designed such that the best node is captured even if. other nodes simultaneously transmit with it. We develop several structural properties about the optimal metric-to-timer-and-power mapping, which maximizes the probability of selecting the best node. These significantly reduce the computational complexity of finding the optimal mapping and yield valuable insights about it. We show that the proposed scheme is scalable and significantly outperforms the conventional timer scheme. We investigate the effect of. and the number of receive power levels. Furthermore, we find that the practical peak power constraint has a negligible impact on the performance of the scheme.
Resumo:
The time division multiple access (TDMA) based channel access mechanisms perform better than the contention based channel access mechanisms, in terms of channel utilization, reliability and power consumption, specially for high data rate applications in wireless sensor networks (WSNs). Most of the existing distributed TDMA scheduling techniques can be classified as either static or dynamic. The primary purpose of static TDMA scheduling algorithms is to improve the channel utilization by generating a schedule of smaller length. But, they usually take longer time to schedule, and hence, are not suitable for WSNs, in which the network topology changes dynamically. On the other hand, dynamic TDMA scheduling algorithms generate a schedule quickly, but they are not efficient in terms of generated schedule length. In this paper, we propose a novel scheme for TDMA scheduling in WSNs, which can generate a compact schedule similar to static scheduling algorithms, while its runtime performance can be matched with those of dynamic scheduling algorithms. Furthermore, the proposed distributed TDMA scheduling algorithm has the capability to trade-off schedule length with the time required to generate the schedule. This would allow the developers of WSNs, to tune the performance, as per the requirement of prevalent WSN applications, and the requirement to perform re-scheduling. Finally, the proposed TDMA scheduling is fault-tolerant to packet loss due to erroneous wireless channel. The algorithm has been simulated using the Castalia simulator to compare its performance with those of others in terms of generated schedule length and the time required to generate the TDMA schedule. Simulation results show that the proposed algorithm generates a compact schedule in a very less time.
Resumo:
In the current state of the art, it remains an open problem to detect damage with partial ultrasonic scan data and with measurements at coarser spatial scale when the location of damage is not known. In the present paper, a recent development of finite element based model reduction scheme in frequency domain that employs master degrees of freedom covering the surface scan region of interests is reported in context of non-contact ultrasonic guided wave based inspection. The surface scan region of interest is grouped into master and slave degrees of freedom. A finite element wise damage factor is derived which represents damage state over distributed areas or sharp condition of inter-element boundaries (for crack). Laser Doppler Vibrometer (LDV) scan data obtained from plate type structure with inaccessible surface line crack are considered along with the developed reduced order damage model to analyze the extent of scan data dimensional reduction. The proposed technique has useful application in problems where non-contact monitoring of complex structural parts are extremely important and at the same time LDV scan has to be done on accessible surfaces only.
Resumo:
To be in compliance with the Endangered Species Act and the Marine Mammal Protection Act, the United States Department of the Navy is required to assess the potential environmental impacts of conducting at-sea training operations on sea turtles and marine mammals. Limited recent and area-specific density data of sea turtles and dolphins exist for many of the Navy’s operations areas (OPAREAs), including the Marine Corps Air Station (MCAS) Cherry Point OPAREA, which encompasses portions of Core and Pamlico Sounds, North Carolina. Aerial surveys were conducted to document the seasonal distribution and estimated density of sea turtles and dolphins within Core Sound and portions of Pamlico Sound, and coastal waters extending one mile offshore. Sea Surface Temperature (SST) data for each survey were extracted from 1.4 km/pixel resolution Advanced Very High Resolution Radiometer remote images. A total of 92 turtles and 1,625 dolphins were sighted during 41 aerial surveys, conducted from July 2004 to April 2006. In the spring (March – May; 7.9°C to 21.7°C mean SST), the majority of turtles sighted were along the coast, mainly from the northern Core Banks northward to Cape Hatteras. By the summer (June – Aug.; 25.2°C to 30.8°C mean SST), turtles were fairly evenly dispersed along the entire survey range of the coast and Pamlico Sound, with only a few sightings in Core Sound. In the autumn (Sept. – Nov.; 9.6°C to 29.6°C mean SST), the majority of turtles sighted were along the coast and in eastern Pamlico Sound; however, fewer turtles were observed along the coast than in the summer. No turtles were seen during the winter surveys (Dec. – Feb.; 7.6°C to 11.2°C mean SST). The estimated mean surface density of turtles was highest along the coast in the summer of 2005 (0.615 turtles/km², SE = 0.220). In Core and Pamlico Sounds the highest mean surface density occurred during the autumn of 2005 (0.016 turtles/km², SE = 0.009). The mean seasonal abundance estimates were always highest in the coastal region, except in the winter when turtles were not sighted in either region. For Pamlico Sound, surface densities were always greater in the eastern than western section. The range of mean temperatures at which turtles were sighted was 9.68°C to 30.82°C. The majority of turtles sighted were within water ≥ 11°C. Dolphins were observed within estuarine waters and along the coast year-round; however, there were some general seasonal movements. In particular, during the summer sightings decreased along the coast and dolphins were distributed throughout Core and Pamlico Sounds, while in the winter the majority of dolphins were located along the coast and in southeastern Pamlico Sound. Although relative numbers changed seasonally between these areas, the estimated mean surface density of dolphins was highest along the coast in the spring of 2006 (9.564 dolphins/km², SE = 5.571). In Core and Pamlico Sounds the highest mean surface density occurred during the autumn of 2004 (0.192 dolphins/km², SE = 0.066). The estimated mean surface density of dolphins was lowest along the coast in the summer of 2004 (0.461 dolphins/km², SE = 0.294). The estimated mean surface density of dolphins was lowest in Core and Pamlico Sounds in the summer of 2005 (0.024 dolphins/km², SE = 0.011). In Pamlico Sound, estimated surface densities were greater in the eastern section except in the autumn. Dolphins were sighted throughout the entire range of mean SST (7.60°C to 30.82°C), with a tendency towards fewer dolphins sighted as water temperatures increased. Based on the findings of this study, sea turtles are most likely to be encountered within the OPAREAs when SST is ≥ 11°C. Since sea turtle distributions are generally limited by water temperature, knowing the SST of a given area is a useful predictor of sea turtle presence. Since dolphins were observed within estuarine waters year-round and throughout the entire range of mean SST’s, they likely could be encountered in the OPAREAs any time of the year. Although our findings indicated the greatest number of dolphins to be present in the winter and the least in the summer, their movements also may be related to other factors such as the availability of prey. (PDF contains 28 pages)
Resumo:
Two high-frequency (HF) radar stations were installed on the coast of the south-eastern Bay of Biscay in 2009, providing high spatial and temporal resolution and large spatial coverage of currents in the area for the first time. This has made it possible to quantitatively assess the air-sea interaction patterns and timescales for the period 2009-2010. The analysis was conducted using the Barnett-Preisendorfer approach to canonical correlation analysis (CCA) of reanalysis surface winds and HF radar-derived surface currents. The CCA yields two canonical patterns: the first wind-current interaction pattern corresponds to the classical Ekman drift at the sea surface, whilst the second describes an anticyclonic/cyclonic surface circulation. The results obtained demonstrate that local winds play an important role in driving the upper water circulation. The wind-current interaction timescales are mainly related to diurnal breezes and synoptic variability. In particular, the breezes force diurnal currents in waters of the continental shelf and slope of the south-eastern Bay. It is concluded that the breezes may force diurnal currents over considerably wider areas than that covered by the HF radar, considering that the northern and southern continental shelves of the Bay exhibit stronger diurnal than annual wind amplitudes.
Resumo:
Hyper-spectral data allows the construction of more robust statistical models to sample the material properties than the standard tri-chromatic color representation. However, because of the large dimensionality and complexity of the hyper-spectral data, the extraction of robust features (image descriptors) is not a trivial issue. Thus, to facilitate efficient feature extraction, decorrelation techniques are commonly applied to reduce the dimensionality of the hyper-spectral data with the aim of generating compact and highly discriminative image descriptors. Current methodologies for data decorrelation such as principal component analysis (PCA), linear discriminant analysis (LDA), wavelet decomposition (WD), or band selection methods require complex and subjective training procedures and in addition the compressed spectral information is not directly related to the physical (spectral) characteristics associated with the analyzed materials. The major objective of this article is to introduce and evaluate a new data decorrelation methodology using an approach that closely emulates the human vision. The proposed data decorrelation scheme has been employed to optimally minimize the amount of redundant information contained in the highly correlated hyper-spectral bands and has been comprehensively evaluated in the context of non-ferrous material classification
Resumo:
Smartphones and other powerful sensor-equipped consumer devices make it possible to sense the physical world at an unprecedented scale. Nearly 2 million Android and iOS devices are activated every day, each carrying numerous sensors and a high-speed internet connection. Whereas traditional sensor networks have typically deployed a fixed number of devices to sense a particular phenomena, community networks can grow as additional participants choose to install apps and join the network. In principle, this allows networks of thousands or millions of sensors to be created quickly and at low cost. However, making reliable inferences about the world using so many community sensors involves several challenges, including scalability, data quality, mobility, and user privacy.
This thesis focuses on how learning at both the sensor- and network-level can provide scalable techniques for data collection and event detection. First, this thesis considers the abstract problem of distributed algorithms for data collection, and proposes a distributed, online approach to selecting which set of sensors should be queried. In addition to providing theoretical guarantees for submodular objective functions, the approach is also compatible with local rules or heuristics for detecting and transmitting potentially valuable observations. Next, the thesis presents a decentralized algorithm for spatial event detection, and describes its use detecting strong earthquakes within the Caltech Community Seismic Network. Despite the fact that strong earthquakes are rare and complex events, and that community sensors can be very noisy, our decentralized anomaly detection approach obtains theoretical guarantees for event detection performance while simultaneously limiting the rate of false alarms.
Resumo:
In the first part of the thesis we explore three fundamental questions that arise naturally when we conceive a machine learning scenario where the training and test distributions can differ. Contrary to conventional wisdom, we show that in fact mismatched training and test distribution can yield better out-of-sample performance. This optimal performance can be obtained by training with the dual distribution. This optimal training distribution depends on the test distribution set by the problem, but not on the target function that we want to learn. We show how to obtain this distribution in both discrete and continuous input spaces, as well as how to approximate it in a practical scenario. Benefits of using this distribution are exemplified in both synthetic and real data sets.
In order to apply the dual distribution in the supervised learning scenario where the training data set is fixed, it is necessary to use weights to make the sample appear as if it came from the dual distribution. We explore the negative effect that weighting a sample can have. The theoretical decomposition of the use of weights regarding its effect on the out-of-sample error is easy to understand but not actionable in practice, as the quantities involved cannot be computed. Hence, we propose the Targeted Weighting algorithm that determines if, for a given set of weights, the out-of-sample performance will improve or not in a practical setting. This is necessary as the setting assumes there are no labeled points distributed according to the test distribution, only unlabeled samples.
Finally, we propose a new class of matching algorithms that can be used to match the training set to a desired distribution, such as the dual distribution (or the test distribution). These algorithms can be applied to very large datasets, and we show how they lead to improved performance in a large real dataset such as the Netflix dataset. Their computational complexity is the main reason for their advantage over previous algorithms proposed in the covariate shift literature.
In the second part of the thesis we apply Machine Learning to the problem of behavior recognition. We develop a specific behavior classifier to study fly aggression, and we develop a system that allows analyzing behavior in videos of animals, with minimal supervision. The system, which we call CUBA (Caltech Unsupervised Behavior Analysis), allows detecting movemes, actions, and stories from time series describing the position of animals in videos. The method summarizes the data, as well as it provides biologists with a mathematical tool to test new hypotheses. Other benefits of CUBA include finding classifiers for specific behaviors without the need for annotation, as well as providing means to discriminate groups of animals, for example, according to their genetic line.
Resumo:
We are at the cusp of a historic transformation of both communication system and electricity system. This creates challenges as well as opportunities for the study of networked systems. Problems of these systems typically involve a huge number of end points that require intelligent coordination in a distributed manner. In this thesis, we develop models, theories, and scalable distributed optimization and control algorithms to overcome these challenges.
This thesis focuses on two specific areas: multi-path TCP (Transmission Control Protocol) and electricity distribution system operation and control. Multi-path TCP (MP-TCP) is a TCP extension that allows a single data stream to be split across multiple paths. MP-TCP has the potential to greatly improve reliability as well as efficiency of communication devices. We propose a fluid model for a large class of MP-TCP algorithms and identify design criteria that guarantee the existence, uniqueness, and stability of system equilibrium. We clarify how algorithm parameters impact TCP-friendliness, responsiveness, and window oscillation and demonstrate an inevitable tradeoff among these properties. We discuss the implications of these properties on the behavior of existing algorithms and motivate a new algorithm Balia (balanced linked adaptation) which generalizes existing algorithms and strikes a good balance among TCP-friendliness, responsiveness, and window oscillation. We have implemented Balia in the Linux kernel. We use our prototype to compare the new proposed algorithm Balia with existing MP-TCP algorithms.
Our second focus is on designing computationally efficient algorithms for electricity distribution system operation and control. First, we develop efficient algorithms for feeder reconfiguration in distribution networks. The feeder reconfiguration problem chooses the on/off status of the switches in a distribution network in order to minimize a certain cost such as power loss. It is a mixed integer nonlinear program and hence hard to solve. We propose a heuristic algorithm that is based on the recently developed convex relaxation of the optimal power flow problem. The algorithm is efficient and can successfully computes an optimal configuration on all networks that we have tested. Moreover we prove that the algorithm solves the feeder reconfiguration problem optimally under certain conditions. We also propose a more efficient algorithm and it incurs a loss in optimality of less than 3% on the test networks.
Second, we develop efficient distributed algorithms that solve the optimal power flow (OPF) problem on distribution networks. The OPF problem determines a network operating point that minimizes a certain objective such as generation cost or power loss. Traditionally OPF is solved in a centralized manner. With increasing penetration of volatile renewable energy resources in distribution systems, we need faster and distributed solutions for real-time feedback control. This is difficult because power flow equations are nonlinear and kirchhoff's law is global. We propose solutions for both balanced and unbalanced radial distribution networks. They exploit recent results that suggest solving for a globally optimal solution of OPF over a radial network through a second-order cone program (SOCP) or semi-definite program (SDP) relaxation. Our distributed algorithms are based on the alternating direction method of multiplier (ADMM), but unlike standard ADMM-based distributed OPF algorithms that require solving optimization subproblems using iterative methods, the proposed solutions exploit the problem structure that greatly reduce the computation time. Specifically, for balanced networks, our decomposition allows us to derive closed form solutions for these subproblems and it speeds up the convergence by 1000x times in simulations. For unbalanced networks, the subproblems reduce to either closed form solutions or eigenvalue problems whose size remains constant as the network scales up and computation time is reduced by 100x compared with iterative methods.
Resumo:
Background: Recently, with the access of low toxicity biological and targeted therapies, evidence of the existence of a long-term survival subpopulation of cancer patients is appearing. We have studied an unselected population with advanced lung cancer to look for evidence of multimodality in survival distribution, and estimate the proportion of long-term survivors. Methods: We used survival data of 4944 patients with non-small-cell lung cancer (NSCLC) stages IIIb-IV at diagnostic, registered in the National Cancer Registry of Cuba (NCRC) between January 1998 and December 2006. We fitted one-component survival model and two-component mixture models to identify short-and long-term survivors. Bayesian information criterion was used for model selection. Results: For all of the selected parametric distributions the two components model presented the best fit. The population with short-term survival (almost 4 months median survival) represented 64% of patients. The population of long-term survival included 35% of patients, and showed a median survival around 12 months. None of the patients of short-term survival was still alive at month 24, while 10% of the patients of long-term survival died afterwards. Conclusions: There is a subgroup showing long-term evolution among patients with advanced lung cancer. As survival rates continue to improve with the new generation of therapies, prognostic models considering short-and long-term survival subpopulations should be considered in clinical research.
Resumo:
A new supervised burned area mapping software named BAMS (Burned Area Mapping Software) is presented in this paper. The tool was built from standard ArcGIS (TM) libraries. It computes several of the spectral indexes most commonly used in burned area detection and implements a two-phase supervised strategy to map areas burned between two Landsat multitemporal images. The only input required from the user is the visual delimitation of a few burned areas, from which burned perimeters are extracted. After the discrimination of burned patches, the user can visually assess the results, and iteratively select additional sampling burned areas to improve the extent of the burned patches. The final result of the BAMS program is a polygon vector layer containing three categories: (a) burned perimeters, (b) unburned areas, and (c) non-observed areas. The latter refer to clouds or sensor observation errors. Outputs of the BAMS code meet the requirements of file formats and structure of standard validation protocols. This paper presents the tool's structure and technical basis. The program has been tested in six areas located in the United States, for various ecosystems and land covers, and then compared against the National Monitoring Trends in Burn Severity (MTBS) Burned Area Boundaries Dataset.
Resumo:
Several alpine vertebrates share a distribution pattern that extends across the South-western Palearctic but is limited to the main mountain massifs. Although they are usually regarded as cold-adapted species, the range of many alpine vertebrates also includes relatively warm areas, suggesting that factors beyond climatic conditions may be driving their distribution. In this work we first recognize the species belonging to the mentioned biogeographic group and, based on the environmental niche analysis of Plecotus macrobullaris, we identify and characterize the environmental factors constraining their ranges. Distribution overlap analysis of 504 European vertebrates was done using the Sorensen Similarity Index, and we identified four birds and one mammal that share the distribution with P. macrobullaris. We generated 135 environmental niche models including different variable combinations and regularization values for P. macrobullaris at two different scales and resolutions. After selecting the best models, we observed that topographic variables outperformed climatic predictors, and the abruptness of the landscape showed better predictive ability than elevation. The best explanatory climatic variable was mean summer temperature, which showed that P. macrobullaris is able to cope with mean temperature ranges spanning up to 16 degrees C. The models showed that the distribution of P. macrobullaris is mainly shaped by topographic factors that provide rock-abundant and open-space habitats rather than climatic determinants, and that the species is not a cold-adapted, but rather a cold-tolerant eurithermic organism. P. macrobullaris shares its distribution pattern as well as several ecological features with five other alpine vertebrates, suggesting that the conclusions obtained from this study might be extensible to them. We concluded that rock-dwelling and open-space foraging vertebrates with broad temperature tolerance are the best candidates to show wide alpine distribution in the Western Palearctic.