890 resultados para historical data
Resumo:
Cell transition data is obtained from a cellular phone that switches its current serving cell tower. The data consists of a sequence of transition events, which are pairs of cell identifiers and transition times. The focus of this thesis is applying data mining methods to such data, developing new algorithms, and extracting knowledge that will be a solid foundation on which to build location-aware applications. In addition to a thorough exploration of the features of the data, the tools and methods developed in this thesis provide solutions to three distinct research problems. First, we develop clustering algorithms that produce a reliable mapping between cell transitions and physical locations observed by users of mobile devices. The main clustering algorithm operates in online fashion, and we consider also a number of offline clustering methods for comparison. Second, we define the concept of significant locations, known as bases, and give an online algorithm for determining them. Finally, we consider the task of predicting the movement of the user, based on historical data. We develop a prediction algorithm that considers paths of movement in their entirety, instead of just the most recent movement history. All of the presented methods are evaluated with a significant body of real cell transition data, collected from about one hundred different individuals. The algorithms developed in this thesis are designed to be implemented on a mobile device, and require no extra hardware sensors or network infrastructure. By not relying on external services and keeping the user information as much as possible on the user s own personal device, we avoid privacy issues and let the users control the disclosure of their location information.
Resumo:
Knowledge of the distribution and biology of the ragfish, Icosteus aenigmaticus, an aberrant deepwater perciform of the North Pacific Ocean, has increased slowly since the first description of the species in the 1880’s which was based on specimens retrieved from a fish monger’s table in San Francisco, Calif. As a historically rare, and subjectively unattractive appearing noncommercial species, ichthyologists have only studied ragfish from specimens caught and donated by fishermen or by the general public. Since 1958, I have accumulated catch records of >825 ragfish. Specimens were primarily from commercial fishermen and research personnel trawling for bottom and demersal species on the continental shelves of the eastern North Pacific Ocean, Gulf of Alaska, Bering Sea, and the western Pacific Ocean, as well as from gillnet fisheries for Pacific salmon, Oncorhynchus spp., in the north central Pacific Ocean. Available records came from four separate sources: 1) historical data based primarily on published and unpublished literature (1876–1990), 2) ragfish delivered fresh to Humboldt State University or records available from the California Department of Fish and Game of ragfish caught in northern California and southern Oregon bottom trawl fisheries (1950–99), 3) incidental catches of ragfish observed and recorded by scientific observers of the commercial fisheries of the eastern Pacific Ocean and catches in National Marine Fisheries Service trawl surveys studying these fisheries from 1976 to 1999, and 4) Japanese government research on nearshore fisheries of the northwestern Pacific Ocean (1950–99). Limited data on individual ragfish allowed mainly qualitative analysis, although some quantitative analysis could be made with ragfish data from northern California and southern Oregon. This paper includes a history of taxonomic and common names of the ragfish, types of fishing gear and other techniques recovering ragfish, a chronology of range extensions into the North Pacific and Bering Sea, reproductive biology of ragfish caught by trawl fisheries off northern California and southern Oregon, and topics dealing with early, juvenile, and adult life history, including age and growth, food habits, and ecology. Recommendations for future study are proposed, especially on the life history of juvenile ragfish (5–30 cm FL) which remains enigmatic.
Resumo:
Reducing energy consumption is a major challenge for "energy-intensive" industries such as papermaking. A commercially viable energy saving solution is to employ data-based optimization techniques to obtain a set of "optimized" operational settings that satisfy certain performance indices. The difficulties of this are: 1) the problems of this type are inherently multicriteria in the sense that improving one performance index might result in compromising the other important measures; 2) practical systems often exhibit unknown complex dynamics and several interconnections which make the modeling task difficult; and 3) as the models are acquired from the existing historical data, they are valid only locally and extrapolations incorporate risk of increasing process variability. To overcome these difficulties, this paper presents a new decision support system for robust multiobjective optimization of interconnected processes. The plant is first divided into serially connected units to model the process, product quality, energy consumption, and corresponding uncertainty measures. Then multiobjective gradient descent algorithm is used to solve the problem in line with user's preference information. Finally, the optimization results are visualized for analysis and decision making. In practice, if further iterations of the optimization algorithm are considered, validity of the local models must be checked prior to proceeding to further iterations. The method is implemented by a MATLAB-based interactive tool DataExplorer supporting a range of data analysis, modeling, and multiobjective optimization techniques. The proposed approach was tested in two U.K.-based commercial paper mills where the aim was reducing steam consumption and increasing productivity while maintaining the product quality by optimization of vacuum pressures in forming and press sections. The experimental results demonstrate the effectiveness of the method.
Resumo:
Reducing energy consumption is a major challenge for energy-intensive industries such as papermaking. A commercially viable energy saving solution is to employ data-based optimization techniques to obtain a set of optimized operational settings that satisfy certain performance indices. The difficulties of this are: 1) the problems of this type are inherently multicriteria in the sense that improving one performance index might result in compromising the other important measures; 2) practical systems often exhibit unknown complex dynamics and several interconnections which make the modeling task difficult; and 3) as the models are acquired from the existing historical data, they are valid only locally and extrapolations incorporate risk of increasing process variability. To overcome these difficulties, this paper presents a new decision support system for robust multiobjective optimization of interconnected processes. The plant is first divided into serially connected units to model the process, product quality, energy consumption, and corresponding uncertainty measures. Then multiobjective gradient descent algorithm is used to solve the problem in line with user's preference information. Finally, the optimization results are visualized for analysis and decision making. In practice, if further iterations of the optimization algorithm are considered, validity of the local models must be checked prior to proceeding to further iterations. The method is implemented by a MATLAB-based interactive tool DataExplorer supporting a range of data analysis, modeling, and multiobjective optimization techniques. The proposed approach was tested in two U.K.-based commercial paper mills where the aim was reducing steam consumption and increasing productivity while maintaining the product quality by optimization of vacuum pressures in forming and press sections. The experimental results demonstrate the effectiveness of the method. © 2006 IEEE.
Resumo:
This paper challenges the recent suggestion that a new financial elite has evolved which is able to capture substantial profit shares for itself. Specifically, it questions the assumption that new groups of financial intermediaries have increased in significance primarily because there is evidence that various types of financial speculators have played a similarly extensive role at several junctures of economic development. The paper then develops the alternative hypothesis that, rather than being a recent development, the rise of these financial intermediaries is a cyclical phenomenon which is linked to specific regimes of capital accumulation. The hypothesis is underpinned by historical data from the US National Income and Product Accounts for the period from 1930 to 2000, which suggest that the activities of `mainstream' financial intermediaries have been accompanied by the frequently countercyclical activities of a `speculative' sector of security and commodity brokers. Based on the combination of this qualitative and quantitative evidence, the paper concludes that the rise of a speculative financial sector is a potentially recurrent phenomenon which is linked to periods of economic restructuring and turmoil.
Resumo:
In this article we review recent work on the history of French negation in relation to three key issues in socio-historical linguistics: identifying appropriate sources, interpreting scant or anomalous data, and interpreting generational differences in historical data. We then turn to a new case study, that of verbal agreement with la plupart, to see whether this can shed fresh light on these issues. We argue that organising data according to the author’s date of birth is methodologically sounder than according to date of publication. We explore the extent to which different genres and text types reflect changing patterns of usage and suggest that additional, different case-studies are required in order to make more secure generalisations about the reliability of different sources.
Resumo:
In recent years external beam radiotherapy (EBRT) has been proposed as a treatment for the wet form of age-related macular degeneration (AMD) where choroidal neovascularization (CNV) is the hallmark. While the majority of pilot (Phase I) studies have reported encouraging results, a few have found no benefit, i.e. EBRT was not found to result in either improvement or stabilization of visual acuity of the treated eye. The natural history of visual loss in untreated CNV of AMD is highly variable. Loss of vision is influenced mainly by the presenting acuity, and size and composition of the lesion, and to a lesser extent by a variety of other factors. Thus the variable outcome reported by the small Phase I studies of EBRT published to date may simply reflect the variation in baseline factors. We therefore obtained information on 409 patients treated with EBRT from eight independent centres, which included details of visual acuity at baseline and at subsequent follow-up visits. Analysis of the data showed that 22.5% and 14.9% of EBRT-treated eyes developed moderate and severe loss of vision, respectively, during an average follow-up of 13 months. Initial visual acuity, which explained 20.5% of the variation in visual loss, was the most important baseline factor studied. Statistically significant differences in loss of vision were observed between centres, after considering the effects of case mix factors. Comparisons with historical data suggested that while moderate visual loss was similar to that of the natural history of the disease, the likelihood of suffering severe visual loss was halved. However, the benefit in terms of maintained/improved vision in the treated eye was modest.
Resumo:
The Irish and UK governments, along with other countries, have made a commitment to limit the concentrations of greenhouse gases in the atmosphere by reducing emissions from the burning of fossil fuels. This can be achieved (in part) through increasing the sequestration of CO2 from the atmosphere including monitoring the amount stored in vegetation and soils. A large proportion of soil carbon is held within peat due to the relatively high carbon density of peat and organic-rich soils. This is particularly important for a country such as Ireland, where some 16% of the land surface is covered by peat. For Northern Ireland, it has been estimated that the total amount of carbon stored in vegetation is 4.4Mt compared to 386Mt stored within peat and soils. As a result it has become increasingly important to measure and monitor changes in stores of carbon in soils. The conservation and restoration of peat covered areas, although ongoing for many years, has become increasingly important. This is summed up in current EU policy outlined by the European Commission (2012) which seeks to assess the relative contributions of the different inputs and outputs of organic carbon and organic matter to and from soil. Results are presented from the EU-funded Tellus Border Soil Carbon Project (2011 to 2013) which aimed to improve current estimates of carbon in soil and peat across Northern Ireland and the bordering counties of the Republic of Ireland.
Historical reports and previous surveys provide baseline data. To monitor change in peat depth and soil organic carbon, these historical data are integrated with more recently acquired airborne geophysical (radiometric) data and ground-based geochemical data generated by two surveys, the Tellus Project (2004-2007: covering Northern Ireland) and the EU-funded Tellus Border project (2011-2013) covering the six bordering counties of the Republic of Ireland, Donegal, Sligo, Leitrim, Cavan, Monaghan and Louth. The concept being applied is that saturated organic-rich soil and peat attenuate gamma-radiation from underlying soils and rocks. This research uses the degree of spatial correlation (coregionalization) between peat depth, soil organic carbon (SOC) and the attenuation of the radiometric signal to update a limited sampling regime of ground-based measurements with remotely acquired data. To comply with the compositional nature of the SOC data (perturbations of loss on ignition [LOI] data), a compositional data analysis approach is investigated. Contemporaneous ground-based measurements allow corroboration for the updated mapped outputs. This provides a methodology that can be used to improve estimates of soil carbon with minimal impact to sensitive habitats (like peat bogs), but with maximum output of data and knowledge.
Resumo:
In this paper, we analyze the behavior of real interest rates over the long-run using historical data for nine developed economies, to assess the extent to which the recent decline observed in most advanced countries is at odds with the past data, as suggested by the Secular Stagnation hypothesis. By using data from 1703 and performing stationarity and structural breaks tests, we find that the recent decline in interest rates is not explained by a structural break in the time series. Our results also show that considering long-run data leads to different conclusions than using short-run data.
Resumo:
Stock markets employ specialized traders, market-makers, designed to provide liquidity and volume to the market by constantly supplying both supply and demand. In this paper, we demonstrate a novel method for modeling the market as a dynamic system and a reinforcement learning algorithm that learns profitable market-making strategies when run on this model. The sequence of buys and sells for a particular stock, the order flow, we model as an Input-Output Hidden Markov Model fit to historical data. When combined with the dynamics of the order book, this creates a highly non-linear and difficult dynamic system. Our reinforcement learning algorithm, based on likelihood ratios, is run on this partially-observable environment. We demonstrate learning results for two separate real stocks.
Resumo:
Previous studies of the place of Property in the multi-asset portfolio have generally relied on historical data, and have been concerned with the supposed risk reduction effects that Property would have on such portfolios. In this paper a different approach has been taken. Not only are expectations data used, but we have also concentrated upon the required return that Property would have to offer to achieve a holding of 15% in typical UK pension fund portfolios. Using two benchmark portfolios for pension funds, we have shown that Property's required return is less than that expected, and therefore it could justify a 15% holding.
Resumo:
The Enriquillo and Azuei are saltwater lakes located in a closed water basin in the southwestern region of the island of La Hispaniola, these have been experiencing dramatic changes in total lake-surface area coverage during the period 1980-2012. The size of Lake Enriquillo presented a surface area of approximately 276 km2 in 1984, gradually decreasing to 172 km2 in 1996. The surface area of the lake reached its lowest point in the satellite observation record in 2004, at 165 km2. Then the recent growth of the lake began reaching its 1984 size by 2006. Based on surface area measurement for June and July 2013, Lake Enriquillo has a surface area of ~358 km2. Sumatra sizes at both ends of the record are 116 km2 in 1984 and 134 km2in 2013, an overall 15.8% increase in 30 years. Determining the causes of lake surface area changes is of extreme importance due to its environmental, social, and economic impacts. The overall goal of this study is to quantify the changing water balance in these lakes and their catchment area using satellite and ground observations and a regional atmospheric-hydrologic modeling approach. Data analyses of environmental variables in the region reflect a hydrological unbalance of the lakes due to changing regional hydro-climatic conditions. Historical data show precipitation, land surface temperature and humidity, and sea surface temperature (SST), increasing over region during the past decades. Salinity levels have also been decreasing by more than 30% from previously reported baseline levels. Here we present a summary of the historical data obtained, new sensors deployed in the sourrounding sierras and the lakes, and the integrated modeling exercises. As well as the challenges of gathering, storing, sharing, and analyzing this large volumen of data in a remote location from such a diverse number of sources.
Resumo:
Data visualization techniques are powerful in the handling and analysis of multivariate systems. One such technique known as parallel coordinates was used to support the diagnosis of an event, detected by a neural network-based monitoring system, in a boiler at a Brazilian Kraft pulp mill. Its attractiveness is the possibility of the visualization of several variables simultaneously. The diagnostic procedure was carried out step-by-step going through exploratory, explanatory, confirmatory, and communicative goals. This tool allowed the visualization of the boiler dynamics in an easier way, compared to commonly used univariate trend plots. In addition it facilitated analysis of other aspects, namely relationships among process variables, distinct modes of operation and discrepant data. The whole analysis revealed firstly that the period involving the detected event was associated with a transition between two distinct normal modes of operation, and secondly the presence of unusual changes in process variables at this time.
Volcanic forcing for climate modeling: a new microphysics-based data set covering years 1600–present
Resumo:
As the understanding and representation of the impacts of volcanic eruptions on climate have improved in the last decades, uncertainties in the stratospheric aerosol forcing from large eruptions are now linked not only to visible optical depth estimates on a global scale but also to details on the size, latitude and altitude distributions of the stratospheric aerosols. Based on our understanding of these uncertainties, we propose a new model-based approach to generating a volcanic forcing for general circulation model (GCM) and chemistry–climate model (CCM) simulations. This new volcanic forcing, covering the 1600–present period, uses an aerosol microphysical model to provide a realistic, physically consistent treatment of the stratospheric sulfate aerosols. Twenty-six eruptions were modeled individually using the latest available ice cores aerosol mass estimates and historical data on the latitude and date of eruptions. The evolution of aerosol spatial and size distribution after the sulfur dioxide discharge are hence characterized for each volcanic eruption. Large variations are seen in hemispheric partitioning and size distributions in relation to location/date of eruptions and injected SO2 masses. Results for recent eruptions show reasonable agreement with observations. By providing these new estimates of spatial distributions of shortwave and long-wave radiative perturbations, this volcanic forcing may help to better constrain the climate model responses to volcanic eruptions in the 1600–present period. The final data set consists of 3-D values (with constant longitude) of spectrally resolved extinction coefficients, single scattering albedos and asymmetry factors calculated for different wavelength bands upon request. Surface area densities for heterogeneous chemistry are also provided.
Resumo:
Historical information is always relevant for clinical trial design. Additionally, if incorporated in the analysis of a new trial, historical data allow to reduce the number of subjects. This decreases costs and trial duration, facilitates recruitment, and may be more ethical. Yet, under prior-data conflict, a too optimistic use of historical data may be inappropriate. We address this challenge by deriving a Bayesian meta-analytic-predictive prior from historical data, which is then combined with the new data. This prospective approach is equivalent to a meta-analytic-combined analysis of historical and new data if parameters are exchangeable across trials. The prospective Bayesian version requires a good approximation of the meta-analytic-predictive prior, which is not available analytically. We propose two- or three-component mixtures of standard priors, which allow for good approximations and, for the one-parameter exponential family, straightforward posterior calculations. Moreover, since one of the mixture components is usually vague, mixture priors will often be heavy-tailed and therefore robust. Further robustness and a more rapid reaction to prior-data conflicts can be achieved by adding an extra weakly-informative mixture component. Use of historical prior information is particularly attractive for adaptive trials, as the randomization ratio can then be changed in case of prior-data conflict. Both frequentist operating characteristics and posterior summaries for various data scenarios show that these designs have desirable properties. We illustrate the methodology for a phase II proof-of-concept trial with historical controls from four studies. Robust meta-analytic-predictive priors alleviate prior-data conflicts ' they should encourage better and more frequent use of historical data in clinical trials.