1000 resultados para Discrepant data
Resumo:
Many fisheries worldwide have adopted vessel monitoring systems (VMS) for compliance purposes. An added benefit of these systems is that they collect a large amount of data on vessel locations at very fine spatial and temporal scales. This data can provide a wealth of information for stock assessment, research, and management. However, since most VMS implementations record vessel location at set time intervals with no regard to vessel activity, some methodology is required to determine which data records correspond to fishing activity. This paper describes a probabilistic approach, based on hidden Markov models (HMMs), to determine vessel activity. A HMM provides a natural framework for the problem and, by definition, models the intrinsic temporal correlation of the data. The paper describes the general approach that was developed and presents an example of this approach applied to the Queensland trawl fishery off the coast of eastern Australia. Finally, a simulation experiment is presented that compares the misallocation rates of the HMM approach with other approaches.
Resumo:
Standardised time series of fishery catch rates require collations of fishing power data on vessel characteristics. Linear mixed models were used to quantify fishing power trends and study the effect of missing data encountered when relying on commercial logbooks. For this, Australian eastern king prawn (Melicertus plebejus) harvests were analysed with historical (from vessel surveys) and current (from commercial logbooks) vessel data. Between 1989 and 2010, fishing power increased up to 76%. To date, both forward-filling and, alternatively, omitting records with missing vessel information from commercial logbooks produce broadly similar fishing power increases and standardised catch rates, due to the strong influence of years with complete vessel data (16 out of 23 years of data). However, if gaps in vessel information had not originated randomly and skippers from the most efficient vessels were the most diligent at filling in logbooks, considerable errors would be introduced. Also, the buffering effect of complete years would be short lived as years with missing data accumulate. Given ongoing changes in fleet profile with high-catching vessels fishing proportionately more of the fleet’s effort, compliance with logbook completion, or alternatively ongoing vessel gear surveys, is required for generating accurate estimates of fishing power and standardised catch rates.
Resumo:
This paper presents a Chance-constraint Programming approach for constructing maximum-margin classifiers which are robust to interval-valued uncertainty in training examples. The methodology ensures that uncertain examples are classified correctly with high probability by employing chance-constraints. The main contribution of the paper is to pose the resultant optimization problem as a Second Order Cone Program by using large deviation inequalities, due to Bernstein. Apart from support and mean of the uncertain examples these Bernstein based relaxations make no further assumptions on the underlying uncertainty. Classifiers built using the proposed approach are less conservative, yield higher margins and hence are expected to generalize better than existing methods. Experimental results on synthetic and real-world datasets show that the proposed classifiers are better equipped to handle interval-valued uncertainty than state-of-the-art.
Resumo:
A compilation of crystal structure data on deoxyribo- and ribonucleosides and their higher derivatives is presented. The aim of this paper is to highlight the flexibility of deoxyribose and ribose rings. So far, the conformational parameters of nucleic acids constituents of ribose and deoxyribose have not been analysed separately. This paper aims to correlate the conformational parameters with the nature and puckering of the sugar. Deoxyribose puckering occurs in the C2′ endo region while ribose puckering is observed both in the C3′ endo and C2′ endo regions. A few endocyclic and exocyclic bond angles depend on the puckering and the nature of the sugar. The majority of structures have an anti conformation about the glycosyl bond. There appears to be a puckering dependence on the torsion angle about the C4′---C5′ bonds. Such stereochemical information is useful in model building studies of polynucleotides and nucleic acids.
Resumo:
During the last decades there has been a global shift in forest management from a focus solely on timber management to ecosystem management that endorses all aspects of forest functions: ecological, economic and social. This has resulted in a shift in paradigm from sustained yield to sustained diversity of values, goods and benefits obtained at the same time, introducing new temporal and spatial scales into forest resource management. The purpose of the present dissertation was to develop methods that would enable spatial and temporal scales to be introduced into the storage, processing, access and utilization of forest resource data. The methods developed are based on a conceptual view of a forest as a hierarchically nested collection of objects that can have a dynamically changing set of attributes. The temporal aspect of the methods consists of lifetime management for the objects and their attributes and of a temporal succession linking the objects together. Development of the forest resource data processing method concentrated on the extensibility and configurability of the data content and model calculations, allowing for a diverse set of processing operations to be executed using the same framework. The contribution of this dissertation to the utilisation of multi-scale forest resource data lies in the development of a reference data generation method to support forest inventory methods in approaching single-tree resolution.
Resumo:
Patterns of movement in aquatic animals reflect ecologically important behaviours. Cyclical changes in the abiotic environment influence these movements, but when multiple processes occur simultaneously, identifying which is responsible for the observed movement can be complex. Here we used acoustic telemetry and signal processing to define the abiotic processes responsible for movement patterns in freshwater whiprays (Himantura dalyensis). Acoustic transmitters were implanted into the whiprays and their movements detected over 12 months by an array of passive acoustic receivers, deployed throughout 64 km of the Wenlock River, Qld, Australia. The time of an individual's arrival and departure from each receiver detection field was used to estimate whipray location continuously throughout the study. This created a linear-movement-waveform for each whipray and signal processing revealed periodic components within the waveform. Correlation of movement periodograms with those from abiotic processes categorically illustrated that the diel cycle dominated the pattern of whipray movement during the wet season, whereas tidal and lunar cycles dominated during the dry season. The study methodology represents a valuable tool for objectively defining the relationship between abiotic processes and the movement patterns of free-ranging aquatic animals and is particularly expedient when periods of no detection exist within the animal location data.
Resumo:
Abstract of Macbeth, G. M., Broderick, D., Buckworth, R. & Ovenden, J. R. (In press, Feb 2013). Linkage disequilibrium estimation of effective population size with immigrants from divergent populations: a case study on Spanish mackerel (Scomberomorus commerson). G3: Genes, Genomes and Genetics. Estimates of genetic effective population size (Ne) using molecular markers are a potentially useful tool for the management of endangered through to commercial species. But, pitfalls are predicted when the effective size is large, as estimates require large numbers of samples from wild populations for statistical validity. Our simulations showed that linkage disequilibrium estimates of Ne up to 10,000 with finite confidence limits can be achieved with sample sizes around 5000. This was deduced from empirical allele frequencies of seven polymorphic microsatellite loci in a commercially harvested fisheries species, the narrow barred Spanish mackerel (Scomberomorus commerson). As expected, the smallest standard deviation of Ne estimates occurred when low frequency alleles were excluded. Additional simulations indicated that the linkage disequilibrium method was sensitive to small numbers of genotypes from cryptic species or conspecific immigrants. A correspondence analysis algorithm was developed to detect and remove outlier genotypes that could possibly be inadvertently sampled from cryptic species or non-breeding immigrants from genetically separate populations. Simulations demonstrated the value of this approach in Spanish mackerel data. When putative immigrants were removed from the empirical data, 95% of the Ne estimates from jacknife resampling were above 24,000.
Resumo:
Tetrapeptide sequences of the type Z-Pro-Y-X were obtained from the crystal structure data on 34 globular proteins, and used in an analysis of the positional preferences of the individual amino acid residues in the β-turn conformation. The effect of fixing proline as the second position residue in the tetrapeptide sequence was studied by comparing the data obtained on the positional preferences with the corresponding data obtained by Chou and Fasman using the Z-R-Y-X sequence, where no particular residue was fixed in any of the four positions. While, in general, several amino acid residues having relatively very high or very low preferences for specific positions were found to be common to both the Z-Pro-Y-X and Z-R-Y-X sequences, many significant differences were found between the two sets of data, which are to be attributed to specific interactions arising from the presence of the proline residue.
Resumo:
Data flow computers are high-speed machines in which an instruction is executed as soon as all its operands are available. This paper describes the EXtended MANchester (EXMAN) data flow computer which incorporates three major extensions to the basic Manchester machine. As extensions we provide a multiple matching units scheme, an efficient, implementation of array data structure, and a facility to concurrently execute reentrant routines. A simulator for the EXMAN computer has been coded in the discrete event simulation language, SIMULA 67, on the DEC 1090 system. Performance analysis studies have been conducted on the simulated EXMAN computer to study the effectiveness of the proposed extensions. The performance experiments have been carried out using three sample problems: matrix multiplication, Bresenham's line drawing algorithm, and the polygon scan-conversion algorithm.
Resumo:
This thesis examines the feasibility of a forest inventory method based on two-phase sampling in estimating forest attributes at the stand or substand levels for forest management purposes. The method is based on multi-source forest inventory combining auxiliary data consisting of remote sensing imagery or other geographic information and field measurements. Auxiliary data are utilized as first-phase data for covering all inventory units. Various methods were examined for improving the accuracy of the forest estimates. Pre-processing of auxiliary data in the form of correcting the spectral properties of aerial imagery was examined (I), as was the selection of aerial image features for estimating forest attributes (II). Various spatial units were compared for extracting image features in a remote sensing aided forest inventory utilizing very high resolution imagery (III). A number of data sources were combined and different weighting procedures were tested in estimating forest attributes (IV, V). Correction of the spectral properties of aerial images proved to be a straightforward and advantageous method for improving the correlation between the image features and the measured forest attributes. Testing different image features that can be extracted from aerial photographs (and other very high resolution images) showed that the images contain a wealth of relevant information that can be extracted only by utilizing the spatial organization of the image pixel values. Furthermore, careful selection of image features for the inventory task generally gives better results than inputting all extractable features to the estimation procedure. When the spatial units for extracting very high resolution image features were examined, an approach based on image segmentation generally showed advantages compared with a traditional sample plot-based approach. Combining several data sources resulted in more accurate estimates than any of the individual data sources alone. The best combined estimate can be derived by weighting the estimates produced by the individual data sources by the inverse values of their mean square errors. Despite the fact that the plot-level estimation accuracy in two-phase sampling inventory can be improved in many ways, the accuracy of forest estimates based mainly on single-view satellite and aerial imagery is a relatively poor basis for making stand-level management decisions.
Resumo:
An ecological risk assessment of the East Coast Otter Trawl Fishery in the Great Barrier Reef Region was undertaken in 2010 and 2011. It assessed the risks posed by this fishery to achieving fishery-related and broader ecological objectives of both the Queensland and Australian governments, including risks to the values and integrity of the Great Barrier Reef World Heritage Area. The risks assessed included direct and indirect effects on the species caught in the fishery as well as on the structure and functioning of the ecosystem. This ecosystem-based approach included an assessment of the impacts on harvested species, by-catch, species of conservation concern, marine habitats, species assemblages and ecosystem processes. The assessment took into account current management arrangements and fishing practices at the time of the assessment. The main findings of the assessment were: Current risk levels from trawling activities are generally low. Some risks from trawling remain. Risks from trawling have reduced in the Great Barrier Reef Region. Trawl fishing effort is a key driver of ecological risk. Zoning has been important in reducing risks. Reducing identified unacceptable risks requires a range of management responses. The commercial fishing industry is supportive and being proactive. Further reductions in trawl by-catch, high compliance with rules and accurate information from ongoing risk monitoring are important. Trawl fishing is just one of the sources of risk to the Great Barrier Reef.
Resumo:
The aim of this thesis is to develop a fully automatic lameness detection system that operates in a milking robot. The instrumentation, measurement software, algorithms for data analysis and a neural network model for lameness detection were developed. Automatic milking has become a common practice in dairy husbandry, and in the year 2006 about 4000 farms worldwide used over 6000 milking robots. There is a worldwide movement with the objective of fully automating every process from feeding to milking. Increase in automation is a consequence of increasing farm sizes, the demand for more efficient production and the growth of labour costs. As the level of automation increases, the time that the cattle keeper uses for monitoring animals often decreases. This has created a need for systems for automatically monitoring the health of farm animals. The popularity of milking robots also offers a new and unique possibility to monitor animals in a single confined space up to four times daily. Lameness is a crucial welfare issue in the modern dairy industry. Limb disorders cause serious welfare, health and economic problems especially in loose housing of cattle. Lameness causes losses in milk production and leads to early culling of animals. These costs could be reduced with early identification and treatment. At present, only a few methods for automatically detecting lameness have been developed, and the most common methods used for lameness detection and assessment are various visual locomotion scoring systems. The problem with locomotion scoring is that it needs experience to be conducted properly, it is labour intensive as an on-farm method and the results are subjective. A four balance system for measuring the leg load distribution of dairy cows during milking in order to detect lameness was developed and set up in the University of Helsinki Research farm Suitia. The leg weights of 73 cows were successfully recorded during almost 10,000 robotic milkings over a period of 5 months. The cows were locomotion scored weekly, and the lame cows were inspected clinically for hoof lesions. Unsuccessful measurements, caused by cows standing outside the balances, were removed from the data with a special algorithm, and the mean leg loads and the number of kicks during milking was calculated. In order to develop an expert system to automatically detect lameness cases, a model was needed. A probabilistic neural network (PNN) classifier model was chosen for the task. The data was divided in two parts and 5,074 measurements from 37 cows were used to train the model. The operation of the model was evaluated for its ability to detect lameness in the validating dataset, which had 4,868 measurements from 36 cows. The model was able to classify 96% of the measurements correctly as sound or lame cows, and 100% of the lameness cases in the validation data were identified. The number of measurements causing false alarms was 1.1%. The developed model has the potential to be used for on-farm decision support and can be used in a real-time lameness monitoring system.
Resumo:
As for other complex diseases, linkage analyses of schizophrenia (SZ) have produced evidence for numerous chromosomal regions, with inconsistent results reported across studies. The presence of locus heterogeneity appears likely and may reduce the power of linkage analyses if homogeneity is assumed. In addition, when multiple heterogeneous datasets are pooled, inter-sample variation in the proportion of linked families (alpha) may diminish the power of the pooled sample to detect susceptibility loci, in spite of the larger sample size obtained. We compare the significance of linkage findings obtained using allele-sharing LOD scores (LOD(exp))-which assume homogeneity-and heterogeneity LOD scores (HLOD) in European American and African American NIMH SZ families. We also pool these two samples and evaluate the relative power of the LOD(exp) and two different heterogeneity statistics. One of these (HLOD-P) estimates the heterogeneity parameter alpha only in aggregate data, while the second (HLOD-S) determines alpha separately for each sample. In separate and combined data, we show consistently improved performance of HLOD scores over LOD(exp). Notably, genome-wide significant evidence for linkage is obtained at chromosome 10p in the European American sample using a recessive HLOD score. When the two samples are combined, linkage at the 10p locus also achieves genome-wide significance under HLOD-S, but not HLOD-P. Using HLOD-S, improved evidence for linkage was also obtained for a previously reported region on chromosome 15q. In linkage analyses of complex disease, power may be maximised by routinely modelling locus heterogeneity within individual datasets, even when multiple datasets are combined to form larger samples.
Resumo:
For zygosity diagnosis in the absence of genotypic data, or in the recruitment phase of a twin study where only single twins from same-sex pairs are being screened, or to provide a test for sample duplication leading to the false identification of a dizygotic pair as monozygotic, the appropriate analysis of respondents' answers to questions about zygosity is critical. Using data from a young adult Australian twin cohort (N = 2094 complete pairs and 519 singleton twins from same-sex pairs with complete responses to all zygosity items), we show that application of latent class analysis (LCA), fitting a 2-class model, yields results that show good concordance with traditional methods of zygosity diagnosis, but with certain important advantages. These include the ability, in many cases, to assign zygosity with specified probability on the basis of responses of a single informant (advantageous when one zygosity type is being oversampled); and the ability to quantify the probability of misassignment of zygosity, allowing prioritization of cases for genotyping as well as identification of cases of probable laboratory error. Out of 242 twins (from 121 like-sex pairs) where genotypic data were available for zygosity confirmation, only a single case was identified of incorrect zygosity assignment by the latent class algorithm. Zygosity assignment for that single case was identified by the LCA as uncertain (probability of being a monozygotic twin only 76%), and the co-twin's responses clearly identified the pair as dizygotic (probability of being dizygotic 100%). In the absence of genotypic data, or as a safeguard against sample duplication, application of LCA for zygosity assignment or confirmation is strongly recommended.