87 resultados para Carnap Entropy
Resumo:
Models of the mammalian clock have traditionally been based around two feedback loops-the self-repression of Per/Cry by interfering with activation by BMAL/CLOCK, and the repression of Bmal/Clock by the REV-ERB proteins. Recent experimental evidence suggests that the D-box, a transcription factor binding site associated with daytime expression, plays a larger role in clock function than has previously been understood. We present a simplified clock model that highlights the role of the D-box and illustrate an approach for finding maximum-entropy ensembles of model parameters, given experimentally imposed constraints. Parameter variability can be mitigated using prior probability distributions derived from genome-wide studies of cellular kinetics. Our model reproduces predictions concerning the dual regulation of Cry1 by the D-box and Rev-ErbA/ROR response element (RRE) promoter elements and allows for ensemble-based predictions of phase response curves (PRCs). Nonphotic signals such as Neuropeptide Y (NPY) may act by promoting Cry1 expression, whereas photic signals likely act by stimulating expression from the E/E' box. Ensemble generation with parameter probability restraints reveals more about a model's behavior than a single optimal parameter set.
Resumo:
In this paper, we propose a highly reliable fault diagnosis scheme for incipient low-speed rolling element bearing failures. The scheme consists of fault feature calculation, discriminative fault feature analysis, and fault classification. The proposed approach first computes wavelet-based fault features, including the respective relative wavelet packet node energy and entropy, by applying a wavelet packet transform to an incoming acoustic emission signal. The most discriminative fault features are then filtered from the originally produced feature vector by using discriminative fault feature analysis based on a binary bat algorithm (BBA). Finally, the proposed approach employs one-against-all multiclass support vector machines to identify multiple low-speed rolling element bearing defects. This study compares the proposed BBA-based dimensionality reduction scheme with four other dimensionality reduction methodologies in terms of classification performance. Experimental results show that the proposed methodology is superior to other dimensionality reduction approaches, yielding an average classification accuracy of 94.9%, 95.8%, and 98.4% under bearing rotational speeds at 20 revolutions-per-minute (RPM), 80 RPM, and 140 RPM, respectively.
Resumo:
Soundscape assessment has been proposed as a remote ecological monitoring tool for measuring biodiversity, but few studies have examined how soundscape patterns vary with landscape configuration and condition. The goal of our study was to examine a suite of published acoustic indices to determine whether they provide comparable results relative to varying levels of landscape fragmentation and ecological condition in nineteen forest sites in eastern Australia. Our comparison of six acoustic indices according to time of day revealed that two indices, the acoustic complexity and the bioacoustic index, presented a similar pattern that was linked to avian song intensity, but was not related to landscape and biodiversity attributes. The diversity indices, acoustic entropy and acoustic diversity, and the normalized difference soundscape index revealed high nighttime sound, as well as a dawn and dusk chorus. These indices appear to be sensitive to nocturnal biodiversity which is abundant at night in warm, subtropical environments. We argue that there is need to better understand temporal partitioning of the soundscape by specific taxonomic groups, and this should involve integrated research on amphibians, insects and birds during a 24 h cycle. The three indices that best connected the soundscape with landscape characteristics, ecological condition and bird species richness were acoustic entropy, acoustic evenness and the normalized difference soundscape index. This study has demonstrated that remote soundscape assessment can be implemented as an ecological monitoring tool in fragmented Australian forest landscapes. However, further investigation should be dedicated to refining and/or combining existing acoustic indices and also to determine if these indices are appropriate in other landscapes and for other survey purposes.
Resumo:
Bioacoustic monitoring has become a significant research topic for species diversity conservation. Due to the development of sensing techniques, acoustic sensors are widely deployed in the field to record animal sounds over a large spatial and temporal scale. With large volumes of collected audio data, it is essential to develop semi-automatic or automatic techniques to analyse the data. This can help ecologists make decisions on how to protect and promote the species diversity. This paper presents generic features to characterize a range of bird species for vocalisation retrieval. In the implementation, audio recordings are first converted to spectrograms using short-time Fourier transform, then a ridge detection method is applied to the spectrogram for detecting points of interest. Based on the detected points, a new region representation are explored for describing various bird vocalisations and a local descriptor including temporal entropy, frequency bin entropy and histogram of counts of four ridge directions is calculated for each sub-region. To speed up the retrieval process, indexing is carried out and the retrieved results are ranked according to similarity scores. The experiment results show that our proposed feature set can achieve 0.71 in term of retrieval success rate which outperforms spectral ridge features alone (0.55) and Mel frequency cepstral coefficients (0.36).
Resumo:
Stochastic (or random) processes are inherent to numerous fields of human endeavour including engineering, science, and business and finance. This thesis presents multiple novel methods for quickly detecting and estimating uncertainties in several important classes of stochastic processes. The significance of these novel methods is demonstrated by employing them to detect aircraft manoeuvres in video signals in the important application of autonomous mid-air collision avoidance.
Resumo:
Japanese encephalitis (JE) is the most common cause of viral encephalitis and an important public health concern in the Asia-Pacific region, particularly in China where 50% of global cases are notified. To explore the association between environmental factors and human JE cases and identify the high risk areas for JE transmission in China, we used annual notified data on JE cases at the center of administrative township and environmental variables with a pixel resolution of 1 km×1 km from 2005 to 2011 to construct models using ecological niche modeling (ENM) approaches based on maximum entropy. These models were then validated by overlaying reported human JE case localities from 2006 to 2012 onto each prediction map. ENMs had good discriminatory ability with the area under the curve (AUC) of the receiver operating curve (ROC) of 0.82-0.91, and low extrinsic omission rate of 5.44-7.42%. Resulting maps showed JE being presented extensively throughout southwestern and central China, with local spatial variations in probability influenced by minimum temperatures, human population density, mean temperatures, and elevation, with contribution of 17.94%-38.37%, 15.47%-21.82%, 3.86%-21.22%, and 12.05%-16.02%, respectively. Approximately 60% of JE cases occurred in predicted high risk areas, which covered less than 6% of areas in mainland China. Our findings will help inform optimal geographical allocation of the limited resources available for JE prevention and control in China, find hidden high-risk areas, and increase the effectiveness of public health interventions against JE transmission.
Resumo:
This paper presents a system to analyze long field recordings with low signal-to-noise ratio (SNR) for bio-acoustic monitoring. A method based on spectral peak track, Shannon entropy, harmonic structure and oscillation structure is proposed to automatically detect anuran (frog) calling activity. Gaussian mixture model (GMM) is introduced for modelling those features. Four anuran species widespread in Queensland, Australia, are selected to evaluate the proposed system. A visualization method based on extracted indices is employed for detection of anuran calling activity which achieves high accuracy.
Resumo:
Closed Systems was an exhibition of sculptural works held at MetroArts, Brisbane in the group exhibition 'Platform 2012' curated by Jan Manton. My contribution to the group show was a series of 13 bronze sculptures, produced using a lost-wax casting process. Each form was derived from a sapling that had been manipulated so that branches and roots were interconnected. The resulting 'looped' forms evoked notions on narcissism, self-absorption and introversion. The works were subsequently exhibited at the National Arts School in Sydney as part of the Kodak Minolta Art Prize (2013).
Resumo:
Species distribution modelling (SDM) typically analyses species’ presence together with some form of absence information. Ideally absences comprise observations or are inferred from comprehensive sampling. When such information is not available, then pseudo-absences are often generated from the background locations within the study region of interest containing the presences, or else absence is implied through the comparison of presences to the whole study region, e.g. as is the case in Maximum Entropy (MaxEnt) or Poisson point process modelling. However, the choice of which absence information to include can be both challenging and highly influential on SDM predictions (e.g. Oksanen and Minchin, 2002). In practice, the use of pseudo- or implied absences often leads to an imbalance where absences far outnumber presences. This leaves analysis highly susceptible to ‘naughty-noughts’: absences that occur beyond the envelope of the species, which can exert strong influence on the model and its predictions (Austin and Meyers, 1996). Also known as ‘excess zeros’, naughty noughts can be estimated via an overall proportion in simple hurdle or mixture models (Martin et al., 2005). However, absences, especially those that occur beyond the species envelope, can often be more diverse than presences. Here we consider an extension to excess zero models. The two-staged approach first exploits the compartmentalisation provided by classification trees (CTs) (as in O’Leary, 2008) to identify multiple sources of naughty noughts and simultaneously delineate several species envelopes. Then SDMs can be fit separately within each envelope, and for this stage, we examine both CTs (as in Falk et al., 2014) and the popular MaxEnt (Elith et al., 2006). We introduce a wider range of model performance measures to improve treatment of naughty noughts in SDM. We retain an overall measure of model performance, the area under the curve (AUC) of the Receiver-Operating Curve (ROC), but focus on its constituent measures of false negative rate (FNR) and false positive rate (FPR), and how these relate to the threshold in the predicted probability of presence that delimits predicted presence from absence. We also propose error rates more relevant to users of predictions: false omission rate (FOR), the chance that a predicted absence corresponds to (and hence wastes) an observed presence, and the false discovery rate (FDR), reflecting those predicted (or potential) presences that correspond to absence. A high FDR may be desirable since it could help target future search efforts, whereas zero or low FOR is desirable since it indicates none of the (often valuable) presences have been ignored in the SDM. For illustration, we chose Bradypus variegatus, a species that has previously been published as an exemplar species for MaxEnt, proposed by Phillips et al. (2006). We used CTs to increasingly refine the species envelope, starting with the whole study region (E0), eliminating more and more potential naughty noughts (E1–E3). When combined with an SDM fit within the species envelope, the best CT SDM had similar AUC and FPR to the best MaxEnt SDM, but otherwise performed better. The FNR and FOR were greatly reduced, suggesting that CTs handle absences better. Interestingly, MaxEnt predictions showed low discriminatory performance, with the most common predicted probability of presence being in the same range (0.00-0.20) for both true absences and presences. In summary, this example shows that SDMs can be improved by introducing an initial hurdle to identify naughty noughts and partition the envelope before applying SDMs. This improvement was barely detectable via AUC and FPR yet visible in FOR, FNR, and the comparison of predicted probability of presence distribution for pres/absence.
Resumo:
The quality of species distribution models (SDMs) relies to a large degree on the quality of the input data, from bioclimatic indices to environmental and habitat descriptors (Austin, 2002). Recent reviews of SDM techniques, have sought to optimize predictive performance e.g. Elith et al., 2006. In general SDMs employ one of three approaches to variable selection. The simplest approach relies on the expert to select the variables, as in environmental niche models Nix, 1986 or a generalized linear model without variable selection (Miller and Franklin, 2002). A second approach explicitly incorporates variable selection into model fitting, which allows examination of particular combinations of variables. Examples include generalized linear or additive models with variable selection (Hastie et al. 2002); or classification trees with complexity or model based pruning (Breiman et al., 1984, Zeileis, 2008). A third approach uses model averaging, to summarize the overall contribution of a variable, without considering particular combinations. Examples include neural networks, boosted or bagged regression trees and Maximum Entropy as compared in Elith et al. 2006. Typically, users of SDMs will either consider a small number of variable sets, via the first approach, or else supply all of the candidate variables (often numbering more than a hundred) to the second or third approaches. Bayesian SDMs exist, with several methods for eliciting and encoding priors on model parameters (see review in Low Choy et al. 2010). However few methods have been published for informative variable selection; one example is Bayesian trees (O’Leary 2008). Here we report an elicitation protocol that helps makes explicit a priori expert judgements on the quality of candidate variables. This protocol can be flexibly applied to any of the three approaches to variable selection, described above, Bayesian or otherwise. We demonstrate how this information can be obtained then used to guide variable selection in classical or machine learning SDMs, or to define priors within Bayesian SDMs.
Resumo:
Glycosaminoglycans (GAGs) are complex highly charged linear polysaccharides that have a variety of roles in biological processes. We report the first use of molecular dynamics (MD) free energy calculations using the MM/PBSA method to investigate the binding of GAGs to protein molecules, namely the platelet endothelial cell adhesion molecule 1 (PECAM-1) and annexin A2. Calculations of the free energy of the binding of heparin fragments of different sizes reveal the existence of a region of low GAG-binding affinity in domains 5-6 of PECAM-1 and a region of high affinity in domains 2-3, consistent with experimental data and ligand-protein docking studies. A conformational hinge movement between domains 2 and 3 was observed, which allows the binding of heparin fragments of increasing size (pentasaccharides to octasaccharides) with an increasingly higher binding affinity. Similar simulations of the binding of a heparin fragment to annexin A2 reveal the optimization of electrostatic and hydrogen bonding interactions with the protein and protein-bound calcium ions. In general, these free energy calculations reveal that the binding of heparin to protein surfaces is dominated by strong electrostatic interactions for longer fragments, with equally important contributions from van der Waals interactions and vibrational entropy changes, against a large unfavorable desolvation penalty due to the high charge density of these molecules.
Resumo:
Generating discriminative input features is a key requirement for achieving highly accurate classifiers. The process of generating features from raw data is known as feature engineering and it can take significant manual effort. In this paper we propose automated feature engineering to derive a suite of additional features from a given set of basic features with the aim of both improving classifier accuracy through discriminative features, and to assist data scientists through automation. Our implementation is specific to HTTP computer network traffic. To measure the effectiveness of our proposal, we compare the performance of a supervised machine learning classifier built with automated feature engineering versus one using human-guided features. The classifier addresses a problem in computer network security, namely the detection of HTTP tunnels. We use Bro to process network traffic into base features and then apply automated feature engineering to calculate a larger set of derived features. The derived features are calculated without favour to any base feature and include entropy, length and N-grams for all string features, and counts and averages over time for all numeric features. Feature selection is then used to find the most relevant subset of these features. Testing showed that both classifiers achieved a detection rate above 99.93% at a false positive rate below 0.01%. For our datasets, we conclude that automated feature engineering can provide the advantages of increasing classifier development speed and reducing development technical difficulties through the removal of manual feature engineering. These are achieved while also maintaining classification accuracy.