925 resultados para complexity regularization
Resumo:
The increase of publicly available sequencing data has allowed for rapid progress in our understanding of genome composition. As new information becomes available we should constantly be updating and reanalyzing existing and newly acquired data. In this report we focus on transposable elements (TEs) which make up a significant portion of nearly all sequenced genomes. Our ability to accurately identify and classify these sequences is critical to understanding their impact on host genomes. At the same time, as we demonstrate in this report, problems with existing classification schemes have led to significant misunderstandings of the evolution of both TE sequences and their host genomes. In a pioneering publication Finnegan (1989) proposed classifying all TE sequences into two classes based on transposition mechanisms and structural features: the retrotransposons (class I) and the DNA transposons (class II). We have retraced how ideas regarding TE classification and annotation in both prokaryotic and eukaryotic scientific communities have changed over time. This has led us to observe that: (1) a number of TEs have convergent structural features and/or transposition mechanisms that have led to misleading conclusions regarding their classification, (2) the evolution of TEs is similar to that of viruses by having several unrelated origins, (3) there might be at least 8 classes and 12 orders of TEs including 10 novel orders. In an effort to address these classification issues we propose: (1) the outline of a universal TE classification, (2) a set of methods and classification rules that could be used by all scientific communities involved in the study of TEs, and (3) a 5-year schedule for the establishment of an International Committee for Taxonomy of Transposable Elements (ICTTE).
Resumo:
Maximum entropy modeling (Maxent) is a widely used algorithm for predicting species distributions across space and time. Properly assessing the uncertainty in such predictions is non-trivial and requires validation with independent datasets. Notably, model complexity (number of model parameters) remains a major concern in relation to overfitting and, hence, transferability of Maxent models. An emerging approach is to validate the cross-temporal transferability of model predictions using paleoecological data. In this study, we assess the effect of model complexity on the performance of Maxent projections across time using two European plant species (Alnus giutinosa (L.) Gaertn. and Corylus avellana L) with an extensive late Quaternary fossil record in Spain as a study case. We fit 110 models with different levels of complexity under present time and tested model performance using AUC (area under the receiver operating characteristic curve) and AlCc (corrected Akaike Information Criterion) through the standard procedure of randomly partitioning current occurrence data. We then compared these results to an independent validation by projecting the models to mid-Holocene (6000 years before present) climatic conditions in Spain to assess their ability to predict fossil pollen presence-absence and abundance. We find that calibrating Maxent models with default settings result in the generation of overly complex models. While model performance increased with model complexity when predicting current distributions, it was higher with intermediate complexity when predicting mid-Holocene distributions. Hence, models of intermediate complexity resulted in the best trade-off to predict species distributions across time. Reliable temporal model transferability is especially relevant for forecasting species distributions under future climate change. Consequently, species-specific model tuning should be used to find the best modeling settings to control for complexity, notably with paleoecological data to independently validate model projections. For cross-temporal projections of species distributions for which paleoecological data is not available, models of intermediate complexity should be selected.
Resumo:
This paper analyses the effects of manipulating the cognitive complexity of L2 oral tasks on language production. It specifically focuses on self-repairs, which are taken as a measure of accuracy since they denote both attention to form and an attempt at being accurate. By means of a repeated measures de- sign, 42 lower-intermediate students were asked to perform three different tasks types (a narrative, and instruction-giving task, and a decision-making task) for which two degrees of cognitive complexity were established. The narrative task was manipulated along +/− Here-and-Now, an instruction-giving task ma- nipulated along +/− elements, and the decision-making task which is manipu- lated along +/− reasoning demands. Repeated measures ANOVAs are used for the calculation of differences between degrees of complexity and among task types. One-way ANOVA are used to detect potential differences between low- proficiency and high-proficiency participants. Results show an overall effect of Task Complexity on self-repairs behavior across task types, with different be- haviors existing among the three task types. No differences are found between the self-repair behavior between low and high proficiency groups. Results are discussed in the light of theories of cognition and L2 performance (Robin- son 2001a, 2001b, 2003, 2005, 2007), L1 and L2 language production models (Levelt 1989, 1993; Kormos 2000, 2006), and attention during L2 performance (Skehan 1998; Robinson, 2002).
Resumo:
Experimental animal models are essential to obtain basic knowledge of the underlying biological mechanisms in human diseases. Here, we review major contributions to biomedical research and discoveries that were obtained in the mouse model by using forward genetics approaches and that provided key insights into the biology of human diseases and paved the way for the development of novel therapeutic approaches.
Resumo:
Although fetal anatomy can be adequately viewed in new multi-slice MR images, many critical limitations remain for quantitative data analysis. To this end, several research groups have recently developed advanced image processing methods, often denoted by super-resolution (SR) techniques, to reconstruct from a set of clinical low-resolution (LR) images, a high-resolution (HR) motion-free volume. It is usually modeled as an inverse problem where the regularization term plays a central role in the reconstruction quality. Literature has been quite attracted by Total Variation energies because of their ability in edge preserving but only standard explicit steepest gradient techniques have been applied for optimization. In a preliminary work, it has been shown that novel fast convex optimization techniques could be successfully applied to design an efficient Total Variation optimization algorithm for the super-resolution problem. In this work, two major contributions are presented. Firstly, we will briefly review the Bayesian and Variational dual formulations of current state-of-the-art methods dedicated to fetal MRI reconstruction. Secondly, we present an extensive quantitative evaluation of our SR algorithm previously introduced on both simulated fetal and real clinical data (with both normal and pathological subjects). Specifically, we study the robustness of regularization terms in front of residual registration errors and we also present a novel strategy for automatically select the weight of the regularization as regards the data fidelity term. Our results show that our TV implementation is highly robust in front of motion artifacts and that it offers the best trade-off between speed and accuracy for fetal MRI recovery as in comparison with state-of-the art methods.
Resumo:
This article examines the mainstream categorical definition of coreference as "identity of reference." It argues that coreference is best handled when identity is treated as a continuum, ranging from full identity to non-identity, with room for near-identity relations to explain currently problematic cases. This middle ground is needed to account for those linguistic expressions in real text that stand in relations that are neither full coreference nor non-coreference, a situation that has led to contradictory treatment of cases in previous coreference annotation efforts. We discuss key issues for coreference such as conceptual categorization, individuation, criteria of identity, and the discourse model construct. We redefine coreference as a scalar relation between two (or more) linguistic expressions that refer to discourse entities considered to be at the same granularity level relevant to the linguistic and pragmatic context. We view coreference relations in terms of mental space theory and discuss a large number of real life examples that show near-identity at different degrees.
Resumo:
CONTEXT: Complex steroid disorders such as P450 oxidoreductase deficiency or apparent cortisone reductase deficiency may be recognized by steroid profiling using chromatographic mass spectrometric methods. These methods are highly specific and sensitive, and provide a complete spectrum of steroid metabolites in a single measurement of one sample which makes them superior to immunoassays. The steroid metabolome during the fetal-neonatal transition is characterized by (a) the metabolites of the fetal-placental unit at birth, (b) the fetal adrenal androgens until its involution 3-6 months postnatally, and (c) the steroid metabolites produced by the developing endocrine organs. All these developmental events change the steroid metabolome in an age- and sex-dependent manner during the first year of life. OBJECTIVE: The aim of this study was to provide normative values for the urinary steroid metabolome of healthy newborns at short time intervals in the first year of life. METHODS: We conducted a prospective, longitudinal study to measure 67 urinary steroid metabolites in 21 male and 22 female term healthy newborn infants at 13 time-points from week 1 to week 49 of life. Urine samples were collected from newborn infants before discharge from hospital and from healthy infants at home. Steroid metabolites were measured by gas chromatography-mass spectrometry (GC-MS) and steroid concentrations corrected for urinary creatinine excretion were calculated. RESULTS: 61 steroids showed age and 15 steroids sex specificity. Highest urinary steroid concentrations were found in both sexes for progesterone derivatives, in particular 20α-DH-5α-DH-progesterone, and for highly polar 6α-hydroxylated glucocorticoids. The steroids peaked at week 3 and decreased by ∼80% at week 25 in both sexes. The decline of progestins, androgens and estrogens was more pronounced than of glucocorticoids whereas the excretion of corticosterone and its metabolites and of mineralocorticoids remained constant during the first year of life. CONCLUSION: The urinary steroid profile changes dramatically during the first year of life and correlates with the physiologic developmental changes during the fetal-neonatal transition. Thus detailed normative data during this time period permit the use of steroid profiling as a powerful diagnostic tool.
Resumo:
With qualitative methods being increasingly used in health science fields, numerous grids proposing criteria to evaluate the quality of this type of research have been produced. Expert evaluators deem that there is a lack of consensual tools to evaluate qualitative research. Based on the review of 133 quality criteria grids for qualitative research in health sciences, the authors present the results of a computerized lexicometric analysis, which confirms the variety of intra- and inter-grid constructions, including within the same field. This variety is linked to the authors' paradigmatic references underlying the criteria proposed. These references seem to be built intuitively, reflecting internal representations of qualitative research, thus making the grids and their criteria hard to compare. Consequently, the consensus on the definitions and the number of criteria becomes problematic. The paradigmatic and theoretical references of the grids should be specified so that users could better assess their contributions and limitations.
Resumo:
This study extends the standard econometric treatment of appellate court outcomes by 1) considering the role of decision-maker effort and case complexity, and 2) adopting a multi-categorical selection process of appealed cases. We find evidence of appellate courts being affected by both the effort made by first-stage decision makers and case complexity. This illustrates the value of widening the narrowly defined focus on heterogeneity in individual-specific preferences that characterises many applied studies on legal decision-making. Further, the majority of appealed cases represent non-random sub-samples and the multi-categorical selection process appears to offer advantages over the more commonly used dichotomous selection models.
Resumo:
Climate change affects the rate of insect invasions as well as the abundance, distribution and impacts of such invasions on a global scale. Among the principal analytical approaches to predicting and understanding future impacts of biological invasions are Species Distribution Models (SDMs), typically in the form of correlative Ecological Niche Models (ENMs). An underlying assumption of ENMs is that species-environment relationships remain preserved during extrapolations in space and time, although this is widely criticised. The semi-mechanistic modelling platform, CLIMEX, employs a top-down approach using species ecophysiological traits and is able to avoid some of the issues of extrapolation, making it highly applicable to investigating biological invasions in the context of climate change. The tephritid fruit flies (Diptera: Tephritidae) comprise some of the most successful invasive species and serious economic pests around the world. Here we project 12 tephritid species CLIMEX models into future climate scenarios to examine overall patterns of climate suitability and forecast potential distributional changes for this group. We further compare the aggregate response of the group against species-specific responses. We then consider additional drivers of biological invasions to examine how invasion potential is influenced by climate, fruit production and trade indices. Considering the group of tephritid species examined here, climate change is predicted to decrease global climate suitability and to shift the cumulative distribution poleward. However, when examining species-level patterns, the predominant directionality of range shifts for 11 of the 12 species is eastward. Most notably, management will need to consider regional changes in fruit fly species invasion potential where high fruit production, trade indices and predicted distributions of these flies overlap.
Resumo:
Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.
Resumo:
As a result of the growing interest in studying employee well-being as a complex process that portrays high levels of within-individual variability and evolves over time, this present study considers the experience of flow in the workplace from a nonlinear dynamical systems approach. Our goal is to offer new ways to move the study of employee well-being beyond linear approaches. With nonlinear dynamical systems theory as the backdrop, we conducted a longitudinal study using the experience sampling method and qualitative semi-structured interviews for data collection; 6981 registers of data were collected from a sample of 60 employees. The obtained time series were analyzed using various techniques derived from the nonlinear dynamical systems theory (i.e., recurrence analysis and surrogate data) and multiple correspondence analyses. The results revealed the following: 1) flow in the workplace presents a high degree of within-individual variability; this variability is characterized as chaotic for most of the cases (75%); 2) high levels of flow are associated with chaos; and 3) different dimensions of the flow experience (e.g., merging of action and awareness) as well as individual (e.g., age) and job characteristics (e.g., job tenure) are associated with the emergence of different dynamic patterns (chaotic, linear and random).
Resumo:
A novel unsymmetric dinucleating ligand (LN3N4) combining a tridentate and a tetradentate binding sites linked through a m-xylyl spacer was synthesized as ligand scaffold for preparing homo- and dimetallic complexes, where the two metal ions are bound in two different coordination environments. Site-selective binding of different metal ions is demonstrated. LN3N4 is able to discriminate between CuI and a complementary metal (M′ = CuI, ZnII, FeII, CuII, or GaIII) so that pure heterodimetallic complexes with a general formula [CuIM′(LN3N4)]n+ are synthesized. Reaction of the dicopper(I) complex [CuI 2(LN3N4)]2+ with O2 leads to the formation of two different copper-dioxygen (Cu2O2) intermolecular species (O and TP) between two copper atoms located in the same site from different complex molecules. Taking advantage of this feature, reaction of the heterodimetallic complexes [CuM′(LN3N4)]n+ with O2 at low temperature is used as a tool to determine the final position of the CuI center in the system because only one of the two Cu2O2 species is formed