990 resultados para Down-sample algorithm


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Vapnik-Chervonenkis (VC) dimension is a combinatorial measure of a certain class of machine learning problems, which may be used to obtain upper and lower bounds on the number of training examples needed to learn to prescribed levels of accuracy. Most of the known bounds apply to the Probably Approximately Correct (PAC) framework, which is the framework within which we work in this paper. For a learning problem with some known VC dimension, much is known about the order of growth of the sample-size requirement of the problem, as a function of the PAC parameters. The exact value of sample-size requirement is however less well-known, and depends heavily on the particular learning algorithm being used. This is a major obstacle to the practical application of the VC dimension. Hence it is important to know exactly how the sample-size requirement depends on VC dimension, and with that in mind, we describe a general algorithm for learning problems having VC dimension 1. Its sample-size requirement is minimal (as a function of the PAC parameters), and turns out to be the same for all non-trivial learning problems having VC dimension 1. While the method used cannot be naively generalised to higher VC dimension, it suggests that optimal algorithm-dependent bounds may improve substantially on current upper bounds.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The size frequency distributions of diffuse, primitive and classic β- amyloid (Aβ) deposits were studied in single sections of cortical tissue from patients with Alzheimer's disease (AD) and Down's syndrome (DS) and compared with those predicted by the log-normal model. In a sample of brain regions, these size distributions were compared with those obtained by serial reconstruction through the tissue and the data used to adjust the size distributions obtained in single sections. The adjusted size distributions of the diffuse, primitive and classic deposits deviated significantly from a log-normal model in AD and DS, the greatest deviations from the model being observed in AD. More Aβ deposits were observed close to the mean and fewer in the larger size classes than predicted by the model. Hence, the growth of Aβ deposits in AD and DS does not strictly follow the log-normal model, deposits growing to within a more restricted size range than predicted. However, Aβ deposits grow to a larger size in DS compared with AD which may reflect differences in the mechanism of Aβ formation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis applies a hierarchical latent trait model system to a large quantity of data. The motivation for it was lack of viable approaches to analyse High Throughput Screening datasets which maybe include thousands of data points with high dimensions. High Throughput Screening (HTS) is an important tool in the pharmaceutical industry for discovering leads which can be optimised and further developed into candidate drugs. Since the development of new robotic technologies, the ability to test the activities of compounds has considerably increased in recent years. Traditional methods, looking at tables and graphical plots for analysing relationships between measured activities and the structure of compounds, have not been feasible when facing a large HTS dataset. Instead, data visualisation provides a method for analysing such large datasets, especially with high dimensions. So far, a few visualisation techniques for drug design have been developed, but most of them just cope with several properties of compounds at one time. We believe that a latent variable model (LTM) with a non-linear mapping from the latent space to the data space is a preferred choice for visualising a complex high-dimensional data set. As a type of latent variable model, the latent trait model can deal with either continuous data or discrete data, which makes it particularly useful in this domain. In addition, with the aid of differential geometry, we can imagine the distribution of data from magnification factor and curvature plots. Rather than obtaining the useful information just from a single plot, a hierarchical LTM arranges a set of LTMs and their corresponding plots in a tree structure. We model the whole data set with a LTM at the top level, which is broken down into clusters at deeper levels of t.he hierarchy. In this manner, the refined visualisation plots can be displayed in deeper levels and sub-clusters may be found. Hierarchy of LTMs is trained using expectation-maximisation (EM) algorithm to maximise its likelihood with respect to the data sample. Training proceeds interactively in a recursive fashion (top-down). The user subjectively identifies interesting regions on the visualisation plot that they would like to model in a greater detail. At each stage of hierarchical LTM construction, the EM algorithm alternates between the E- and M-step. Another problem that can occur when visualising a large data set is that there may be significant overlaps of data clusters. It is very difficult for the user to judge where centres of regions of interest should be put. We address this problem by employing the minimum message length technique, which can help the user to decide the optimal structure of the model. In this thesis we also demonstrate the applicability of the hierarchy of latent trait models in the field of document data mining.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study compared the molecular lipidomic profi le of LDL in patients with nondiabetic advanced renal disease and no evidence of CVD to that of age-matched controls, with the hypothesis that it would reveal proatherogenic lipid alterations. LDL was isolated from 10 normocholesterolemic patients with stage 4/5 renal disease and 10 controls, and lipids were analyzed by accurate mass LC/MS. Top-down lipidomics analysis and manual examination of the data identifi ed 352 lipid species, and automated comparative analysis demonstrated alterations in lipid profi le in disease. The total lipid and cholesterol content was unchanged, but levels of triacylglycerides and N -acyltaurines were signifi cantly increased, while phosphatidylcholines, plasmenyl ethanolamines, sulfatides, ceramides, and cholesterol sulfate were signifi cantly decreased in chronic kidney disease (CKD) patients. Chemometric analysis of individual lipid species showed very good discrimination of control and disease sample despite the small cohorts and identifi ed individual unsaturated phospholipids and triglycerides mainly responsible for the discrimination. These fi ndings illustrate the point that although the clinical biochemistry parameters may not appear abnormal, there may be important underlying lipidomic changes that contribute to disease pathology. The lipidomic profi le of CKD LDL offers potential for new biomarkers and novel insights into lipid metabolism and cardiovascular risk in this disease. -Reis, A., A. Rudnitskaya, P. Chariyavilaskul, N. Dhaun, V. Melville, J. Goddard, D. J. Webb, A. R. Pitt, and C. M. Spickett. Topdown lipidomics of low density lipoprotein reveal altered lipid profi les in advanced chronic kidney disease. J. Lipid Res. 2015.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the paper learning algorithm for adjusting weight coefficients of the Cascade Neo-Fuzzy Neural Network (CNFNN) in sequential mode is introduced. Concerned architecture has the similar structure with the Cascade-Correlation Learning Architecture proposed by S.E. Fahlman and C. Lebiere, but differs from it in type of artificial neurons. CNFNN consists of neo-fuzzy neurons, which can be adjusted using high-speed linear learning procedures. Proposed CNFNN is characterized by high learning rate, low size of learning sample and its operations can be described by fuzzy linguistic “if-then” rules providing “transparency” of received results, as compared with conventional neural networks. Using of online learning algorithm allows to process input data sequentially in real time mode.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Vigabatrin (VGB) is an anti-epileptic medication which has been linked to peripheral constriction of the visual field. Documenting the natural history associated with continued VGB exposure is important when making decisions about the risk and benefits associated with the treatment. Due to its speed the Swedish Interactive Threshold Algorithm (SITA) has become the algorithm of choice when carrying out Full Threshold automated static perimetry. SITA uses prior distributions of normal and glaucomatous visual field behaviour to estimate threshold sensitivity. As the abnormal model is based on glaucomatous behaviour this algorithm has not been validated for VGB recipients. We aim to assess the clinical utility of the SITA algorithm for accurately mapping VGB attributed field loss. Methods: The sample comprised one randomly selected eye of 16 patients diagnosed with epilepsy, exposed to VGB therapy. A clinical diagnosis of VGB attributed visual field loss was documented in 44% of the group. The mean age was 39.3 years∈±∈14.5 years and the mean deviation was -4.76 dB ±4.34 dB. Each patient was examined with the Full Threshold, SITA Standard and SITA Fast algorithm. Results: SITA Standard was on average approximately twice as fast (7.6 minutes) and SITA Fast approximately 3 times as fast (4.7 minutes) as examinations completed using the Full Threshold algorithm (15.8 minutes). In the clinical environment, the visual field outcome with both SITA algorithms was equivalent to visual field examination using the Full Threshold algorithm in terms of visual inspection of the grey scale plots, defect area and defect severity. Conclusions: Our research shows that both SITA algorithms are able to accurately map visual field loss attributed to VGB. As patients diagnosed with epilepsy are often vulnerable to fatigue, the time saving offered by SITA Fast means that this algorithm has a significant advantage for use with VGB recipients.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The long-term foetal surveillance is often to be recommended. Hence, the fully non-invasive acoustic recording, through maternal abdomen, represents a valuable alternative to the ultrasonic cardiotocography. Unfortunately, the recorded heart sound signal is heavily loaded by noise, thus the determination of the foetal heart rate raises serious signal processing issues. In this paper, we present a new algorithm for foetal heart rate estimation from foetal phonocardiographic recordings. A filtering is employed as a first step of the algorithm to reduce the background noise. A block for first heart sounds enhancing is then used to further reduce other components of foetal heart sound signals. A complex logic block, guided by a number of rules concerning foetal heart beat regularity, is proposed as a successive block, for the detection of most probable first heart sounds from several candidates. A final block is used for exact first heart sound timing and in turn foetal heart rate estimation. Filtering and enhancing blocks are actually implemented by means of different techniques, so that different processing paths are proposed. Furthermore, a reliability index is introduced to quantify the consistency of the estimated foetal heart rate and, based on statistic parameters; [,] a software quality index is designed to indicate the most reliable analysis procedure (that is, combining the best processing path and the most accurate time mark of the first heart sound, provides the lowest estimation errors). The algorithm performances have been tested on phonocardiographic signals recorded in a local gynaecology private practice from a sample group of about 50 pregnant women. Phonocardiographic signals have been recorded simultaneously to ultrasonic cardiotocographic signals in order to compare the two foetal heart rate series (the one estimated by our algorithm and the other provided by cardiotocographic device). Our results show that the proposed algorithm, in particular some analysis procedures, provides reliable foetal heart rate signals, very close to the reference cardiotocographic recordings. © 2010 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 91B28, 65C05.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The connectivity between the fish community of estuarine mangroves and that of freshwater habitats upstream remains poorly understood. In the Florida Everglades, mangrove-lined creeks link freshwater marshes to estuarine habitats downstream and may act as dry-season refuges for freshwater fishes. We examined seasonal dynamics in the fish community of ecotonal creeks in the southwestern region of Everglades National Park, specifically Rookery Branch and the North and watson rivers. Twelve low-order creeks were sampled via electrofishing, gill nets, and minnow traps during the wet season, transition period, and dry season in 2004-2005. Catches were greater in Rookery Branch than in the North and watson rivers, particularly during the transition period. Community composition varied seasonally in Rookery Branch, and to a greater extent for the larger species, reflecting a pulse of freshwater taxa into creeks as marshes upstream dried periodically. The pulse was short-lived, a later sample showed substantial decreases in freshwater fish numbers. No evidence of a similar influx was seen in the North and watson rivers, which drain shorter hydroperiod marshes and exhibit higher salinities. These results suggest that head-water creeks can serve as important dry-season refugia. Increased freshwater flow resulting from Everglades restoration may enhance this connectivity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The inadequacy about eating habits have been established as a serious problem nowadays. It is a multifactorial and difficult to handle, given their different nuances and causes. A population particularly exposed to the bad eating habits arising harm are individuals with Down syndrome, both with regard to the aspects inherent to the individual's own condition, regarding eating misfits, making it the weight control a necessary measure for a proper development. Thus, this study aimed to develop and evaluate a proposal based nutrition education from the assumptions of mediated learning, with children three to four years with Down syndrome. The participants were five children, four girls and a boy. Also included his parents and / or guardians. The data collection procedure involved the use of eight playful workshops with children and nutritional evaluation of those five meetings with parents and three home visits with each participating family. We tried to build with these children and their families a nutritional education to contribute to their daily choices of eating. Using listening, observation and questionnaire, besides playful interventions, it was observed that the first meaning of the act of eating is built in the family and reinforced by their social life. Overall, our sample characteristics seem to agree with the literature. During the intervention, the children showed attention, but little understanding of the content. With mothers, the results were different, with reflections on the inadequate power both the type of food offered, and quantity and so this offer is performed, conducted along the interventions changes in your lifestyle, such as perception of influence they had on their children in the formation of their eating habits, as well as less frequent intake of soft drinks and sweets. Nutritional interventions and mediations conducted with the mothers is that they seem to be effective strategies to combat obesity. Face of what was discussed, we see the importance of implementing intervention measures in combating and preventing overweight or obese since childhood, particularly with children with Down syndrome. One should prevent childhood obesity with educational and informative measures from birth, with family and with each child, through the primary health care and schools.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a study of the star-forming properties of a stellar mass-selected sample of galaxies in the GOODS (Great Observatories Origins Deep Survey) NICMOS Survey (GNS), based on deep Hubble Space Telescope (HST) imaging of the GOODS North and South fields. Using a stellar mass-selected sample, combined with HST/ACS and Spitzer data to measure both ultraviolet (UV) and infrared-derived star formation rates (SFRs), we investigate the star forming properties of a complete sample of ∼1300 galaxies down to log M_*= 9.5 at redshifts 1.5 < z < 3. Eight per cent of the sample is made up of massive galaxies with M_*≥ 10^11 M_⊙. We derive optical colours, dust extinctions and UV and infrared SFR to determine how the SFR changes as a function of both stellar mass and time. Our results show that SFR increases at higher stellar mass such that massive galaxies nearly double their stellar mass from star formation alone over the redshift range studied, but the average value of SFR for a given stellar mass remains constant over this ∼2 Gyr period. Furthermore, we find no strong evolution in the SFR for our sample as a function of mass over our redshift range of interest; in particular we do not find a decline in the SFR among massive galaxies, as is seen at z < 1. The most massive galaxies in our sample (log M_*≥ 11) have high average SFRs with values SFR_UV, corr= 103 ± 75 M_⊙ yr^−1, and yet exhibit red rest-frame (U−B) colours at all redshifts. We conclude that the majority of these red high-redshift massive galaxies are red due to dust extinction. We find that A_2800 increases with stellar mass, and show that between 45 and 85 per cent of massive galaxies harbour dusty star formation. These results show that even just a few Gyr after the first galaxies appear, there are strong relations between the global physical properties of galaxies, driven by stellar mass or another underlying feature of galaxies strongly related to the stellar mass.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Testing for two-sample differences is challenging when the differences are local and only involve a small portion of the data. To solve this problem, we apply a multi- resolution scanning framework that performs dependent local tests on subsets of the sample space. We use a nested dyadic partition of the sample space to get a collection of windows and test for sample differences within each window. We put a joint prior on the states of local hypotheses that allows both vertical and horizontal message passing among the partition tree to reflect the spatial dependency features among windows. This information passing framework is critical to detect local sample differences. We use both the loopy belief propagation algorithm and MCMC to get the posterior null probability on each window. These probabilities are then used to report sample differences based on decision procedures. Simulation studies are conducted to illustrate the performance. Multiple testing adjustment and convergence of the algorithms are also discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fitting statistical models is computationally challenging when the sample size or the dimension of the dataset is huge. An attractive approach for down-scaling the problem size is to first partition the dataset into subsets and then fit using distributed algorithms. The dataset can be partitioned either horizontally (in the sample space) or vertically (in the feature space), and the challenge arise in defining an algorithm with low communication, theoretical guarantees and excellent practical performance in general settings. For sample space partitioning, I propose a MEdian Selection Subset AGgregation Estimator ({\em message}) algorithm for solving these issues. The algorithm applies feature selection in parallel for each subset using regularized regression or Bayesian variable selection method, calculates the `median' feature inclusion index, estimates coefficients for the selected features in parallel for each subset, and then averages these estimates. The algorithm is simple, involves very minimal communication, scales efficiently in sample size, and has theoretical guarantees. I provide extensive experiments to show excellent performance in feature selection, estimation, prediction, and computation time relative to usual competitors.

While sample space partitioning is useful in handling datasets with large sample size, feature space partitioning is more effective when the data dimension is high. Existing methods for partitioning features, however, are either vulnerable to high correlations or inefficient in reducing the model dimension. In the thesis, I propose a new embarrassingly parallel framework named {\em DECO} for distributed variable selection and parameter estimation. In {\em DECO}, variables are first partitioned and allocated to m distributed workers. The decorrelated subset data within each worker are then fitted via any algorithm designed for high-dimensional problems. We show that by incorporating the decorrelation step, DECO can achieve consistent variable selection and parameter estimation on each subset with (almost) no assumptions. In addition, the convergence rate is nearly minimax optimal for both sparse and weakly sparse models and does NOT depend on the partition number m. Extensive numerical experiments are provided to illustrate the performance of the new framework.

For datasets with both large sample sizes and high dimensionality, I propose a new "divided-and-conquer" framework {\em DEME} (DECO-message) by leveraging both the {\em DECO} and the {\em message} algorithm. The new framework first partitions the dataset in the sample space into row cubes using {\em message} and then partition the feature space of the cubes using {\em DECO}. This procedure is equivalent to partitioning the original data matrix into multiple small blocks, each with a feasible size that can be stored and fitted in a computer in parallel. The results are then synthezied via the {\em DECO} and {\em message} algorithm in a reverse order to produce the final output. The whole framework is extremely scalable.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Personalised diets based on people’s existing food choices, and/or phenotypic, and/or genetic information hold potential to improve public dietary-related health. The aim of this analysis, therefore, has been to examine the degree to which factors which determine uptake of personalised nutrition vary between EU countries to better target policies to encourage uptake, and optimise the health benefits of personalised nutrition technology. A questionnaire developed from previous qualitative research was used to survey nationally representative samples from 9 EU countries (N = 9381). Perceived barriers to the uptake of personalised nutrition comprised three factors (data protection; the eating context; and, societal acceptance). Trust in sources of information comprised four factors (commerce and media; practitioners; government; family and, friends). Benefits comprised a single factor. Analysis of Variance (ANOVA) was employed to compare differences in responses between the United Kingdom; Ireland; Portugal; Poland; Norway; the Netherlands; Germany; and, Spain. The results indicated that respondents in Greece, Poland, Ireland, Portugal and Spain, rated the benefits of personalised nutrition highest, suggesting a particular readiness in these countries to adopt personalised nutrition interventions. Greek participants were more likely to perceive the social context of eating as a barrier to adoption of personalised nutrition, implying a need for support in negotiating social situations while on a prescribed diet. Those in Spain, Germany, Portugal and Poland scored highest on perceived barriers related to data protection. Government was more trusted than commerce to deliver and provide information on personalised nutrition overall. This was particularly the case in Ireland, Portugal and Greece, indicating an imperative to build trust, particularly in the ability of commercial service providers to deliver personalised dietary regimes effectively in these countries. These findings, obtained from a nationally representative sample of EU citizens, imply that a parallel, integrated, public-private delivery system would capture the needs of most potential consumers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Projeto de Graduação apresentado à Universidade Fernando Pessoa como parte dos requisitos para obtenção do grau de licenciada em Enfermagem