855 resultados para optimal feature selection


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The real earth is far away from an ideal elastic ball. The movement of structures or fluid and scattering of thin-layer would inevitably affect seismic wave propagation, which is demonstrated mainly as energy nongeometrical attenuation. Today, most of theoretical researches and applications take the assumption that all media studied are fully elastic. Ignoring the viscoelastic property would, in some circumstances, lead to amplitude and phase distortion, which will indirectly affect extraction of traveltime and waveform we use in imaging and inversion. In order to investigate the response of seismic wave propagation and improve the imaging and inversion quality in complex media, we need not only consider into attenuation of the real media but also implement it by means of efficient numerical methods and imaging techniques. As for numerical modeling, most widely used methods, such as finite difference, finite element and pseudospectral algorithms, have difficulty in dealing with problem of simultaneously improving accuracy and efficiency in computation. To partially overcome this difficulty, this paper devises a matrix differentiator method and an optimal convolutional differentiator method based on staggered-grid Fourier pseudospectral differentiation, and a staggered-grid optimal Shannon singular kernel convolutional differentiator by function distribution theory, which then are used to study seismic wave propagation in viscoelastic media. Results through comparisons and accuracy analysis demonstrate that optimal convolutional differentiator methods can solve well the incompatibility between accuracy and efficiency, and are almost twice more accurate than the same-length finite difference. They can efficiently reduce dispersion and provide high-precision waveform data. On the basis of frequency-domain wavefield modeling, we discuss how to directly solve linear equations and point out that when compared to the time-domain methods, frequency-domain methods would be more convenient to handle the multi-source problem and be much easier to incorporate medium attenuation. We also prove the equivalence of the time- and frequency-domain methods by using numerical tests when assumptions with non-relaxation modulus and quality factor are made, and analyze the reason that causes waveform difference. In frequency-domain waveform inversion, experiments have been conducted with transmission, crosshole and reflection data. By using the relation between media scales and characteristic frequencies, we analyze the capacity of the frequency-domain sequential inversion method in anti-noising and dealing with non-uniqueness of nonlinear optimization. In crosshole experiments, we find the main sources of inversion error and figure out how incorrect quality factor would affect inverted results. When dealing with surface reflection data, several frequencies have been chosen with optimal frequency selection strategy, with which we use to carry out sequential and simultaneous inversions to verify how important low frequency data are to the inverted results and the functionality of simultaneous inversion in anti-noising. Finally, I come with some conclusions about the whole work I have done in this dissertation and discuss detailly the existing and would-be problems in it. I also point out the possible directions and theories we should go and deepen, which, to some extent, would provide a helpful reference to researchers who are interested in seismic wave propagation and imaging in complex media.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Humans recognize optical reflectance properties of surfaces such as metal, plastic, or paper from a single image without knowledge of illumination. We develop a machine vision system to perform similar recognition tasks automatically. Reflectance estimation under unknown, arbitrary illumination proves highly underconstrained due to the variety of potential illumination distributions and surface reflectance properties. We have found that the spatial structure of real-world illumination possesses some of the statistical regularities observed in the natural image statistics literature. A human or computer vision system may be able to exploit this prior information to determine the most likely surface reflectance given an observed image. We develop an algorithm for reflectance classification under unknown real-world illumination, which learns relationships between surface reflectance and certain features (statistics) computed from a single observed image. We also develop an automatic feature selection method.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We consider the problem of detecting a large number of different classes of objects in cluttered scenes. Traditional approaches require applying a battery of different classifiers to the image, at multiple locations and scales. This can be slow and can require a lot of training data, since each classifier requires the computation of many different image features. In particular, for independently trained detectors, the (run-time) computational complexity, and the (training-time) sample complexity, scales linearly with the number of classes to be detected. It seems unlikely that such an approach will scale up to allow recognition of hundreds or thousands of objects. We present a multi-class boosting procedure (joint boosting) that reduces the computational and sample complexity, by finding common features that can be shared across the classes (and/or views). The detectors for each class are trained jointly, rather than independently. For a given performance level, the total number of features required, and therefore the computational cost, is observed to scale approximately logarithmically with the number of classes. The features selected jointly are closer to edges and generic features typical of many natural structures instead of finding specific object parts. Those generic features generalize better and reduce considerably the computational cost of an algorithm for multi-class object detection.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

R. Jensen and Q. Shen. Semantics-Preserving Dimensionality Reduction: Rough and Fuzzy-Rough Based Approaches. IEEE Transactions on Knowledge and Data Engineering, 16(12): 1457-1471. 2004.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Q. Shen and R. Jensen, 'Selecting Informative Features with Fuzzy-Rough Sets and its Application for Complex Systems Monitoring,' Pattern Recognition, vol. 37, no. 7, pp. 1351-1363, 2004.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

P. Lingras and R. Jensen, 'Survey of Rough and Fuzzy Hybridization,' Proceedings of the 16th International Conference on Fuzzy Systems (FUZZ-IEEE'07), pp. 125-130, 2007.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

R. Jensen, Q. Shen and A. Tuson, 'Finding Rough Set Reducts with SAT,' Proceedings of the 10th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing, LNAI 3641, pp. 194-203, 2005.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

R. Jensen, Q. Shen, Data Reduction with Rough Sets, In: Encyclopedia of Data Warehousing and Mining - 2nd Edition, Vol. II, 2008.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Enot, D. P., Beckmann, M., Overy, D., Draper, J. (2006). Predicting interpretability of metabolome models based on behavior, putative identity, and biological relevance of explanatory signals. Proceedings of the National Academy of Sciences of the USA, 103(40), 14865-14870. Sponsorship: BBSRC RAE2008

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Overlay networks have been used for adding and enhancing functionality to the end-users without requiring modifications in the Internet core mechanisms. Overlay networks have been used for a variety of popular applications including routing, file sharing, content distribution, and server deployment. Previous work has focused on devising practical neighbor selection heuristics under the assumption that users conform to a specific wiring protocol. This is not a valid assumption in highly decentralized systems like overlay networks. Overlay users may act selfishly and deviate from the default wiring protocols by utilizing knowledge they have about the network when selecting neighbors to improve the performance they receive from the overlay. This thesis goes against the conventional thinking that overlay users conform to a specific protocol. The contributions of this thesis are threefold. It provides a systematic evaluation of the design space of selfish neighbor selection strategies in real overlays, evaluates the performance of overlay networks that consist of users that select their neighbors selfishly, and examines the implications of selfish neighbor and server selection to overlay protocol design and service provisioning respectively. This thesis develops a game-theoretic framework that provides a unified approach to modeling Selfish Neighbor Selection (SNS) wiring procedures on behalf of selfish users. The model is general, and takes into consideration costs reflecting network latency and user preference profiles, the inherent directionality in overlay maintenance protocols, and connectivity constraints imposed on the system designer. Within this framework the notion of user’s "best response" wiring strategy is formalized as a k-median problem on asymmetric distance and is used to obtain overlay structures in which no node can re-wire to improve the performance it receives from the overlay. Evaluation results presented in this thesis indicate that selfish users can reap substantial performance benefits when connecting to overlay networks composed of non-selfish users. In addition, in overlays that are dominated by selfish users, the resulting stable wirings are optimized to such great extent that even non-selfish newcomers can extract near-optimal performance through naïve wiring strategies. To capitalize on the performance advantages of optimal neighbor selection strategies and the emergent global wirings that result, this thesis presents EGOIST: an SNS-inspired overlay network creation and maintenance routing system. Through an extensive measurement study on the deployed prototype, results presented in this thesis show that EGOIST’s neighbor selection primitives outperform existing heuristics on a variety of performance metrics, including delay, available bandwidth, and node utilization. Moreover, these results demonstrate that EGOIST is competitive with an optimal but unscalable full-mesh approach, remains highly effective under significant churn, is robust to cheating, and incurs minimal overheads. This thesis also studies selfish neighbor selection strategies for swarming applications. The main focus is on n-way broadcast applications where each of n overlay user wants to push its own distinct file to all other destinations as well as download their respective data files. Results presented in this thesis demonstrate that the performance of our swarming protocol for n-way broadcast on top of overlays of selfish users is far superior than the performance on top of existing overlays. In the context of service provisioning, this thesis examines the use of distributed approaches that enable a provider to determine the number and location of servers for optimal delivery of content or services to its selfish end-users. To leverage recent advances in virtualization technologies, this thesis develops and evaluates a distributed protocol to migrate servers based on end-users demand and only on local topological knowledge. Results under a range of network topologies and workloads suggest that the performance of the distributed deployment is comparable to that of the optimal but unscalable centralized deployment.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This article presents a new method for predicting viral resistance to seven protease inhibitors from the HIV-1 genotype, and for identifying the positions in the protease gene at which the specific nature of the mutation affects resistance. The neural network Analog ARTMAP predicts protease inhibitor resistance from viral genotypes. A feature selection method detects genetic positions that contribute to resistance both alone and through interactions with other positions. This method has identified positions 35, 37, 62, and 77, where traditional feature selection methods have not detected a contribution to resistance. At several positions in the protease gene, mutations confer differing degress of resistance, depending on the specific amino acid to which the sequence has mutated. To find these positions, an Amino Acid Space is introduced to represent genes in a vector space that captures the functional similarity between amino acid pairs. Feature selection identifies several new positions, including 36, 37, and 43, with amino acid-specific contributions to resistance. Analog ARTMAP networks applied to inputs that represent specific amino acids at these positions perform better than networks that use only mutation locations.

Relevância:

80.00% 80.00%

Publicador:

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A system for the identification of power quality violations is proposed. It is a two-stage system that employs the potentials of the wavelet transform and the adaptive neurofuzzy networks. For the first stage, the wavelet multiresolution signal analysis is exploited to denoise and then decompose the monitored signals of the power quality events to extract its detailed information. A new optimal feature-vector is suggested and adopted in learning the neurofuzzy classifier. Thus, the amount of needed training data is extensively reduced. A modified organisation map of the neurofuzzy classifier has significantly improved the diagnosis efficiency. Simulation results confirm the aptness and the capability of the proposed system in power quality violations detection and automatic diagnosis

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A silicon implementation of the Approximate Rotations algorithm capable of carrying the computational load of algorithms such as QRD and SVD, within the real-time realisation of applications such as Adaptive Beamforming, is described. A modification to the original Approximate Rotations algorithm to simplify the method of optimal angle selection is proposed. Analysis shows that fewer iterations of the Approximate Rotations algorithm are required compared with the conventional CORDIC algorithm to achieve similar degrees of accuracy. The silicon design studies undertaken provide direct practical evidence of superior performance with the Approximate Rotations algorithm, requiring approximately 40% of the total computation time of the conventional CORDIC algorithm, for a similar silicon area cost. © 2004 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: Ineffective risk stratification can delay diagnosis of serious disease in patients with hematuria. We applied a systems biology approach to analyze clinical, demographic and biomarker measurements (n = 29) collected from 157 hematuric patients: 80 urothelial cancer (UC) and 77 controls with confounding pathologies.

Methods: On the basis of biomarkers, we conducted agglomerative hierarchical clustering to identify patient and biomarker clusters. We then explored the relationship between the patient clusters and clinical characteristics using Chi-square analyses. We determined classification errors and areas under the receiver operating curve of Random Forest Classifiers (RFC) for patient subpopulations using the biomarker clusters to reduce the dimensionality of the data.

Results: Agglomerative clustering identified five patient clusters and seven biomarker clusters. Final diagnoses categories were non-randomly distributed across the five patient clusters. In addition, two of the patient clusters were enriched with patients with ‘low cancer-risk’ characteristics. The biomarkers which contributed to the diagnostic classifiers for these two patient clusters were similar. In contrast, three of the patient clusters were significantly enriched with patients harboring ‘high cancer-risk” characteristics including proteinuria, aggressive pathological stage and grade, and malignant cytology. Patients in these three clusters included controls, that is, patients with other serious disease and patients with cancers other than UC. Biomarkers which contributed to the diagnostic classifiers for the largest ‘high cancer- risk’ cluster were different than those contributing to the classifiers for the ‘low cancer-risk’ clusters. Biomarkers which contributed to subpopulations that were split according to smoking status, gender and medication were different.

Conclusions: The systems biology approach applied in this study allowed the hematuric patients to cluster naturally on the basis of the heterogeneity within their biomarker data, into five distinct risk subpopulations. Our findings highlight an approach with the promise to unlock the potential of biomarkers. This will be especially valuable in the field of diagnostic bladder cancer where biomarkers are urgently required. Clinicians could interpret risk classification scores in the context of clinical parameters at the time of triage. This could reduce cystoscopies and enable priority diagnosis of aggressive diseases, leading to improved patient outcomes at reduced costs. © 2013 Emmert-Streib et al; licensee BioMed Central Ltd.