962 resultados para Clustering a large document collection
Resumo:
We present a new domain of preferences under which the majority relation is always quasi-transitive and thus Condorcet winners always exist. We model situations where a set of individuals must choose one individual in the group. Agents are connected through some relationship that can be interpreted as expressing neighborhood, and which is formalized by a graph. Our restriction on preferences is as follows: each agent can freely rank his immediate neighbors, but then he is indifferent between each neighbor and all other agents that this neighbor "leads to". Hence, agents can be highly perceptive regarding their neighbors, while being insensitive to the differences between these and other agents which are further removed from them. We show quasi-transitivity of the majority relation when the graph expressing the neighborhood relation is a tree. We also discuss a further restriction allowing to extend the result for more general graphs. Finally, we compare the proposed restriction with others in the literature, to conclude that it is independent of any previously discussed domain restriction.
Resumo:
In this paper we prove that the solution of a backward stochastic differential equation, which involves a subdifferential operator and associated to a family of reflecting diffusion processes, converges to the solution of a deterministic backward equation and satisfes a large deviation principle.
Resumo:
One of the main implications of the efficient market hypothesis (EMH) is that expected future returns on financial assets are not predictable if investors are risk neutral. In this paper we argue that financial time series offer more information than that this hypothesis seems to supply. In particular we postulate that runs of very large returns can be predictable for small time periods. In order to prove this we propose a TAR(3,1)-GARCH(1,1) model that is able to describe two different types of extreme events: a first type generated by large uncertainty regimes where runs of extremes are not predictable and a second type where extremes come from isolated dread/joy events. This model is new in the literature in nonlinear processes. Its novelty resides on two features of the model that make it different from previous TAR methodologies. The regimes are motivated by the occurrence of extreme values and the threshold variable is defined by the shock affecting the process in the preceding period. In this way this model is able to uncover dependence and clustering of extremes in high as well as in low volatility periods. This model is tested with data from General Motors stocks prices corresponding to two crises that had a substantial impact in financial markets worldwide; the Black Monday of October 1987 and September 11th, 2001. By analyzing the periods around these crises we find evidence of statistical significance of our model and thereby of predictability of extremes for September 11th but not for Black Monday. These findings support the hypotheses of a big negative event producing runs of negative returns in the first case, and of the burst of a worldwide stock market bubble in the second example. JEL classification: C12; C15; C22; C51 Keywords and Phrases: asymmetries, crises, extreme values, hypothesis testing, leverage effect, nonlinearities, threshold models
Resumo:
BACKGROUND: Up to 60% of syncopal episodes remain unexplained. We report the results of a standardized, stepwise evaluation of patients referred to an ambulatory clinic for unexplained syncope. METHODS AND RESULTS: We studied 939 consecutive patients referred for unexplained syncope, who underwent a standardized evaluation, including history, physical examination, electrocardiogram, head-up tilt testing (HUTT), carotid sinus massage (CSM) and hyperventilation testing (HYV). Echocardiogram and stress test were performed when underlying heart disease was initially suspected. Electrophysiological study (EPS) and implantable loop recorder (ILR) were used only in patients with underlying structural heart disease or major unexplained syncope. We identified a cause of syncope in 66% of patients, including 27% vasovagal, 14% psychogenic, 6% arrhythmias, and 6% hypotension. Noninvasive testing identified 92% and invasive testing an additional 8% of the causes. HUTT yielded 38%, CSM 28%, HYV 49%, EPS 22%, and ILR 56% of diagnoses. On average, patients with arrhythmic causes were older, had a lower functional capacity, longer P-wave duration, and presented with fewer prodromes than patients with vasovagal or psychogenic syncope. CONCLUSIONS: A standardized stepwise evaluation emphasizing noninvasive tests yielded 2/3 of causes in patients referred to an ambulatory clinic for unexplained syncope. Neurally mediated and psychogenic mechanisms were behind >50% of episodes, while cardiac arrhythmias were uncommon. Sudden syncope, particularly in older patients with functional limitations or a prolonged P-wave, suggests an arrhythmic cause.
Resumo:
The long term goal of this research is to develop a program able to produce an automatic segmentation and categorization of textual sequences into discourse types. In this preliminary contribution, we present the construction of an algorithm which takes a segmented text as input and attempts to produce a categorization of sequences, such as narrative, argumentative, descriptive and so on. Also, this work aims at investigating a possible convergence between the typological approach developed in particular in the field of text and discourse analysis in French by Adam (2008) and Bronckart (1997) and unsupervised statistical learning.
Resumo:
PURPOSE: The Cancer Vaccine Consortium of the Cancer Research Institute (CVC-CRI) conducted a multicenter HLA-peptide multimer proficiency panel (MPP) with a group of 27 laboratories to assess the performance of the assay. EXPERIMENTAL DESIGN: Participants used commercially available HLA-peptide multimers and a well characterized common source of peripheral blood mononuclear cells (PBMC). The frequency of CD8+ T cells specific for two HLA-A2-restricted model antigens was measured by flow cytometry. The panel design allowed for participants to use their preferred staining reagents and locally established protocols for both cell labeling, data acquisition and analysis. RESULTS: We observed significant differences in both the performance characteristics of the assay and the reported frequencies of specific T cells across laboratories. These results emphasize the need to identify the critical variables important for the observed variability to allow for harmonization of the technique across institutions. CONCLUSIONS: Three key recommendations emerged that would likely reduce assay variability and thus move toward harmonizing of this assay. (1) Use of more than two colors for the staining (2) collect at least 100,000 CD8 T cells, and (3) use of a background control sample to appropriately set the analytical gates. We also provide more insight into the limitations of the assay and identified additional protocol steps that potentially impact the quality of data generated and therefore should serve as primary targets for systematic analysis in future panels. Finally, we propose initial guidelines for harmonizing assay performance which include the introduction of standard operating protocols to allow for adequate training of technical staff and auditing of test analysis procedures.
Resumo:
We show that H-spaces with finitely generated cohomology, as an algebra or as an algebra over the Steenrod algebra, have homotopy exponents at all primes. This provides a positive answer to a question of Stanley.
Resumo:
The high density of slope failures in western Norway is due to the steep relief and to the concentration of various structures that followed protracted ductile and brittle tectonics. On the 72 investigated rock slope instabilities, 13 were developed in soft weathered mafic and phyllitic allochthons. Only the intrinsic weakness of such rocks increases the susceptibility to gravitational deformation. In contrast, the gravitational structures in the hard gneisses reactivate prominent ductile or/and brittle fabrics. At 30 rockslides along cataclinal slopes, weak mafic layers of foliation are reactivated as basal planes. Slope-parallel steep foliation forms back-cracks of unstable columns. Folds are specifically present in the Storfjord area, together with a clustering of potential slope failures. Folding increases the probability of having favourably orientated planes with respect to the gravitational forces and the slope. High water pressure is believed to seasonally build up along the shallow-dipping Caledonian detachments and may contribute to destabilization of the rock slope upwards. Regional cataclastic faults localized the gravitational structures at 45 sites. The volume of the slope instabilities tends to increase with the amount of reactivated prominent structures and the spacing of the latter controls the size of instabilities.
Resumo:
Memòria elaborada a partir d’una estada al projecte Proteus de la New York University entre abril i juny del 2007. Les tècniques de clustering poden ajudar a reduir la supervisió en processos d’obtenció de patrons per a Extracció d’Informació. Tanmateix, és necessari disposar d’algorismes adequats a documents, i aquests algorismes requereixen mesures adequades de similitud entre patrons. Els kernels poden oferir una solució a aquests problemes, però l’aprenentatge no supervisat requereix d’estrat`egies m´es astutes que l’aprenentatge supervisat per a incorporar major quantitat d’informació. En aquesta memòria, fruit de la meva estada de mes d’Abril al de Juny de 2007 al projecte. Proteus de la New York University, es proposen i avaluen diversos kernels sobre patrons. Ini- cialment s’estudien kernels amb una família de patrons restringits, i a continuació s’apliquen kernels ja usats en tasques supervisades d’Extracció d’Informació. Degut a la degradació del rendiment que experimenta el clustering a l’afegir informació irrellevant, els kernels se simpli- fiquen i es busquen estratègies per a incorporar-hi semàntica de forma selectiva. Finalment, s’estudia quin efecte té aplicar clustering sobre el coneixement semàntic com a pas previ al clustering de patrons. Les diverses estratègies s’avaluen en tasques de clustering de documents i patrons usant dades reals.
Resumo:
BACKGROUND: Low 24-hour urine volume (24 UV) may be a significant risk factor for decline in kidney function. We therefore aimed to study associated markers and possible determinants of 24 UV in a sample of the Swiss population. METHODS: The cross-sectional Swiss Salt Study included a population-based sample of 1535 (746 men and 789 women) individuals from three linguistic regions of Switzerland. Data from 1300 subjects were available for the present analysis. 24 UV was measured using 24-hour urine collection. Determinants of 24 UV were identified using multivariable linear regression models. RESULTS: In bivariate analysis, 24 UV was higher in women compared to men (2000 ml/24 h [interquartile range (IQR): 1354, 2562] versus 1780 ml/24 h [IQR: 1244, 2360], p = 0.002). In multivariable regression analyses, independent associated markers of 24 UV were female sex (β = 280, 95% confidence interval [CI]: 174, 386, p < 0.0001), fluid intake (β = 604, 95% CI: 539, 670, p < 0.0001), sodium excretion (β = 4.2, 95% CI: 3.4, 4.9, p < 0.0001) age (β = 6.6, CI: 3.4, 9.7, p < .0001), creatinine clearance (β = 2.4, CI: 0.2, 4.6, p = 0.04), living in the German-speaking part of Switzerland (β = 124, CI: 29, 219, p = 0.01), alcohol consumption (β = 41, CI: 9, 73, p = 0.01 for increasing categories of alcohol consumption), body mass index (β = -32, CI: -45, -18, p < 0.0001), current smoking (β = -146, CI: -265, -26, p = 0.02), and consumption of meat and cold cut (β = -56, CI: -108, -5, p = 0.03). CONCLUSION: In this large population-based, cross-sectional study, we found several strong and independent correlates for 24 UV. These findings may be important to improve our understanding in the development of chronic kidney disease.
Resumo:
Reticulitermes santonensis is a subterranean termite that invades urban areas in France and elsewhere where it causes damage to human-built structures. We investigated the breeding system, colony and population genetic structure, and mode of dispersal of two French populations of R. santonensis. Termite workers were sampled from 43 and 31 collection points, respectively, from a natural population in west-central France (in and around the island of Oleron) and an urban population (Paris). Ten to 20 workers per collection point were genotyped at nine variable microsatellite loci to determine colony identity and to infer colony breeding structure. There was a total of 26 colonies, some of which were spatially expansive, extending up to 320 linear metres. Altogether, the analysis of genotype distribution, F-statistics and relatedness coefficients suggested that all colonies were extended families headed by numerous neotenics (nonwinged precocious reproductives) probably descended from pairs of primary (winged) reproductives. Isolation by distance among collection points within two large colonies from both populations suggested spatially separated reproductive centres with restricted movement of workers and neotenics. There was a moderate level of genetic differentiation (F(ST) = 0.10) between the Oleron and Paris populations, and the number of alleles was significantly higher in Oleron than in Paris, as expected if the Paris population went through bottlenecks when it was introduced from western France. We hypothesize that the diverse and flexible breeding systems found in subterranean termites pre-adapt them to invade new or marginal habitats. Considering that R. santonensis may be an introduced population of the North American species R. flavipes, a breeding system consisting primarily of extended family colonies containing many neotenic reproductives may facilitate human-mediated spread and establishment of R. santonensis in urban areas with harsh climates.
Resumo:
Concerns on the clustering of retail industries and professional services in main streets had traditionally been the public interest rationale for supporting distance regulations. Although many geographic restrictions have been suppressed, deregulation has hinged mostly upon the theory results on the natural tendency of outlets to differentiate spatially. Empirical evidence has so far offered mixed results. Using the case of deregulation of pharmacy establishment in a region of Spain, we empirically show how pharmacy locations scatter, and that there is not rationale for distance regulation apart from the underlying private interest of very few incumbents.
Resumo:
The enhanced flow in carbon nanotubes is explained using a mathematical model that includes a depletion layer with reduced viscosity near the wall. In the limit of large tubes the model predicts no noticeable enhancement. For smaller tubes the model predicts enhancement that increases as the radius decreases. An analogy between the reduced viscosity and slip-length models shows that the term slip-length is misleading and that on surfaces which are smooth at the nanoscale it may be thought of as a length-scale associated with the size of the depletion region and viscosity ratio. The model therefore provides a physical interpretation of the classical Navier slip condition and explains why `slip-lengths' may be greater than the tube radius.
Resumo:
I study large random assignment economies with a continuum of agents and a finite number of object types. I consider the existence of weak priorities discriminating among agents with respect to their rights concerning the final assignment. The respect for priorities ex ante (ex-ante stability) usually precludes ex-ante envy-freeness. Therefore I define a new concept of fairness, called no unjustified lower chances: priorities with respect to one object type cannot justify different achievable chances regarding another object type. This concept, which applies to the assignment mechanism rather than to the assignment itself, implies ex-ante envy-freeness among agents of the same priority type. I propose a variation of Hylland and Zeckhauser' (1979) pseudomarket that meets ex-ante stability, no unjustified lower chances and ex-ante efficiency among agents of the same priority type. Assuming enough richness in preferences and priorities, the converse is also true: any random assignment with these properties could be achieved through an equilibrium in a pseudomarket with priorities. If priorities are acyclical (the ordering of agents is the same for each object type), this pseudomarket achieves ex-ante efficient random assignments.