852 resultados para Initial data problem
Resumo:
The telemetry data processing operation intended for a given mission are pre-defined by an onboard telemetry configuration, mission trajectory and overall telemetry methodology have stabilized lately for ISRO vehicles. The given problem on telemetry data processing is reduced through hierarchical problem reduction whereby the sequencing of operations evolves as the control task and operations on data as the function task. The function task Input, Output and execution criteria are captured into tables which are examined by the control task and then schedules when the function task when the criteria is being met.
Resumo:
One comes across directions as the observations in a number of situations. The first inferential question that one should answer when dealing with such data is, “Are they isotropic or uniformly distributed?” The answer to this question goes back in history which we shall retrace a bit and provide an exact and approximate solution to this so-called “Pearson’s Random Walk” problem.
Resumo:
In a business environment that is characterized by intense competition, building customer loyalty has become a key area of focus for most financial institutions. The explosion of the services sector, changing customer demographics and deregulation and emergence of new technology in the financial services industry have had a critical impact on consumers’ financial services buying behaviour. The changes have forced banks to modify their service offerings to customers so as to ensure high levels of customer satisfaction and also high levels of customer retention. Banks have historically had difficulty distinguishing their products from one another because of their relative homogeneity; with increasing competition,the problem has only intensified with no coherent distinguishing theme. Rising wealth, product proliferation, regulatory changes and newer technologies are together making bank switching easier for customers. In order to remain competitive, it is important for banks to retain their customer base. The financial services sector is the foundation for any economy and plays the role of mobilization of resources and their allocation. The retail banking sector in India has emerged as one of the major drivers of the overall banking industry and has witnessed enormous growth. Switching behaviour has a negative impact on the banks’ market share and profitability as the costs of acquiring customers are much higher than the costs of retaining. When customers switch, the business loses the potential for additional profits from the customer the initial costs invested in the customer by the business get . The Objective of the thesis was to examine the relationship among triggers that customers experience, their perceptions of service quality, consumers’ commitment and behavioral intentions in the contemporary India retail banking context through the eyes of the customer. To understand customers’ perception of these aspects, data were collected from retail banking customers alone for the purpose of analysis, though the banks’ views were considered during the qualitative work carried out prior to the main study. No respondent who is an employee of a banking organization was considered for the final study to avoid the possibility of any bias that could affect the results adversely. The data for the study were collected from customers who have switched banks and from those who were non switchers. The study attempted to develop and validate a multidimensional construct of service quality for retail banking from the consumer’s perspective. A major conclusion from the empirical research was the confirmation of the multidimensional construct for perceived service quality in the banking context. Switching can be viewed as an optimization problem for customers; customers review the potential gains of switching to another service provider against the costs of leaving the service provider. As banks do not provide tangible products, their service quality is usually assessed through service provider’s relationship with customers. Thus, banks should pay attention towards their employees’ skills and knowledge; assessing customers’ needs and offering fast and efficient services.
Resumo:
Soil organic matter (SOM) vitally impacts all soil functions and plays a key role in the global carbon (C) cycle. More than 70% of the terrestric C stocks that participate in the active C cycle are stored in the soil. Therefore, quantitative knowledge of the rates of C incorporation into SOM fractions of different residence time is crucial to understand and predict the sequestration and stabilization of soil organic carbon (SOC). Consequently, there is a need of fractionation procedures that are capable of isolating functionally SOM fractions, i.e. fractions that are defined by their stability. The literature generally refers to three main mechanisms of SOM stabilization: protection of SOM from decomposition by (i) its structural composition, i.e. recalcitrance, (ii) spatial inaccessibility and/or (iii) interaction with soil minerals and metal ions. One of the difficulties in developing fractionation procedures for the isolation of functional SOM fractions is the marked heterogeneity of the soil environment with its various stabilization mechanisms – often several mechanisms operating simultaneously – in soils and soil horizons of different texture and mineralogy. The overall objective of the present thesis was to evaluate present fractionation techniques and to get a better understanding of the factors of SOM sequestration and stabilization. The first part of this study is attended to the structural composition of SOM. Using 13C cross-polarization magic-angle spinning (CPMAS) nuclear magnetic resonance (NMR) spectroscopy, (i) the effect of land use on SOM composition was investigated and (ii) examined whether SOM composition contributes to the different stability of SOM in density and aggregate fractions. The second part of the present work deals with the mineral-associated SOM fraction. The aim was (iii) to evaluate the suitability of chemical fractionation procedures used in the literature for the isolation of stable SOM pools (stepwise hydrolysis, treatments using oxidizing agents like Na2S2O8, H2O2, and NaOCl as well as demineralization of the residue obtained by the NaOCl treatment using HF (NaOCl+HF)) by pool sizes, 13C and 14C data. Further, (iv) the isolated SOM fractions were compared to the inert organic matter (IOM) pool obtained for the investigated soils using the Rothamsted Carbon Model and isotope data in order to see whether the tested chemical fractionation methods produce SOM fractions capable to represent this pool. Besides chemical fractionation, (v) the suitability of thermal oxidation at different temperatures for obtaining stable SOC pools was evaluated. Finally, (vi) the short-term aggregate dynamics and the factors that impact macroaggregate formation and C stabilization were investigated by means of an incubation study using treatments with and without application of 15N labeled maize straw of different degradability (leaves and coarse roots). All treatments were conducted with and without the addition of fungicide. Two study sites with different soil properties and land managements were chosen for these investigations. The first one, located at Rotthalmünster, is a Stagnic Luvisol (silty loam) under different land use regimes. The Ah horizons of a spruce forest and continuous grassland and the Ap and E horizons of two plots with arable crops (continuous maize and wheat cropping) were examined. The soil of the second study site, located at Halle, is a Haplic Phaeozem (loamy sand) where the Ap horizons of two plots with arable crops (continuous maize and rye cropping) were investigated. Both study sites had a C3-/C4-vegetational change on the maize plot for the purpose of tracing the incorporation of the younger, maize-derived C into different SOM fractions and the calculation of apparent C turnover times of these. The Halle site is located near a train station and industrial areas, which caused a contamination with high amounts of fossil C. The investigation of aggregate and density fractions by 13C CPMAS NMR spectroscopy revealed that density fractionation isolated SOM fractions of different composition. The consumption of a considerable part (10–20%) of the easily available O-alkyl-C and the selective preservation of the more recalcitrant alkyl-C when passing from litter to the different particulate organic matter (POM) fractions suggest that density fractionation was able to isolate SOM fractions with different degrees of decomposition. The spectra of the aggregate fractions resembled those of the mineral-associated SOM fraction obtained by density fractionation and no considerable differences were observed between aggregate size classes. Comparison of plant litter, density and aggregate size fractions from soil under different land use showed that the type of land use markedly influenced the composition of SOM. While SOM of the acid forest soil was characterized by a large content (> 50%) of POM, which contained high amounts of spruce-litter derived alkyl-C, the organic matter in the biologically more active grassland and arable soils was dominated by mineral-associated SOM (> 95%). This SOM fraction comprised greater proportions of aryl- and carbonyl-C and is considered to contain a higher amount of microbially-derived organic substances. Land use can alter both, structure and stability of SOM fractions. All applied chemical treatments induced considerable SOC losses (> 70–95% of mineral-associated SOM) in the investigated soils. The proportion of residual C after chemical fractionation was largest in the arable Ap and E horizons and increased with decreasing C content in the initial SOC after stepwise hydrolysis as well as after the oxidative treatments with H2O2 and Na2S2O8. This can be expected for a functional stable pool of SOM, because it is assumed that the more easily available part of SOC is consumed first if C inputs decrease. All chemical treatments led to a preferential loss of the younger, maize-derived SOC, but this was most pronounced after the treatments with Na2S2O8 and H2O2. After all chemical fractionations, the mean 14C ages of SOC were higher than in the mineral-associated SOM fraction for both study sites and increased in the order: NaOCl < NaOCl+HF ≤ stepwise hydrolysis << H2O2 ≈ Na2S2O8. The results suggest that all treatments were capable of isolating a more stable SOM fraction, but the treatments with H2O2 and Na2S2O8 were the most efficient ones. However, none of the chemical fractionation methods was able to fit the IOM pool calculated using the Rothamsted Carbon Model and isotope data. In the evaluation of thermal oxidation for obtaining stable C fractions, SOC losses increased with temperature from 24–48% (200°C) to 100% (500°C). In the Halle maize Ap horizon, losses of the young, maize-derived C were considerably higher than losses of the older C3-derived C, leading to an increase in the apparent C turnover time from 220 years in mineral-associated SOC to 1158 years after thermal oxidation at 300°C. Most likely, the preferential loss of maize-derived C in the Halle soil was caused by the presence of the high amounts of fossil C mentioned above, which make up a relatively large thermally stable C3-C pool in this soil. This agrees with lower overall SOC losses for the Halle Ap horizon compared to the Rotthalmünster Ap horizon. In the Rotthalmünster soil only slightly more maize-derived than C3-derived SOC was removed by thermal oxidation. Apparent C turnover times increased slightly from 58 years in mineral-associated SOC to 77 years after thermal oxidation at 300°C in the Rotthalmünster Ap and from 151 to 247 years in the Rotthalmünster E horizon. This led to the conclusion that thermal oxidation of SOM was not capable of isolating SOM fractions of considerably higher stability. The incubation experiment showed that macroaggregates develop rapidly after the addition of easily available plant residues. Within the first four weeks of incubation, the maximum aggregation was reached in all treatments without addition of fungicide. The formation of water-stable macroaggregates was related to the size of the microbial biomass pool and its activity. Furthermore, fungi were found to be crucial for the development of soil macroaggregates as the formation of water-stable macroaggregates was significantly delayed in the fungicide treated soils. The C concentration in the obtained aggregate fractions decreased with decreasing aggregate size class, which is in line with the aggregate hierarchy postulated by several authors for soils with SOM as the major binding agent. Macroaggregation involved incorporation of large amounts maize-derived organic matter, but macroaggregates did not play the most important role in the stabilization of maize-derived SOM, because of their relatively low amount (less than 10% of the soil mass). Furthermore, the maize-derived organic matter was quickly incorporated into all aggregate size classes. The microaggregate fraction stored the largest quantities of maize-derived C and N – up to 70% of the residual maize-C and -N were stored in this fraction.
Resumo:
In dieser Arbeit werden nichtüberlappende Gebietszerlegungsmethoden einerseits hinsichtlich der zu lösenden Problemklassen verallgemeinert und andererseits in bisher nicht untersuchten Kontexten betrachtet. Dabei stehen funktionalanalytische Untersuchungen zur Wohldefiniertheit, eindeutigen Lösbarkeit und Konvergenz im Vordergrund. Im ersten Teil werden lineare elliptische Dirichlet-Randwertprobleme behandelt, wobei neben Problemen mit dominantem Hauptteil auch solche mit singulärer Störung desselben, wie konvektions- oder reaktionsdominante Probleme zugelassen sind. Der zweite Teil befasst sich mit (gleichmäßig) monotonen koerziven quasilinearen elliptischen Dirichlet-Randwertproblemen. In beiden Fällen wird das Lipschitz-Gebiet in endlich viele Lipschitz-Teilgebiete zerlegt, wobei insbesondere Kreuzungspunkte und Teilgebiete ohne Außenrand zugelassen sind. Anschließend werden Transmissionsprobleme mit frei wählbaren $L^{\infty}$-Parameterfunktionen hergeleitet, wobei die Konormalenableitungen als Funktionale auf geeigneten Funktionenräumen über den Teilrändern ($H_{00}^{1/2}(\Gamma)$) interpretiert werden. Die iterative Lösung dieser Transmissionsprobleme mit einem Ansatz von Deng führt auf eine Substrukturierungsmethode mit Robin-artigen Transmissionsbedingungen, bei der eine Auswertung der Konormalenableitungen aufgrund einer geschickten Aufdatierung der Robin-Daten nicht notwendig ist (insbesondere ist die bekannte Robin-Robin-Methode von Lions als Spezialfall enthalten). Die Konvergenz bezüglich einer partitionierten $H^1$-Norm wird für beide Problemklassen gezeigt. Dabei werden keine über $H^1$ hinausgehende Regularitätsforderungen an die Lösungen gestellt und die Gebiete müssen keine zusätzlichen Glattheitsvoraussetzungen erfüllen. Im letzten Kapitel werden nichtmonotone koerzive quasilineare Probleme untersucht, wobei das Zugrunde liegende Gebiet nur in zwei Lipschitz-Teilgebiete zerlegt sein soll. Das zugehörige nichtlineare Transmissionsproblem wird durch Kirchhoff-Transformation in lineare Teilprobleme mit nichtlinearen Kopplungsbedingungen überführt. Ein optimierungsbasierter Lösungsansatz, welcher einen geeigneten Abstand der rücktransformierten Dirichlet-Daten der linearen Teilprobleme auf den Teilrändern minimiert, führt auf ein optimales Kontrollproblem. Die dabei entstehenden regularisierten freien Minimierungsprobleme werden mit Hilfe eines Gradientenverfahrens unter minimalen Glattheitsforderungen an die Nichtlinearitäten gelöst. Unter zusätzlichen Glattheitsvoraussetzungen an die Nichtlinearitäten und weiteren technischen Voraussetzungen an die Lösung des quasilinearen Ausgangsproblems, kann zudem die quadratische Konvergenz des Newton-Verfahrens gesichert werden.
Resumo:
Heilkräuter sind während des Trocknungsprozesses zahlreichen Einflüssen ausgesetzt, welche die Qualität des Endproduktes entscheidend beeinflussen. Diese Forschungsarbeit beschäftigt sich mit der Trocknung von Zitronenmelisse (Melissa officinalis .L) zu einem qualitativ hochwertigen Endprodukt. Es werden Strategien zur Trocknung vorgeschlagen, die experimentelle und mathematische Aspekte mit einbeziehen, um bei einer adäquaten Produktivität die erforderlichen Qualitätsmerkmale im Hinblick auf Farbeänderung und Gehalt an ätherischen Ölen zu erzielen. Getrocknete Zitronenmelisse kann zurzeit, auf Grund verschiedener Probleme beim Trocknungsvorgang, den hohen Qualitätsanforderungen des Marktes nicht immer genügen. Es gibt keine standardisierten Informationen zu den einzelnen und komplexen Trocknungsparametern. In der Praxis beruht die Trocknung auf Erfahrungswerten, bzw. werden Vorgehensweisen bei der Trocknung anderer Pflanzen kopiert, und oftmals ist die Trocknung nicht reproduzierbar, oder beruht auf subjektiven Annäherungen. Als Folge dieser nicht angepassten Wahl der Trocknungsparameter entstehen oftmals Probleme wie eine Übertrocknung, was zu erhöhten Bruchverlusten der Blattmasse führt, oder eine zu geringe Trocknung, was wiederum einen zu hohen Endfeuchtegehalt im Produkt zur Folge hat. Dies wiederum mündet zwangsläufig in einer nicht vertretbaren Farbänderung und einen übermäßigen Verlust an ätherischen Ölen. Auf Grund der unterschiedlichen thermischen und mechanischen Eigenschaften von Blättern und Stängel, ist eine ungleichmäßige Trocknung die Regel. Es wird außerdem eine unnötig lange Trocknungsdauer beobachtet, die zu einem erhöhten Energieverbrauch führt. Das Trocknen in solaren Tunneln Trocknern bringt folgendes Problem mit sich: wegen des ungeregelten Strahlungseinfalles ist es schwierig die Trocknungstemperatur zu regulieren. Ebenso beeinflusst die Strahlung die Farbe des Produktes auf Grund von photochemischen Reaktionen. Zusätzlich erzeugen die hohen Schwankungen der Strahlung, der Temperatur und der Luftfeuchtigkeit instabile Bedingungen für eine gleichmäßige und kontrollierbare Trocknung. In Anbetracht der erwähnten Probleme werden folgende Forschungsschwerpunkte in dieser Arbeit gesetzt: neue Strategien zur Verbesserung der Qualität werden entwickelt, mit dem Ziel die Trocknungszeit und den Energieverbrauch zu verringern. Um eine Methodik vorzuschlagen, die auf optimalen Trocknungsparameter beruht, wurden Temperatur und Luftfeuchtigkeit als Variable in Abhängigkeit der Trocknungszeit, des ätherischer Ölgehaltes, der Farbänderung und der erforderliche Energie betrachtet. Außerdem wurden die genannten Parametern und deren Auswirkungen auf die Qualitätsmerkmale in solaren Tunnel Trocknern analysiert. Um diese Ziele zu erreichen, wurden unterschiedliche Ansätze verfolgt. Die Sorption-Isothermen und die Trocknungskinetik von Zitronenmelisse und deren entsprechende Anpassung an verschiedene mathematische Modelle wurden erarbeitet. Ebenso wurde eine alternative gestaffelte Trocknung in gestufte Schritte vorgenommen, um die Qualität des Endproduktes zu erhöhen und gleichzeitig den Gesamtenergieverbrauch zu senken. Zusätzlich wurde ein statistischer Versuchsplan nach der CCD-Methode (Central Composite Design) und der RSM-Methode (Response Surface Methodology) vorgeschlagen, um die gewünschten Qualitätsmerkmalen und den notwendigen Energieeinsatz in Abhängigkeit von Lufttemperatur und Luftfeuchtigkeit zu erzielen. Anhand der gewonnenen Daten wurden Regressionsmodelle erzeugt, und das Verhalten des Trocknungsverfahrens wurde beschrieben. Schließlich wurde eine statistische DOE-Versuchsplanung (design of experiments) angewandt, um den Einfluss der Parameter auf die zu erzielende Produktqualität in einem solaren Tunnel Trockner zu bewerten. Die Wirkungen der Beschattung, der Lage im Tunnel, des Befüllungsgrades und der Luftgeschwindigkeit auf Trocknungszeit, Farbänderung und dem Gehalt an ätherischem Öl, wurde analysiert. Ebenso wurden entsprechende Regressionsmodelle bei der Anwendung in solaren Tunneltrocknern erarbeitet. Die wesentlichen Ergebnisse werden in Bezug auf optimale Trocknungsparameter in Bezug auf Qualität und Energieverbrauch analysiert.
Resumo:
Rhizome rot disease caused by Erwinia spp. is emerging as a major problem in banana nurseries and young plantations worldwide. Management of the disease is possible only in the initial stages of development. Currently no method is available for rescuing plant material already infected with this pathogen. A total of 95 Nanjanagud Rasabale and 212 Elakki Bale suckers were collected from different growing regions of Karnataka, India. During nursery maintenance of these lines, severe Erwinia infection was noticed. We present a method to rescue infected plants and establish them under field conditions. Differences were noticed in infection severity amongst the varieties and their accessions. Field data revealed good establishment and growth of most rescued plants under field conditions. The discussed rescue protocol coupled with good field management practices resulted in 89.19 and 82.59 percent field establishment of previously infected var. Nanjanagud Rasabale and var. Elakki Bale plants, respectively.
Resumo:
This study investigated the relationship between higher education and the requirement of the world of work with an emphasis on the effect of problem-based learning (PBL) on graduates' competencies. The implementation of full PBL method is costly (Albanese & Mitchell, 1993; Berkson, 1993; Finucane, Shannon, & McGrath, 2009). However, the implementation of PBL in a less than curriculum-wide mode is more achievable in a broader context (Albanese, 2000). This means higher education institutions implement only a few PBL components in the curriculum. Or a teacher implements a few PBL components at the courses level. For this kind of implementation there is a need to identify PBL components and their effects on particular educational outputs (Hmelo-Silver, 2004; Newman, 2003). So far, however there has been little research about this topic. The main aims of this study were: (1) to identify each of PBL components which were manifested in the development of a valid and reliable PBL implementation questionnaire and (2) to determine the effect of each identified PBL component to specific graduates' competencies. The analysis was based on quantitative data collected in the survey of medicine graduates of Gadjah Mada University, Indonesia. A total of 225 graduates responded to the survey. The result of confirmatory factor analysis (CFA) showed that all individual constructs of PBL and graduates' competencies had acceptable GOFs (Goodness-of-fit). Additionally, the values of the factor loadings (standardize loading estimates), the AVEs (average variance extracted), CRs (construct reliability), and ASVs (average shared squared variance) showed the proof of convergent and discriminant validity. All values indicated valid and reliable measurements. The investigation of the effects of PBL showed that each PBL component had specific effects on graduates' competencies. Interpersonal competencies were affected by Student-centred learning (β = .137; p < .05) and Small group components (β = .078; p < .05). Problem as stimulus affected Leadership (β = .182; p < .01). Real-world problems affected Personal and organisational competencies (β = .140; p < .01) and Interpersonal competencies (β = .114; p < .05). Teacher as facilitator affected Leadership (β = 142; p < .05). Self-directed learning affected Field-related competencies (β = .080; p < .05). These results can help higher education institution and educator to have informed choice about the implementation of PBL components. With this information higher education institutions and educators could fulfil their educational goals and in the same time meet their limited resources. This study seeks to improve prior studies' research method in four major ways: (1) by indentifying PBL components based on theory and empirical data; (2) by using latent variables in the structural equation modelling instead of using a variable as a proxy of a construct; (3) by using CFA to validate the latent structure of the measurement, thus providing better evidence of validity; and (4) by using graduate survey data which is suitable for analysing PBL effects in the frame work of the relationship between higher education and the world of work.
Resumo:
Vortrag beim Treffen Lions Club Kassel Brüder Grimm am 20. August 1999. In der Zeit vor der Jahrtausendwende gab es Bedenken, dass größere Probleme durch die übliche Darstellung der Jahresangabe mit nur zwei Ziffern entstehen würden, weil Rechner nicht zwischen 1900 und 2000 unterscheiden könnten. Als Beispiel genannt wurden Fahrstühle, die seit 100 Jahren nicht mehr gewartet wurden und daher stehenbleiben. Tatsächlich ist dann sehr wenig passiert, ob wegen der lebhaften Diskussion vorher oder nicht, ist umstritten. Der Vortrag betrachtet sehr gründlich die technischen Probleme, die sich mit der Zeitdarstellung auf Rechnern ergeben.
Resumo:
Enhanced reality visualization is the process of enhancing an image by adding to it information which is not present in the original image. A wide variety of information can be added to an image ranging from hidden lines or surfaces to textual or iconic data about a particular part of the image. Enhanced reality visualization is particularly well suited to neurosurgery. By rendering brain structures which are not visible, at the correct location in an image of a patient's head, the surgeon is essentially provided with X-ray vision. He can visualize the spatial relationship between brain structures before he performs a craniotomy and during the surgery he can see what's under the next layer before he cuts through. Given a video image of the patient and a three dimensional model of the patient's brain the problem enhanced reality visualization faces is to render the model from the correct viewpoint and overlay it on the original image. The relationship between the coordinate frames of the patient, the patient's internal anatomy scans and the image plane of the camera observing the patient must be established. This problem is closely related to the camera calibration problem. This report presents a new approach to finding this relationship and develops a system for performing enhanced reality visualization in a surgical environment. Immediately prior to surgery a few circular fiducials are placed near the surgical site. An initial registration of video and internal data is performed using a laser scanner. Following this, our method is fully automatic, runs in nearly real-time, is accurate to within a pixel, allows both patient and camera motion, automatically corrects for changes to the internal camera parameters (focal length, focus, aperture, etc.) and requires only a single image.
Resumo:
This paper considers the problem of language change. Linguists must explain not only how languages are learned but also how and why they have evolved along certain trajectories and not others. While the language learning problem has focused on the behavior of individuals and how they acquire a particular grammar from a class of grammars ${cal G}$, here we consider a population of such learners and investigate the emergent, global population characteristics of linguistic communities over several generations. We argue that language change follows logically from specific assumptions about grammatical theories and learning paradigms. In particular, we are able to transform parameterized theories and memoryless acquisition algorithms into grammatical dynamical systems, whose evolution depicts a population's evolving linguistic composition. We investigate the linguistic and computational consequences of this model, showing that the formalization allows one to ask questions about diachronic that one otherwise could not ask, such as the effect of varying initial conditions on the resulting diachronic trajectories. From a more programmatic perspective, we give an example of how the dynamical system model for language change can serve as a way to distinguish among alternative grammatical theories, introducing a formal diachronic adequacy criterion for linguistic theories.
Resumo:
Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering, classification, and function approximation from incomplete data in a principled and efficient manner. These algorithms are based on mixture modeling and make two distinct appeals to the Expectation-Maximization (EM) principle (Dempster, Laird, and Rubin 1977)---both for the estimation of mixture components and for coping with the missing data.
Resumo:
Modeling and predicting co-occurrences of events is a fundamental problem of unsupervised learning. In this contribution we develop a statistical framework for analyzing co-occurrence data in a general setting where elementary observations are joint occurrences of pairs of abstract objects from two finite sets. The main challenge for statistical models in this context is to overcome the inherent data sparseness and to estimate the probabilities for pairs which were rarely observed or even unobserved in a given sample set. Moreover, it is often of considerable interest to extract grouping structure or to find a hierarchical data organization. A novel family of mixture models is proposed which explain the observed data by a finite number of shared aspects or clusters. This provides a common framework for statistical inference and structure discovery and also includes several recently proposed models as special cases. Adopting the maximum likelihood principle, EM algorithms are derived to fit the model parameters. We develop improved versions of EM which largely avoid overfitting problems and overcome the inherent locality of EM--based optimization. Among the broad variety of possible applications, e.g., in information retrieval, natural language processing, data mining, and computer vision, we have chosen document retrieval, the statistical analysis of noun/adjective co-occurrence and the unsupervised segmentation of textured images to test and evaluate the proposed algorithms.
Resumo:
The COntext INterchange (COIN) strategy is an approach to solving the problem of interoperability of semantically heterogeneous data sources through context mediation. COIN has used its own notation and syntax for representing ontologies. More recently, the OWL Web Ontology Language is becoming established as the W3C recommended ontology language. We propose the use of the COIN strategy to solve context disparity and ontology interoperability problems in the emerging Semantic Web – both at the ontology level and at the data level. In conjunction with this, we propose a version of the COIN ontology model that uses OWL and the emerging rules interchange language, RuleML.
Resumo:
One of the disadvantages of old age is that there is more past than future: this, however, may be turned into an advantage if the wealth of experience and, hopefully, wisdom gained in the past can be reflected upon and throw some light on possible future trends. To an extent, then, this talk is necessarily personal, certainly nostalgic, but also self critical and inquisitive about our understanding of the discipline of statistics. A number of almost philosophical themes will run through the talk: search for appropriate modelling in relation to the real problem envisaged, emphasis on sensible balances between simplicity and complexity, the relative roles of theory and practice, the nature of communication of inferential ideas to the statistical layman, the inter-related roles of teaching, consultation and research. A list of keywords might be: identification of sample space and its mathematical structure, choices between transform and stay, the role of parametric modelling, the role of a sample space metric, the underused hypothesis lattice, the nature of compositional change, particularly in relation to the modelling of processes. While the main theme will be relevance to compositional data analysis we shall point to substantial implications for general multivariate analysis arising from experience of the development of compositional data analysis…