884 resultados para Connectivity,Connected Car,Big Data,KPI
Resumo:
SOA (Service Oriented Architecture), workflow, the Semantic Web, and Grid computing are key enabling information technologies in the development of increasingly sophisticated e-Science infrastructures and application platforms. While the emergence of Cloud computing as a new computing paradigm has provided new directions and opportunities for e-Science infrastructure development, it also presents some challenges. Scientific research is increasingly finding that it is difficult to handle “big data” using traditional data processing techniques. Such challenges demonstrate the need for a comprehensive analysis on using the above mentioned informatics techniques to develop appropriate e-Science infrastructure and platforms in the context of Cloud computing. This survey paper describes recent research advances in applying informatics techniques to facilitate scientific research particularly from the Cloud computing perspective. Our particular contributions include identifying associated research challenges and opportunities, presenting lessons learned, and describing our future vision for applying Cloud computing to e-Science. We believe our research findings can help indicate the future trend of e-Science, and can inform funding and research directions in how to more appropriately employ computing technologies in scientific research. We point out the open research issues hoping to spark new development and innovation in the e-Science field.
Resumo:
Automatic generation of classification rules has been an increasingly popular technique in commercial applications such as Big Data analytics, rule based expert systems and decision making systems. However, a principal problem that arises with most methods for generation of classification rules is the overfit-ting of training data. When Big Data is dealt with, this may result in the generation of a large number of complex rules. This may not only increase computational cost but also lower the accuracy in predicting further unseen instances. This has led to the necessity of developing pruning methods for the simplification of rules. In addition, classification rules are used further to make predictions after the completion of their generation. As efficiency is concerned, it is expected to find the first rule that fires as soon as possible by searching through a rule set. Thus a suit-able structure is required to represent the rule set effectively. In this chapter, the authors introduce a unified framework for construction of rule based classification systems consisting of three operations on Big Data: rule generation, rule simplification and rule representation. The authors also review some existing methods and techniques used for each of the three operations and highlight their limitations. They introduce some novel methods and techniques developed by them recently. These methods and techniques are also discussed in comparison to existing ones with respect to efficient processing of Big Data.
Resumo:
The induction of classification rules from previously unseen examples is one of the most important data mining tasks in science as well as commercial applications. In order to reduce the influence of noise in the data, ensemble learners are often applied. However, most ensemble learners are based on decision tree classifiers which are affected by noise. The Random Prism classifier has recently been proposed as an alternative to the popular Random Forests classifier, which is based on decision trees. Random Prism is based on the Prism family of algorithms, which is more robust to noise. However, like most ensemble classification approaches, Random Prism also does not scale well on large training data. This paper presents a thorough discussion of Random Prism and a recently proposed parallel version of it called Parallel Random Prism. Parallel Random Prism is based on the MapReduce programming paradigm. The paper provides, for the first time, novel theoretical analysis of the proposed technique and in-depth experimental study that show that Parallel Random Prism scales well on a large number of training examples, a large number of data features and a large number of processors. Expressiveness of decision rules that our technique produces makes it a natural choice for Big Data applications where informed decision making increases the user’s trust in the system.
Resumo:
This study assesses Autism-Spectrum Quotient (AQ) scores in a ‘big data’ sample collected through the UK Channel 4 television website, following the broadcasting of a medical education program. We examine correlations between the AQ and age, sex, occupation, and UK geographic region in 450,394 individuals. We predicted that age and geography would not be correlated with AQ, whilst sex and occupation would have a correlation. Mean AQ for the total sample score was m = 19.83 (SD = 8.71), slightly higher than a previous systematic review of 6,900 individuals in a non-clinical sample (mean of means = 16.94) This likely reflects that this big-data sample includes individuals with autism who in the systematic review score much higher (mean of means = 35.19). As predicted, sex and occupation differences were observed: on average, males (m = 21.55, SD = 8.82) scored higher than females (m = 18.95; SD = 8.52), and individuals working in a STEM career (m = 21.92, SD = 8.92) scored higher than individuals non-STEM careers (m = 18.92, SD = 8.48). Also as predicted, age and geographic region were not meaningfully correlated with AQ. These results support previous findings relating to sex and STEM careers in the largest set of individuals for which AQ scores have been reported and suggest the AQ is a useful self-report measure of autistic traits
Resumo:
This introduction to the Virtual Special Issue surveys the development of spatial housing economics from its roots in neo-classical theory, through more recent developments in social interactions modelling, and touching on the role of institutions, path dependence and economic history. The survey also points to some of the more promising future directions for the subject that are beginning to appear in the literature. The survey covers elements hedonic models, spatial econometrics, neighbourhood models, housing market areas, housing supply, models of segregation, migration, housing tenure, sub-national house price modelling including the so-called ripple effect, and agent-based models. Possible future directions are set in the context of a selection of recent papers that have appeared in Urban Studies. Nevertheless, there are still important gaps in the literature that merit further attention, arising at least partly from emerging policy problems. These include more research on housing and biodiversity, the relationship between housing and civil unrest, the effects of changing age distributions - notably housing for the elderly - and the impact of different international institutional structures. Methodologically, developments in Big Data provide an exciting framework for future work.
Resumo:
I consider the case for genuinely anonymous web searching. Big data seems to have it in for privacy. The story is well known, particularly since the dawn of the web. Vastly more personal information, monumental and quotidian, is gathered than in the pre-digital days. Once gathered it can be aggregated and analyzed to produce rich portraits, which in turn permit unnerving prediction of our future behavior. The new information can then be shared widely, limiting prospects and threatening autonomy. How should we respond? Following Nissenbaum (2011) and Brunton and Nissenbaum (2011 and 2013), I will argue that the proposed solutions—consent, anonymity as conventionally practiced, corporate best practices, and law—fail to protect us against routine surveillance of our online behavior. Brunton and Nissenbaum rightly maintain that, given the power imbalance between data holders and data subjects, obfuscation of one’s online activities is justified. Obfuscation works by generating “misleading, false, or ambiguous data with the intention of confusing an adversary or simply adding to the time or cost of separating good data from bad,” thus decreasing the value of the data collected (Brunton and Nissenbaum, 2011). The phenomenon is as old as the hills. Natural selection evidently blundered upon the tactic long ago. Take a savory butterfly whose markings mimic those of a toxic cousin. From the point of view of a would-be predator the data conveyed by the pattern is ambiguous. Is the bug lunch or potential last meal? In the light of the steep costs of a mistake, the savvy predator goes hungry. Online obfuscation works similarly, attempting for instance to disguise the surfer’s identity (Tor) or the nature of her queries (Howe and Nissenbaum 2009). Yet online obfuscation comes with significant social costs. First, it implies free riding. If I’ve installed an effective obfuscating program, I’m enjoying the benefits of an apparently free internet without paying the costs of surveillance, which are shifted entirely onto non-obfuscators. Second, it permits sketchy actors, from child pornographers to fraudsters, to operate with near impunity. Third, online merchants could plausibly claim that, when we shop online, surveillance is the price we pay for convenience. If we don’t like it, we should take our business to the local brick-and-mortar and pay with cash. Brunton and Nissenbaum have not fully addressed the last two costs. Nevertheless, I think the strict defender of online anonymity can meet these objections. Regarding the third, the future doesn’t bode well for offline shopping. Consider music and books. Intrepid shoppers can still find most of what they want in a book or record store. Soon, though, this will probably not be the case. And then there are those who, for perfectly good reasons, are sensitive about doing some of their shopping in person, perhaps because of their weight or sexual tastes. I argue that consumers should not have to pay the price of surveillance every time they want to buy that catchy new hit, that New York Times bestseller, or a sex toy.
Resumo:
Orange may be the new black, but as I have seen only five minutes of that show, I can’t really use it here. Besides, based on the five minutes I saw, I would assume it is a series written by males. Not since the Victoria’s Secret catalog have I seen so many women wearing fewer clothes, or engaging in so many unmentionable acts. I’ll stop there because my Victorianism is showing, I’m sure.
Resumo:
O maior acesso dos brasileiros à internet e o aumento do volume de conteúdo disseminado pela web têm atraído atenção para as análises de big data. Mercado, imprensa e governos recorrem cada vez mais a técnicas de análise de rede para apoiar decisões. Mas essa prática embute riscos de manipulação pouco considerados. A DAPP/FGV — parceira do GLOBO no monitoramento de redes — tem desenvolvido mecanismos próprios de filtragem e identificou ao menos 25% de “lixo on-line” em pesquisas feitas nas duas últimas semanas.
Resumo:
Na moderna Economia do Conhecimento, na Era do Big Data, entender corretamente o uso e a gestão da Tecnologia de Informação e Comunicação (TIC) tendo como base o campo acadêmico de estudos de Sistemas de Informação (SI), torna-se cada vez mais relevante e estratégico para as organizações que pretendem: permanecer em atividade, estar aptas para atender novas demandas (internas e externas) e enfrentar as complexas mudanças na competição de mercado. Esta pesquisa utiliza a teoria dos estágios de crescimento, fundamentada pelos estudos de Richard L. Nolan nos anos 70. A literatura acadêmica relacionada com modelos de estágios de crescimento e o contexto do campo de estudo de SI, fornecem as bases conceituais deste estudo. A pesquisa identifica um modelo com seus construtos relacionados aos estágios de crescimento das iniciativas da TIC/SI organizacional, partindo das variáveis de benchmark de segundo nível de Nolan, e propõe sua operacionalização com a criação e desenvolvimento de uma escala. De caráter exploratório e descritivo, a pesquisa traz contribuição teórica ao paradigma da teoria dos estágios de crescimento, adicionando um novo processo de crescimento em sua estrutura conceitual. Como resultado, é disponibilizado além de um instrumento de escala bilíngue (português e inglês), recomendações e regras para aplicação de um instrumento de pesquisa do tipo survey, na continuidade deste estudo. Como implicação geral desta pesquisa, é esperado que seu uso e aplicação ao mensurar a avaliação do nível de estágio da TIC/SI em organizações, possam auxiliar dois perfis de indivíduos: acadêmicos que estudam essa temática, assim como, profissionais que buscam respostas de suas ações práticas nas organizações onde trabalham.
Resumo:
A quantificação do risco país – e do risco político em particular – levanta várias dificuldades às empresas, instituições, e investidores. Como os indicadores econômicos são atualizados com muito menos freqüência do que o Facebook, compreender, e mais precisamente, medir – o que está ocorrendo no terreno em tempo real pode constituir um desafio para os analistas de risco político. No entanto, com a crescente disponibilidade de “big data” de ferramentas sociais como o Twitter, agora é o momento oportuno para examinar os tipos de métricas das ferramentas sociais que estão disponíveis e as limitações da sua aplicação para a análise de risco país, especialmente durante episódios de violência política. Utilizando o método qualitativo de pesquisa bibliográfica, este estudo identifica a paisagem atual de dados disponíveis a partir do Twitter, analisa os métodos atuais e potenciais de análise, e discute a sua possível aplicação no campo da análise de risco político. Depois de uma revisão completa do campo até hoje, e tendo em conta os avanços tecnológicos esperados a curto e médio prazo, este estudo conclui que, apesar de obstáculos como o custo de armazenamento de informação, as limitações da análise em tempo real, e o potencial para a manipulação de dados, os benefícios potenciais da aplicação de métricas de ferramentas sociais para o campo da análise de risco político, particularmente para os modelos qualitativos-estruturados e quantitativos, claramente superam os desafios.
Resumo:
Our focus is on information in expectation surveys that can now be built on thousands (or millions) of respondents on an almost continuous-time basis (big data) and in continuous macroeconomic surveys with a limited number of respondents. We show that, under standard microeconomic and econometric techniques, survey forecasts are an affine function of the conditional expectation of the target variable. This is true whether or not the survey respondent knows the data-generating process (DGP) of the target variable or the econometrician knows the respondents individual loss function. If the econometrician has a mean-squared-error risk function, we show that asymptotically efficient forecasts of the target variable can be built using Hansens (Econometrica, 1982) generalized method of moments in a panel-data context, when N and T diverge or when T diverges with N xed. Sequential asymptotic results are obtained using Phillips and Moon s (Econometrica, 1999) framework. Possible extensions are also discussed.
Resumo:
Não é novidade que o presidente dos Estados Unidos, Barack Obama, é um grande entusiasta da tecnologia. Primeiro a contratar um especialista de big data para a Casa Branca, Obama avançou muito na comunicação digital institucional do governo e, na última segunda-feira (18) inovou outra vez ao abrir uma conta pessoal no Twitter (@POTUS, abreviação de "President of the United States"). Popular, o chefe de Estado americano entrou para o livro dos recordes como o mais rápido a obter 1 milhão de seguidores - foram apenas 5 horas.