當前位置:
首頁 > 最新 > 大數據時代催生「另類數據」熱,數據分析行業緊隨其後

大數據時代催生「另類數據」熱,數據分析行業緊隨其後

題圖來自經濟學人

2018年各地高考成績正在陸續公布。縱觀2017年新增設的本科專業,「數據科學與大數據技術」是大勢,人大、同濟、廈大等眾多大學都新設了此專業。也許本文的「另類」數據在不久之後就完全不「另類」了。

# 經濟學人 #

6.23- 6.29 / 2018

專欄 | Finance and Economics


另類數據:謹慎搜尋

「QUANT」 (quantitative) hedge funds, whichcraftelaborate algorithmsto make trading decisions, rely onaccess to information. That used to mean market data, such as prices and trading volume. But some nowseek anedgein novel sources. An industry has sprung up to serve them with, and help them analyse, 「alternative」 data, such as thosegleaned from satellite images or byscrapingwebsites. Many of these data firms have been founded by entrepreneurs, but some quant funds themselves are getting involved. Winton, a large London-based fund, isspinning offHivemind, a data-analysis unit. A full-time management team was announced on June 18th.

量化投資對沖基金依靠信息的獲取,然後在這些信息上運用精細複雜的演算法做出交易決策。一般來說,對沖基金獲取的信息都是市場數據類,比如價格和交易量。但目前新的信息來源成為了一些對沖基金(做決策)的新利器。(因為對沖基金有這樣的需求)一個新的行業就此湧現出來,即為對沖基金分析「另類」數據,例如分析從衛星圖像上搜集或網站上抓取的數據。這些數據公司中有許多是由企業家創立的,但一些量化基金本身也參與其中。比如總部位於倫敦的Winton基金就剝離出Hivemind作為其數據分析部門。6月18日宣布了其全職管理團隊。

【拓展1】量化對沖基金使用了量化投資的方法;而量化投資是運用金融建模進行定量分析,依據其結果做出投資決策。

【拓展2】對沖基金簡單來說就是買一個標的物,然後再賣一個標的物。利用標的物與標的物之間的關聯性進行套利。套期保值常見的形式是在一個市場或資產上做交易,以對沖在另一個市場或資產上的風險。

【拓展3】「另類數據」對應了「另類投資」這個概念,都屬於非傳統主流產物。但或許以後「另類數據」也會變為主流數據。

craft: v. to make or produce sth skilfully

edge: n. an advantage that makes sb. or sth. more successful than other people or things 優勢

glean: v. obtain (information) from various sources, often with difficulty 搜集

scrape: v. copy (data) from a website using a computer program

e.g. All search enginesscrapecontent from sites without permission and display it on their own sites.

spin off: to create sth new based on sth else that already exists

e.g.to bespun offinto a separate company

For funds making macroeconomic bets bytrading in, say, currencies or government bonds, real-time measures of inflation (scraped from e-commerce sites) or trade flows (from shipping data) can be better and more timely than the output of national statistics agencies. Funds trading in individual firms』 shares can infer information on sales from satellite photos of their car parks, and onfootfallin shops from data bought from mobile-phone and credit-card companies, rather than having to rely on company reports or quarterly earnings statements. Many of these datasets arefine-grained. Quandl, a data provider, sells information on the number of Tesla cars sold each day, broken down by each American state.

對於通過交易貨幣或政府債券進行宏觀投資的資金,實時通貨膨脹測量(從電子商務網站中抓取)或貿易流量(來自海運數據)可能比國家統計局發布的數據更有效也更及時。有了這些另類數據,投資單個公司股票的基金就不用等對方公司的年報或者季度收益報表了,可以直接從停車場的衛星圖片上推測其銷售量,也可以從手機和信用卡公司購買的數據中推測商店的人流量。許多這些數據集都非常詳細。數據提供商Quandl售賣特斯拉電動車每天的銷售量信息,數據按照美國各州分類。

footfall:n.the number of people entering a shop or shopping area in a given time

e.g.a drive to improvefootfallin individual branches

fine-grained: adj. involving great attention to detail 詳細的,深入的

e.g.fine-grainedanalysis

But amid thenew wave ofdata vendors are some that arepushing the boundariesoflegality. John Funge of Winton』s San Francisco office says some are careless about privacy. Anoosh Lachin of Aspect Capital, another London quant fund, was once offered data by a former employee of the American government, who founded a firm to 「predict」 the statistics released by the agency he had worked for. Jonathan Streeter of Dechert, a law firm, says hedge funds are waking up to the risks of potentiallysuspect data.The mainpitfalls are privacy laws andinsider-tradingrules.

但在新一輪的數據供應商中,有一些正在試探法律的界限。Winton基金舊金山辦公室的John Funge說一些公司對(侵犯)隱私毫不在意。倫敦的另一家量化基金Aspect Capital的Anoosh Lachin有一次還得到了曾經是美國政府的一名前僱員提供的數據,這個前僱員創立了一家公司來「預測」他曾經工作過的機構發布的統計數據。律師事務所Dechert的Jonathan Streeter表示,對沖基金正在意識到潛在可疑數據的風險。主要的隱患是隱私法和內幕交易規則。

legality/li?"g?l?t?/: n. the quality or state of being in accordance with the law 合法性

pitfall: n. a hidden or unsuspected danger or difficulty 隱患

e.g.thepitfalls of buying goods at public auctions

The biggest risk is reputational; onlyegregious transgressions are likely to lead to penalties. In America, aconvictionfor insider trading requires not only proof that the information ismaterialand non-public, but also proof of a 「breach of duty」; that it was obtained without the owner』s consent, for example. Since many phone and credit-card companies include clauses in their contracts allowing them to sell information, that condition is rarely fulfilled. In Europe, though no breach of duty is needed to prove insider trading, thebaris higher in other ways. But privacy isa much greater concern. A new EU-wide data-protection law is backed byheftyfines.

最大風險是(有損)聲譽; 只有極其嚴重的違法才有可能被懲罰。在美國,要定內幕交易的罪不僅需要證明信息是重大且非公開的,還要證明「違反了職責」,例如是在沒有得到所有者同意的情況下獲得的。但由於許多電話和信用卡公司在其合同中就包含了允許出售客戶信息的條款,因此這一條幾乎名存實亡。在歐洲,雖然不需要違反職責這一條來證明內幕交易,但在其他方面標準會更高。但隱私是大家廣泛關注的問題。新的歐盟數據保護法就有對這方面巨額罰款的處罰。

egregious/?"gri?d??s/:adj. outstandingly bad; shocking 極壞的,極嚴重的

e.g.egregiousabuses of copyright

transgression: n. an act that goes against a law, rule, or code of conduct

material: adj. significant; important

hefty: adj. large and heavy

Some funds are seeking ways to explore new datasets without breaching privacy. In apilot projectintended eventually tofeed intoits trading algorithms, Winton has worked with researchers at the University of California, Berkeley, to use 「differential-privacy」 techniques to analyse datasets that Winton was wary of looking at alone. Differential privacy works by adding noise to data, thusobscuringpersonal, identifiable information without destroying the dataset』s useful features. It is already used by tech firms including Google and Apple, and by America』s Census Bureau.

一些基金正在設法探索不違反隱私的新數據集。在一個旨在最終用於交易演算法的試點項目中,Winton基金與加利福尼亞大學伯克利分校的研究人員合作,使用「差分隱私」技術分析Winton出於謹慎而不敢單獨分析的數據集。差分隱私通過向數據添加干擾來工作,從而在不破壞數據集有效功能的情況下模糊個人可識別信息。這項技術已經被包括谷歌和蘋果在內的科技公司以及美國人口普查局使用。

Thenascentindustry is cleaning up its act, too. Emmett Kilduff of Eagle Alpha, an alternative-data provider, points to the Investment Data Standards Organisation, a non-profit body set up earlier this year. It is unsurprising that firms such as Eagle Alpha and Quandl are moving into analysis, rather than merely providing raw data. Amid theproliferation,the needtosortuseful from pointless, and legal from dubious,has never been greater.

這個新興行業也在整頓其行為。一家另類數據提供商Eagle Alpha的Emmett Kilduff指出了今年早些時候成立的非營利機構投資數據標準組織。像Eagle Alpha和Quandl這樣的另類數據提供商正在轉向分析而不僅僅是提供原始數據這個變化看起來一點都不意外。在這個數據激增的時代,從無意義的數據中整理出有用的信息,從法律界限模糊規避到完全合法,這類需求從來沒有如此之大。

sort: arrange systematically in groups; separate according to type

e.g.Once the dataiscollected, the computer willsortit by date.

- The End -

喜歡這篇文章嗎?立刻分享出去讓更多人知道吧!

本站內容充實豐富,博大精深,小編精選每日熱門資訊,隨時更新,點擊「搶先收到最新資訊」瀏覽吧!


請您繼續閱讀更多來自 達達萬事屋 的精彩文章:

談共享經濟在日本的現狀

TAG:達達萬事屋 |