神經網路、AI 很簡單！所以……別再裝逼、佯稱自己是個天才！

科技 02-17

這篇文章也許讓人覺得是在炮轟現狀，但那不是初衷，本文的目的是找出為何在短短時間內AI專家從鳳毛麟角變成過江之鯽的原因。

作者簡介：Rcognant公司首席執行官兼創始人Brandon Wirtz

經常有人告訴我他們如何運用AI取得了不起的成就，但這些成就當中99%其實完全很愚蠢。這篇文章也許讓人覺得是在炮轟現狀，但那不是初衷，本文的目的是找出為何在短短時間內AI專家從鳳毛麟角變成過江之鯽的原因，同時還要揭露這個事實：大多數這些專家看起來很專業，完全是由於很少有人指出他們搞出來的東西純粹是狗屎。

假設你從頭開始構建了一個神經網路，它還可以在手機上運行……

很棒。你把一件T恤上都放得下的11行Python代碼轉換成了Java、C或C++代碼。你對交叉編譯器在3秒內能完成的事情瞭然於胸。

大多數人不知道神經網路其實很簡單，他們以為神經網路異常複雜。與數學領域的分形相似，神經網路可以處理似乎很複雜的任務，但這種複雜性來自重複以及隨機數生成器。

假設你構建了一個有20層的深度神經網路……

恭喜你！你拿來上述代碼，再次對循環語句進行循環。這肯定非常難，這取決於將另一個For語句和冒號放在何處。

「深度學習」和N層深度只是一個通過自己運行輸出結果的神經網路。由於你對循環語句進行循環，所以這被稱為遞歸神經網路（RNN）。

這與學開車相類似，但只能夠右轉彎，但你還是可以到達任何地方。這可能不是最有效的，但這比左轉彎來得容易。

假設你使用英偉達GPU訓練了神經網路，後來將神經網路移到了手機上……

在上面那11行代碼中，出岔子（或未實現）的是種子未設置。不設置種子，我就無法保證第二次能和第一次獲得同樣的隨機數，因而結果可能會大不一樣。由於手機和台式機無法給出同樣的隨機數，而不同的手機晶元都會有不同的隨機數，在基於GPU的系統上訓練的神經網路到了手機上很有可能無法運行。

由於相比在鎖系統中分類，訓練需要多花幾百萬、甚至幾十億倍的時間，構建面向手機的神經網路幾乎是不可能的。更何況不同設備之間始終存在差異。對於語音識別來說，相差正負5%關係不大，但是對於癌症檢測/診斷來說，關係就很大了。

假設你訓練了一個神經網路完成人類尚無法完成的事情……比如僅憑一張照片來判斷某人是不是同性戀。

不，你做不到。神經網路是「啞」黑盒子系統。如果你折磨它們，你可以對測試數據進行很好的擬合，但是無法從隨機來源測試中獲得很好的結果。AI很擅長偽相關（spurious correlations），肯塔基州的結婚率並不提高溺水率。

而且，照片近距離拍攝這個事實也無法證明照片里的動物是貓而不是獅子。所以，地平線的形狀並沒有導致什麼東西是獅子或貓。

人們想為AI賦予神話色彩，但總的來說，人類做不了的事情，AI也無能為力。有些時候例外，但僅限於透明AI。神經網路並不透明，即使在透明系統中，人類也只能複製最終的結果。

假設你使用TensorFlow來……

還記得上面那11行代碼嗎？TensorFlow只是那11行代碼的封裝器。它的擅長之處在於，幫助你直觀地顯示那11行代碼中發生的事情。它在很多方面類似谷歌分析（Google Analytics）。要完成谷歌分析所做工作的所有數據可能都在伺服器日誌里，但查看那些日誌很困難，而查看谷歌分析很容易。與此同時，雖然谷歌分析會告訴你伺服器速度很慢，但不會告訴你原因。

了解神經網路的那些人之所以不想要或不需要TensorFlow，是由於我們不依賴那些花哨的圖表和動畫即可直觀顯示數據；因為我們查看原始數據和代碼，就能搞清楚伺服器速度變慢的原因。

假設你用神經網路進行自然語言處理（NLP）/自然語言理解（NLU）……

神經網路模擬起來其實比鼻涕蟲的智力水平強不了多少。你教鼻涕蟲理解英語的可能性有多大？

如果構建一個記下英語中每個單詞一個特徵的神經網路，這個網路就需要使用與整個谷歌一樣強大的計算能力。如果還要記下英語中每個詞義的一個特徵，那麼就需要地球上所有雲服務計算能力的總和。

可以構建處理出色任務的AI，但神經網路有其限制。

假設你有一個自定義的神經網路……

恭喜你，你知道如何將11行神經網路代碼封裝在11行遺傳演算法代碼中，或者是封裝在44行分散式演進演算法代碼中。好撰寫一份新聞稿了，因為你的55行代碼可以……噢，稍等……

假設你針對……任何情形訓練了一個神經網路。

恭喜你，你是數據牧人（data wrangler，意為數據管理員）。雖然這聽起來很了不起，但你就是狗狗訓練員。而你的狗擁有鼻涕蟲一般的智能，唯一有利的方面就是你可以有好多狗。擁有訓練集並不神奇。別讓自己或別讓他人以為你只不過是美其名曰的鼻涕蟲訓練員。

假設你結合了神經網路和區塊鏈……

恭喜你，你知道如何大搞聲勢。遺憾的是，哈希挖掘和神經網路根本沒有共同之處，想通過區塊鏈農場的所有節點來運行所有數據集行不通。如果你以逾16種方式來「切分」負載（數據集是正常大小），神經網路就會開始出現問題。如果你有幾十億個記錄，或者你在搞反向傳播（Back Propagation），想測試多個級序的數據顯示，可能會碰到更棘手的情況，但這些技術無法將規模擴大到成千上萬個節點的環境。

我用神經網路做不了多少事。

我的工具箱里有神經網路代碼，不過本該就是這樣。它是可供選擇的工具，而不是整個產品的基礎。我的大多數工作在認識論中是自定義啟發法。多種技術的結合稱為思維模擬（Mind Simulation）。思維模擬用來用軟體模擬大腦的軟體，就是過去所說的大腦模擬器，而神經網路本該用軟體模仿大腦的硬體（實則不然）。思維模擬的歷史只有10年左右，而神經網路已經存在50多年了。思維模擬的不同之處還在於它是透明的，需要幾百萬行代碼，而不是幾十行代碼。

英文原文：Neural network AI is simple. So... Stop pretending you are a genius.

On a regular basis people tell me about their impressive achievements using AI. 99% of these things are completely stupid. This post may come off as a rant, but that』s not so much its intent, as it is to point out why we went from having very few AI experts, to having so many in so little time. Also to convey that most of these experts only seem experty because so few people know how to call them on their bull shit.

So you built a neural network from scratch… And it runs on a phone…

Great. So you converted 11 lines of python that would fit on a t-shirt to Java or C or C++. You have mastered what a cross compiler can do in 3 seconds.

Most people don』t know that a neural network is so simple. They think it is super complex. Like fractals a neural network can do things that seem complex, but that complexity comes from repetition and a random number generator.

So you built a neural network that is 20 layers deep…

Congrats! You took the above code, and looped the loop again. That must have been so hard, deciding where to put another For and a Colon.

「Deep Learning」 and n-Layers of depth is just a neural network that runs its output through itself. This is called Recursive Neural Networks (RNN), because you loop the loop.

This is similar to learning to drive, and only being able to make right turns. You can get to almost anywhere doing this. It may not be the most efficient, but it is easier than making left turns.

So you trained a neural network using Nvidia GPUs and moved it to the phone…

In that above 11 lines of code something that is wrong (or not implemented) is that the seed is not set. Without setting the seed I can』t guarantee that I will get the same random numbers in a second pass as in the first pass. As a result I could have dramatically different results. Since your phone and your desktop won』t give the same random numbers, and different phone chips could all have different random numbers, your training from a GPU based system to a mobile system has a high probability of not working.

Since training can take millions to billions of times longer than classifying in a locked system, building a neural network for a phone is pretty much impossible. There will always be differences between devices. Plus or minus 5% is not a big deal for voice recognition. It is a big deal for things like cancer detection/diagnosis.

So you trained a neural network to do something no human has been able to do…Like detect if people are gay just from a photo.

No. No you didn』t. Neural networks are dumb black box systems. If you torture them enough you can get great fit of test data, but you won』t get great results from randomly sourced tests. AI is really good at spurious correlations. The marriage rate in Kentucky is not driving the drowning rate.

Nor is the fact that a picture is taken close up a proof that the animal in the photo is a cat instead of a lion. So the shape of the horizon didn』t cause something to be a lion or a cat.

People want to ascribe magic powers to AI, but for the most part AI can』t do anything a human can』t. There are some exceptions, but only for transparent AI. Neural Networks aren』t transparent, and even in the transparent systems a human would be able to replicate the final result.

So you use TensorFlow to…

Remember those 11 lines from above? TensorFlow is just a wrapper for those 11 lines. What it does well is help you visualize what is happening in those 11 lines. In many ways it is like Google Analytics. All of the data to do what Google Analytics does is probably in your server log, but looking at those logs is hard, and looking at Google Analytics is easy. At the same time while Google Analytics will tell you that your server is slow it won』t tell you why.

Those of us who understand neural networks don』t want or need tensor flow because we visualize the data without their fancy charts and animations, and because we look at the data and code raw, we can figure out the equivalent of why the server is slow.

So you use neural networks to do NLP/NLU…

Common sense, people. Neural networks are not simulating much more than a Slug』s level of intelligence. What are the odds you taught a slug to understand English?

Building a neural network with 1 trait for every word in the English language would require a network that used as much computing power as all of Google. Upping that to 1 trait for each word sense in the English language would be all of the computing in all of the cloud services on the planet.

AI can be built to do great things. Neural networks have limitations.

So you have a self-defining neural networks…

Congrats, you know how to wrap the 11 lines of neural network code in the 9 lines of code for a genetic algorithm. Or the 44 lines for a distributed evolutionary algorithm. Write a press release because your 55 lines of code are going to... Oh, wait...

So you trained a neural network to…anything.

Congrats, you are a data wrangler. While that sounds impressive you are a dog trainer. Only your dog has the brains of a slug, and the only thing it has going for it, is that you can make lots of them. There is no magic in owning a training set. It might have been hard to track down, but don"t fool yourself (or others) into thinking you are anything more than a glorified slug trainer.

So you combined neural networks and blockchain…

Congrats, you know how to make hype stack. Unfortunately, hash mining and neural networks don』t have anything in common, and trying to run all of a data sets through all of the nodes of a blockchain farm wouldn』t work. Neural network start to have problems when you 「slice」 the load more than about 16 ways with data sets of normal size. You can go larger of if you have billions of records, or if you are doing Back Propagation and want to test multiple orders of data presentation, but these techniques don』t scale to 1000s or millions of nodes.

I don"t do much with neural networks.

There is neural network code in my tool box. but that is what it should be. A tool in the selection, not the basis for an entire product. Most of my work is in epistemology an self-defining heuristics. The combination of technologies is called Mind Simulation, because rather than neural networks that are supposed to be modeled after the hardware of the brain, in software (which they aren"t), Mind Simulation is about modeling the software of the brain, in software. A brain emulator as it were. Mind Simulation has only been a thing for about 10 years, where as neural networks have been around for 50+. Mind Simulation also differs in that it is transparent, and takes millions of lines of code not dozens.

To learn more about AI that isn"t neural network based, check out my follow up article:

https://www.linkedin.com/pulse/8-ai-technologies-aint-neural-networks-brandon-wirtz/

喜歡這篇文章嗎？立刻分享出去讓更多人知道吧！

本站內容充實豐富，博大精深，小編精選每日熱門資訊，隨時更新，點擊「搶先收到最新資訊」瀏覽吧！

請您繼續閱讀更多來自 雲頭條 的精彩文章:

※華雲數據簽下「萬達集團私有雲」項目
※萬國數據宣布公開發行ADS：融資金額約為 2 億美元

TAG:雲頭條 |