Early last year, you might recall, Target found itself at the center of a storm of outrage1. The retailer's number crunchers had come up with a statistical2 method for predicting which of its customers were most likely to become pregnant in the near future, giving Target's marketers a head start on pitching them baby products.
你可能还记得,塔吉特百货公司(Target)在去年初曾深陷愤怒的舆论漩涡中心。那是因为这家零售商的数据专家们开发出了一种统计方法,可以预测哪些客户有可能在近期怀孕,营销人员向她们推销婴幼儿产品时,就拥有了先人一步的优势。
The model worked: Target expanded its customer base for pregnancy3 and infant-care products by about 30%. But the media brouhaha, with everyone from The New York Times to Fox News accusing the company of "spying" on shoppers, took weeks to die down.
这个模型很管用:在塔吉特购买孕期及婴幼儿产品的客户增长了30%。但这却引来舆论一片哗然,从《纽约时报》(The New York Times)到福克斯新闻(Fox News),几乎所有人都指责该公司是在“暗中监测”购物者。这场风波好几周后才平息下去。
If Target's success at setting its sights on potential moms-to-be gives you the creeps, Eric Siegel's new book could ruin your whole day. Siegel is a former Columbia professor whose company, Predictive Impact, builds mathematical models that cull4 valuable nuggets of data from floods of raw information. Companies use the tools to forecast everything from what we'll shop for, to which movies we'll watch, to how likely we are to be in a car accident or default on our credit cards.
如果塔吉特成功监测准妈妈这件事已经让你觉得毛骨悚然了,那埃里克?西格尔的新书恐怕会让你惶惶不可终日的。西格尔曾是哥伦比亚大学(Columbia University)的教授,他的公司叫“预测影响”(Predictive Impact),专门开发各类数学模型,这些模型能从海量原始数据中提取出极具价值的信息。各类公司都在使用这些工具进行预测,不管是我们想购买什么东西,还是我们想看什么电影,不管是我们碰上车祸的可能性有多高,还是我们有多大可能会信用卡欠款,都能预测出来。
In Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, Siegel explains how these models work and where the pitfalls5 are, in clear, colorful terms. Simply put, predictive analytics, or PA, is the science of learning from experience. Starting with data about the past and current behavior of a given group of people -- whether customers, patients, prison inmates6 up for parole, voters, or employees -- analysts7 can predict what they'll probably do next.
在《预测分析:预测谁将点击、购买、撒谎或死亡的力量》(Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die)一书中,西格尔用清晰生动的语言解释了这些模型运作的机制及各类误区。简而言之,预测分析,或简称PA,就是一种从经验中学习的科学。从既定人群——客户、病人、即将假释的囚犯、选民或员工——过去和当前的行为数据入手,分析师就能预知他们下一步可能的行为。
This kind of high-tech8 crystal ball is behind "the growing trend to make decisions more 'data driven,'" Siegel writes. "In fact, an organization that doesn't leverage9 its data in this way is like a person with a photographic memory who never bothers to think."
这是一种可以预知未来的高科技“水晶球”。西格尔写道,它位居“日益盛行的、越来越依靠数据做决策的趋势”幕后,“实际上,如果一个机构从来不用这种方式充分利用自己的数据,那就和一个人有过目不忘的本事却从来不动脑筋无异”。
Predictive Analytics is packed with examples of how Citi, Facebook, Ford10, IBM, Google, Netflix, PayPal and many other businesses and government agencies have put PA to work. Pfizer, for instance, has a predictive model to foretell11 the likelihood that a patient will respond to a given new drug within three weeks. LinkedIn uses PA to pinpoint12 the fellow members you might want as connections. At the IRS, a mathematical ranking system applied13 to past tax returns "empowered IRS analysts to find 25 times more tax evasion14, without increasing the number of investigations15."
这本书列举了丰富的案例,有关花旗集团(Citi)、Facebook、IBM、谷歌公司(Google)、网飞公司(Netflix)、贝宝(PayPal)和其他企业及政府机构利用预测分析的例子比比皆是。比如,辉瑞制药(Pfizer)就有一个预测模型,它能预告病人在三周内对一种给定新药产生药效反应的几率。LinkedIn会用PA来准确找到你希望联系的用户。而在美国国税局(IRS),一套用于过去纳税申报单的数学排序系统“让IRS的分析师在不增加调查的前提下,能发现比以前多25倍的逃税情况。”
And then there's Hewlett-Packard. A couple of years ago, alarmed by annual turnover16 rates in some divisions as high as 20%, HP decided17 to try anticipating which of its 330,000 employees worldwide were most likely to quit. Beginning with reams of data on things like salaries, raises, promotions18, and job rotations19, a team of analysts correlated that information with detailed20 employment records of people who had already left. Based on the similarities they found, the researchers assigned each current employee a Flight Risk score.
还有一个惠普公司(Hewlett-Packard)的案例。几年前,惠普的一些部门每年离职率高达20%,受此触动,惠普决定预测其全球33万名员工中谁最有可能辞职。分析师团队从海量数据入手,如薪酬水平、加薪情况、升迁情况及轮岗情况等,将它们和已离职员工的详细工作经历联系起来开展分析。在他们所发现的数据相似性基础上,研究者们为目前每位员工都打了一个离职风险(Flight Risk)评分。
1 outrage [ˈaʊtreɪdʒ] 第7级 | |
n.暴行,侮辱,愤怒;vt.凌辱,激怒 | |
参考例句: |
|
|
2 statistical [stə'tɪstɪkl] 第7级 | |
adj.统计的,统计学的 | |
参考例句: |
|
|
3 pregnancy [ˈpregnənsi] 第9级 | |
n.怀孕,怀孕期 | |
参考例句: |
|
|
4 cull [kʌl] 第12级 | |
vt.拣选;剔除;n.拣出的东西;剔除 | |
参考例句: |
|
|
5 pitfalls ['pɪtfɔ:lz] 第10级 | |
(捕猎野兽用的)陷阱( pitfall的名词复数 ); 意想不到的困难,易犯的错误 | |
参考例句: |
|
|
6 inmates [ˈinmeits] 第10级 | |
n.囚犯( inmate的名词复数 ) | |
参考例句: |
|
|
7 analysts ['ænəlɪsts] 第9级 | |
分析家,化验员( analyst的名词复数 ) | |
参考例句: |
|
|
8 high-tech [haɪ tek] 第7级 | |
adj.高科技的 | |
参考例句: |
|
|
9 leverage [ˈli:vərɪdʒ] 第9级 | |
n.力量,影响;杠杆作用,杠杆的力量 | |
参考例句: |
|
|
10 Ford [fɔ:d, fəʊrd] 第8级 | |
n.浅滩,水浅可涉处;v.涉水,涉过 | |
参考例句: |
|
|
11 foretell [fɔ:ˈtel] 第8级 | |
vt. 预言;预示;预告 vi. 预言;预示;预告 | |
参考例句: |
|
|
12 pinpoint [ˈpɪnpɔɪnt] 第9级 | |
vt.准确地确定;用针标出…的精确位置 | |
参考例句: |
|
|
13 applied [əˈplaɪd] 第8级 | |
adj.应用的;v.应用,适用 | |
参考例句: |
|
|
14 evasion [ɪˈveɪʒn] 第9级 | |
n.逃避,偷漏(税) | |
参考例句: |
|
|
15 investigations [ɪnvestɪ'ɡeɪʃnz] 第7级 | |
(正式的)调查( investigation的名词复数 ); 侦查; 科学研究; 学术研究 | |
参考例句: |
|
|
16 turnover [ˈtɜ:nəʊvə(r)] 第7级 | |
n.人员流动率,人事变动率;营业额,成交量 | |
参考例句: |
|
|
17 decided [dɪˈsaɪdɪd] 第7级 | |
adj.决定了的,坚决的;明显的,明确的 | |
参考例句: |
|
|
18 promotions [prə'məʊʃənz] 第7级 | |
促进( promotion的名词复数 ); 提升; 推广; 宣传 | |
参考例句: |
|
|