Winning the Hardware Software Game Winning the Hardware-Software Game - 2nd Edition

Using Game Theory to Optimize the Pace of New Technology Adoption
  • How do you encourage speedier adoption of your product or service?
  • How do you increase the value your product or service creates for your customers?
  • How do you extract more of the value created by your product or service for yourself?

Read more...

Latest Comments

  • Ron Giuntini said More
    As always a good read.
    I have always... Thursday, 25 January 2018
  • Anonymous said More
    Clearly, phones, and subsequently mobile... Friday, 15 September 2017

Suppose a friend told you that he was planning on doing a TED Talk, and he asked your advice on how to make his talk one of the most popular TED Talks out there. What would you tell him?

This is exactly the type of question Data Scientists seek to answer. The way Data Scientists approach such a problem is to gather information on past TED Talks and analyze that information to see which factors describe only the most popular TED Talks, and not also the less popular Talks. For our purposes, we’ll define “popular” TED Talks as Talks that generate a lot of views.

So then following the Data Scientists’ route, we obtain a database that contains all TED Talks posted on the TED website from its inception in June 2006 through September 2017. There are 2,550 talks. The distribution of views per talk across all the different talks is presented in Figure 1.

Figure 1

1 ted talks by views

What the distribution of TED Talks by views shows is that (i) A small number of talks has received over 5 million views each, (ii) a large number of talks has received several million views each, and (iii) and a large number of talks has received less than a million views. We can present the same information contained in Figure 1 a little differently, as shown in Figure 2.

Figure 2

2 ted distrn talks views2

Figure 2 shows that the 4% of talks that each had more than 5 millions views collectively accounted for 25% of the total number of views for all TED Talks from June 2006 through September 2017. That is, a small number of talks generated a large portion of the total views for all TED Talks. What we want to know is: what characteristics do those top 4% of talks share that the other talks don’t?

The database of information on TED Talks contains information on characteristics of speakers and their talks, including, for example: date posted; duration of talk; title and description of talk; identity and occupation of speakers; number of languages for which transcripts of the talk were provided; number of comments by other viewers; and tags — or themes — for each talk. There is also a ranking system TED provides for viewers to rate the talks. Viewers are given a set of 14 descriptors from which they can choose up to 3 to describe a particular talk: Beautiful, Confusing, Courageous, Fascinating, Funny, Informative, Ingenious, Inspiring, Jaw-dropping, Longwinded, Obnoxious, OK, Persuasive, or Unconvincing.

The tags for each talk appear to be inconsistently defined. For the 2,550 TED Talks, there are 19,098 total tags, of which 417 are unique. The 5 most popular tags are: Technology, Science, Global Issues, Culture, TEDx, and Design, which collectively account for 16% of the total tags assigned. The tags assigned per talk vary from 1 to 32 in a seemingly unsystematic manner (see Figure 3). This inconsistency in assignment of tags suggests that the tags variable would not be a good predictor of TED Talk popularity. The analyses were run both with and without information on tags, and, as suspected, they didn’t provide any additional information.

Figure 3

3 ted distrn tags

Jumping Right In!

So, now, if we jump right to the analysis, what does it tell us? If we regress the number of views a TED Talk receives on the various data elements in the dataset, across all 2,550 TED Talks, we get the results presented in Figure 4. To be conservative, I’m labeling as “statistically significant” those results that are significant at the 1% level (i.e., p-value ≤ 0.01). Variables with statistically significant coefficients have been highlighted in yellow.

Figure 4

4 ted reg1 graph

The first observation from the analysis is that the adjusted R2 is 0.82. There’s a good amount of variation in the views a Talk generates – 18% – that isn’t captured in the variables that have been included in the regression.

The second observation is that the impact of languages is large and positive. So, talks posted in more languages generate more views. Or talks that generate more views are posted in more languages. This is correlation, not necessarily causation.

The third observation is that the year the talk was presented has by far the largest impact on the number of views a talk has received, where talks given in later years are more popular. We’ll explore this more in a minute, but first, let’s go through the other variables in the regression.

The fourth observation is that talks with more ratings generate fewer views. Before we interpret this unintuitive result, let’s consider the impacts of the individual ratings descriptors. It turns out that the ratings descriptors that generate the most views are Confusing and OK, not particularly favorable descriptors. The way I interpret the information on ratings is that it’s the less popular talks that viewers give ratings to, and those ratings are not favorable. So ratings reflect people voicing dissatisfaction with the talk, and people who enjoy talks simply don’t provide ratings.

So now let’s return to the strong relationship between Year and the number of views a TED Talk receives. Consider the pattern in Views per Talk over time, presented in Figure 5.

Figure 5

5 ted talks yr2

Talks during the first year received a lot of Views, but there were relatively few talks that year, so those large Views per Talk get less weight in the analysis. Views per Talk peaked in 2013, but they were also relatively high for 2014 and 2015. Also, there were enough talks presented during those years to give the large Views per Talks large weight in the analysis. So it looks like the large positive impact of Year on Views reflects the fact that talks in 2013 through 2015 — later years in the analysis — generated more views. Again this is correlation not causation.

Let’s take one more deeper dive and compare the distributions over time of Talks with less than 5 million views and Talks with more than 5 million views. That is, we’re splitting the blue line in Figure 5 into two sub-components. The distribution of Talks over time for Talks with more than 5 million views and Talks with less than 5 million views is presented in Figure 6.

Figure 6

6 ted talks yr4

It turns out that of the 99 talks in the dataset with more than 5 million views, 22 of them were presented in 2013. So what the large positive coefficient in the regression on Year is saying is that talks that were presented in later years, particularly 2013, generated more views. Again, this is correlation, not causation. It doesn’t say if you want to generate more views, then present your talk in 2013. Rather, it says that talks that generated more views took place in 2013. Correlation, not causation.

So now recall the distribution of Views per Talk in Figure 1. The distribution is nonlinear for talks with more than 5 million views. So then what happens if we look at the analysis of talk characteristics that affect Views separately for the two subgroups? That is, what happens if we subdivide the talks into those with less than 5 million views and those with more than 5 million views, and then we run the analysis separately for each subgroup? Are there differences in the patterns of characteristics that predict numbers of views for the two different groups of talks?

Talks with Less Than 5 Million Views

Let’s first take a look at what the analysis says for talks with fewer than 5 million views, which is presented in Figure 7

Figure 7

7 ted reg2 graph

The results of the analysis for talks with fewer than 5 million views shows the identical pattern as that for all talks combined. This suggests that that weird pattern we saw for the Ratings variables in the analysis of all talks — where people tended to submit more ratings for talks they don’t like — pertained to the less popular talks, that is, talks that had less than 5 million views.

Talks with More Than 5 Million Views

So what do the results have to say about the talks with more than 5 million views?

Figure 8

8 ted reg 3

As Figure 8 shows, for the most popular talks, none of the characteristics of the talks are significant predictors of views. In other words, if you ask, “what are the characteristics of the most popular TED talks?” The answer is, “there is no predictor.”

So What’s Going On?

Here’s my hypothesis.

TED Talks can be viewed on TED’s website, but they can also be viewed on YouTube and other social media sites, such as Facebook, iTunes, and Hulu. Which talks are people most inclined to view? Do they go to the TED website and start with the most recently presented TED Talks? I don’t think so. I posit that most TED Talks are viewed through either (i) a link sent to people by friends, (ii) a link others posted on social media, or (iii) talks posted under a label of “Top 10 TED Talks,” “Most watched TED Talks,” or some other such label.

In other words, I posit that the most popular TED Talks are the ones that have been caught up in a success-breeds-success loop, which has been facilitated or fostered by choice architecture, so as to propel those Talks into the group of most popular.

Success-breeds-success phenomena occur when things that are popular become even more popular, because they are given more chances to succeed. For example, once a piece of content has garnered enough clicks, other people will click on it simply because many others have also done so.

Wikipedia defines choice architecture as the design of different ways choices can be presented to consumers, and the Impact of that presentation on consumer decision-making. In other words, choice architecture recognizes that the way you present choices to people can affect which of the options they choose.

Choice architecture feeds success-breeds-success phenomena by labeling certain content as “Top 10,” “Most Viewed,” “Now Trending,” etc. People will tend to skip individual pieces of content posted on the site in favor of what’s most popular. Either they view what’s popular as a proxy for high quality content, or they fear missing out (FOMO) on what so many others have experienced.

So, I posit that the most popular TED Talks are not viewed through a visit to TED’s website. Rather, I propose that the most popular TED Talks are more likely to be viewed because they either serendipitously end up in the path of viewers or they appear under a label of “Most Popular.” A TED Talk becomes among the most popular when it starts to gain momentum in views, gets passed around more on social media, makes it into a Top 20 list and continues to become ever more popular because it’s popular.

The other contributing factor that might make the most popular TED Talks so popular is that they exhibit some intangible quality about the speaker or the talk that appeals to viewers, but that hasn’t been captured in the database of TED Talks data. The 14 ratings descriptors capture some elements of this, such as Ingenious or Inspiring or Funny. But they don’t capture information, for example, about speakers who are dynamic, or wry, or captivating. It’s also possible that intangible characteristics lead the most popular TED Talks to gain the initial momentum they need to get caught up in a success-breeds-success loop, which then propels them into the top Talks.

So What Does This Mean?

The first implication is that the key information we need to answer the questions we seek to answer is often not captured in the data we have. Sure, we might be able to get some small scraps of understanding from the information we have. But relative to the primary understanding we actually seek, the scraps are often irrelevant. However, we won’t understand what we’re missing, unless we have some understanding of the dynamics that drive the situation. In other words, if we jump right into the TED Talk data without first thinking about what might drive popularity of TED Talks, then we’re very likely to completely miss the big picture. We won’t know what we’re missing, unless we take time before jumping into the data to try to understand what really drives the situation.

The second implication is this. In a world flooded with information, in which everyone is vying for our attention, success-breeds-success phenomena and choice architecture are increasingly determining which content ends up becoming popular or successful. That is, a product’s success is increasingly determined as much by factors that don’t have anything to do with the nature or quality of the product itself, but rather, by how well the product is propelled into success through extrinsic factors. Merit won’t necessarily win the day. Is that what we want?

More Blogs

Discussion of the Mapping Apps Game

07-04-2018 - Hits:347 - Ruth Fisher - avatar Ruth Fisher

Mapping apps, such as Waze and Google Maps, have created enormous value for users by helping them get to where they’re going faster. As least initially, when few people were using mapping apps, the apps were particularly helpful for individual users in rerouting them around traffic problems. However, now that...

Read more

What Makes the Most Popular TED Talks So Popular?

19-03-2018 - Hits:541 - Ruth Fisher - avatar Ruth Fisher

Suppose a friend told you that he was planning on doing a TED Talk, and he asked your advice on how to make his talk one of the most popular TED Talks out there. What would you tell him? This is exactly the type of question Data Scientists seek to answer...

Read more

Playing the Used Technology Game

09-03-2018 - Hits:1250 - Ruth Fisher - avatar Ruth Fisher

Smartphone manufacturers, such as Apple and Samsung, have thrived for the past fifteen years using a specific business model that involves (i) Swiftly releasing next generation products that contain significant advancements over previous generations of products, and (ii) Selling next generation technologies at a premium. However, more recently, sales of used and refurbished...

Read more

Why Is So Difficult to Extract Value from Data?

16-02-2018 - Hits:847 - Ruth Fisher - avatar Ruth Fisher

This is a new idea I'm working on. I'd love to hear any feedback you might have.   We Collect and Analyze Data Why do we collect and analyze data? It informs us about (i) what happened in a given time and place and (ii) why (see Figure 1).  

Read more

The Future of Money

24-01-2018 - Hits:1216 - Ruth Fisher - avatar Ruth Fisher

Traditional currency systems are being assailed from several directions. Some propose digitizing national currencies as a means to decrease transaction costs, facilitate tracking, and discourage illicit uses of currency. Some suggest a single, global currency system is inevitable. Others propose creating non-government-backed forms of currency to eliminate the ability of...

Read more

Switching Costs: The Overlooked Obstacle to Change

18-01-2018 - Hits:630 - Ruth Fisher - avatar Ruth Fisher

Why do we adopt new technologies? Many people will be quick to respond, “Because they help us do things faster, easier, or better.” Pretty obvious. Well, then, why don’t people adopt new technology? Most people would probably say that people don’t adopt new technology either because the technology is too expensive or because people...

Read more

An Example of Game Theory in Risk Management

12-01-2018 - Hits:565 - Ruth Fisher - avatar Ruth Fisher

This analysis was co-authored with Norman Marks, CPA, CRMA and originally posted on his blog.   Many liked the post on Risk and Game Theory with Ruth Fisher (my co-author on the piece). We were asked for more, especially an example or two. As with the last post, I will set the stage and...

Read more

Playing the Sexual Harassment & Misbehavior Game

08-12-2017 - Hits:1146 - Ruth Fisher - avatar Ruth Fisher

Definition of Sexual Harassment Scenarios in the Sexual Harassment & Misbehavior Game Cost to Firms of Sexual Harassment & Misbehavior The Sexual Harassment & Misbehavior Game Musings about the Game Outcomes of the Model   The scourge of the day is sexual harassment. There’s nothing new about sexual harassment. It has always existed. What has changed is the...

Read more