Playing the Open Source AI Game, Part 1
Written on 01 January 2016
by Ruth Fisher, PhD
Current AI Ecosystem
Categorization of AI Technologies
Organization of Companies in the AI Ecosystem
A copy of the full analysis can be downloaded by clicking on the link at the bottom of this blog entry.
OpenAI, the organization recently cofounded by Elon Musk, has been receiving a lot of press lately. The company was introduced as follows:
OpenAI is a non-profit artificial intelligence research company. Our goal is to advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return.
Since our research is free from financial obligations, we can better focus on a positive human impact. We believe AI should be an extension of individual human wills and, in the spirit of liberty, as broadly and evenly distributed as possible.
Two issues in particular have been generating most of the attention surrounding the founding of the new organization:
- OpenAI will focus its research on discoveries that will have positive benefits for society; and
- OpenAI will be open source, that is, its discoveries will be freely available to all.
Recent advancements in AI have enabled researchers to provide valuable new products and services in the marketplace, and the promise of continuing advancements suggest that even more valuable discoveries are on the horizon. As such, what motivations lay behind the decision of Elon Musk and his cofounders to make their new organization open source, rather than establishing it as a for-profit company? They have said that their intent is to provide discoveries that benefit humanity. But are the founders really as altruistic as they, themselves, and the media have made them out to be?
This analysis is an attempt to better understand the dynamics underlying the AI ecosystem so as to better understand what motivated the founders of OpenAI to designate the organization as open source and whether or not there may be other agendas out there besides pure altruism.
Let’s start by trying to define what artificial intelligence (AI) is. In “Demystifying Artificial Intelligence,”David Schatsky, Craig Muraskin, & Ragu Gurumurthy indicate that a clear and unanimously accepted definition of AI has been elusive.
The field of AI suffers from both too few and too many definitions. Nils Nilsson, one of the founding researchers in the field, has written that AI “may lack an agreed-upon definition …” A well-respected AI textbook, now in its third edition, offers eight definitions, and declines to prefer one over the other. For us, a useful definition of AI is the theory and development of computer systems able to perform tasks that normally require human intelligence. Examples include tasks such as visual perception, speech recognition, decision making under uncertainty, learning, and translation between languages.
Wikipedia defines artificial intelligence as follows.
Artificial intelligence (AI) is the intelligence exhibited by machines or software. It is also the name of the academic field of study which studies how to create computers and computer software that are capable of intelligent behavior... John McCarthy, who coined the term in 1955, defines it as "the science and engineering of making intelligent machines".
The central problems (or goals) of AI research include reasoning, knowledge, planning, learning, natural language processing (communication), perception and the ability to move and manipulate objects.
Kris Hammond provides a succinct encompassment of AI in “Artificial Intelligence Today and Tomorrow”:
Generally, we can break intelligence or cognition into three main categories: sensing, reasoning and communicating... In other words, cognition breaks down to taking stuff in, thinking about it and then telling someone what you have concluded.
And Tim Urban, in “The AI Revolution: The Road to Superintelligence,” defines the three major categories of AI:
… [T]he critical categories we need to think about are based on an AI’s caliber. There are three major AI caliber categories:
AI Caliber 1) Artificial Narrow Intelligence (ANI): Sometimes referred to as Weak AI, Artificial Narrow Intelligence is AI that specializes in one area…
AI Caliber 2) Artificial General Intelligence (AGI): Sometimes referred to as Strong AI, or Human-Level AI, Artificial General Intelligence refers to a computer that is as smart as a human across the board—a machine that can perform any intellectual task that a human being can...
AI Caliber 3) Artificial Superintelligence (ASI): Oxford philosopher and leading AI thinker Nick Bostrom defines superintelligence as “an intellect that is much smarter than the best human brains in practically every field, including scientific creativity, general wisdom and social skills.”
A subfield of AI that’s currently being hotly pursued is machine learning. Wikipedia defines machine learning as follows.
Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. Such algorithms operate by building a model from example inputs in order to make data-driven predictions or decisions…
… Example applications include spam filtering, optical character recognition (OCR), search engines and computer vision…
Tom M. Mitchell provided a widely quoted, more formal definition: "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E".
Deep learning is a subfield of machine learning. From Wikipedia:
Deep learning (deep machine learning, or deep structured learning, or hierarchical learning, or sometimes DL) is a branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data by using multiple processing layers with complex structures, or otherwise composed of multiple non-linear transformations.
Machine learning forms the basis of (much of) the research currently being conducted by OpenAI, Google, Facebook, and others.
“AI is not a new idea. Indeed, the term itself dates from the 1950s.” So then why is there so much hype now surrounding the field?
Currently, one of the hottest areas of AI is machine learning. Advancements in machine learning, are driven by improvements in any of three essential components (see Figure 1): (i) algorithms (software) created by researchers, (ii) data (big data) used to “train” the algorithms, and (iii) computer processing power (hardware/platforms). And it just so happens that recently, big advances in all three of these areas have taken the technology to whole new levels.
The relationships between the components in Figure 1 will be discussed in more detail later in the analysis.
David Schatsky, Craig Muraskin, & Ragu Gurumurthy provide a brief history of AI, together with details about recent advancements in the area.
AI is not a new idea. Indeed, the term itself dates from the 1950s. The history of the field is marked by “periods of hype and high expectations alternating with periods of setback and disappointment,” as a recent apt summation puts it. After articulating the bold goal of simulating human intelligence in the 1950s, researchers developed a range of demonstration programs through the 1960s and into the ’70s that showed computers able to accomplish a number of tasks once thought to be solely the domain of human endeavor... But simplistic algorithms, poor methods for handling uncertainty (a surprisingly ubiquitous fact of life), and limitations on computing power stymied attempts to tackle harder or more diverse problems. Amid disappointment with a lack of continued progress, AI fell out of fashion by the mid-1970s.
In the early 1980s, Japan launched a program to develop an advanced computer architecture that could advance the field of AI. Western anxiety about losing ground to Japan contributed to decisions to invest anew in AI. The 1980s saw the launch of commercial vendors of AI technology products… By the end of the 1980s, perhaps half of the Fortune 500 were developing or maintaining “expert systems,” an AI technology that models human expertise with a knowledge base of facts and rules. High hopes for the potential of expert systems were eventually tempered as their limitations, including a glaring lack of common sense, the difficulty of capturing experts’ tacit knowledge, and the cost and complexity of building and maintaining large systems, became widely recognized. AI ran out of steam again.
In the 1990s, technical work on AI continued with a lower profile. Techniques such as neural networks and genetic algorithms received fresh attention, in part because they avoided some of the limitations of expert systems and partly because new algorithms made them more effective...
By the late 2000s, a number of factors helped renew progress in AI, particularly in a few key technologies.
Moore’s Law… Advanced system designs that might have worked in principle were in practice off limits just a few years ago because they required computer power that was cost-prohibitive or just didn’t exist. Today, the power necessary to implement these designs is readily available…
Big data. Thanks in part to the Internet, social media, mobile devices, and low-cost sensors, the volume of data in the world is increasing rapidly... Big data has been a boon to the development of AI. The reason is that some AI techniques use statistical models for reasoning probabilistically about data such as images, text, or speech. These models can be improved, or “trained,” by exposing them to large sets of data, which are now more readily available than ever.
The Internet and the cloud. Closely related to the big data phenomenon, the Internet and cloud computing can be credited with advances in AI for two reasons. First, they make available vast amounts of data and information to any Internet-connected computing device. This has helped propel work on AI approaches that require large data sets. Second, they have provided a way for humans to collaborate—sometimes explicitly and at other times implicitly—in helping to train AI systems…
New algorithms… In recent years, new algorithms have been developed that dramatically improve the performance of machine learning
Since its inception, AI technology has always been enveloped by controversy: Will man always be able to keep technology under his control, or will machines eventually become master over man? At the heart of the controversy is a notion known as The Singularity. Wikipedia describes the notion as follows.
The technological singularity is a hypothetical event related to the advent of genuine artificial general intelligence (also known as "strong AI"). Such a computer, computer network, or robot would theoretically be capable of recursive self-improvement (redesigning itself), or of designing and building computers or robots better than itself on its own. Repetitions of this cycle would likely result in a runaway effect – an intelligence explosion – where smart machines design successive generations of increasingly powerful machines, creating intelligence far exceeding human intellectual capacity and control. Because the capabilities of such a superintelligence may be impossible for a human to comprehend, the technological singularity is the point beyond which events may become unpredictable or even unfathomable to human intelligence.
As to whether or not man will always be able to keep machines under his control, Paul Ford, in “Our Fear of Artificial Intelligence”, notes that this fear dates back to the inception of AI:
The question “Can a machine think?” has shadowed computer science from its beginnings. Alan Turing proposed in 1950 that a machine could be taught like a child; John McCarthy, inventor of the programming language LISP, coined the term “artificial intelligence” in 1955. As AI researchers in the 1960s and 1970s began to use computers to recognize images, translate between languages, and understand instructions in normal language and not just code, the idea that computers would eventually develop the ability to speak and think—and thus to do evil—bubbled into mainstream culture. Even beyond the oft-referenced HAL from 2001: A Space Odyssey, the 1970 movie Colossus: The Forbin Project featured a large blinking mainframe computer that brings the world to the brink of nuclear destruction; a similar theme was explored 13 years later in WarGames. The androids of 1973’s Westworld went crazy and started killing.
Paul Ford continues on to contrasts experts’ views on the subject.
Nick Bostrom, a philosopher who directs the Future of Humanity Institute at the University of Oxford … does believe that superintelligence could emerge, and while it could be great, he thinks it could also decide it doesn’t need humans around. Or do any number of other things that destroy the world…
Critics such as the robotics pioneer Rodney Brooks say that people who fear a runaway AI misunderstand what computers are doing when we say they’re thinking or getting smart. From this perspective, the putative superintelligence Bostrom describes is far in the future and perhaps impossible.
Whereas Turing had posited a humanlike intelligence, Vinge, Moravec, and Kurzweil were thinking bigger: when a computer became capable of independently devising ways to achieve goals, it would very likely be capable of introspection—and thus able to modify its software and make itself more intelligent. In short order, such a computer would be able to design its own hardware.
As Kurzweil described it, this would begin a beautiful new era. Such machines would have the insight and patience (measured in picoseconds) to solve the outstanding problems of nanotechnology and spaceflight; they would improve the human condition and let us upload our consciousness into an immortal digital form. Intelligence would spread throughout the cosmos.
You can also find the exact opposite of such sunny optimism. Stephen Hawking has warned that because people would be unable to compete with an advanced AI, it “could spell the end of the human race.” Upon reading Superintelligence, the entrepreneur Elon Musk tweeted: “Hope we’re not just the biological boot loader for digital superintelligence. Unfortunately, that is increasingly probable.” Musk then followed with a $10 million grant to the Future of Life Institute… this is an organization that says it is “working to mitigate existential risks facing humanity,” the ones that could arise “from the development of human-level artificial intelligence.”
More on the issue from John Markoff in “The Coming Superbrain”:
The notion that a self-aware computing system would emerge spontaneously from the interconnections of billions of computers and computer networks goes back in science fiction at least as far as Arthur C. Clarke’s “Dial F for Frankenstein.” A prescient short story that appeared in 1961, it foretold an ever-more- interconnected telephone network that spontaneously acts like a newborn baby and leads to global chaos as it takes over financial, transportation and military systems.
… [T]here is a hot debate here over whether such machines might be the “machines of loving grace,” of the Richard Brautigan poem, or something far darker, of the “Terminator” ilk.
Concerned about the same potential outcome, the A.I. researcher Eliezer S. Yudkowsky, an employee of the Singularity Institute, has proposed the idea of “friendly artificial intelligence,” an engineering discipline that would seek to ensure that future machines would remain our servants or equals rather than our masters.
Recent advances in AI have led to an increasing awareness of the potential for further developments in AI to lead to dangerous ends. In response, a group of prominent AI scientists posted an open letter in January 2015, calling for participants in the field to consider their research carefully and to limit it to responsible and socially beneficial areas of advancement:
The progress in AI research makes it timely to focus research not only on making AI more capable, but also on maximizing the societal benefit of AI ... We recommend expanded research aimed at ensuring that increasingly capable AI systems are robust and beneficial: our AI systems must do what we want them to do. The attached research priorities document gives many examples of such research directions that can help maximize the societal benefit of AI...
Currently, over 7,800 people have signed the letter.
Current AI Ecosystem
Before moving on the to the discussion, let me first provide a general layout of the current AI ecosystem.
Categories of AI Technologies
As cited above, Kris Hammond divides intelligence into three main categories: (i) sensing (taking stuff in), (ii) reasoning (thinking about stuff), and (iii) communicating (telling someone what you have concluded).
If we consider that the means of “taking stuff in” – accomplished mainly through speech and vision, and also through sensors and gestures – is similar to the means of “telling someone what you have concluded,” then Kris Hammond’s three categories may be reduced to two: (i) communicating information from and to a source and (ii) reasoning about that information.
As of August 2015, Venture Scanner, in “Making Sense of the Artificial Intelligence Ecosystem,” reports that it is “tracking 855 Artificial Intelligence companies across 13 categories,” as presented in Figure 2. Note that the number of companies listed in Figure 2 adds up to 923, which suggests that while the majority of companies are working on technologies concentrated within a single category, some of the companies in the space are, in fact, working on technologies in more than one category.
Venture Scanner describes the categories depicted in Figure 2 as follows.
Deep Learning/Machine Learning: Machine learning is the technology of computer algorithms that operate based on its learning from existing data. Deep learning is a subset of machine learning that focuses on deeply layered neural networks.
Computer Vision/Image Recognition: Computer vision is the method of processing and analyzing images to understand and produce information from them. Image recognition is the process of scanning images to identify objects and faces.
￼￼Natural Language Processing: Natural language processing is the method through which computers process human language input and convert into understandable representations to derive meaning from them. Speech recognition is a subset of natural language processing that focuses on processing a sound clip of human speech and deriving meaning from it.
Speech to Speech Transition: Speech to speech translation is the process through which human speech in one language is processed by the computer and translated into another language instantly.
Smart Robots: Smart robot companies build robots that can learn from their experience and act and react autonomously based on the conditions of their environment.
Virtual Personal Assistants: Virtual personal assistants are software agents that use artificial intelligence to perform tasks and services for an individual, such as customer service, etc.
Recommendation Engines and Collaborative Filtering: Recommendation engines are systems that predict the preferences and interests of users for certain items (movies, restaurants) and deliver personalized recommendations to them.
Gesture Control: Gesture control is the process through which humans interact and communicate with computers with their gestures, which are recognized and interpreted by the computers.
Video Automatic Content Recognition: Video automatic content recognition is the process through which the computer compares a sampling of video content with a source content file to identify what the content is through its unique characteristics.
Context Aware Computing: Context aware computing is the process through which computers become aware of their environment and their context of use, such as location, orientation, lighting and adapt their behavior accordingly.
To get a better understanding of the distribution of companies across the space, I divided the categories and counts reported in the Figure 1 into coarser sections of
i. Taking stuff in, in which I include companies in the areas of visual, speech, gesture control, and context awareness;
ii. Learning, in which I include companies in the areas of machine learning and recommendations; and
iii. Human Assistants, in which I include virtual personal assistants and robots
The results of my coarser categorization are presented in Figure 3.
Figure 3 suggests that roughly half (48%) of companies in the AI space are working on taking stuff in and communicating, over a third (38%) of companies are working on making sense of information, and about 14% are involved in machines that do both.
Organization of Companies in the AI Ecosystem
There are two important takeaways from the analysis presented in the previous section:
- Companies working in the AI space are involved both in furthering the technology in general, as well as using specific technologies to provide particular products and services.
- Most companies in the AI space are working within a specific category of AI, that is, they are working on artificial narrow intelligence.
However, the AI ecosystem is a bit more complicated than how it is portrayed in Figure 2. Return to Figure 1, reposted here for convenience:
The vast majority of the companies contained in the company counts in Figure 2 are likely companies working on AI engines, that is, software components. There are other researchers/organizations working on AI engines that are not included in Figure 2, including Federal research labs, colleges and universities, and private/public company research labs.
There are also companies in the AI ecosystem that provide data services and others that provide platform services. Companies in these two areas provide a combination of access to big data and/or access to platforms and tools used to process big data. Scientists and researchers use these products and services to analyze big data and/or test or train algorithms. In particular, Bernard Marr in “Big Data-As-a-Service Is Next Big Thing” reports
Big Data as a Service (or BDaaS, and it’s pronounced how you will) … might not be a term you’re familiar with yet, but it suitably describes a fast-growing new market. In the last few years many businesses have sprung up offering cloud-based Big Data services to help other companies and organizations solve their data dilemmas.
At the moment, BDaaS is a somewhat nebulous term often used to describe a wide variety of outsourcing of various Big Data functions to the cloud. This can range from the supply of data, to the supply of analytical tools with which to interrogate the data … to carrying out the actual analysis and providing reports.
Players in the BDaaS market that provide datasets include, for example, Amazon, IBM, Google, and Facebook.
Companies in the BDaaS market providing hardware components (e.g., GUI chips) and systems (platforms) on which the AI engines are run include Horton Works, Microsoft, Amazon, and Altiscale.
The companies that are most relevant to the analysis at issue (Why did OpenAI make its organization open source?) are companies that are working on artificial general intelligence and perhaps artificial superintelligence. These are the big tech companies that are using machine learning to compete with each other across the different AI categories (software, platforms, and data). As Charles Clover notes in “China’s Baidu Searches for AI Edge”:
Other Chinese companies including Alibaba and Tencent are also making advances in AI, but thanks largely to Mr Ng’s [chief scientist for Baidu] reputation Baidu is now judged by industry experts to be ahead of its domestic peers, ranking up alongside US rivals Facebook, Google and IBM.
“Artificial intelligence is an oligopoly,” says Yang Jing, founder of AI Era, an association for the artificial intelligence industry in China. “It’s a game for the titans.”
I put together a table listing components of some of the larger companies’ AI systems, which is displayed in Figure 4.
Go to Part 2