Chapter 2 - Computers and Intelligence

Author: Winfred Phillips


One requirement for the extraordinary future is that computers will be as smart as humans. Actually, the authors who present the extraordinary future clearly think that within the next century computers will far surpass humans in intelligence. In this chapter I describe their reasons for making this claim and consider whether it is plausible. In order to do this I have to consider related issues such as the nature of human intelligence, how the brain works, how computers work, realistic projections of increases in computer processing speed, and different understandings of the concept of thought.

Computer Intelligence in the Extraordinary Future

Our authors think computers will get very smart, even smarter than humans are, during the next century. They approach the question of how smart that is by trying to estimate the computing power of the human brain. It is perhaps surprising that in considering the question of how smart humans are, our authors have very little to say about human intelligence. Rather they discuss how big and powerful the brain is because they tend to think of human intellectual performance in terms of the "computing power" of the brain.

To understand the computing power of the brain, they think, you need to understand a little bit about brain anatomy. Paul and Cox note that the brain has about 1012 cells, but less than 2*1011 are neurons (other estimates, such as that from Kurzweil, say closer to 1011). Axons and dendrites from the neurons connect to other neurons via gaps called "synapses." The neurons send chemicals (or electro-chemical pulses) across the synapses. Paul and Cox point out that these pulse signals are both digital and analog. The digital aspect is that they are either on or off; the analog aspect is in the fact that the signals fluctuate in peaks and valleys. Pulses across synapses, then, can come in different strengths. The synapses store memories as chemical and structural changes that equate to changes in the strength of the connections (Paul & Cox, 1996, pp. 136-140).

Paul and Cox estimate that a single neuron connects to 104 neurons, though typically other estimates are for about 103 connections per average neuron. Paul and Cox estimate a total of about 1015 synapses for the brain (Paul & Cox, 1996, pp. 136-137). Our other authors estimate a total of about 1014 synapses for the brain neurons but these seem to be associated with an estimate of only 103 connections per neuron.

So estimates of the size of the brain (number of neurons, synapses, etc.) can be different (by a factor of ten, for instance). Estimates of the computing power of the human brain also vary and there are even two approaches to estimating. One approach is based on size and the other approach is based on functions. On the first approach, the size of the brain is estimated (as above), along with the number of connections among neurons, neurons in use at one time, and neuron speed, and from these factors one then estimates the number of calculations the brain can perform (per unit time). The second method is to base the brain power estimate on an estimate of the computer power needed to do the brain's functions. On this latter approach one might consider a particular function of the brain, such as visual pattern recognition, and note how much computing power a computer takes to do this. Well, if a computer takes this much power to do this one function, and the brain can do X such functions, then how much computing power would a computer need to do these X functions? Our answer is the computing power of the brain.

Let's start with the first approach, based on the size of the brain. Modern digital computers have hardware both for processing and for memory/storage (for example, RAM). The brain seems to have its memory/storage mixed in with its processing hardware, but processing and storage might be considered distinct for purposes of estimating. Here is Kurzweil's reasoning. He estimates the human brain has about 1011 neurons. With each neuron having an average of 103 connections between it and other neurons, the total connections are about 1014. (Note that these last two numbers are 10% of those of Paul and Cox). If each connection is considered a calculation, this allows 1014 simultaneous calculations. (It is not made clear why each connection is a calculation though.) Neurons can handle 200 calculations per second, so this gives 2*1016 total calculations a second. Kurzweil's estimate of the memory capacity is a little more puzzling. He says that the memory capacity of the brain is about 1014 total synapse strengths, but this is the same number he used for the total of synapses. This would mean each synapse has only one strength, when in reality each synapse has many strengths. What he probably means is that there are 1014 total synapses, but the total strengths of all these synapses together should be represented by the equivalent of a 1015 bits, which implies about 10 strengths per synapse (with a bit for each strength). So Kurzweil arrives at an estimate of a 1015 bits' worth of total brain memory (Kurzweil, 1999, pp. 103-104).

Moravec, using the second method mentioned above, estimates brainpower by basing it on the computer power that would be needed to accomplish the brain's functions. He considers how much computing power is required for visual pattern recognition, and then extrapolates this to how much computing power would be required for all that the brain can do. The brain, Moravec thinks, would have to operate at about 108 MIPS. Considering the memory required to give acceptable performance, he notes that one megabyte per one MIPS is a good rule of thumb. So the brain should have about 108 megabytes of memory. This is the same as Kurzweil's number of a 109megabits (1015 bits). Another way to see this is to note, as already mentioned, that the synapses' memory capacity comes from their ability to be in a number of distinct states or strengths through "molecular adjustment." Moravec guesses that each synapse can be in a byte's worth of states, and this is close to what we surmise must be Kurzweil's conjecture of 10 bits. So 1014 synapses is equivalent to 1014 bytes of memory, which is the same as the 108 megabyte figure mentioned earlier (Moravec, 1999, pp. 54-56).

We have already seen some variance in the size estimates of the brain, based on the estimator. Brain size can be 1011 to 2*1011 neurons; synapses per neuron can be 103 to 104; total synapses can be 1014 to 1015. Kurzweil puts the speed of a neuron at 200 firings a second, Paul and Cox think it 100, and Moravec seems to think it can go up to 1000. Paul and Cox note that at most only 10% of neurons are firing at any one time, which is not mentioned by other authors but which may not be relevant, since most estimates of computing power are based on the number of firings per second, and all neurons could fire repeatedly in one second without all of them firing at once. Paul and Cox note that estimates of total brain speed range from 1013 to 1015 calculations a second, but we have already seen that Kurzweil estimates it at 2*1015 calculations a second (Paul & Cox, 1996, p. 139). Both Kurzweil and Moravec hold to an estimate of memory in the range of 1014 bytes/1015 bits, which Paul and Cox agree with. However, Paul and Cox also note that this may underestimate memory, since the same synapse may be involved in many different remembered situations, with different sets of other neurons for each situation, for example (Paul & Cox, 1996, p. 140).

So in terms of neurons and connections the brain is very large, but in terms of speed each neuron is comparatively slow, with neurons able to fire only about 100-200 cycles per second. Nerves themselves can manage to send signals at only about 100 meters a second. What makes the brain powerful is not neuron speed but the extensive parallel processing of billions of neurons involving trillions of connections (Paul & Cox, 1996, p. 139). Paul and Cox take it to be a massively parallel machine, since they define a machine as something "that uses energy to do work or that processes information according to an internally logical set of rules" (Paul & Cox, 1996, p. 140).

Having some rough idea of the computing power of the brain, our authors can tackle the issue of when it is that computers will be able to match and even exceed this computing power. There seem to be two main lines of argument used to attempt to show that this event will take place early in the next century. First, by examining advances since the turn of this century, proponents find that computing power is advancing at an exponential rate. This is expressed in Moore's Law. According to our authors, this exponential growth rate can be safely extrapolated into the future--proponents find no compelling reason to think that it will stop anytime soon. And advanced technologies for building such computers and for building robot bodies will be developed. This line of reasoning seems to be the primary argument for the claim that computers and robots will meet and exceed human intelligence during the next century, but sometimes a second argument for computer advancement appears, one that involves claims about the nature of evolution and the universe. More about this second argument later.

Let's describe the first argument. The most widely known depiction of the growth of computing power is known as Moore's Law, after Gordon Moore, an inventor of the integrated circuit and a former chairman of Intel. It seems as if everybody these days goes around quoting Moore's Law, so it's ironic that there seems to be a little controversy over exactly what the law states! In 1965 Moore noted that the surface area of a transistor (when etched on an integrated circuit) was undergoing a 50% reduction of size every twelve months. This 50% size reduction correlates with a doubling in speed. In 1975 he was reported as revising that time period to eighteen months, but Kurzweil notes that Moore himself later claims he had revised it to twenty-four months! Whatever the exact rate, Kurzweil notes, many engineers currently believe that this growth rate cannot be sustained indefinitely and probably not beyond 2020 or even earlier. By that time the transistor insulators will be only a few atoms thick and further shrinking by conventional means will be impossible (Kurzweil, 1999, pp. 20-21).

If Moore's Law is going to run out of steam in the next dozen to twenty years, how can our authors believe that computers will continue to advance well into the second half of the next century? Kurzweil points out that the exponential growth of computing did not begin with Intel or Moore's Law. Going back to early computing machines such as a 1900 Analytical Engine and a 1908 Hollerith tabulator, he claims that one can observe essentially exponential growth in computing speed per unit cost during the entire century. At the start of the century the rate is one of doubling every three years, while now it is a doubling about every twelve months. The fact that the exponential growth of computing did not begin with Moore's Law suggests to Kurzweil that it will not stop with the end of Moore's Law (Kurzweil, 1999, pp. 20-25).

We have already seen that the exact rate of growth is not clear, though it is now commonly placed at a doubling of speed every twelve months to two-years. For predictions about exactly where computing will be at various points in the next century, our authors for the most part just draw a graph showing a curve or line representing the growth and extrapolate that curve or line into the future. Since no one knows exactly how limitations of current chip technology will be overcome to pass the hurdle many envision for 2020 or earlier, none of the authors can prove that the exponential growth rate will continue or say exactly how it can be sustained. They do however suggest currently undeveloped technologies (such as nanotechnology and atomic computers) that they think might be used to meet the expected challenges.

Let's turn to specific predictions. Moravec notes that current supercomputers can manage only a few million MIPS, whereas we have seen that to match the human brain we need 100 million MIPS. In Moravec's sketch of future robot generations to be developed during the next century, fourth generation robots will arrive in 2040 and have a processing power of 100 million MIPS. A major feature of these robots that will distinguish them from earlier robots is that they will be able to reason. They will be able to simultaneously simulate the world and reason about the simulation, and they will understand natural languages (Moravec, 1999, pp. 108-110). So Moravec thinks small computers will match human brainpower by 2040.

Kurzweil claims that computer speed at the turn of the century doubled every three years, then every two years in the fifties and sixties, and now every twelve months (Kurzweil, 1999, pp. 2-3). In 1997, $2000 of neural computer chips would get us 2*109 calculations per second, but by the year 2020 this will have doubled 23 times and reach 2*1016 calculations per second, which is in the range of the brain. Memory prices halve every eighteen months, and the requisite 1014 total synapse strengths should be available at the right price in 2023 or even sooner. So a $1000 personal computer in about 2020 should match the power of the human brain. (Supercomputers should be there even earlier, by 2010, but these are physically too big for a robot body). So by about 2020 small computers will be able to read and understand written documents; they will then gather knowledge on their own by reading, and so learn both all human-acquired and machine-acquired knowledge. Doubling power per price every 12 months, a personal computer will match a small village by 2030, the population of the US by 2048, and a trillion human brains by 2060 (Kurzweil, 1999, pp. 4-5, 103-104).

Paul and Cox estimate the human brain processing speed to be from 10 to 1000 teraflops (1012 calculations per second). We are already at the 1 teraflop level, and they estimate that it should take 20 to 25 years to build a petaflop (1015) machine the size of a mainframe, and about 35 to 45 years for this to be in a small computer. Small ten teraflop machines should be available by 2020. Memory requirements (internal RAM type) might be about 1015 bits. Computer memory capacity improves at about the same pace as processing speeds, so the memories of powerful large computers should reach the human range at about 2015, with two decades later this much available in small computers. Memory space on peripherals will move from magnetic tape or disks to something three-dimensional, like the brain, preferably in a holographic manner (Paul & Cox, 1996, pp. 203-206).

We earlier saw that estimates of brain size, brain power, and brain memory varied, and now we have seen similar variation in estimates of when computers will match the computing power of the brain. Kurzweil may be more optimistic than Moravec and Paul and Cox, but all the authors would agree that by 2040 there will exist robots whose computer brains are the equal of the human brain. After that the computing power of robots will grow to quickly far exceed that of humans.

Note that in these estimates there is no insistence that these robots be massively parallel processors like the human brain. The concern is with sheer computing power. The prediction is only that the computers in question will match the overall power of the human brain in terms of total calculations or instructions per second. Because of this, the assumption seems to be, the two will be equivalent in terms of function.

In the coming century and beyond, how will such technological feats be accomplished? Paul and Cox claim that when smaller chips of silicon reach their limit, "top-down silicon etching" will be succeeded by "bottom-up nanotechnology" (Paul & Cox, 1996, p. 210). Kurzweil thinks that advances in chips may occur through incorporating a third dimension in chip design. Improvements in semiconductor materials (with superconducting circuits that do not generate heat) will allow chips with thousands of layers of circuitry. Combine this with smaller component geometries, and computing power will improve by many millions. Other technologies that may play a part include nanotube, optical, crystalline, DNA, and quantum (Kurzweil, 1999, pp. 33-35).

Beyond the issue of chip advances, the breakthroughs needed to create intelligent robots will come from combining advances in artificial intelligence, artificial life, the aforementioned nanotechnology and transmutation of elements into other elements. Each of these elements alone will experience limitations. Artificial intelligence will find it increasingly difficult to carry out the extreme complexity involved in programming high-level intelligence (long since beyond simple human programmers). Artificial life can evolve to high complexity, but so far it is inside computers that are not of high enough complexity. Nanotechnology will be able to build complex hardware, but it will be dumb. Transmutation will produce raw exotic materials, but it cannot put them together. But connections among all these technologies will enable them to work together. Artificial life will be the means by which artificial intelligence can evolve beyond the limits of human programmers. Combining nanotechnology and artificial life will allow growing the complex computers to run at a high intelligence. Heavy elements on stellar scales could be provided by transmutation (Paul & Cox, 1996, pp. 124-125).

Above I mentioned that the concern about advances in future computers was more with overall processing speed than with building into them massive parallelism. But there is acknowledgement among our authors that some parallelism might be needed. To think like a brain, it might be that computers have to work somewhat like a brain. They will have to have complex parallel processors running many algorithms at once. Programming these computers will be incredibly complex, and as mentioned a combination of conventional programming (using human and artificial intelligence) and artificial life may succeed. There is also agreement that such a computer will have to be a self-learning neural network storing holographic memories by strengthening and weakening connections. But these computers may not have to be as complex as the brain. Brains have more than a hundred billion neurons and trillions of interconnections because the neurons are so slow. A faster computer of a million cycles a second could use only millions of neural circuits and match the total speed of the brain (Paul & Cox, 1996, pp. 216-220). Note here the implicit recognition that the real target of robot brain development is the matching of the total computing power of the human brain. The assumption seems to be that the human brain's primary reason for being so parallel is because human wetware is so slow, not because massive parallelism has other needed virtues.

Now for the second argument for computer advancement that we mentioned earlier. The proponent here is Kurzweil, who in this second argument makes sweeping claims about the nature of evolution. He draws upon several laws or principles he believes hold for evolutionary processes and other processes. Moore's Law is not just a set of industry expectations but part of a deeper phenomenon. The "Law of Time and Chaos" holds that in a process such as evolution, the time interval between critical events (that significantly affect the future of the process) increases or decreases with the amount of chaos. Chaos is the quantity of disordered, random events relevant to the process. Order increases as chaos decreases. So a sublaw of the above law is the "Law of Accelerating Returns," which is that as order increases exponentially, the time interval between critical events grows shorter (or as Kurzweil puts it, "time exponentially speeds up") (Kurzweil, 1999, pp. 27-30). In accordance with the "Law of Accelerating Returns," when Moore's Law gives out by 2020 another computational technology will have arrived.

As one can see from the above description the authors in question think that it is clear that humans will soon be matched and then quickly outclassed by robot intelligence. At this point I would like to turn from exposition and description to evaluation and appraisal. I think our authors are overly optimistic in their estimate of how easy and how soon robot intelligence will come. In fact, I think there may be such severe problems in getting a robot to be as smart as a human that we can put no realistic timetable on when it will happen (if ever). To support this less than rosy appraisal I will suggest the following:

1.     While it is true that computing power has advanced, we cannot say with any confidence that this will continue at any particular rate. See "Advances in Computing Power" below.

2.     Reliance on future technologies as the magic bullet is highly speculative. See "Reliance on Future Technologies" below.

3.     While our authors realize that the brain appears to operate via something like a massive parallelism rather than by sequential processing, they do not fully realize that this may make the need for faster computers less relevant than the proper style of processing. See "Computing Power Alone May Not Be the Answer" below.

4.     It may be a mistake to characterize humans primarily in terms of the "computing power" of the brain rather than in terms of intelligence. It may be very difficult to get a computer to display humanlike intelligence. See "The Many Aspects of Intelligence" below.

5.     The Turing Test is not an adequate test of intelligence or adequate for purposes of evaluating robots for human-computer mind transfer. See "The Turing Test" below.

6.     We may not be able to do anything more than merely guess that a computer can really understand anything. See "Computers and Understanding" below.

Advances in Computing Power

Claims about computing power advancing at exponential type rates are argued in two ways by the authors in question. First, there is the extrapolated line approach that graphs computing density, speed, or some such proxy for computing power over an extended historical period and projects the line or curve into the future. The other approach is represented by the second approach of Kurzweil, who claims to discern fundamental laws of the universe about evolution, chaos, and order that show computing power must advance and in an exponential fashion. This second approach is seen to even encompass Moore's Law, which of makes up a major part of the first approach.

Let's quickly tackle the second approach first. Kurzweil claims to discern such laws, but it's hard to take this claim seriously. Kurzweil's a smart guy, but here he comes off as discerning fundamental rules of the universe that almost no one else can see. Obviously, maybe he's the only one that can see them because they are mostly imaginative speculation. As far as I am aware, the laws he claims to discern are not considered established scientific laws, in the manner for example of the laws of Newtonian or quantum physics accepted by the majority of reputable physicists. Kurzweil does not provide decisive or what one would consider even substantial evidence for their truth. They seem to be rather part of Kurzweil's personal metaphysical or even religious views. As such, while granting their intelligibility for the sake of argument, I can't see that they constitute legitimate support for predictions of the extraordinary future any more than other religious views might be thought to do so. For all I know, or anyone else does, these "laws" might be true, and they might be false. But since insufficient evidence to establish their truth has been presented, they don't give any real support to other claims about advancing computer intelligence. Any plausibility for the claim that computer performance will continue to increase at a particular exponential rate must be provided by the first line of argument instead.

I can spend a lot more time on the first approach, which claims basically that Moore's Law or something bigger than Moore's Law can be extrapolated into the future. It's true that computing power (in some sense) has been growing at something like an accelerating or exponential rate for many decades and one can extend this to a larger time period. But I cannot agree with the claim that this rate can be known with any very good degree of precision, or that it is precise at all for extended periods of time. If you look closely at graphs by Moravec and Kurzweil that attempt to display a precise line or curve, you will grant that a precise line or curve is present only if you are willing to accept a variety of datapoints that are only approximately near the line or curve rather than exactly on it. It does appear that computing power or something like it, suitably defined, has been growing at a roughly exponential rate, perhaps since the turn of the century, but the rate seems to fluctuate too much to permit precise definition for long periods of time.

I will not attempt to analyze the truth of the claim that computing power has been proceeding at a particular rate for the last hundred years or longer because I think the notion of computing power applying to time periods prior to the development of digital computers may be suspect or at least hard to measure. Furthermore, while our authors see Moore's Law as only part of a wider phenomenon, Moore's Law is as clearly defined as this rate is likely to get, so a defense of Moore's Law is probably the best chance anyone has of making the case for this whole extrapolation approach. So how well does Moore's Law hold up?

Just about everyone in computing knows about Moore's Law. Moore's Law is frequently cited by many in the computer industry, not just our authors. I have great respect for Gordon Moore, but when we take a look at the much-ballyhooed Moore's Law, it turns out that the marketing hype has gone beyond the reality. What was originally an insightful (and perhaps lucky) prediction on the part of Moore has been blown up into something perhaps far beyond his original intent. It's not even clear what Moore's Law is anymore.

The original prediction of Gordon Moore was made in a short article in a magazine called Electronics. The article was called "Cramming more components onto integrated circuits (Moore, 1965)." At the time Moore was director of research and development laboratories for Fairchild Semiconductor, and he had been asked to predict the next ten years in the semiconductor industry (Schaller, 1997). In this article, Moore was concerned to emphasize that integrated electronics, as opposed to using discrete components, would be increasingly used in all sorts of electronic devices because it offered a number of advantages, including reduced cost, increased reliability, and increased performance. Integrated electronics would be increasingly used in computers to soon enable home computers. With respect to semiconductors used in computers, increased integration meant placing more components on a single chip (on a single integrated circuit) (Moore, 1965, pp. 114-115).

Given that the number of components on a chip was increasing, Moore then tried to predict at what rate this increase was occurring and would continue to occur. I don't want to take away anything from Moore's insight here, but apparently he was not the first or only one to think that computing was advancing at a fast pace. It has been claimed that already by the mid-1960's there existed in the semiconductor industry the general understanding that innovation was proceeding exponentially; it turns out that Moore was just the most articulate spokesman for this position (Schaller, 1996).

Note that Moore decided to discuss the state of the art for semiconductor integration in terms of the number of components that should be "crammed" onto a chip for the minimum average per component cost. Moore's discussion is in terms of "device yields" but I think I can paraphrase it in terms of the performance increase brought by adding a component to the circuit, though Moore clearly does not phrase it explicitly in terms of processing speed performance. When adding components to a chip, initially each component added lowers the average "per component" cost. That is, initially the performance added to the circuit by the additional component is relatively large compared to the additional cost to add that component. In terms of an analogy with what we might have learned in an economics course, during this phase the marginal revenue exceeds the marginal cost. Eventually, though, adding more components actually starts to raise the per component cost because the performance each additional component adds to the chip becomes relatively small compared to the additional cost of adding that component. To return to the economics analogy, the marginal cost starts exceeding the marginal revenue. Now, the crucial point is that for any given time in the development of technology (any given date in time), there will be an optimal number of components that can be placed on a chip to give the minimum average component cost. In terms of our analogy, this "optimal number" would be analogous to that point at which the marginal revenue equals the marginal cost. At the time Moore wrote, in 1965, the minimum average component cost was reached at about 50 components per circuit, but Moore could see that the number of components at which this minimum average cost was reached was rising as technology advanced (Moore, 1965, p 115). For example, it would cost less per component to produce a 50 component chip in 1966 than it did per component to produce a 50 component chip in 1965, but more importantly, because production technology had advanced, in 1966 the lowest average cost per component would no longer be found on a 50 component chip but on a 100 component chip. Note also that Moore clearly had production chips in mind, not just special expensive prototypes that would never go into production, because his discussion is in terms of the complexity the chip should have for the minimum average component manufacturing cost (Moore, 1965, p. 115).

To make an estimate of the future rate of increase in the number of components for minimum average component manufacturing cost per chip, Moore would extrapolate from what he saw as the then current trend. To determine that then current trend, Moore chose five datapoints. The first datapoint was the 1959 production of the first planar transistor. The second, third, and fourth datapoints together comprised what was really more like a set of datapoints that represented the first few integrated circuits of the early 1960's, including the 1964 production of IC's with 32 components. The last datapoint was the soon to be released (in 1965) IC with 64 components. Moore plotted the points on logarithmic paper and connected them with a straight line, and then extended the line to 1975. Reading from the line, which shows a doubling every year, he predicted that a chip would attain 65,000 components by 1975 (Schaller, 1997). As Moore put it, "The complexity for minimum component costs has increased at a rate of roughly a factor of two per year" (Moore, 1965, p. 115).

A little background will help us understand what was happening technologically at this time. Miniaturization was a major theme, as it is today. William Shockly and his colleagues at Bell Labs had invented the transistor ("transfer resistor") in 1947. The transistor was based on the discovery that by adding impurities to a solid such as silicon the flow of electricity through it could be controlled. The transistor could be made smaller, more reliable, and less power-hungry than the vacuum tube it would replace. By the late 1950's engineers at Fairchild had developed the transistor in a plane (a planar transistor), which Moore would later say was the beginning of the law of density doubling. Later came the first planar integrated circuit, which enabled the extending of the cost and operating benefits of transistors to mass-produced electronic circuits (Schaller, 1997).

As semiconductors developed, advances in production technologies were as much an influence as technological inventions such as transistors and integrated circuits. For example, the development of a diffusion and oxide-masking process enabled the diffusion of impurities (dopants) directly into the semi-conductor surface, eliminating the tedious process of adding conducting and insulating material layers on top of a substrate. Sophisticated photographic techniques then enabled the laying of intricate patterns in the semiconductor so that only desired areas lay open to dopants. Production reliability and accuracy increased and the process moved from individual fabrication to batch production. Another production technology was the planar process itself, which by replacing three-dimensional transistors with flat surface planar transistors, made them easier to make and make smaller. Because the electrical connections were now flattened, they could be made by evaporating metal film onto the semiconductor wafer in appropriate areas instead of having to make them by hand. A photolithographic process was used to etch regions on the chip, which were plated and laid on top of one another on the silicon wafer. Circuits could be integrated with one another on a single substrate because electrical circuits were now internal to the chip (Schaller, 1997). I point out the significance of production technology advances because, when you're talking of not what goes on in the lab but of the kind of production chips needed for the extraordinary future, production advances are as much a part in advancing chip densities as any other technological breakthroughs.

Moore's 1965 prediction for 1975 turned out to be accurate; in a 1975 paper presented at an IEEE meeting, Moore noted a memory chip of the proper density was in production at Intel, where Moore was now President and CEO. (The original article talked of memory density, which has not always increased at the exact same rate as microprocessor density, though the discussion is often loose enough that authors perhaps unfortunately shift back and forth between the two types of chip.) Moore claimed that up until 1975 there were three key reasons for the rate of growth. First, due to the use of optical projection in place of contact printing for the lithography masks on wafers, they could make bigger chips with fewer defects. Second, they had ever-finer rendering of images and line widths. Third, "circuit device cleverness" enabled manufacturers to use more of the total wafer area. But this third device, "circuit device cleverness," was ending with the charge-coupled device (CCD) for which a new technique of "doping" semiconductors used controlled light beams rather than chemical means. So, going forward, the industry would have to rely on only the first two factors, bigger dice and finer dimensions (Schaller, 1997). In other words, to get more transistors on chips, they would have to resort to making the chips bigger and shrinking the lines used on the chips so that the transistors would be closer together.

So now Moore wished to make a change to his prediction of the rate of future growth going forward. No longer would it be a doubling every 12 months; it would have to be slower. But here is where some controversy enters, as Kurzweil notes. It is clear that Moore redrew his line from 1975 onward to have a gentler slope (Schaller, 1997). But there is some controversy over whether Moore changed the doubling period to 18 months or 24 months. I can find no record of publication of the paper he read at the conference, and later accounts of what Moore claimed do seem to vary. There are reports that Moore himself later claimed he changed it in 1975 to 24 months, but most observers instead report he changed it to 18 months (Rosenberg, 1999). One might guess, as some have, that the 18 month figure is really a conflation of the 12 month and 24 month numbers. But Schaller (1996) claims Moore presented his law mathematically at the 1975 conference as "(Circuits per chip) = 2**((year-1975)/1.5)," which is based on an 18 month period.

Later still, in 1997, Moore himself referred to the 30 year compounded growth rate as a doubling every 18 months, thus blending the two separate rates into one (Moore, 1997). So Kurzweil is entirely correct to point out that there is controversy over exactly what rate Moore changed his law to in 1975. In any event, shortly after the 1975 conference the prediction was called "Moore's Law" (Schaller, 1997).

So our first lesson is that there is no one rate called "Moore's Law." It was originally one rate, then changed by Moore to another (and we're not even sure what that new rate was). Some even claim that it has changed again. Kurzweil, for instance, claims that Moore's Law has now picked up to a rate of doubling every twelve months, which I do not see supported by anything Moore says or by the majority of industry pundits. Moore himself slowed down the rate in 1975 rather than sped it up. Sure, you could claim the real rate was some kind of average of the two earlier figures, but to do this you would have to fudge the data points. Remember that we are supposed to be talking of a rate that holds constant over its life, not just an average applied to a large time period but not useful for predicting particular intermediate points.

Confusion over which rate to use prefigures the larger question of whether the rate is really all that exact anyway, even for a limited time period, because there is a lot of ambiguity about what the subject is. Are we talking of microprocessor chips used for CPU's or memory? Are we talking of chip density for chips already in production or for those just about to enter production? A few months' difference between the two could change the rate. Are we talking of chips available at any price or chips available at a competitive price (Moore seemed to mean the latter since his original discussion is in terms of manufacturing costs). Are we even still talking of chip density? Note that chip density as used in this context seems to mean transistors per chip and not transistors per unit area of chip. When you double the total transistors by doubling the chip area, the chip density in this sense of "density" has doubled, rather than remained constant. However, some versions of the law claim it really is about a more traditional "units per area" kind of density. (For example, the Web-based "jargon dictionary," in an entry from an unknown author, states that Moore's Law holds that the "logic density" of silicon integrated circuits has followed the curve "(bits per square inch) = 2^((t-1962))" where t is current year (Moore's Law, 1995, italics mine).

Generally, chip performance in terms of processing speed will improve with density increases gained by smaller dimensions, but chip size (the other way to gain "density") may not improve speed to the same degree. Moore usually talked of chip density, or the number of components on a chip, especially with respect to that needed to attain the minimum average component manufacturing cost, but later versions of the law often prefer vaguer terms such as "complexity," "speed," or "performance." This might have been expected; Moore's Law does seem less relevant if chip speed, performance, etc. do not track increases in chip density. Even Moore was interested in performance of course; in a 1997 update he noted the speed gains from smaller chips and shorter distances, something like a 20,000 fold increase from the 1971 4004 chip to the 1997 Pentium (Moore, 1997).

One guesses that most of the variation in rate calculations and reporting must come from honest mistakes. Of course "fudging" the data to make it fit the rate is possible too. In the 1997 IEEE Spectrum article by Schaller cited above there appeared a chart showing rates for CPU microprocessor and memory chip density increases. However, after scrutinizing the datapoints on the chart, an observant reader pointed out that while the article and chart caption claimed a doubling every 18 months, the actual datapoints supported a different rate (a doubling every 22 months for DRAM and a little over 24 months for Intel microprocessors)! This reader even charged Intel with "revisionist history" on its Web site because the Web site allegedly claimed that in 1965 Moore predicted doubling every two years (which he did not) (Kane, 1997). One wonders: if Intel can't get it right, who can?

Currently (as of 1999) on Intel's official Web site, they provide a description of Moore's Law that is safely vaguer. They imply that Moore's Law is the claim that computing power, or capacity, would rise exponentially over relatively brief periods of time, roughly twice as much over a period of eighteen to twenty-four months. Intel claims the law is still accurate: in 26 years the number of transistors on a chip has increased more than 3200 times, from 2300 on the 4004 in 1971 to 7.5 million on the Pentium II processor of 1998 (Intel Corporation, 1999).

If one interprets Moore's Law as the claim merely that computing power will increase exponentially, with no time period specified, then the claim becomes so vague as to be almost meaningless. Certainly Moore did not understand it as vague in this fashion. He would hardly have found it necessary to "correct" the rate in 1975 if his claim were only that computing power will increase exponentially over some time interval or other.

So what is the status of Moore's Law, assuming we can agree on a version of what it is? It is hardly a scientific law or theory like Newton's "force equals mass times acceleration," for instance. For one, it is not a law just about nature but involves contingent events and decisions in the business of making production chips. I have detailed above the particular technological breakthroughs, both production and otherwise, that enabled chips to advance at the rate they did during the sixties and seventies, and these breakthroughs did not happen by accident but by devoting time and financial resources to research projects and deciding to build plants to produce chips. The breakthroughs would not have happened without particular decisions to carry out the research, and the plants would not have been built without particular business decisions to go after profits. There is no guarantee similar decisions will be made in the future or that similar breakthroughs will occur even if the effort is expended. I emphasize that keeping up with Moore's Law depends not just on advances in chip technology and production technology but also on the decision and financial ability of companies in free market systems to afford incorporating advances in this technology into the production of computer chips. If a broad recession or depression were to hit, as in the 1930's, it is not clear that computer companies would continue churning out advanced chips just to keep up with the law.

There have been admissions (even by Moore himself) that the law has become self-fulfilling. Moore pointed out that once the law became publicized and accepted, it became "more or less a self-fulfilling prophecy. The Semiconductor Industry Association puts out a technology roadmap, which continues this generation [turnover] every three years. Everyone in the industry recognizes that if you don't stay on essentially that curve they will fall behind. So it sort of drives itself" (Schaller, 1997). Of course Moore's Law seems to hold, this line of argument claims, because companies put just that much effort into chip technology and production needed to meet what the law predicts they should have next year. It's as much a business target or objective as it is a "law." For instance, a recent article by Meieran, of Intel, allowed that the industry is expending "enormous resources" to meet the predictions of Moore's Law (Meieran, 1998). Imagine the absurdity of claiming that the industry is expending tremendous resources to make sure that on the macroscopic level "force equals mass times acceleration" continues to hold in the future, and you see how far Moore's Law is from being a scientific law.

Of course one of the things that drives the need for greater density is the need for the greater processing speed brought with it, and one of the things that drives this is the increasing size and complexity of software. Nathan Myhrvold, of Microsoft, points out that the Basic language had 4000 lines of code in 1975 and about half a million two decades later. Microsoft Word had 27,000 lines of code in its first version but by 1995 had grown to two million. In a process of reciprocal reinforcement, software makers consume new microprocessor capability as soon as the chip makers make it available, encouraging chip makers to continue increasing speed (Schaller, 1997). George Gilder claims that Bill Gates follows the dictum "waste transistors"; "every time Andy Grove makes a faster chip, Bill uses all of it" (Schaller, 1996).

The above considerations demonstrate to my satisfaction that it is difficult to find enough precision, objectivity, and grounding in science in Moore's Law to use it as a basis for anything more than a rough conjecture about the future. But let's suppose for the sake of argument that there is some interpretation of Moore's Law such that the law has proven precise and accurate. Assuming Moore's Law has been precise and correct so far, what then bodes for the future? Should we just confidently draw the line out through the next century, as our authors would have us do?

When we go beyond the comments of our authors, we find that opinions run the gamut. Pessimists believe the law is running on borrowed time and will soon play out. Some even believe we have failed to keep up with it already. Optimists believe that it will continue a long time into the next century (obviously believers in the extraordinary future are in this camp). Some extreme optimists claim we have advanced beyond it. (Note that this would falsify the law as much as if we had failed to keep up with it.) Given that the law itself appears in slightly different forms, one might expect some disagreement, but the range of predictions is so diverse as to indicate a real difference of opinion about the future of the law.

Let's talk about the pessimists first. Keep in mind that authors such as Kurzweil, Moravec, and Paul & Cox believe that computing power will continue to increase approximately in accordance with Moore's Law well into and past the middle of the 21st century. In this context, any estimate that Moore's Law will end significantly before this time can thus be considered pessimistic. Because of this stringent standard, there are far more pessimists than optimists.

There have been claims that we have fallen behind Moore's Law already. For example, one article claimed Intel was actually falling behind Moore's Law in the introduction of chip improvements with the planned P7 introduction around the millennium. Moore's Law predicted 170 million transistors per chip when only 10 million would be attainable (Rosenberg, 1999).

Going forward, those who think time is running out for Moore's Law cite technological reasons or economic/business reasons. The technological reasons have to do with problems in making chips ever smaller. Economic/business reasons involve the falling price of chips and the increasing cost of chip plants.

Moore himself comes off as a pessimist, because he seems to think his law will be defunct within two decades. In a 1997 speech, Moore told his audience that his law was going to come into conflict with the laws of nature, namely the finite size of atomic particles, by about 2017. As chip production processes get smaller, more transistors can be put onto a chip, offering the addition of new performance features. Since the distance between transistors is reduced, the speed increases. Intel is currently (as of 1997) using a .35 micron process, will be moving to a .25 micron process, and then will go to .18 microns for 1000-MHz machines. (As of the date of this thesis we are almost at the 1000-MHz machine level.) This latter process doubles the size of the processor and takes 40 watts of power, which generates problematic amounts of heat, so voltage levels would have to be reduced from 3.3 volts to half a volt, which is "not fun." Moving functions that are now off of the chip (such as modems, graphics chips, and memory control) onto the chip attracts the interest of the Federal Trade Commission in possible anticompetitive practices (Kanellos, 1999). Then there is the issue of the cost of chip plants. In 1995 Moore pointed out that capital requirements for fabrication plants rise exponentially along with component densities. In 1966 a new fabrication plant cost $14 million, by 1995 it cost $1.5 billion, and by 1998 it would cost about $3 billion (Schaller, 1997). And costs of plants keep increasing; more complex chips require great capital to build plants, up to $4 billion per plant for the .18 micron chips, and it's not clear that rivals such as AMD will be able to keep up (Kanellos, 1999). What is sometimes called "Moore's Second Law" is his claim that the cost of semiconductor plants doubles every three to four years (Geppert, 1998).

But we may not even make it to 2017. In an announcement on June 25, 1999, Yahoo news reported that Bell Labs had announced in Nature magazine that they estimated chips would run into the wall in 2012. The problem would be the insulating material known as the gate oxide, which is the smallest feature on the chip. As the chips shrink, this has to shrink, and since it is already the smallest it can be, it becomes the limiting factor as chips use the .06 micron process, at which point it will be 5 atoms thick. New materials would be needed to go any farther. But the article also noted that 2012 would actually be an extension beyond the 2005 or so estimate of some previous scientists (Lemos, 1999). One of these other scientists was Gordon Bell, who actually predicts Moore's line will flatten out around 2003 (Geppert, 1998).

So there are a large number of pessimists. A 1996 poll of 11 industry executives had allowed Moore's Law an average of only 14 more years--until 2010. Several reasons were cited by these executives. For example, optical lithography may not be able to provide increasing density in a cost-effective fashion. Transistors could become so cheap that there would be no profit in making them smaller instead of just using more. As already mentioned above, it may become too costly to make them faster. Such cost increases are not a problem when chip improvements come even faster, but this is not expected to continue. Given a time period in which costs double, chip improvements are not expected to double (Schaller, 1997). Even if these numbers have not been adjusted for inflation, the point is well taken that continued increases in the speed and density of production chips may prove too costly to implement to allow Moore's Law to continue. Another problem cited is the anticipated difficulty of testing integrated circuits with many millions of gates (Geppert, 1998).

The executives and others who claim that Moore's Law will be maintained until 2010 or so might ordinarily be interpreted as optimists, because they think it will at least continue for a while, but in our context they are pessimists. It is truly rare to find anyone other than our authors who really expects Moore's Law to continue past the middle of the next century. One who is almost as optimistic is Nathan Myrvold, quoted above, who claimed in 1997 that he thought it would last for another 40 years (Geppert, 1998). But even this is more pessimistic than our authors.

Of course whenever a breakthrough is announced, there may be a few optimistic comments appearing that claim we are already ahead of the pace Moore's Law dictates. For example, in late 1997 Intel announced that a technological breakthrough would soon enable engineers to put twice as much information (moving to 2 bit in the space of 1 bit) on flash memory chips. Apparently at this time Intel then announced something to the effect that Moore's Law was over--not because of technology pooping out but because the rate was going to be beaten (Geppert, 1998)! Another story in 1997 told of how IBM engineers had found a way to substitute copper for aluminum in chip circuitry, which would allow greater miniaturization. Copper is better than aluminum at the "below .20 micron" levels needed for new chips. (IBM would soon put this in play on their chips with Intel following by 2002) (Rupley, 1997). These news stories were even sometimes interpreted as showing that this meant a return to the original 12-month period for Moore's Law (Rosenberg, 1999). Of course, since the rate itself is not precise, the same event may to one person be the signal of a return to an earlier version and to another person the signal that the rate has been beaten.

In response to pessimism about chip costs, and in line with the self-fulfilling feature of the law, one possibility suggested for semiconductor manufacturers is to team up with customers, competitors, suppliers, or even governments to share construction and R&D costs. Advanced DRAMs were developed by a joint effort of IBM, Siemans, and Toshiba. South Korea and Singapore enterprises appear to have state support (Schaller, 1997). Note the emphasis on the role of business to do its part to make Moore's Law happen, rather than just letting things play out and see if the law holds.

If Moore's Law can't provide the realistic expectation of a particular rate of computing power growth, then most likely the wider phenomenon of which it is a part will not do so either, since Moore's Law, vague as it is, seems stated more precisely and is better defended or justified than other parts of the phenomenon. Given the vagueness and pessimism we have seen above, however, we cannot have confidence in Moore's Law continuing far enough into the future to still hold at the time computer intelligence is alleged to match and surpass that of humans, roughly sometime between 2020 and 2050. It may or may not "give out" or "slow down" in a few years; we just do not know. (It is an interesting question how much it would have to slow down to be considered false. The fact that this notion is confusing is further evidence that it is not a conventional scientific law.) Moore's Law is not a scientific law in physics or chemistry but a self-fulfilling prophecy powered by economic and market forces as well as breakthroughs in technology. For any number of reasons, such as the limits of silicon circuit widths, the too-great expense of semiconductor plants, or the commoditization of computer chips and computers, the exponential rate of growth might change significantly, as Moore thinks it already did in 1975. I conclude that, as far as has been shown by our authors, Moore's Law or anything larger that is like it does not provide confidence that the requisite computing speed and power will be available when our authors predict it will.

It is interesting to consider in what sense the extraordinary future might be possible for the very lucky or the very rich even were Moore's Law to fail to hold. Recall that Moore's Law is about component densities for minimum average component manufacturing cost. Moore's Law has been generally interpreted as a yardstick of the computing speed and power available in the marketplace at an given point in time, which is in the direction of Moore's original understanding. But, strictly speaking, the original formulation of Moore's Law says nothing about the possibility that very dense or very fast laboratory prototype chips might be available that far surpass what could be produced in a manufacturing environment at minimum cost. This possibility might not seem likely, since when talking of reaching the limit of circuit widths at the .06 micron level, for instance, this seems a limit on chips per se and not just on minimum component cost chips. But at some point in time there might be developed an exotic chip technology that was too expensive to mass produce but which was nevertheless possible to make in a research lab as a one-off prototype or the equivalent. So even were Moore's Law to fail to hold far enough into the future for low-priced robots to be available for the average person to engage in mind transfer, that failure would not preclude the possibility that high-priced prototype robots could be made by research scientists, possibly for the very few rich who could afford maximum computing performance at any price. Thus the failure of Moore's Law would not preclude the possibility of human-computer mind transfer becoming available for computer scientists working on such exotic robot brains or the few rich people who might be able to buy such machines. Of course, this mere theoretical possibility does not imply that it would actually occur; our authors do not provide any more evidence for this possibility than they do for the possibility of mind transfer being available to the general public. And our authors do intend the extraordinary future to encompass the possibility of widespread mind transfer, not just mind transfer for a lucky few.

Reliance on Future Technologies

Of course the authors depicting the extraordinary future believe that new technologies will come to the rescue of Moore's Law. They cannot spell out exactly how this will happen in any detail, however, because technologies such as nanotechnology are still in their infancy. The willingness of the authors to believe in the power of future technology to come to the rescue, even without much evidence how or when it would do so, appears based in their belief that Moore's Law or the wider trend of which it is a part is some inexorable force. So instead of nanotechnology (or the other new technologies cited) being an established field with a proven track record, which can then be used to buttress the shaky claim that Moore's Law will continue indefinitely, it seems the belief in the inexorability of Moore's Law (or the wider trend of which it is the latest manifestation) is used to buttress the claim that nanotechnology will produce what its proponents say it will. We have a lot of shaky arguments trying to support one another. Such faith reminds one more of religious fervor than of scientific impartiality.

Just what are these technologies? We have seen several mentioned. With respect to the route to increased computing power, nanotechnology is mentioned, as well as atomic computing, which might be considered to be part of nanotechnology. Nanotechnology concerns the technology and production of machines on the molecular or nanometer scale, though atomic computing involving quantum physics would seem to deal with what is even smaller still. Other buzzwords thrown around when convenient to be tossed into the equation include artificial intelligence, artificial life, and the transmutation of elements. While our authors realize that each of these technologies individually have limits, somehow the combination of them all will allow each to complement one another in ways that will overcome limitations. Interestingly, the one "technology" here that has shown any concrete results at all, artificial intelligence, faces significant problems that are scarcely mentioned or instead glossed over by our authors. More about this later.

At this point, nanotechnology is more of a research program than an established engineering field. We are nowhere near producing any little self-replicating machines. A recent critical article (Stix, 1996) in Scientific American brings up some of the relevant problems with nanotechnology. Manufacturing at the nano level would require treating individual atoms and molecules as construction elements in a tiny erector set. It's true some preliminary efforts along these lines have succeeded. For example, researchers manipulated 35 xenon atoms to form the letters "IBM." They did this with a scanning tunneling microscope, which dragged the xenon atoms across a nickel surface. But this is a long way from creating a self-replicating nanoassembler. Proponents usually fail to mention that the xenon experiment was done in a high vacuum at a supercooled temperature, but both of these conditions are impractical for everyday nanomanufacturing of the kind envisioned in the extraordinary future. Atoms that can be manipulated in this special environment are much too reactive with ambient atmospheric and other environmental elements to allow such manipulation to occur in normal environments. Other scientists accuse nanoenthusiasts of failing to provide crucial details about basic engineering that would need to be done in nano manufacturing. For example, how close are we to realizing one of the exciting aspects predicted of nanotechnology, the ability of nanomachines ("nanoassemblers") to reproduce by self-replication? At least one prominent chemist charges that this is still sheer science fiction (Stix, 1996). There does seem to just be this blind faith that we will be able to manipulate all kinds of particles at will to form what we want. But chemical compounds follow the rules of chemistry and behave in well known ways, including potentially reacting with everything around them, and it's not as simple as just grabbing parts of molecules and building whatever you want.

The Scientific American article of course generated critical responses from nanoenthusiasts. One can follow the discussion on the Web pages of The Foresight Institute, founded by K. Eric Drexler, a prominent proponent of nanotechnology.

I don't know that nanotechnology (or any of these other technologies) won't work; in fact, I hope it does. But how much evidence is provided by hope? While undoubtedly development of such new technologies will proceed and they may in fact allow the production of smaller machines of various sorts, the hodgepodge of terms thrown around in this context seems more like the result of fervor or desperation than insight. None of our authors nor anyone else really knows what different form computers will take, if any, after silicon materials have played their last card. One reads encouraging reports of experiments involving some of these technologies, but we appear to be nowhere near ready to announce the arrival of any serious alternative kind of computer to the ones we have now. Just because scientists in the lab can get a few atoms to line up to spell "IBM" does not mean nanoassemblers and atomic computers are just around the corner. But 2020 really is sort of just around the corner.

The authors in question rely on nanotechnology not just for new generations of computers, but also for the robot bodies that such computers will inhabit. Again, details are lacking because no one yet knows how to manufacture truly functional humanlike flesh (skin, muscle, nerve, and brain tissues, etc.) out of chemicals. If we were anywhere close to this development severe burn victims would not be forced to die while medical teams tried to cover large portions of their bodies with skin grafts and bandages. I read a news item the other day that reported an enterprising physician was using the knee joints of Barbie-dolls as artificial finger joints for people who had lost fingers and hands. A sympathetic Mattel toy company had at the doctor's request sent him a box full of Barbie legs. Barbie legs? Mattel? Where are all the artificial organs and other replacement parts that we were supposed to have perfected by now? Human hearts and baboon livers seem to play more of a role in transplants than the synthetic models (artificial hearts are used as temporary stopgap measures only). We seem pretty far off from the immanent arrival of a practical macrotechnology for human parts, much less nanotechnology (remember that according to our authors, supersmart computers may arrive as early as 2020). Even the ultimate nanoenthusiast K. Eric Drexler refuses to predict exactly when nanotechnology will be able to fulfill all of the predictions for it. Perhaps some important and encouraging research has been done in these areas, but we are too far away to confidently predict when we will be able to make a humanlike body, much less something better.

Nanotechnology advocates often cite a lecture that Richard Feynman gave in 1959 pointing to the possibilities of something akin to nanotechnology. In that lecture Feynman allowed that he saw nothing in physical law that precluded making computers enormously smaller than they were then. While one wonders if what he had in mind was partially fulfilled in the silicon advances in microchips of the following decades, he does leave open the possibility of manipulation at the molecular and atomic level (Feynman, 1959). But the success of nanotechnology requires a lot more than just the pronouncement of an eminent scientist that it is not physically impossible! Just because Feynman declared that such technology is not physically impossible does not in itself give us concrete reasons to think that it will be available anytime soon or even at all.

I do not want to sound too negative here. The fact that nanotechnology currently resembles a religion or cult as much as it does a scientific endeavor does not mean that it won't produce some amazing breakthroughs. I am certainly not interested in mounting an ad hominem attack against Drexler. But we need to be realistic about nanotechnology prospects before relying on it or any other technology as the "magic silver bullet" that will jump in right when we need it to allow us to make supersmart robots with synthetic bodies.

I won't continue my description of the debate on nanotechnology and its future beyond pointing out that nanotechnology and related technologies are still largely theoretical and speculative. The authors depicting the extraordinary future might be right that nanotechnology will rescue computing from its current silicon limits and provide robot bodies, but right now those claims are as much science speculation as is the depiction of the extraordinary future itself.

Computing Power Alone May Not Be the Answer

While our authors realize that the brain appears to operate via something like a massive parallelism rather than by sequential processing, they do not seem to fully appreciate that this may make the need for faster computers less relevant than the proper style of processing. Their preferred view seems to be that as long as the overall processing power of a robot brain is the same as that of the human brain, the robot will be as smart and capable as the human brain, even if the robot brain operates with less parallelism than does the human brain. But the fact that this assumption may not be correct could have significant implications. This issue I will explore in this section.

By and large the authors' discussions of human brain size and anatomy seem to be in the ballpark. No one appears to have any basis for more precise estimates than a brain size of around 1011 neurons and thousands of connections per neuron. Perhaps surprisingly it doesn't seem to me that their estimates have to be all that accurate. If computing power is advancing exponentially as fast as they claim, then even if they underestimate the brain's computing power by a factor of ten, say, that means only a few more years of development time to get a computer to that power. Their estimates vary among them by this much anyway. Of course, if they are wrong about the exponential growth of computing power, with it actually advancing in the future at an incrementally slow rate, then an accurate estimate of brain size would be more important. But if the predicted rate is off by this much, their predictions will be off the mark by decades or centuries anyway.

The most significant problem with the viewpoint of our authors in this context, however, may be the use they make of their understanding of brain anatomy and physiology. The authors focus on estimating the number of neurons and connections in the brain in order to estimate the computing power of the brain. Discussions of the computing power of the brain emphasize the total number of brain events or neuron to neuron firings per second. But this focus may represent a failure to fully appreciate the significance of the distinctive way the brain works. The authors are of course aware that the brain appears to operate by a massive parallelism rather than sequential or serial processing, but their talk about translating brain size into calculations per second or total storage in terms of bytes or bits seems to place more emphasis on getting an equivalent in raw computing power than in getting an equivalent in terms of massive parallelism. So my criticism here is that in spite of some of their comments, for the most part our authors seem to have their emphasis in the wrong place. In order to clarify what I mean, we need to discuss the difference between the traditional symbol processing model of computing and more recently developed connectionist models.

Digital computers have become popular during the last decade or so as very many people use personal computers in their office or home. Of course they have been around longer than that, since around the forties and fifties, when they were too big and expensive for individuals to own. The earliest computers were hardwired to run a particular program, but since then everyone has become familiar with the distinction between the program or software run by a computer and the hardware that runs the software. People are now familiar with the idea of being able to run different programs on the same computer, and some people are familiar with the idea of being able to run the same program on different computers, whether loading the exact same program on two similar personal computers or running "equivalent" versions of a program on two different computers (such as an IBM-compatible PC and a MacIntosh computer). In short, the distinction between software and hardware has become common knowledge. The idea that a computer could have a memory space that could hold different programs at different times was developed by von Neumann and known as the "stored program" concept, such machines sometimes being called "von Neumann computers." A von Neumann machine is a sequential processor (carrying out operations in serial fashion); it stores data at particular memory locations, accesses this data via the addresses of such locations, and features the CPU as the single locus of control (Copeland, 1993, pp. 192-194).

Now, the brain doesn't seem very much like a von Neumann machine for many reasons. First of all is the massive parallelism of the brain enabled by the great number of connections among neurons. As we have mentioned, a neuron may be connected to ten thousand others (or even a hundred thousand), and if the average is a thousand, and we have 1012 total neurons, we have 1015 total connections (Copeland, 1993, pp. 182-183).

Memory storage also appears to be different. Modern computers separate the CPU, which does the processing, from primary storage (RAM). Some memory may remain on the CPU to speed up processing as a cache, but in any event storage and processing occur in different places. It is not clear that the brain works like this, with memories seemingly distributed among synapses that may also be used for processing. In other words, computer memory stores a datum at a specific address, whereas the human brain distributes a single memory over many sites (distributed storage) (Copeland, 1993, pp. 190-192). Our authors seem to realize this but this crucial difference with modern digital computers doesn't appear to unduly concern them.

There seems to be another disanalogy between computer and brain memory access. Computers store data so as to be accessible by address. The brain, in contrast, can access memory via content (content-addressable). Human recall of an event can be initiated by any number of relevant triggers--far from finding it by specifying a unique address where it is stored. Computers can retrieve data via something like content-addressable techniques, as in examining all address contents (way too inefficient) or even better in the use of hashtables. But the disanalogy here is that the programmer must set up the hashing technique in advance, whereas human memory recalls events from any number of triggers, such as an associated smell or sound, various words or phrases, connections to other concrete and abstract ideas, etc. This open-endedness is not captured by the use of a hashing technique (Copeland, 1993, pp. 188-189).

Considerations such as these have led many in AI to look to parallel distributed processing rather than sequential processing as a model of how the brain works. The basic ideas are as follows. Parallel distributed processing (PDP) networks (connectionist networks) are built of a dense interconnected mass of simple switch-like units, artificial neurons as it were, that to use the simplest example, are each either on or off. A neuron fires if a sufficient number of neurons connected to it are themselves firing. Neural connections have different strengths (weights or conduction factors). The threshold of a neuron is the minimum input that will cause it to turn on. Some neural connections are excitatory and some are inhibitory. The networks switch themselves on and off in response to the stimulation they receive from their neighbors. When the total input meets or exceeds the threshold, it switches on, and when it drops below the threshold, it switches off. The patterns become complex, however, because the switching on of any single neuron may have an effect on a great number of others (Copeland, 1993, pp. 208-210).

The network is divided into an input layer, an output layer, and "hidden" layers in between. The input layers are set up so that they can be switched on or off in a particular pattern irrespective of the influence of their neighbors. This pattern then is the input. The repercussions of this input pattern rebound throughout the network until stability is reached, with some of the neurons permanently on and others permanently off. The output can then be read off of a bottom layer, which represents one edge of the stable pattern. Units can also be set up to act probabilistically to threshold values. Changing the strength of a connection can throw the network out of its current stability into more activity. The operator can produce a desired output from a given input by adjusting the strengths of the connections; one way to do this is called "training." This is a systematic way of locating output units that ought to be different and adjusting the relevant connection strengths. Since each change usually induces other changes, this can be a lengthy process (Copeland, 1993, pp. 210-214).

The above description fits the simplest example of a network, with neuron values either on or off. It is also possible to have the neurons select from a range of values between fully on and fully off. Meeting one of various thresholds may cause the unit to jump from one level to the next. The input and output patterns for a network composed of such units will be sequences of real numbers rather than strings of zeroes and ones (Copeland, 1993, p. 220).

So PDP-connectionist networks are of interest because they seem closer to the brain than do von Neumann architectures. There are other key differences between a PDP-connectionist network and a traditional computer using the von Neumann machine architecture. The traditional computer operates by manipulating symbols, but the network consists of units exciting and inhibiting one another rather than program-governed manipulation of stored symbolic expressions. But note that a simple network with on or off inputs can be regarded as a programless bit-manipulator, with the input on and off states representing bit values. A coordinated array of networks can even simulate a von Neumann machine, with one network trained to perform assembly language level shifts, another to perform compare-rights, and so on (Copeland, 1993, 219-220).

Now my claim is that while our authors concede that the brain uses a massive parallelism, closer to the PDP-connectionist model discussed above than to the sequential processing of a von Neumann machine, their picture of the mind seems to see it more as the stored program type of software running on a modern digital computer of the von Neumann type. The distinction between software and hardware, and the idea that the software can be uploaded onto different computers, has become for some a way (a model or metaphor) of thinking of the relation between the human mind and the human brain. Certainly it has our authors in its grip. While they don't want to say that the mind is a "substance" distinct from the brain, and wish rather to think of the mind as in some sense nothing more than the brain or even identical to the brain, they also tend to think of it as the program that the brain runs when it is operating. On this analogy, the mind is to the brain as a program is to a computer and as software is to hardware. We need to keep this metaphor or model in mind in trying to understand the viewpoint of our authors. Our authors have humans in the future being able to "port" themselves all over the place from machine to machine as the need arises. If they don't think of the mind as a piece of software, of the type of stored program that runs in a von Neumann architecture, it is hard to see how they could envision this sort of thing happening. But the mind as software is more than a metaphor to them. Their predominant view is that the mind is literally a program. For example, Moravec (1999, p. 210) thinks that humans will exist in the extraordinary future as artificial intelligence programs running on platforms other than the human brain. As programs, our minds might be laser-beamed at the speed of light to inhabit distant robot brains (Moravec, 1988, p. 214).

This equation of the mind with software might be the cause of an overly hasty identification of a brain neuron firing with a calculation. Recall that they attempt to determine the computational power of the brain in terms of total number of neuron firings per second, as if this should be merely duplicated in a computer almost irrespective of the way the brain is structured or carries out its neuron firing. Looking at the brain itself, it is not clear in what sense a single neuron firing represents anything. But the tendency of our authors is to equate a single neuron firing with a digital computer carrying out a single calculation. I cannot find any argument even attempting to justify this identification. One can understand the natural tendency of trying to identify a neuron firing with something specific that can be represented in a digital computer, whether it is a calculation, an instruction, a cycle, etc. But such identification really does seem to be just an assumption, and a questionable one at that. The representation or representations occurring when a computer carries out a calculation, whether just moving a bit into a register or something more complex, may or may not be what is occurring in a neuron firing. It certainly has not been proved that a neuron firing corresponds to a representation of a proposition or the meaning of a word, and many think that any representation in the brain must be on a more global scale than a single neuron event.

Besides the view of mind as software, our authors may be in the grip of a view of the brain that sees it as a symbol processor, which fits in well with the former view since when a computer runs a software program it is manipulating symbols. To see how this view develops, consider how a computer works in general terms. The transistors of digital computers can be put in various electrical states of different voltage levels that can be thought of as representing "off" and "on," with off and on standing for the zero and one values of a binary digit. The computer works by carrying out operations (dictated by the application and operating systems programs) that ultimately shuffle these voltage levels around in a way that is interpreted as carrying out operations on numbers. These numbers can be thought of as representing other characters, such as ordinary letters and decimal numbers (this is ASCII code). So, for example, a computer spreadsheet appears to us to be able to add 2 to 1 and get 3, but what happens to accomplish this is that the program translates the decimal numbers and the plus sign ultimately into machine level binary number values and electrical states, carries out the requisite processing by changing its electrical states in accordance with the instructions of the program, and then translates the relevant resulting electrical state back into a display of the sum. The shuffling of symbols in accordance with rules, carried out by the computer in its electrical state manipulation, comes to be seen as computation.

An important assumption of what might be termed "traditional artificial intelligence" is the symbol-system hypothesis (SSH), which stems from this view of computers as engaged in shuffling symbols (Copeland, 1993, pp. 58-60). We can talk of a weak version of the symbol-system hypothesis (just abbreviated SSH) and a strong version (SSSH). The weak version holds that a symbol-system (such as a computer) can think, though there might be other things that think (Copeland, 1993, p. 82). While digital computers are engaged in a kind of thinking, other things that are not universal symbol manipulators might also carry out thinking. It may be that humans are such things, and so the fact that we can think does not prove we are computers or fundamentally just symbol-manipulators. In contrast, the strong version of the symbol-system hypothesis (SSSH) holds that computing is necessary for thinking, all thinking is computing, and only computers of some sort can think.

On SSSH, the computer becomes a model for the human brain, the computer software becomes a model for the human mind, and thinking is seen as nothing but a type of computation analogous to what goes on in common digital computers. It's perhaps inevitable that this would happen; big old mainframes were referred to as "electronic brains" decades ago. But "digital computer" here comes to refer to a class of objects that includes more than modern digital computers made of silicon and metal. Consider that computers might run in different ways that are equivalent in terms of producing the same output for a given input. This is the idea that you can run the same program on different computers. Going even further, it would even be possible to "run" the same "program" on something that is not an electronic computer as long as the relevant symbol manipulation were carried out. So we arrive at an abstract notion of a computer as a universal symbol manipulator. This allows humans to be seen as computers. Since humans think, they must be doing so by computing in the above sense. Though humans don't work quite like modern digital computers do in the sense of manipulating voltage levels of silicon planar transistors, on this view our thinking does amount to some sort of manipulation of symbols. On SSSH then, not only is computation a metaphor for human thinking, humans (their brains or minds) are computers in this sense of universal symbol manipulator.

Leading advocates of SSSH who view mental processing as computation include Zenon Pylyshyn and Jerry Fodor. On this view, human thinking just is computation. The brain encodes its knowledge in the same fashion that a computer may be said to use symbol-structures to encode semantic content (representations of things in the world). Human mental activity is the manipulation of sentence-like symbolic expressions that compose an internal mental language or "Mentalese" (Copeland, 1993, p. 181). Fodor holds that our brains have mental representations that are sort of like sentences or propositions. The sentences "It is raining" or "Il pleut" or "Es regnet" in particular languages can express the fact that it is raining; or rather, all of these sentences in particular languages express the same proposition that you believe. While the proposition is not literally in your brain or mind, a mental representation corresponding to it is, and this mental representation can cause you to act in response, such as by putting up an umbrella (this theory is called the "representational theory of mind"). These mental representations have a structure similar to language (the "language of thought" hypothesis). I manipulate these mental representations purely formally just as a computer manipulates symbols, and this manipulation is truth preserving (the so-called "computational theory of mind").

Now the question arises of whether our authors hold to SSSH or merely SSH. They don't explicitly use these phrases, but given their predominant view of the mind as a piece of software, it is quite natural to interpret them as holding to SSSH. While I don't want to accuse our authors of being Fodor disciples, if they do hold to SSSH this might be what causes them to neglect the particular way the human brain "wetware" works. The massive parallelism of the human brain might be seen as just the way our brains implement symbol manipulation; a robot brain could have the same mind and just implement the symbol manipulation is a different fashion. Our brains and the robot brains are both computers, but different computers can run equivalent programs, with in a sense the "same" program running on different computers, so they see no problem if the robot brain doesn't work in quite the same way the human brain does. As long as the symbols get manipulated in equivalent ways, thinking will occur. As long as the overall computational power is the same, other differences between hardware and wetware probably won't matter. Occasionally one does see the comment, as mentioned earlier from Paul and Cox, that parallelism may be important, and perhaps some robot brain parallelism will be needed for computers to do what human brains do. But this may be more an acknowledgement that perhaps some form of parallelism is the best way to implement symbol manipulation, not that the essence of human thought might be radically different than symbol manipulation. And even then I see no insistence that, for all we know, the parallelism of the robot brain might have to be as extensive as that of the human brain for it to be able to do what the human brain does.

Our authors would certainly hold at least to SSH, that computing is thinking, even if they might not unequivocally embrace SSSH if pressed about it. But adopting SSH might not entail that just any symbol manipulation is thinking. It is not clear how much symbol manipulation must be done for our authors to consider it thinking. The assumption seems to be that once the computer can match the processing speed and power of the human mind, then it will be thinking. But they don't seem ready to embrace every low-level type of calculating as an instance of thought. They are not necessarily all "panpsychists."

Copeland mentions three positions on the relation between the symbol-processing model and connectionism. Implementationalism holds that PDP is just how symbolic computation (manipulation) is realized in the human brain. So the brain, although it consists of PDP networks, is still a computer in the sense of a universal symbol system. This seems to be the position of our authors, though if so they don't seem to be aware that they are just implicitly assuming the truth of this position rather than arguing for it. They also don't seem to be aware that there are alternatives that argue that implementationalism is incorrect or even impossible. What is known as eliminativism (in this context; it is not exactly the same as eliminative materialism later in this thesis) holds that eventually we will see that the brain is not a symbol processor; we will be able to explain all forms of cognition without the use of any type of brain code, such as the Mentalese assumed by Fodor (explained above). This seems pretty far from what our authors are assuming, since they see such a parallel between the software that is running on digital computers and what is going on in our minds. A more eclectic position is that of moderatism, which sees a variety of theories as necessary to explain the brain. Some processes such as language processing and logical reasoning will yield to the symbolic approach, while others such as face recognition and associative memory will be amenable only to pure PDP (Copeland, 1993, pp. 244-247). Our authors might consider being open to this latter position, and to the weak form of SSH. But if so they would have to be careful. An assumption of our authors is clearly that for human-computer mind transfer to work the human mind, if it is not already just a piece of symbol-manipulating software, must be translatable somehow into software of some sort, so that it can be ported around to different platforms. Thus for our authors to be moderatists, the moderatist must allow that it make sense to talk of the software of the processes that are amenable only to pure PDP. If moderatism allows this, then they could be open to such a position. But it may be that the types of human brain processes that can't be captured by the symbolic approach are not readily thought of in terms of software at all, and these processes might then prove not amenable to transfer throughout the many machines humans as software would want to port themselves to.

It seems that on any position that seeks to combine symbol manipulation with parallelism, such as the implementationalism discussed above, inevitably there is going to be unclarity about how these two aspects work together, given our present state of ignorance of exactly what the mind and brain do in the activity of thinking. Though networks can be used as symbol manipulators, they seem very different. However, the difference between PDP networks and symbol-manipulators in general is not clear-cut. This is because the nature of symbol-manipulators (as opposed to one kind called von Neumann machines) is open-ended. SSSH allows the brain's symbol-manipulating operations to be different in radical and presently unknowable ways from that of a von Neumann machine (Copeland, 1993, pp. 220-221). It is not precluded that the brain may be using some kind of PDP architecture to manipulate symbols in some way. However, it could also be that the brain uses its PDP architecture with no symbols and no programs (Copeland, 1993, p. 221). Thus the dispute between implementationalism and its opponents remains unresolved.

My point in this section is not about whether our authors hold to SSSH or SSH, or even whether they are implementationalists or moderatists, but about how our authors may be in trouble by not taking seriously enough the fact that the human brain is so massively parallel. As I stated, our authors are aware of current theories of the brain that have it operating on a massively parallel scale rather than like a sequential processing computer, and they talk of the possible need to build a computer that uses parallel processing. On some mind transfer scenarios presented, by Moravec for instance, the transfer is conducted by building a robot brain isomorphic in some sense to the structure and functioning of the human brain (though it is not clear how fine-grained this isomorphism is supposed to be). But they do often allow that less parallelism in the robot brain than in the human brain could likely be good enough. This is what leads one to think they are implementationalists at heart--the parallelism of the human brain is just the way the symbol manipulation of thinking occurs in humans, but in robot brains it could make do with less parallelism or perhaps none at all.

When I claim that it seems to me that they have failed to take the brain's massive parallelism seriously enough, by this I mean that they fail to appreciate that maybe the brain can do what it does just because it uses this type of processing, and so making do with less parallelism may not be an option at all. They under-emphasize the point that porting oneself around the universe into various platforms, some with less parallelism than the human brain or no parallelism at all, may not be possible. This would explain their concern with Moore's Law continuing to hold well into the next century. Moore's Law, as it has evolved, is about increases in chip density that are relevant to processing power, and in terms of the kind of computers that are being developed to keep Moore's Law in force, this power is in terms of sequential speed. Strictly speaking, as Moore understood it, Moore's Law doesn't have anything to say about parallel processing within a chip, as long as the components on the chip make up an integrated circuit. But Moore's Law as originally understood is certainly not about parallel processing among a large number of chips. It is about getting the most computing power out of a single chip. Thus the concern with moving to shorter paths via narrower circuit widths. Thus the concern with continuing to pack more and more transistors on a chip. Of course our authors would reply that while they talk of Moore's Law, which is not about parallel processing among large number of chips, they think Moore's Law is just an instance of the wider phenomenon of exponential growth in computing power per se, which would cover parallel processing among large numbers of chips. But as I have mentioned, the further one gets from Moore's Law, the harder it gets to provide evidence of any kind of real "law" at work moving us to more powerful computers. It is difficult enough to make the case that Moore's Law is really a law.

Given the way computer chip technology is advancing, if Moore's Law does remain in force and powerful computers of the type our authors envisage do come about within the next half-century, it looks very likely that such computers will be sequential processors or at least make do with substantially less parallelism than that present in the human brain. But consider that a modern computer can already process serially many more times per second than a human brain neuron can fire. We already seem to have more than enough speed now to match what human wetware can do in terms of a neuron firing. Moore's Law could fail right now and we would have more than enough speed to match human wetware in this regard--in fact, we've had it for a long time. So why the need for still more processing power, in terms of more sequential speed, if we have already far surpassed neuron processing speed? Obviously because we haven't yet matched the total processing power of the human brain, even though our current digital computers can far surpass the brain in terms of sheer serial speed. More processing power, in terms of still faster computers, would be useful only if one were trying to accomplish by some sort of sequential processing or minimal parallel processing what the human brain does by massive parallel processing. Because the human brain is so massively parallel, the total number of neurons firing per second is huge, even though individual neurons may not fire very many times per second. Instead of taking existing computer processing speed and recreating the human brain's massive parallelism, it would seem that at least much of the time our authors envisage trying to get the same overall speed by using much faster processors operating in a much less parallel fashion.

To take the brain's parallelism seriously would be to try to estimate what it would take to recreate it, and to also estimate realistically when this might be possible. It would mean not just assuming that we could probably translate the brain's parallelism into some form of sequential computing. It would mean placing less emphasis on Moore's Law as such, or an equivalent trend that focuses just on faster or more powerful processing per se, and instead placing more emphasis on advances in parallel computing. What needs to be considered is not just Moore's Law but any similar trend discernable with respect to increasing levels of parallelism in computing. Consider that the capabilities of the human brain and mind in terms of thinking abilities and consciousness may arise from some physical properties of the human brain related to what it is made of, from the way that the brain is organized and functions in a massively parallel fashion, or from something else we haven't thought of. I think it's safe to say that at this point we just don't know. The robot brain won't be made of the same stuff as the human brain, so we may lose the possibility of such thinking and consciousness in robots if that is the relevant aspect that provides them in humans. But suppose it is not and that fine-grained structural, organizational, and functional identity (in terms of parallelism) is instead what would provide for thought and consciousness in the robot. In such a case, if we don't have the same parallelism, we lose that kind of identity, and so then lose the possibility of the robot being able to think and be conscious like we are. As I discuss later, it may be that we can't ensure that robots are conscious even if their brains duplicate the parallelism of the human brain on a very fine-grained level. But to abandon such fine-grained parallelism really might make it even less likely that the robot would be capable of our kind of thought and consciousness. So in the end trying to build the sequential processing robot equivalent of a fundamentally parallel processing human brain, even if that robot uses some parallelism, means not only that it may not work but that we may have lost our only hope of providing for thought and consciousness.

The problem with the approach that focuses on obtaining just raw processing power in whatever form is that it neglects the possibility that perhaps humans are able to do what they do because of the specific way the brain works. The brain uses organic materials put together in a particular way and apparently operating in a massively parallel fashion. It doesn't seem to separate memory storage and processing components in the way a modern digital computer does. We do not know whether and how human capabilities and thought are enabled by such factors. But when designing a computer to match the intelligence and capabilities of humans, the further one departs from the actual design and materials of the brain, the greater one risks failure to match human brain functioning in all its varied aspects. To take the brain's parallelism seriously, our authors should consider that if we can't match the way the brain works in a robot brain, we might not be able to provide the kind of robot we need. Since, as explained more fully below, we don't seem to be anywhere near being able to match the brain's parallelism in a PDP computer, much less on a von Neumann sequential processor, caution would seem to be more in order than their naive optimism that given enough of Moore's Law, our computers will be powerful enough to allow for the necessary robot mentality.

Taking the composite picture of the comments of our authors distilled into a common perspective, I think that this position of our authors is sometimes perplexing and perhaps inconsistent. Maybe this is a little unfair, since they don't always all speak with one voice, but I think all of them are more concerned to point to such trends as Moore's Law than they are to figure out whether we will really be able to crack the riddles of the brain's type of massive parallel processing. On the one hand, as we have seen above, they worry about getting greater overall processing power. But the way they see us getting more processing power is to extrapolate trends to pack in more chip density, etc. They don't claim that current chip technology should be overthrown and remodeled on the way the human brain works, rather, they seem to be concerned with continuing down that road until speed increases will allow the computer to match the brain. Thus the concern with Moore's Law and switching to atomic computing when silicon etching reaches its limits. But on the other hand, elsewhere, they sometimes talk of building a robot brain by "modeling" it on the human brain. In fact, as we shall see in upcoming chapters, on some scenarios (but by no means all) the robot brain is pretty much an electronic copy, neuron by neuron, of the human brain. Here presumably massive parallelism would be a prominent feature of the robot brain. These two views seem hard to reconcile. It shows they really don't have a definite picture of how to pull off the project of a smart robot that will accept the transfer of a human mind. They have just the vague idea that it will be done, but they don't know how much parallelism will be involved, if any at all. But my advice is that if you really want to make a robot brain copy of the human brain that has a good chance of doing what the human brain does, forget Moore's Law and focus on figuring out how the brain's parallelism works and whether we have any realistic chance of advancing in that area with computer parallelism. We already have enough speed to match the brain--what we need is to figure out its organization and how to duplicate it exactly. But they have limited interest in this.

I fault our authors for being too optimistic that sheer computing power alone without sufficient parallelism will make for the smart robots that we need for mind transfer. But it's not as if they should assume all problems have been solved if they just insist that we use a massively parallel architecture. Apparently we still have a long way to go in getting PDP-connectionist models to match the brain. Copeland (1993, pp. 221-225, 245) summarizes the similarities and differences between the brain and current PDP networks. Similarities include the following:

1.     Individual units in a network are somewhat analogous to neurons.

2.     Human learning appears to involve modification of connection strengths in the brain.

3.     Neurons behave in a roughly similar fashion to networks using input, excitation/inhibition, and output.

4.     Both networks and the brain store information is a non-localized or distributed manner.

5.     Human memory works through content-addressability, and networks can function this way (if part of a remembered pattern is given as input the completed pattern is generated as output).

6.     Networks and the human brain "degrade gracefully."

But there are important differences between current networks and the brain:

1.     The brain features a diversity among neurons not matched in PDP networks.

2.     Neurons are either excitatory or inhibitory, but PDP units have both functions.

3.     It may be that the extensive repetition needed in an artificial network to train it up (give it the appropriate strengths) is not needed in human learning (it is at least debatable to what extent human learning involves such extensive repetition).

4.     Network "learning" requires a trainer, specification of the desired output, and numerous adjustments after observing the wrong output; but nothing quite like this happens when a human learns.

5.     The brain uses fifty different types of neurotransmitter (chemicals carrying signals), which is unmatched in a network.

6.     The brain has an elaborate geometry of connections among neurons to near and far neighbors which is unmatched in a network.

7.     Humans are good at inference, and von Neumann machines are good at inference, but parallel networks are relatively bad at inference.

The suggestion I am making is that our authors not just mention parallel processing but take it seriously as the way to build a computer suitable for human-computer mind transfer. The relevant question then is not how far we can extend Moore's Law, seemingly so as to build a quasi-von Neumann type machine, but how soon we can hope to have the sort of massive parallelism in a computer similar to that operating in the brain. What they should be looking to find evidence for is some kind of "Moore's Law for Parallel Processing."

The Many Aspects of Intelligence

Recall that we are discussing the question of whether computers will soon be as smart as humans. In arguing that they will, our authors quickly launch into discussions of computing power advances and try to estimate the computing power of the brain. But this is not necessarily the obvious place to start such a discussion. If the question is one of how soon computers will become as smart as humans, perhaps our authors should first or at least eventually consider what it is to be humanly smart rather than merely focusing on the recreation of the total processing speed of the brain in a computer. In trying to design a computer/robot for use in human-computer mind transfer, I would think it important to discuss the nature of the human intelligence you are trying to build the robot to have. I find this topic curiously neglected by our authors. Some of them do mention the question of what intelligence is, but it gets short shrift. For example, Kurzweil starts a section entitled "What Is Intelligence?" with the claim that intelligence is "the ability to use optimally limited resources" (Kurzweil, 1999, p. 73). But then he quickly changes the subject to one of how to solve intelligent problems (the answer is to combine simple methods with massive amounts of computation), as if intelligence is basically the kind of problem solving a computer can do. So much for considering intelligence in humans!

I think the reason the subject of human intelligence is neglected is that our authors just assume that once you build a computer with the same overall processing power (or "overall" speed) as the brain, it will be as smart as a human. It's as if intelligence is simply how quick you think, so matching the overall speed will enable the computer to think as quick as a human. But if this is their assumption, it is obviously flawed. With respect to overall computational speed in terms of total computations or firings per second (which they sometimes equate in a very questionable fashion as I have mentioned), our authors would estimate that the typical human brain is still far ahead of the ordinary desktop computer. Yet the ordinary desktop computer can do arithmetic of gigantic numbers far quicker than can the average human, and these examples could be extended to cover complicated calculations and feats of memory (and now even world-class chess if one considers Kasparov's defeat at the hands of an atypical computer). So how can the computer be smarter when it is slower (in terms of overall speed)? Obviously total processing speed in this sense does not track some types of intelligence. Also consider that the brain achieves this massive overall speed by use of neurons that are individually quite slow compared to electronic circuits. Furthermore, it should be obvious to anyone in computer science that hardware speed is only part of the story. Two software programs running on the same platform can produce identical output for input and yet vary in speed.

Again, if you are trying to create a computer/robot to be as smart as a human, wouldn't it make sense to try to understand and characterize what human intelligence is, in all its varied facets? Anyone growing up in a public school system quickly learns that some of the kids in the class are better in math than others, that some of the kids are better readers, writers, or speakers than others, that some of the kids are better athletes, dancers, or actors than others, and that these superior groups do not always consist of the same kids. Intelligence may be more complicated than it first appears.

Our authors should realize that in their endeavor, human intelligence needs more attention than they give it because it is the proper measuring stick of relevant computer intelligence. This is for two reasons. First, the authors depicting an extraordinary future pick the intelligence of humans as the basis of comparison when they say computers will get really smart in the next century. In spite of the fact the computers can beat human performance in some respects--calculating, as in the above example, and perhaps even chess nowadays--there seems the implicit assumption that overall humans are still the smartest being we know about. The second reason human intelligence is the measuring stick for computer intelligence in our context is that it is humans who are going to be doing the mind transfer into computers, and so we want the recipient of the transfer to come out of it at least as smart as when he or she went in.

It should be clear that I think our authors have neglected adequate treatment of the subject of intelligence. They might claim in defense that though this is true, the neglect has not harmed their analysis. I wish to argue that it has hurt their analysis by misleading them about how easy it would be to make a robot as smart as a human. To show this I need to show that human intelligence might be more than just calculating and solving problems. I'll talk about a narrow understanding of intelligence, a broad understanding of intelligence, and some problems in getting a machine to achieve human intelligence.

The narrow understanding of intelligence is what many people commonly assume intelligence to be. It seems to be the position of our authors. On a common sense level many people would say that intelligence, or how smart you are, involves reasoning ability, calculating ability, the ability to understand, and so forth. We have what are commonly perceived of as intelligence tests given to us while in school or when preparing to enter college or graduate school. The Stanford-Binet Intelligence Scale and the Wechsler Intelligence Scales have been available for years, and the Scholastic Aptitude Test and Graduate Record Exam incorporate many features found in IQ tests (Yam, 1998, p. 7). So since we have an intuitive understanding of intelligence, and we have tests to measure it, we might surmise there should be no real problem coming up with a definition of intelligence and a way to test how much of it robots have.

It is not quite as simple as this. There is no universal consensus on what intelligence is, and if we can't agree on what it is, we may not agree on what tests measure it. It may seem hard to believe that the notion of intelligence and its tests can be so controversial, but consider what your view is of some aspects of Sir Francis Galton's test for intelligence, administered between 1884 and 1890. What did Galton use? One test used a whistle to ascertain the highest pitch a person could perceive. He also had people pick up cases of gun cartridges (with the cartridges filled with either shot, wool, or wadding) to see how well they could sort the cases by weight. Another test involved the factor of one's sensitivity to the fragrance of roses (Sternberg, 1998, p. 12). To us these tests probably seem of dubious relevance, but to Galton these tests were good measures of intelligence. We can't say this is because Galton was just stupid; a 1926 estimate of Galton's IQ put it at 200, and even if the method used of estimating his IQ is questionable, he probably was no slouch. This difference of opinion about the relevance of these sorts of tests seems to reflect the fact that we don't all agree on what intelligence is or how to test for it.

Definitions and descriptions of intelligence in the AI literature vary. Sometimes the author of a paper just provides his or her favorite definition, with little or no justification. At other times the approach is more one of "here are several definitions, pick the one you like." A recent AI work by Kelly notes the diverse abilities falling under the rubric of intelligence: ability to reason, use heuristics, do and know what is being done and why, make use of knowledge, accept and interpret information, select appropriate information and apply it to problem-solving tasks, recognize and learn from mistakes, deal with unexpected or unusual situations, learn and adapt to different circumstances, understand, reason, perceive, have insight, be aware of relevance, form adaptive responses, perform tests or tasks involving the grasping of relationships, meet novel situations, carry on abstract thinking, etc. (Kelly, 1993, pp. 38-41.) Another approach, noting that the concept of intelligence may not be one thing for all creatures, prefers to think of intelligence as present whenever an organism has a quorum of the following attributes: verbal fluency, verbal comprehension, spatial visualization, perceptual speed, memory, reasoning, sensorimotor intelligence, symbolic thought, concrete operational thought, formal operations, knowledge how, knowledge that, ability to generalize, ability to learn from the past, ability to act purposefully, creativity, and the ability to notice significant facts (Kelly, 1993, p. 67).

All these approaches above are attempts to define what I think of as a narrow understanding of intelligence. A narrow understanding sees intelligence as a single kind of thing and attempts to define it. As we have seen, definitions vary, but most of them stress things like abilities in reasoning, learning, or adapting.

Besides the above definitions, and the many definitions given over the years in psychology and education textbooks and the like, psychologists have at least several times surveyed experts specifically on the proper definition of intelligence. A famous symposium in 1921 in the Journal of Educational Psychology produced fourteen definitions, many of which emphasized learning and adaptation rather than abstract reasoning alone: the range or flexibility of association, the ability to learn to adjust oneself to the environment, the ability to adapt oneself adequately to relatively new situations in life, the capacity for knowledge, and the capacity to learn or profit by experience. A 1986 update on the 1921 symposium solicited essays by experts in the field of intelligence including the invitation to again try to define its nature. The understandings of intelligence discussed in this update were more diverse still than those of the earlier symposium, but some common themes hold between the two. Adaptation to the environment, basic mental processes, and higher order thinking (such as reasoning, problem solving, and decision making) were present in both. But important new emphases in the update include metacognition (knowledge about and control of cognition), the role of knowledge and the interaction between knowledge and mental processes, and the role of cultural context (Sternberg, 1990, pp. 35-36, 49-53).

Unfortunately the discussion above may be naive in simply assuming that with intelligence we are dealing with the individual. While nonspecialists may assume intelligence characterizes an individual, wider emphases from the updated symposium are instructive of the diversity of alternative understandings. Three main "loci" of intelligence were involved: intelligence within the individual, intelligence with the environment, and intelligence in the interaction between individual and environment. But numerous distinctions were made even further within each of these areas. Individual intelligence was discussed on the biological level, the molar level (the molecular level in the sense of the parts or components making it up), and the behavioral level. Biologically, comparisons can be involved within or among species or generations of species, and involving interaction with the environment. Within organisms the role of structural and evolutionary aspects of the brain or process aspects of the neurons may be discussed. At the molar level, the emphasis may be on cognitive or motivational factors. Cognitive processes mentioned include selective attention, learning, reasoning, problem solving, and decision making. Motivation theorists argue that motivation is involved in intelligence as much as cognition. Motivation to cognize may be relevant to the quantity and quality of cognition. As one might guess, the behavioral level of analysis focuses on what a person does rather than on what is being thought. A major controversy is about the breadth of the domain of intelligence, for example is artistic or dancing behavior part of intelligence or within another domain? Furthermore, theorists do not agree on the relevance of everyday tasks to the issue of intelligence, some saying it is irrelevant to anything important and others arguing that a true understanding of intelligence is found in such mundane behavior. Some theorists view intelligence as residing not in the individual but in the environment as a function of one's culture and society or of one's niche within these. Some would argue that intelligence is wholly relativistic with respect to culture, and so it is impossible to understand intelligence without understanding the culture. Through its labeling and attributional processes, the culture determines the nature of intelligence and who has how much of it. What the culture, society, or niche deems intelligent will generally be a function of the demands of the environment in which the people live, the values held by the people, and the interaction between demands and values. Societal values that are in demand but not easily filled may come to be valued highly. Other theorists would claim that intelligence resides not wholly within the individual or environment but within the interaction between the two. A person may be differentially intelligent within different environments (Sternberg & Detterman, 1986, pp. 3-9).

The above discussion presents a glimpse of a larger understanding of intelligence, one that sees it as more than just one thing. But just to show you that this narrow understanding is still popular, let me highlight a recent article from a special issue of Scientific American on intelligence. This article, by Linda Gottfredson, claims that there is a general intelligence ("g") that is depicted by a person's intelligence quotient (IQ). Gottfredson claims there is clearly a general mental ability, a "global factor that permeates all aspects of cognition," that is measured by IQ tests and acts to predict success and performance in life. It was long ago recognized that in mental tests designed to measure particular domains of cognition, such as verbal fluency, mathematical skill, spatial visualization, or memory, people doing well on one test tend to do so on the others, and similarly for those doing poorly. The overlap or intercorrelation suggests that the tests are measuring some global element of intellectual ability. This general factor, or g, isolated by statistical factor analysis, is now used as the working definition of intelligence by most intelligence experts. Particular tests also measure specific abilities, but these impurities can be statistically separated from g. The g factor is especially important in behaviors such as reasoning, problem solving, abstract thinking, and quick learning. While the concept of intelligence and how people in a society are ranked according to it could be "social artifacts," the fact that g is not specific to any particular domain of knowledge or mental skill suggests that g is independent of cultural content including beliefs about intelligence (Gottfredson, 1998, pp. 24-27).

I turn now from the narrow type of view to consider the broad type of view. Most of the above theories saw intelligence as just one thing, they just differed on what that thing is. The broad view sees intelligence as not just one thing. Howard Gardner, for instance, thinks that human intelligence encompasses a set of competencies far wider than those captured in the notion of g.

Gardner holds that intelligence consists of a wider and more universal set of competencies. He refers to them as multiple intelligences. He developed his view after working with both gifted children and adults who had suffered strokes that shut down particular capacities while leaving others untouched. Each of the people he saw from either group had various strengths and weaknesses, and he noticed that a strength or weakness could exist simultaneously with varied sets of abilities and disabilities in the same individual. Gardner began to think that humans are more accurately thought of as possessing a number of relatively independent faculties instead of having an "intellectual horsepower" that can be channeled in one direction or other (Gardner, 1998, p. 20).

Gardner thinks his view fits in well with developments in various related sciences. Neuroscience knows of the modular nature of the brain, and evolutionary psychology holds that different capacities have evolved in particular environments for specific purposes (Gardner, 1998, p. 21).

Gardner also makes two strong claims about multiple intelligences. First, all humans possess all of them, but second, not everyone has them in the same proportions; we all have different profiles (Gardner, 1998, p. 21).

Of course one could try to reduce Gardner's many intelligences to one thing, as something in common or behind all the different intelligences. Gardner himself says intelligence is "a psychobiological potential to solve problems or to fashion products that are valued in at least one cultural context." But this misses the thrust of his approach. How many intelligences are there? In considering whether a candidate capacity is an intelligence, Gardner drew upon work in psychology, case studies of learners, anthropology, cultural studies, and the biological sciences, and he chose as criteria eight different factors. For a candidate capacity to be considered a type of intelligence, it would have to possess many of these factors. These factors are: potential isolation by brain damage, the existence of prodigies and savants with the capacity, an identifiable core operation or set (such as a musician's sensitivity to melody and rhythm), a distinctive developmental history within an individual of the capacity and a definable nature of expert performance of it, an evolutionary history and plausibility, support from tests in experimental psychology and from psychometric findings, and susceptibility to encoding in a symbol system. He finds there to be eight distinct and independent forms of intelligence or such ability: linguistic, logical-mathematical, spatial, bodily-kinesthetic, musical, interpersonal, intrapersonal, and naturalist. Existential intelligence is under current consideration (Gardner, 1998, pp. 20-21).

Linguistic intelligence concerns the ability to acquire, form, and process language. Included here are abilities in symbolic reasoning, reading, and writing (Wilson, 1998a). High intelligence in this domain means a "mastery and love of language and words with a desire to explore them" (Gardner, 1998, p. 22). Logical/mathematical intelligence concerns the ability to think logically (especially inductively and to some degree deductively, which also seems to involve linguistic intelligence), to recognize patterns (geometric and numerical), and the ability to work with abstract concepts (Wilson, 1998a). It also involves discerning the relations and underlying principles behind objects and abstractions (Gardner, 1998, p. 22). Spatial intelligence concerns the ability to perceive images, recall visually, and imagine visually (Wilson, 1998a). Musical intelligence concerns the ability to create and interpret music and discern differences in speech patterns and accents (Wilson, 1998a). Thus listening is involved, not only composing and performing (Gardner, 1998, p. 22). Bodily/kinesthetic intelligence concerns the ability to control fine and large muscle movement, and to create and interpret gestures and communicate through body language (Wilson, 1998a). Interpersonal intelligence concerns the ability to understand and communicate with others and facilitate group processes. Intrapersonal intelligence concerns the ability to have a strong sense of self, leadership abilities, intuitive feelings, a feeling of inner wisdom, and precognition. (Wilson, 1998a). Naturalist intelligence, recently added as an eighth type of intelligence alongside the original seven, concerns the ability to cope with environment in the sense of identifying and classifying natural patterns, such as those in flora, fauna, and weather patterns (Wilson, 1998b). Existential intelligence, under consideration, captures the human tendency to raise and ponder fundamental questions about existence, life, death, and finitude (Gardner, 1998, p. 21).

Gardner's work is controversial. Some would consider his intelligences merely different talents. To Gardner, this devalues musical or body-kinesthetic abilities by implying that orchestra conductors and dancers are "talented but not smart." "In my view, it would be all right to call those abilities talents, so long as logical reasoning and linguistic facility are then also termed talents" (Gardner, 1998, p. 21). Gardner is recommending a wide notion of intelligence, one that incorporates a "range of human computational capacities, including those that deal with music, other persons and skill in deciphering the natural world." But he still wishes it not be conflated with other virtues such as creativity, wisdom, or morality (Gardner, 1998, pp. 21, 23).

If the above long discussion of intelligence proves nothing else it shows that we should not take for granted that there is a consensus on the meaning of the term "intelligence." As mentioned several standard tests have been developed that test for IQ, but whether these are good tests for intelligence depend on one's understanding of intelligence. If you think intelligence involves g, then IQ tests will be good tests for intelligence. If, like Gardner, you think that intelligence is far more diverse, then IQ tests will not be good tests for all the varied kinds of intelligence. If, with some of the approaches mentioned above, you think that intelligence involves the environment and society as much as the individual, then you will probably think the standard types of IQ test do not test for the right kinds of traits and abilities and may be culturally biased in favor of a particular society's notion of intelligence, and they may neglect the importance of the role of motivational factors.

Can our authors get away with insufficiently considering the subject of intelligence? I tend to think not, but I'm not entirely sure. It depends on what is going to happen to bring about the extraordinary future. Part of the extraordinary future involves creating smart robots. If this is to come about through traditional artificial intelligence means, then it would seem relevant to know what human intelligence is and how to test for it. Otherwise, how do you know what to build into the robot? As we saw above, traditional artificial intelligence has thought it necessary to grapple with the notion of intelligence at least to some extent, whether or not they have advanced the issue.

On the other hand, some depictions of the extraordinary future see us creating smart robots on the fly as we transfer our minds into them. The way this happens is that an electronic (or whatever) copy is made of the human brain as the brain is scanned. We will see in a later chapter that the subject of exactly how this is supposed to occur is complicated, but on some versions we don't really have to solve the problems faced by traditional artificial intelligence efforts in creating smart computers. We may not even have to know what human intelligence is. All we have to do is to translate the programs running in the human brain into their equivalent in a robot brain. (This assumes there are such programs.) Then again, in the testing of this equivalence we might be faced with questions of what exactly we need to test, and then the question of intelligence might rear its head after all.

In one respect we certainly need to get a handle on what intelligence is and how to test for it. Our authors frequently claim that robots will not only equal human intelligence but will far surpass it. I don't know how you could show this, or even provide the statement with a concrete meaning if challenged, without coming up with some characterization of intelligence and how to test for it. Giving the robot one of the standard IQ tests obviously presupposes that such a test tests for intelligence. But could you really give a robot a standard IQ test and derive meaningful results? Standard IQ tests are of course very anthropocentric, expecting you to have grown up in a world of human culture, understand human language, etc. So they don't rate you on any absolute scale of intelligence that would, for example, let you know how much smarter you are than non-human creatures. In other words, I might know that I am smarter than the next guy, for instance, but I don't know how much smarter I am than a particular chimpanzee, or how much dumber or smarter I am than my PC. Likewise the standard IQ test might not let me know how much smarter than I the computer is. Even Gardner's wide notion of intelligence is meant to characterize human intelligence--he is not considering the intelligence of other types of beings. In the above discussion of diverse approaches to intelligence we observed that some psychologists were interested in comparisons of intelligence among species, but there is probably not some clear-cut test available whose results allow us to compare diverse creatures. We could specify an ad hoc indicator such as the ability to use language, but this would probably be seen as presupposing a particular narrow conception of intelligence that might be biased against particular species. It does not seem that there has been much demand for the development of such a cross-species test, though I guess someone or other working in the field of animal intelligence has tried to devise one. So while it is or should have been the intent of IQ tests to refrain from cultural, ethnic, or racial bias, one can hardly fault the authors of such tests for leaving them biased species-wise. Computer keyboards might be a real problem for other species to master too, but there hasn't been much demand for typing parrots, for example, so one can't really fault keyboard designers for biasing their products against birds. But if we get the robots of the extraordinary future, won't we want to try to work on this issue?

There is still a further reason for our authors to pay attention to intelligence. Human-computer mind transfer is seen as a way to personal immortality, but it is also seen as a way to make one smarter. If I am transferring my mind into a new brain, why not make it a better brain, one that makes me smarter? Our authors discuss this at great length as their imaginations run wild. After the transfer you can give your brain all of human knowledge, and spend your time calculating things that were previously way beyond your abilities. We will see such talk in later chapters. Obviously the improvements we would wish to make to the robot brain, vis a vis the human brain it replaces, would have to do with making the person transferring in to be smarter. But how could we know in what ways to change the robot brain to make it smarter without knowing what intelligence is or how to test for it?

So where are we at this point? I have discussed the many definitions of intelligence above, especially Gardner's, to emphasize the varied types of intelligence (or the varied talents and capabilities in which one uses intelligence) humans possess. Certainly the assumption should be that any robot we build for human-computer mind transfer be capable of exercising these kinds of intelligence. And any robot presented for inspection as a candidate should be able to pass tests to show this, if such a thing is possible. Certainly if we make the claim that computers will be as smart as humans in 2020, for instance, we should back this up with an explanation of what we mean by "intelligence" and how we will know when computers have it. This goes as well for any claims that robots will be smarter than humans or that we will be able to arrange it that humans after the transfer will be smarter than they were before.

The Turing Test

The gist of the above discussion is that it is probably important for our authors to arrive at the proper characterization of intelligence. But some AI theorists believe it is not important at all. Haugeland asks "How shall we define intelligence? Doesn't everything turn on this?" and then answers "Surprisingly, perhaps, very little seems to turn on it." This is because "for practical purposes" we already have all the test we need in the Turing test, a criterion which "satisfies nearly everyone" (Haugeland, 1986, p. 6). I think Haugeland is being too optimistic here. To claim that the Turing test is a good practical test of intelligence clearly presupposes a particular view of intelligence that may not be correct, whether for the purposes of our authors or artificial intelligence efforts in general. For instance, it may not be a very good practical test for Gardner's musical intelligence.

In some cases we can imagine how to test computers or robots against humans to determine whether one has more of something. There could be a calculating contest, for example, that might easily show who is best. But in other cases, it's not as clear, particularly if the human and robot do not have the same type of bodies. For example, there are certain industrial machines, whether fork lifts, robotic arms, or cranes, that are clearly stronger than humans when it comes to lifting heavy objects. They can lift thousands of pounds. The most a human can lift varies with the type of lift. Consider the type of lift known as a "deadlift," where you just stand up from a crouch while picking up a heavy barbell from the ground without trying to lift it over your head or even to your chest. The most the strongest human can manage is going to be on the order of 900 pounds or so, and the average human will struggle with a few hundred pounds. But while a machine is often clearly stronger than a human, you might have trouble coming up with a suitable test to demonstrate this. How exactly does a forklift do a dead lift when it can't even crouch? While we are not interested in strength tests per se, but rather intelligence, we might run into similar problems devising tests for showing that a computer has all of the various types of human intelligence if the computer does not have a very human-like robot body. Consider Gardner's notion of bodily/kinesthetic intelligence as the ability to control fine and large muscle movement, and to create and interpret gestures and communicate through body language. Can a computer without a human-like body have this type of intelligence? If it could, how do you demonstrate this? No one considers a PC especially talented at providing an interpretive dance, for instance. Given the apparent multiple types or aspects to intelligence, it's certainly difficult to imagine one test that would do the trick. Gardner himself is very skeptical of the ability of any standard type of intelligence test to accurately measure intelligence.

Alan Turing imagined a test that might help us out. This test has of course come to be called the Turing test, so we should consider whether it fits our needs. (We could use it to determine whether a computer is intelligent enough for us to use it for mind-transfer, for instance, and then use the test again on the post-transfer being to see whether it really is still intelligent and thinks.) What has come to be called the "Turing test" first appeared in a very famous paper by Alan Turing in which he discussed the question of whether machines could think. That particular discussion of whether machines can think is very short--Turing spends one paragraph on this question before replacing it with another. The question of whether machines can think apparently had terms too vague or ambiguous (such as "think" and even "machine") for Turing's patience. Perhaps it's just as well, because it's not clear to me what Turing had in mind by the term "think." He might have meant "think" in the sense of having consciousness. I wish he had considered the original question, which I think may be important to answer if "think" involves consciousness. In any event, later in the article he declares the question too meaningless for discussion. The new question he replaces it with has to do with whether a machine could fool humans in a variation on a certain sort of "imitation game" that was popular at the time.

The imitation game Turing describes is played with three people: a man, a woman, and an interrogator. The object is for the interrogator, who is in a separate room from the man and woman, to determine by the end of the game which is the man and which is the woman by asking questions of them (via teletype or some other means that conceals their voices). One player is supposed to try to help the interrogator guess correctly (in Turing's example, the woman) and the other is supposed to try to help the interrogator guess incorrectly. Turing thinks that the player trying to trick the interrogator will want to lie, if that helps, but the other one will likely tell the truth (Turing, 1950, pp. 53-54).

Now what Turing wonders is what will happen if a machine (computer, for our purposes) takes the place of one of the players (in Turing's example the man). The question "Can machines think?" is now replaced with "Will the interrogator decide wrongly as often when the game is played like this as he does when the game is played between a man and a woman?" From Turing's description, one wonders if perhaps the machine's job is to try to trick the interrogator into thinking the machine is male, rather than female, but other remarks suggest that the interrogator is rather trying to guess which one is the machine and which the human. In Turing's example, the man is trying to fool the interrogator into thinking he is a woman, and so when he is replaced, we have the machine trying to fool the interrogator into thinking it is a human. The best strategy for the machine is probably to try to provide answers that a human would give; obviously showing amazing speed in calculations would catch the machine out pretty quickly (Turing, 1950, pp. 54-55).

How will the machine win the game? Turing's above comment suggests that it will win when it can match the frequency with which humans fool the interrogator about their sex. I do not know what that frequency is. One would think that the interrogator would have a fifty percent chance of getting it right on guessing alone, so if the interrogator got it right sixty percent of the time, for example, the player would have done a pretty good job of fooling him or her. Turing's opinion was that in about fifty years from when the article was written (it was published in 1950), computers would be able to play the imitation game so well that an average interrogator would not have more than a seventy percent chance of making the correct identification after five minutes of questioning. By that time we will be attributing thought to machines (Turing, 1950, p. 57). I do not know how he came up with these numbers, but I imagine that with any questions fair game to be asked of the machine, if the machine can fool the interrogator the requisite 30% of the time it is doing a pretty good job of at least imitating thinking, if not downright thinking.

Turing considers a number of objections to his claim that we will appropriately use the term "thought" to refer to such machines as can win the imitation game. Many of them are the kind of objections to machine thinking that are sort of old hat nowadays. I will describe a few that I will refer to later in the thesis.

The theological objection claims that machines can't think because thinking is a function of the soul, and God gives souls only to humans. Turing is "unable to accept any part of this" but tries to reply to this camp on its own (theological) ground. Wouldn't this restrict God's omnipotence? If a mutated elephant had a capable brain, wouldn't God have the freedom to give the creature a soul were he wont to do so? Why not for a computer then? We wouldn't be usurping God's power to create souls any more than we are when we take the appropriate steps to give rise to a child (Turing, 1950, pp. 57-58).

The argument from consciousness claims that machines will not be able to match our power to be conscious, or at least that we will never know that they will, and so they cannot be said properly to think. Turing thinks that we will not need to answer the question of machine consciousness to decide the issue of whether a computer thinks, partly because the only way we could ever really know whether anyone is conscious would be to be that person, which we cannot do with other people (Turing, 1950, pp. 59-61). I agree that we won't have to know about computer consciousness to decide the question of computer thought, but I don't think it has anything to do with Turing's reason. Rather we can give some meaning to the notion of unconscious thought. More on this later.

Lady Lovelace's objection amounts to the charge that a machine can never do anything original. Turing notes that we can't be sure that originality on the part of humans is anything more than humans producing unexpected outcomes from the application of earlier teachings or general principles (Turing, 1950, pp. 63-64). Again, I think Turing is right here, and will bring this up in a later chapter.

The argument from informality of behavior claims that there is a crucial difference between humans and machines with respect to following rules or laws of behavior. We cannot be machines because our behavior can never be captured by a system of rules, which is what machines must follow. Turing thinks we cannot be sure there are no such rules yet undiscovered by science (Turing, 1950, pp. 65-66). Whether human mentality can be captured in a set of rules seems to be a significant topic in artificial intelligence but we will not have time to go into it.

To my knowledge no computer has yet passed the Turing test, though of course some computers have fooled some people, as in the case of ELIZA. Given this fact, two questions come to mind. First, will a computer ever pass it? The second question is "What would it show were a computer to pass it (what is it a test for and is it a good one for that)?"

But before we get to these questions, we might pause to consider how seriously we should take Turing's claims. Turing seemed to think that a computer passing such a test would be considered to have been thinking. This would mean that passing the test was a sufficient condition of possessing thought or intelligence, or whatever he considered the test to be a test for. And he predicted that this would happen by the turn of the century. Since no machine has really passed the test as he conceived it, and we still seem years away from having a machine that will, he was wildly off in this prediction. By all accounts Turing was a brilliant fellow, but the utter failure of his prediction to come true makes him look like a little foolish. Maybe there is a lesson for our authors here.

But it is not clear how serious we should take the prediction. A friend of Turing who discussed the article with him claims that Turing thought of the article as basically propaganda (Gandy, 1996, p. 125). Turing was enthusiastic about the notion of machine intelligence and wanted to spread this enthusiasm. As well, it should be noted that at the time Turing wrote the article there were only four computers in existence: in Britain the Manchester Mark I and Cambridge EDSAC, and in America the ENIAC and the BINAC (Copeland, 1993, p. 9). How accurate and even how serious could such a prediction have been given the primitive state of computers when it was made? Consider the seriousness of someone making an analogous prediction that by the end of the century a land vehicle would break the speed of sound by going over seven hundred miles per hour, if the prediction were made at the turn of the century when only a few cars were around and the (then) current record was sixty-five miles per hour. How could someone have enough evidence upon which to base such a prediction? (Incidentally, the speed record I refer to is held by a jet-powered car, while the record at the turn of century was held by an electric car. Many records in between were held by cars powered by internal combustion engines. In 1900, who could have predicted jet engines?)

Just as we shouldn't make a big deal about the inaccuracy of Turing's prediction, perhaps we shouldn't worry too much about some details of the test. Just where did Turing get those numbers of seventy percent and five minutes? Why not sixty percent and ten minutes, which would be a little harder for the machine?

Note also that Turing did not say that if a machine passed the test this would show the machine could think like or as well as a human, only that we should say it could think. But in the case of human-computer mind transfer and predictions about computers equaling and even surpassing human intelligence, we really are interested in whether the computer could think as well as a human. (As I argued above, we need to know something about intelligence to provide for this.) So we might wish to make the test even harder--the interrogator must guess correctly only fifty percent of the time in a test extending for an indefinitely long period--say, several days! We want a test such that passing it will show the machine can think (or possess intelligence) at least as well as a human, not just that it can think.

But the Turing test is widely discussed and so I will consider it to see if it is what we want. Our first question, then, is that of whether the Turing test will be passed by a computer. Our authors think so, and think it will be passed within a few decades. Though this thesis considers human-computer mind transfer and not computer intelligence per se, if the extraordinary future comes true when our authors say it will, then the Turing Test will soon be passed. On the other hand, if their claims that smart robots will be around soon are a bit optimistic, as I argue, then who knows when the Turing Test will be passed, if ever?

Why is it that no machine has yet passed the Turing test? There are several reasons. First, the computer would have to "speak" a natural language in order to converse. Getting a computer to understand and generate sentences in a natural language is an extremely difficult task. Just consider problems in understanding the meaning of a sentence. The computer has to take each word and determine the part of speech. Some words can be any of several different parts of speech, so the larger context must be used to eliminate this ambiguity. The sentence must be classified as a statement, question, exclamation, etc. All these steps must be done to arrive at the syntax of the sentence. But what does it mean? The computer must have some way of determining the semantic meaning of each word. Again, many words have numerous meanings, so context must be used to disambiguate the sentence in this regard. The words must be put together to determine the semantic meaning of the sentence. But this may require a consideration of the overall context--the place of the sentence in the larger conversation. In a typical conversation, we assume that the other conversant has background common-sense knowledge of the world, and so our comments are not always completely explicit. We often leave things unstated, assuming the listener will supply what is needed to make sense of the remark.

We do the above process constantly in understanding the utterance of a sentence, but how do we get a computer to do it? We can't just supply a dictionary and leave it at that, for this will not enable the computer to do the requisite disambiguation. We need to provide the relevant background knowledge. But it is not clear how to do this. How do we organize all the facts about the world that we already know and present them to the computer? How does the computer determine which facts it needs for disambiguation? How does the computer find the facts it needs when it needs them in a split second (without searching through all of them)?

Assuming we could do all this, we have to provide the computer with some mechanism by which it can generate sentences to keep up its end of the conversation. It might have a lot of facts, but will it be able to respond to questions and comments the way a human does? Humans have first hand experience of the world, their feelings, etc. Would this be in the database that the computer can access? Can how a toothache feels be put in the database for a computer to access? Will it understand what the pain is like if it can't feel (we're not saying it can or can't)? Will it really know what words refer to if it has never had any first hand experience of the world? Maybe what needs to happen is for the computer to become a subject in its own right, and train it about the world through some massive learning process to recreate what we all go through in childhood. Do we need to provide different kinds of sensory apparatus for it to do this? Even were we to get a computer to "speak" a natural language, it might not be able to pass the Turing test unless it could learn first hand and gain experience of the world. We might think of all these hurdles that need to be overcome for a robot or computer to really be as smart and able as a human as the "classic" or traditional problems in AI.

The above considerations would seem to apply if we are to get robots smart enough to pass the Turing test. Our authors provide no insight on how to overcome these classic problems in artificial intelligence. They seem pretty unconcerned about them. But we do have to distinguish between robots who get smart on their own and robots who are smart because their brains are copies of the human brain. It does seem that our authors believe that robots will be smart before we do any such copying, though it is still not clear how this will occur. But conceivably if somehow we could create a robot simply by copying a human, so it might be assumed, we could do so without solving such problems. If a robot is created by copying a human, then we don't have to solve the classic problems in AI, because they will have been solved in whatever way the human brain has already solved them.

Here we see what seems to me a tension within the depictions of the extraordinary future by our authors. Through all the varied discussions there seem to appear two distinct methods of getting a smart robot. First is the method of gradually solving the problems faced in AI research in the process of building better and better robots. The robots gradually get more capable until sometime between 2020 and 2050 they are as capable of humans, after which they get even smarter. The second method is just to create a robot by modeling the robot brain on that of a human. There are various scenarios for this that we will discuss in a later chapter. But basically we start with a detailed knowledge of the human brain and how it works, reverse engineer the software running on it, and then write the equivalent robot brain software as we build equivalent robot brain parts. What we get on this view is sort of an electronic (if that is what the robot brain medium winds up being) copy or replica that is equivalent in output, function, structure, algorithms, etc. to the human brain. It might be smarter because it will be running on better hardware (faster circuits, etc.). Or in the copying process, we might be able to improve performance in other ways, such as writing more efficient algorithms. Just as an aside, it is not clear that just replacing human wetware with some sort of electronic or atomic equivalent is going to make the robot smarter. We all have the experience of knowing people who seem smarter or dumber than we are, but we don't think this is due to such people having "circuits" that run faster or slower than do those of our brains. So likewise, faster circuits may or may not be what makes for more intelligence.

The tension between the two methods above for creating a smart robot is that we seem to need each method to be finished in order to get the other one working, which is impossible. Our authors don't really supply crucial details about how we are to do either, much less both. But they do jump around from method to method, and among various options within each method, without saying how exactly the details are to be pulled off.

Consider how each method is to work. On the first method, we solve the classic problems of AI. How is this accomplished? Of course our authors don't know, because if they knew, they would publish the results and build the robots. Paul and Cox mention the possibility of a bottom-up approach that recreates evolution, but details are not provided. So instead we get a depiction of gradual developments during the next several decades showing the arrival of computers and robots that are more and more capable, without any concrete details of how we are going to accomplish the amazing feats we will have to accomplish. Moore's Law, etc. says computing power will increase, and nanotechnology will come to the rescue of silicon limits, etc. so we will just be able to do it. But we don't really have any reason to know that what seem insuperable problems in AI are just going to be solved that easily, do we? What we need, of course, is the supersmart robots to already be here to show us how to do it. So we really need the other (second) method already done to give us the robots to overcome these AI and materials problems of the first method.

On the other approach, we just copy a human brain and make a robot equivalent. This bypasses the need for solving the problems of AI, because in whatever way our brain does this, the robot will do this. So on this approach we don't have to slog though decades of AI breakthroughs. But how on earth are we going to pull off this second method? Where do we get that incredible knowledge of the human brain functioning and software design to reverse engineer the code and write the robot brain equivalent? It may be that extensive knowledge of the brain alone, even assuming we had this, would not do it, because we need to write the software of the brain and then the robot equivalent. But how could we possibly know what kinds of code do what kinds of things in the brain (to enable humans to do certain things, like access knowledge when we want it), and then know what kinds of code are going to do the equivalent in a robot brain made of different materials, without having worked through the solving of the classic problems of AI? I'm not sure we could. You can see a human do something, and observe neural activity in the brain when he or she does this, but that alone doesn't tell you what kind of code should be written for the brain or for the robot equivalent. This is to assume we need to write any software at all. The software in the human brain, if there is any, seems to be already hardwired into its structure. So if we somehow create a robot equivalent, there might not be any additional software needed. But on one of Moravec's scenarios for mind transfer, we have a supersmart robot surgeon who instantly analyzes chunks of human brain, reverse engineers the code, and then writes the equivalent code for parts of the robot brain that are being created and installed as we go. Why do we need that robot? Apparently it is too complex for a human to do. Where does that supersmart robot surgeon come from? Or at least where did the knowledge come from of what code would allow a certain brain part to enable a certain human skill, and what equivalent code should be written for the robot, so that we can give such knowledge to the robot surgeon? If we know so much about the code, how did we arrive at that knowledge? Or if we know so little about the human brain that the supersmart robot has to reverse engineer the code and then write the robot brain equivalent, then how did we create that supersmart robot in the first place? Here, it seems to me, there may be the implicit assumption that the first method can come to the rescue--we already created that robot through the gradual process that overcame all the classic AI problems! Or at least we got to be such good coders by solving those classic AI problems and learning to build supersmart robots gradually.

But this seems to be going around almost in a circle. When we press for details about how we are going to do the first method, we get no reply. Though the authors don't explicitly say this, what we need is the help of the supersmart robot that we will be creating, or at least the supersmart robot created by the second method. When we ask about how we are going to do the second method, it is assumed that we will be able to do it because we will have learned crucial details by already having done the first method. I may be exaggerating a little here to make this seem a circle, but not much. There do seem to be these two methods under discussion at the same time. Of course, some of this is to be expected, because we are really trying to blend the views of three different sets of authors, and they don't all say the same thing. But no one really explains how each method is going to be successful without the results of the other.

To return to our questions, what would it show were a computer to pass the Turing test (in the original form or our more difficult amended version)? This is our second question. Passing the Turing test is not supposed to be a necessary condition of having or displaying intelligence (perhaps some intelligent things couldn't pass it), but it is supposed to be at least a sufficient condition of something. I'll make three points about this. The Turing test is not a good test for our purposes because, first, strictly speaking, passing it really can't be a sufficient condition of any kind of intelligence, second, even if it were a good indicator of intelligence, it would be so only in a certain sense of "intelligence, " and third, if it really is to replace the question of whether a computer can think we want it to resolve many questions we have about whether computers can think, and it does not do that.

One reason that passing the Turing test or any similar test can't be a sufficient condition of intelligence is that it is logically possible to pass it by sheer chance. This is a trivial point but I have not seen it mentioned. A computer could through sheer chance produce just the right responses during its trials with any number of guessers, say 100, that it could pass any test like this that you could throw at it. It is logically and physically possible that such a lucky computer could pass such a test for days on end through chance, but clearly it would not be intelligent. But this does not really show that the test is seriously flawed. In any kind of test we take in school it is logically possible for students to score a hundred percent through sheer guessing, even for tests that require a written response (guess the letters for the perfect answer). But no one considers such tests flawed on account of this reason. This way of passing a test is so highly improbable that no one worries about it, of course. But it is not impossible, so passing the test does not absolutely rule out lack of intelligence.

Even if the above remote possibility does not preclude the Turing test from being a good test, there are other reasons to think passing it is not sufficient to indicate intelligence. One such reason is that a Block machine could pass the test and not be intelligent. A Block machine contains a great number of possible questions and plausible responses to those questions. When the Block machine is presented with a question, it searches through its database and finds the question, and then randomly chooses one of the possible responses. When the guesser replies to that response, the Block machines goes through the same process to find a possible response, etc. Since the Block machine contains all possible questions (say, those up to 100 words in length), and numerous responses to these, the machine could fool the guesser indefinitely. The Block machine could pass the Turing test through this sort of brute force method, but since it clearly is not intelligent, so the objection goes, the Turing test is flawed. This may seem to be the same objection that I made above, but it is not, since a Block machine is not a random guesser.

Actually, though, what is controversial is whether a Block machine could pass the test, due to physical and technological limitations. Robinson claims that given the number of responses the Block machine would have to contain, and how fast it would have to be operate to locate a plausible response in a very short interval of time, the machine could never really pass the test! If a fast enough machine could be built, it would pass the test, but since such a machine is technically infeasible and can't be built, we don't have to worry about an unintelligent Block machine passing the test. So the Turing test could still be a good test for intelligence. (Note that the Turing test is supposed to be a test or criterion of intelligence, not a definition of intelligence.) Robinson's argument might be correct for current technology. One might try to argue that future technology might enable such a machine to be built, but Robinson thinks not (Robinson, 1992).

While Robinson claims the Block machine to be impossible, Copeland is not so sure that this means the Block machine example is irrelevant. Even if it were not technically possible to build such a test-passing Block machine, the fact that it seems theoretically possible for a Block machine to pass the Turing test would show that the test is not quite right. I tend to agree and think this is another reason to think the Turing Test is not a good test. Copeland calls the Block machine possibility the "black box objection." What matters for intelligence is not only that the Turing test is passed, but how it is passed, and the "how" should become part of the test. The original version of the test considers conversational output as the only criterion. Copeland thinks this output criterion needs to be supplemented by a design criterion, which come in two versions. One version is that the program should do the things it does in a way broadly similar to the way those things are done by the human brain. In other words, there should be a high-level equivalence between the way the computer does things and the way the human brain does them. The equivalence does not have to go all the way down. At this high-level, for instance, one could say that a modern computer and an old valve machine are executing the same program (Copeland, 1993, pp. 50-51).

This requirement for the computer to be of anthropocentric design is the strong version of the design criterion. Copeland defends it on the grounds that, first, the Turing test is not supposed to be a litmus test for thought, specifying only a sufficient condition, not a necessary one. If chimpanzees and some computers fail, this does not show they can't think. Second, Turing's test is anthropocentric anyway in that it specifies the test for thought in terms of producing outward behavior indistinguishable from that of a human, so the strong version of the design criterion is in keeping with this spirit (Copeland, 1993, pp. 50-51).

I'm note sure what to make of this strong design criterion. Whether or not we make the Turing test this strong depends on how strict we want to be about the term "thought" and our test for it. We run the risk of excluding some intelligent beings who don't think like humans do. Should we allow the term "thought" to apply in such a case? This question will reappear in a different form when we consider the mind-body problem, when we wonder whether we wish to attribute mental states to a robot on the basis of their output being similar to that of a human (functionalism) or allow that they don't have particular mental states unless the robot brain states are the same as human brain states (type-identity materialism). The general view of our authors seems to be that the robot brain will run in a roughly similar fashion to the human brain (parallelism), so perhaps it will pass this strong version of the test. It depends on how far down the similarity has to hold. We have seen above that though our authors have not been explicit about the degree of parallel processing, what they have in mind may or may not hold too far down, depending on the scenario. And their obsession with speed may be an indication that they do not necessarily envision robot and human brain similarities holding very far down at all.

The weak version of Copeland's design criterion is that the program or computer must be of modular design. This means that it must in principle be capable of being "zipped whole" and piggybacked onto other programs, for example, those running the sensory systems of a robot, with the new whole forming a functioning program of greater complexity. This is weaker than the insistence that the program operate analogous to a human brain, but it is strong enough to guard against passing the Turing test through trickery along the lines of a Block machine (Copeland, 1993, p. 51). The robots envisioned by our authors in the extraordinary future probably would pass this requirement.

My second point above is that even passing the Turing test would indicate only certain kinds of intelligence. Let's suppose that a computer passes the test, and passes it by operating a certain way analogous to the way humans operate (and not like a Block machine). This may indicate only that the machine has a particular kind of intelligence, or intelligence in a certain domain. The Turing test is all about conversational ability. If intelligence is as broad as Gardner thinks it is, then some kinds of intelligence may not be tested for adequately using the Turing test. As I already questioned, how would this test determine the extent to which one was a dancer, or had any bodily-kinesthetic intelligence at all, for instance? It seems a disembodied computer might be able to converse about dancing, but that does not seem to be what Gardner means by bodily-kinesthetic intelligence. In this case passing the test would not be sufficient to indicate the requisite possession of the relevant type of intelligence. The ability to talk about dance is not the same as the ability to dance, so instead of trying to make the Turing test be all things for all purposes, perhaps one should consider another type of test. I guess one cannot fault Turing for failing to consider this--the kind of intelligence he had in mind was associated with thinking and the use of language. (In fact, he never even claimed the Turing test was a test of intelligence, but rather a substitute for discussions of whether the computer could think.) But Gardner thinks all humans possess all kinds of intelligence to some degree, so what we want is a test such that passing it is sufficient to indicate the possession of all types of intelligence, and the Turing test does not do that. I am not arguing here that Gardner is correct, but we don't want to rule out his position from the start, which we would do if we used an intelligence test that could not test for all the types of intelligence he mentions. Maybe we need to supplement the Turing test with another test.

My third point about the Turing test is that it is not clear that it is a good test for our purposes because if it really is to replace the question of whether a computer can think we want it to resolve many questions we have about whether computers can think, and it does not do that. For instance, even if a computer passes the Turing test in the right way, it may be that we still will consider open such questions as whether it is consciously thinking, or whether it "really" has original intentionality. To consider this objection I need to more fully discuss some distinctions related to computers and thinking.

As humans, we commonly take ourselves to engage in the activity called "thinking," which might be taken to include matters such as believing something, desiring something, and so forth. We take ourselves to engage in conscious thought, and many people believe we also engage in unconscious thought. Turing apparently did not believe it was fruitful to continue to discuss the issue of whether a computer could think. His suggestion that we should consider this question to be resolved affirmatively in the case of a computer that passed the Turing test might indicate that one should take "thinking" to be the carrying out of a certain function on the part of the thinker. On the other hand, one might take "thinking" to either or as well refer to the having of certain inner states of conscious awareness or the like. We will discuss this a little more in the next chapter of this thesis.

But one thing I want to address here is what it might mean to say that whether a computer thinks is an issue that calls for a decision rather than a discovery. It is not clear to me what this means because it could mean any of several things.

For example, Copeland argues that the question of whether a computer can think can remain open even after all the relevant facts are known because it is an instance of an application of a concept to a relatively new sort of case. This sort of case was not envisaged when the concept was framed. Our concept of thinking was formed in the context of application to natural living organisms rather than artifacts. What we need to do is decide whether the notion of a computer thinking best fits the purposes for which the concept of thinking is employed (Copeland, 1993, pp. 52-54).

What Copeland seems to mean is that "thinking" is linguistically indeterminate (for want of a better phrase) when applied to computers. But there seem to be other kinds of indeterminacy that might be meant when it is said that a decision rather than a discovery is needed. To say that a decision is called for rather than a discovery suggests either that (a) the issue is metaphysically indeterminate, (b) the issue is metaphysically determinate but epistemologically indeterminate, or (c) the issue is metaphysically and epistemologically determinate but linguistically indeterminate. To illustrate my use of these phrases, consider the question of whether "Katie is imagining a pink elephant" at a particular time (assume we agree about which person named "Katie" we are talking about). Suppose I say the issue of whether Katie is imagining a pink elephant calls for a decision rather than a discovery. What could I possibly mean? (a) If what I have in mind is that the issue is metaphysically indeterminate, I mean that though we all agree on the meanings of the terms "pink," "elephant," "imagining," etc., and we can ask Katie whatever we want, with nothing lacking in the way of her answering our questions, there is no fact of the matter about the issue. So, it could be argued, this situation would call for a decision rather than a discovery. But this is not how we usually think about things in the world. In everyday life we would claim that the question about Katie is metaphysically determinate; it's a fact that she either is or isn't imagining such a thing. (b) If I say that the issue is metaphysically determinate but epistemologically indeterminate, then I mean that though we all agree on the meanings of the terms involved, and though there is a fact of the matter about whether Katie is thinking of such a thing, we cannot find it out. Perhaps we aren't sure whether to believe her answers. In such a situation it might be said that since a discovery of the truth is not possible, a decision is called for. (c) If I say that the issue is metaphysically and epistemologically determinate but linguistically indeterminate, then I mean that though the physical facts about Katie are all fully determinate, and we can know what these are, we are just unsure about whether the term "pink" or perhaps "elephant" applies in this case. Here too it might be said that a decision, rather than a discovery, is called for.

So in what sense does Copeland think that the question whether a robot thinks calls for a decision rather than a discovery? His comments above seem to indicate that he thinks the issue is linguistically indeterminate, at least until we decide that we can apply the phrase to computers. To see this, note some further comments he makes about the concept of thinking. Copeland considers three possible uses for the concept of thinking. The first is to pick out entities having an "inner awareness." Involved here is a phenomenological distinction. But Copeland thinks that this is not a good definition of thought because of the coherence of the notion of unconscious thought, which is thought without that inner awareness. (I am not objecting here to Copeland's view that humans do engage in unconscious thought.) The second possibility is for "thought" to refer to a biological distinction, with it picking out those organisms with higher brain processes. But this won't work because there could be extraterrestrial thinkers having no such brain processes. (More is involved here than Copeland lets on, such as whether one's position on the relation of thought to brain processes is that of a functionalist or a type-identity materialist, but this discussion will have to wait until a later chapter.) The third possibility, which Copeland likes, is that thought distinguishes organisms with flexible inner processes allowing plasticity in response to the environment from those whose behavior is instead rigid. The former have processes that are "massively adaptable," and they can "form plans, analyze situations, deliberate, reason, exploit analogies, revise beliefs in the light of experience, weigh up conflicting interests, formulate hypotheses and match them against evidence, make reasonable decisions on the basis of imperfect information, and so forth" (Copeland, 1993, pp. 55-56). Now in line with his earlier rejection of the first option, presumably these inner processes need not be conscious. This understanding of the term "thought" seems to characterize it partly as the possessing of a kind of function or capacity, though not entirely solely in these terms, since Copeland does mention that it involves having inner processes that provide for such functionality.

Copeland thinks that probably the most important role for the concept of thinking is in the explanation and prediction of behavior. Someone doing something is explained on the basis of that person having thought such and such rather than in terms of electrical or biochemical activity in the brain (though if type-identity materialism is correct, then such thoughts just are brain processes). Such explanations are intentional explanations. If AI research were to produce massively adaptable programs in robots that roughly matched those of humans, we could apply intentional explanation to them, and such robots could be said to think. "It is clear, then, that the purposes for which we employ the concept 'thinking' would be best served by a decision amongst the linguistic community to count these robots as thinking things," literally, and not just metaphorically (Copeland, 1993, pp. 56-57).

I agree that if thinking in terms of having states of phenomenal consciousness is left out of the picture, we could say that such robots (if and when they are built) think, in his sense of thinking. (We don't know that they do have phenomenal consciousness, and we don't know that they don't, as I will argue in a few chapters.) But it might not be entirely clear why a decision rather than a discovery is called for. It would seem that whether such robots think would be a (metaphysically) determinate matter, given his definition; they either have flexible inner processes allowing plasticity or not. If thinking doesn't have to involve conscious awareness, then these robots could be unconscious or nonconscious zombies and still be thinking. And if it is a metaphysically determinate matter, then we should seek to discover whether they do such thinking. The reason he holds that a decision is called for can't be that we will never agree on what the terms mean (linguistic indeterminacy), because he just defined them. What Copeland seems to mean is that the term "thinking" is currently linguistically indeterminate, but it will cease to be so after he defines it in the context of robots. But then wouldn't the issue, once defined, call for a discovery rather than a decision? So both decision and discovery might be involved after all.

Another matter Copeland is not clear on is whether he thinks intentionality involves conscious awareness--it looks like he holds it doesn't. If it does not, then perhaps we could make sense of the concept of robot intentionality, in the relevant sense of "original" intentionality. The claim is sometimes made that robots could have only derivative but not original intentionality. Intentionality characterizes human mental states and refers to the fact that they are about something. For example, my belief is about a book, a person, an airplane, etc. My belief is not about my own mental state but about something in the world. This is original intentionality. But consider the words and sentences in a book. Are they about anything? They are only if we make them be about something. In other words, the sentences in the book are not about anything from the perspective of the book--the sentences are not the book's beliefs; the book has no beliefs to be intentional. Any intentionality of the sentences in the book must be derivative on some agent who has original intentionality.

Well, the question arises as to whether the symbols of a computer have any original intentionality, which is the question of whether from the perspective of the computer they are about anything. It seems that to have original intentionality, the computer would have to be a subject. Then its symbols could represent external matters to the computer. One might also try to argue that to have original intentionality the computer would have to be phenomenally conscious, though this is more controversial. If we can make sense of the notion of unconscious thought, then we might be able to make sense of the notion of something being a subject and engaging in intentional behavior without it being phenomenally conscious.

One view is that in an important sense the computer really is just like the book: the computer's symbols would be about something only if we interpreted them as about something. That is, any intentionality would be derivative on our original intentionality. The computer has no ability to give its symbols any representational meaning. (As we shall see, this is Searle's view, which may or may not depend on a view that computers are not conscious.) A different view is that the computer, if it attains subjectivity (whether or not conscious), can give its symbols intentionality because they can represent things in the world to the computer. They can have meaning, even if the computer is not conscious. Not all meaning has to be conscious meaning. Copeland seems to take this latter view.

I think that even if a computer passed the Turing test, even in Copeland's amended version that takes into account the way it is passed, we might still have questions about thinking that could be said to call for discovery rather than mere decision. We might allow that the computer could think in Copeland's sense, and yet still wonder whether its thinking is conscious and whether its thinking involved original intentionality. I see no reason to believe "a priori" why any type of indeterminacy should be assumed to hold in the case of computers any more than it does in the case of humans. So one might plausibly claim that there would still be important matters, possibly metaphysically and linguistically determinate but epistemologically indeterminate, that called for discovery. It seems the only reason he could have for saying a decision is called for would be if we thought we just wouldn't be able to discover the truth and for practical purposes had to decide. But, depending on our view of the relation of the brain and mind, maybe we could decide. If thoughts are just brain processes, for example, (a "type identity" version of materialism discussed later) then couldn't we discover whether or not the robot brains in question had the relevant processes?

Turing considers his Turing test a substitute for the issue of whether a computer can think. With respect to Copeland's third sense of "thinking" above (massively adaptable and possibly unconscious processes of reasoning, etc.), it may appear to be a good test. But even here we seem to have to modify it to incorporate the matter of how the computer passes the test, and it doesn't resolve questions about in what sense computers are intentional or conscious. In the sense in which "thought" can mean "conscious thought," passing the test may have no bearing on whether a computer thinks. It may be possible for something to pass the test and yet not be conscious at all in the sense of phenomenal consciousness. "Phenomenal consciousness" will be discussed more fully in a later chapter, but basically it's the phenomenal awareness of things like pain that you have but an unfeeling zombie would not. Copeland above allows that thought could be unconscious, which may be true, but his argument is irrelevant to the question of whether the test is a good test for thought if what Turing meant was "conscious thought." But even if that is not what Turing meant, we would still want to know that a robot is conscious before transferring our mind into it. So if the Turing test is trotted out as an appropriate all-purpose test to give to a robot to test for what you want the robot to have in the way of a mental life, then we do want it to test for consciousness. A robot who thinks unconsciously could still pass the Turing test and not be conscious, and this makes the test not what we want.

I argued above that it is not clear when, if ever, a computer will be able to pass an appropriate version of the Turing test, and that even were this to happen, we still might not be satisfied that we have no need to further discuss the issue of whether it can think. John Searle also claims that the Turing test leaves something to be desired as a test of machine intelligence and thought.

Computers and Understanding

John Searle claims that no matter how intelligent computers appear to be, they are not truly smart in any way analogous to humans. In a famous line of argument involving a Chinese translation room, John Searle objects to the claim that computers will ever be able to understand anything. Even if a computer were to pass the Turing test, it would not understand anything it does. The implication is that a machine that can understand nothing is not intelligent in our sense of the term.

Searle attributes to strong AI the claim that an appropriately programmed computer really is a mind; the right computer with the right program can understand and have other cognitive states (Searle, 1980, p. 353). He examines the work of Roger Schank, whose computers are provided with "scripts" that fill in background, common sense information about various situations and scenarios. We have already seen the need of robots for such background knowledge. This information enables the computer, when presented with a story involving such a situation or scenario, to "infer" things it has not been explicitly told or that strictly speaking can't be deduced from just what it has been told. Strong AI proponents, according to Searle, hold that the computer is not merely simulating human cognitive abilities. They would claim the computer literally understands the story and that the way the machine understands here explains how it is that we understand such a story. Searle thinks both claims are false (Searle, 1980, p. 354).

Strong AI holds that the human mind works on principles that are operative in such computer "understanding," so Searle invokes a thought-experiment to show what it would be like if our minds really did work like strong AI claims they do. Searle's thought experiment involves the Chinese language, about which he knows nothing. Searle is locked in a room and supplied with Chinese writings (the script) to translate, another batch of Chinese writings (the story), and rules (in English) for formally relating (by the shapes of the letters) the second batch to the original Chinese writings. He is then given a third batch of Chinese symbols (the questions) and English rules that tell him how to correlate the third batch with the first two and give back Chinese characters in response to the third batch. The Chinese characters he gives back are considered the "answers," and the English rules are the "program." He gets so good at doing this "translation" that his answers are like those of a native Chinese speaker. He is also given similar writings and questions in English, and since he understands English, he has no trouble answering these questions (Searle, 1980, p. 355).

Searle claims that it is obvious he does not understand the Chinese stories, even though his answers are on par with his answers to the English scenario questions. The English he understands, but in the case of the Chinese he merely manipulates uninterpreted formal symbols. In this regard, performing computational operations on formally specified elements, he behaves like a computer and is the "instantiation" of a computer. So likewise Schank's computer doesn't understand its stories, since the computer has nothing more than Searle does when he understands nothing. Furthermore, there is no reason to think this type of symbol manipulation serves to shed any light on how humans do understand. What's going on in the computer, as in the case of the Chinese "translation," is not sufficient for understanding, and it has not been shown to be a necessary or even a significant contribution to understanding either. If he were to do a similar procedure with English characters, he would understand what they meant, but this is not the case with the Chinese. The computer can process characters syntactically, but it does not know their semantic meaning, and so it has no understanding, much in the same way that he would have no understanding of the Chinese characters no matter how good he became at this "translating." So when he really does understand the English story, the claim of strong AI that he is really just doing more of the kind of thing the computer does and what he himself does in the Chinese case is "incredible." He admits, though, that he hasn't shown this claim to be false (Searle, 1980, pp. 355-356).

In the Chinese case he has everything a computer does, and yet understands nothing, so there is no reason to think that his understanding in the English case has anything to do with computer programs (computational operations on purely formally specified elements). The example, he thinks, suggests that such operations have "no interesting connection with understanding." No reason has been given to show that in understanding English he is operating with any formal program (Searle, 1980, pp. 356-357).

Searle points out that the issue here has nothing to do with any vagueness in the concept of "understanding." Whatever confusions there may be about borderline cases, there are two absolutely clear cases here of understanding (the English translation) and not understanding (the Chinese translation) (Searle, 1980, pp. 357-358).

Searle already tries to fend off certain types of criticism in the original article by considering several possible replies to the argument. The systems reply claims that though Searle doesn't understand, he is only part of the system, and the system understands. Searle replies that even if he were to internalize the elements of the system by memorization he still wouldn't understand. "If he doesn't understand, then there is no way the system could understand because the system is just part of him." But, he thinks, the systems reply is absurd anyway, for how is it that Searle alone wouldn't understand but the conjunction of Searle and bits of paper would understand? It could be claimed that the man is really two formal symbol manipulation (sub) systems, one understanding English, and the other understanding Chinese. But in the case of subsystems, Searle claims the one manipulating Chinese is no better off than was the man himself. One can even imagine the Chinese subsystem passing the Turing test, which shows that passing the test is not sufficient for having understanding (Searle, 1980, pp. 358-360).

The robot reply holds that what is needed is to put a computer in some kind of robot body so that it has "perceptual" input from the world and "acts" in the world (behavioral output), rather than just having the input and output of formal symbols. Obviously this is what the extraordinary future is supposed to bring. Searle replies on several levels. First, as pointed out by Fodor, this would tacitly concede that cognition involves causal relations with the outside world rather than being just formal symbol manipulation. But even so, putting a computer in a robot body adds nothing significant in the way of understanding or intentionality. If Searle and his room were in the robot, so that some of the messages from outside came via the television camera and some of the symbols he gave out moved robot arms and legs, there would have been nothing added to create understanding where there was none before. "The robot has no intentional states at all," and by instantiating the program Searle would not have any intentional states of the relevant type. He is still merely manipulating symbols, and it's still syntax without semantics (Searle, 1980, pp. 362-363). Note that Searle here links intentional states to understanding.

The brain simulator reply invokes the example of building something to simulate the actual sequence of neuron firings at the synapses of the brain of a native Chinese speaker when he understands and responds to the story in Chinese. This system even operates by parallel processing. Searle notes this to be an odd reply for AI. Strong AI takes the mind's relation to the brain to be like software is to hardware, with the essence of the mental to be computational processes over formal elements. The whole idea is that we don't really have to understand how the hardware works if we know the software, but here in this reply we have to recreate in detail the low level workings of the brain. Even so, to Searle the reply fails. Add to his Chinese translation story a series of valved water pipes and connections analogous to neurons and synapses, with the man opening and closing valves in response to instructions. The output pops out as water from the pipes if all the right faucets are turned on. This simulates the formal structure of a Chinese speaker's brain, but the man doesn't understand, the water pipes don't, and neither does the conjunction of the man and the water pipes even if the man were to do all the firings "internally" by imagining the water pipe operations. The simulation is of the wrong brain properties. What we need is not the formal structure of the sequence of neuron firings but the causal properties that produce intentional states. The example shows that the formal properties are not sufficient for the causal properties (Searle, 1980, pp. 363-364).

The combination reply claims that understanding would come from putting all of the previous replies' aspects together. A brain-shaped computer is in a robot body, and it's programmed with all the synapses of the human brain. Its behavior is indistinguishable from that of a human. Searle thinks we would be tempted, and find it irresistible, to attribute intentionality to it, but this would be for reasons irrelevant to strong AI. If we knew nothing more about the robot we would assume it had intentionality because of its looks and behavior. But this would not show that instantiating a formal program was constitutive of intentionality. Once we found out how it worked, the game would be over, and we would see it as analogous to an ingenious dummy (Searle, 1980, pp. 364-365). Contrast this view with that of Copeland above, who was willing to grant intentionality on the basis of external behavior of sufficient complexity.

Searle considers other replies that I'm not going to recount in detail. The other minds reply claims that we should allow the same claims about the computer as we do to other humans when we attribute minds to them, since we have similar evidence from their behavior, but Searle thinks this confuses the epistemological issue of how we know about other people with the metaphysical issue of what cognitive states really amount to. The many-mansions reply is that in the future we might be able to create different computers that will understand (though we don't know how these might work), but to Searle this is just to redefine AI in an ad hoc way as whatever in the end will produce real intentionality. This shows nothing about the current claim under consideration (Searle, 1980, p. 366).

At the end of his famous article Searle tries to state his position more explicitly with respect to a number of issues important to us. First, Searle is not arguing against materialism or the view that machines can think. For one, he thinks we are machines that can think. It is an empirical question whether there could be an artifact, a man made machine, that could think. There is something about Searle (or any human) that the computers in question lack that allows him to understand English--why not give this to the computers? If we were able to build a machine out of the same material of which we are made (neurons, etc.), then we would have duplicated that in us which has the requisite causal powers and so it would obviously think. (Note here that Searle has shifted talk from "understanding" to "thinking" and seems to mean conscious thinking). However, if the question is whether something could think or understand solely by virtue of being the right sort of program, the answer is no. This would not be likely where the operation of the machine is defined solely in terms of computational processes over formally defined elements (the instantiation of a computer program). It is not because he is an instantiation of a computer program that Searle can understand English and have other intentional states. It is because of his biological structure, and it is this that would have to be given to the computer. Only something that has the causal powers of this structure could have intentionality. It is an empirical question whether other things like Martians could be made of the kind of stuff that has such a structure. What matters about the brain is not the formal sequence of synapses but the actual properties of the sequences. The formal symbol manipulations of computers are not even real symbol manipulations, since they don't symbolize anything. Any intentionality involved is in the minds of the programmers and interpreters (Searle, 1980, pp. 367-369).

Searle thinks there are severe problems with the assumption that the mind is to the brain as a computer program is to the computer hardware. The same program could have many different realizations in different entities and systems that have no intentionality. He mentions Weizenbaum's example of how to construct a computer out of toilet paper and stones, which to Searle clearly would lack intentionality even though it could instantiate a formal program. Likewise for his Chinese translation system using water pipes. And intentional states are not just purely formal in the way a program is. Intentional states have content--the belief that it is raining is not defined as a particular formal shape but as mental content with conditions of satisfaction. Also, mental states and events are a product of the operation of the brain in a way that a program is not a product of the operation of the computer (Searle, 1980, p. 369).

People have been fooled by the notion of simulation into thinking a computer simulation of an event or process is the real thing, but computer simulations aren't confined to simulations of thinking. "No one supposes that computer simulations of a five-alarm fire will burn the neighborhood down or that a computer simulation of a rainstorm will leave us drenched. Why on earth would anyone suppose that a computer simulation of understanding actually understood anything?" Searle notes that it is sometimes bemoaned that it will be difficult to get a computer to feel pain, or fall in love, but these would be no harder than cognition or anything else. Simulation just involves the transformation of some input to output by a program, but this is all the computer does with anything, even a simulation of thinking. The assumption is based on a confusion of simulation with duplication (Searle, 1980, pp. 369-370).

Searle accuses strong AI of assuming a version of dualism. This is not a substance dualism, but rather the view that the mind is really independent of the brain in the sense that it could be realized as a system of computational symbol manipulation operations in any number of media, just as programs are independent of their realization in machines. Strong AI workers often rail against dualism without realizing that they have their own form in holding that what is specifically mental about the mind has no intrinsic connection with the actual properties of the brain (Searle, 1980, pp. 371-372).

Searle's original paper generated a storm of controversy. Some readers agreed with Searle and considered his argument decisive--digital computers made of silicon will never be able to think, at least not in virtue of their formal symbol manipulation. Other readers, including many in AI, thought Searle guilty of some sort of illusion. The claim was that his thought-experiments were completely unrealistic and therefore led his readers to gloss over important leaps in the line of reasoning--and this sleight of hand vitiated the force of his argument. Furthermore, there seems to be controversy over who has the burden of proof. Does Searle succeed if he shows that we can't assume that machines (in the sense of modern digital computers) can think? Or does he actually have to prove they can't? Do Searle's critics succeed if they show that Searle hasn't proved that machines can't think? Or do they have to prove that machines can?

As a way of appraising Searle's views, I want to distinguish between Searle's arguments and Searle's position. Of course, Searle's arguments are intended to convince you of the correctness of his position, but they are distinct. As I will discuss below, his arguments have been severely criticized, perhaps effectively, but even if his arguments are undermined we will still want to consider whether his position is correct. Part of what I'm saying here is just the well known point that the invalidity of an argument does not prove the conclusion false. I may put forth the most comically fallacious argument to prove to you that George Washington was President, but the fact that the argument is invalid does not prove that Washington wasn't President. So in the comments that follow, I'll discuss Searle's arguments, but in the end I'm more interested in whether his position is correct than whether his arguments work. And I'm not really interested in reconstructing the historical Searle either. If his comments lead one to a particular interpretation of his argument such that the argument is invalid, I would be interested in finding out whether his claims and position could be reinterpreted as a different argument that might be insightful in its own right.

Much ado is made about how what Searle describes--the massive project of translation by a single individual--would be impossible to pull off. It would take days, weeks, months, years, etc. of effort. Similarly, the variation on the story that involves the individual memorizing of millions or billions of rules describes a task that could never be done (for example, see Hofstadter, 1981, pp. 373-375). Personally I do not see that the fact that the scenarios in the thought experiments describe unrealistic situations automatically entails that Searle's reliance on them is faulty. The extremely common philosophical use of thought-experiments usually assumes only that the depicted scenarios be logically possible (not violating the laws of logic and thus inherently self-contradictory, for example), not that they be physically possible (obeying all the laws of physics) or technologically possible (possible given current technology). So it is not obvious to me why Searle's thought-experiments have to obey restrictions not followed by countless bizarre body, brain, and mind-switching experiments and instances of teletransportation and brain fission depicted in the literature on personal identity, for example.

Let's look at some specific criticisms of the Searle's argument, particularly first with respect to the systems reply type of objection and then later with respect to the robot reply type of objection. Copeland thinks Searle's argument that the whole system doesn't understand is fallacious. (Copeland discusses the version involving Sam the story understanding program, but his remarks would apply to Searle's earlier versions as well, since Searle has never changed the essentials of this critical line of reasoning.) Basically Copeland thinks Searle is focusing on the wrong participant in his Chinese translation scenarios. Sure, the person doing the translation does not understand Chinese, because he is merely a laboring cog in the whole process. But if you could ask the voice of the whole system or program (or process of translation), this voice would reply that she understands what the words and sentences mean (whether or not she does, she will say that she does). Searle moves fallaciously from the assumption that no amount of symbol manipulation by a part of the system will enable that part to understand the input to the conclusion that no amount of symbol manipulation by a part of the system will enable the wider system (of which the part is a component) to understand the input (Copeland, 1993, p. 125). And of course strictly speaking such a conclusion does not follow from this premise. Copeland is correct that if this is the whole of the argument it is deductively invalid. You can't legitimately infer from the fact that part of something has a particular property that the whole has that property. This is an old logical fallacy known as the "part-whole" fallacy. Copeland assumes Searle has the burden of proof and his opponents win if they show his arguments are faulty. And if his argument is intended as a deductive argument, and it is invalid, then it is faulty.

As Copeland points out, Searle explicitly considers this part-whole distinction under the rubric of the systems reply and thinks it silly to claim that the part alone doesn't understand but the part conjoined with slips of paper do understand. (Here again we have the burden of proof issue. Is Searle's opponent claiming only that the whole system may understand (Searle not having shown that it can't) or that the system does understand (and then where is the argument for this?)?) The critic claims the silliness is not from the notion of the whole system understanding but because Searle's scenario is so ridiculous--a man trying to translate by manipulating untold numbers of rules and slips of paper (Copeland, 1993, p. 126).

But let's take the comments and reinterpret the debate. If the strong AI proponent is merely asserting that the system understands, then this does seem to be question begging, as Searle argues and Copeland agrees (Copeland, 1993, p. 127). But let's be more charitable than that to strong AI and take it that the strong AI proponent is claiming that we can assert that computers can or will think by using an analogy with humans. Just as we attribute thought to other humans on the basis of their having features and behavior that are relevantly similar to us (though we can't "be them" to really prove to ourselves that they think), so likewise we can take it that computers think if we see such computers exhibiting relevantly similar behavior (like passing the Turing test, or in robot bodies conversing with us). This seems to be an argument from analogy on the part of the proponent of strong AI, and though it is not strictly speaking deductively valid, it might be accepted as a decent inductive argument (if the analogy really holds). Copeland seems to agree that strong AI might be using this strategy: "In my view it is as obvious as anything can be in philosophy that if we are ever confronted with a robot like Turbo Sam, we ought to say it thinks," and, "there can be no point in refusing to say he understands the language he speaks. For in my imaginary scenario Turbo Sam is every bit as skilful as a human in using words to ease and ornament his path through the world" (Copeland, 1993, p. 132). And I do think that strong AI seeks to build an analogy between humans and computers--proponents think that the way a computer works sheds light on the way the human mind works.

Searle can be interpreted as calling into question such an analogy. He is pointing out that the analogy holds among humans, since humans are made of the same kind of stuff. But computers, built out of different stuff, are not similar in this respect, and thus there is a disanalogy between humans and computers. So the strong AI attempt to show computers can or will understand by using an argument from analogy does not succeed.

Why should we think the kind of stuff involved is relevant and breaks the analogy? Because we can conceive of a situation (the Chinese room) in which symbol manipulation would not seem to produce understanding purely by virtue of the symbol manipulation. This seems to me to be the real merit of the Chinese translation examples--to try to break the analogy between humans and computers. Searle thinks that the Chinese translation scenario describes a situation that is relevantly similar to what goes on in computers and shows that if humans worked like computers, they wouldn't understand. If humans operated like computers, and "understood" purely on the basis of symbol manipulation (the thesis of strong AI), they wouldn't really understand. But we do understand, so our understanding can't come from doing what computers do and computers are not enough like us to be analogous. Of course Searle appeals to our intuitions at this point. The translator doesn't understand and neither does the whole that includes the translator and the slips of paper. I must admit that Searle's scenario strikes me as powerful and effective, since I have trouble believing the Chinese room system understands--while I can more easily wonder whether a future supercomputer understands, my intuition is that the system he describes in the Chinese room experiment doesn't understand. And does anybody not trying just to win the argument seriously think the system he describes would understand, though the translator wouldn't? Very few people would go as far as to really think the Chinese room system understands. Churchland doesn't, and he has been one of Searle's major opponents over the years. Even Copeland, who thinks Searle's argument is wrong, admits that this type of system (in the form of the Sam program) wouldn't understand (Copeland, 1993, p. 128). So why is Searle wrong to appeal to this intuition? If he thinks he has a good argument that proves that the system wouldn't understand because the part doesn't, he is wrong. But if what he is doing is showing that the relevant analogy with a computer is a Chinese room and not a human, and appealing to our intuition that the Chinese room wouldn't understand, I don't see that his position is all that faulty. If we agree, then Searle has broken the analogy between computers and humans, and the Chinese translation scenario does seem relevantly analogous to computer processing. Those who would argue otherwise would claim that the Chinese translation situation is not analogous to computer processing because...? Because computers would process faster? Because the individual involved would become exhausted? Searle thinks that these issues are irrelevant, and I personally fail to see that such things show the situation is not relevantly analogous to computer processing. Speed and resource exhaustion seem irrelevant to symbol processing per se. Strong AI holds that understanding is a function of computation, not computational speed per se or whether parts of the computer become "exhausted" during the processing. To recap, the similarities on which strong AI bases the analogy between humans and computers are first, functional output in response to input, and second, information processing by symbol manipulation. Since in the Chinese translation room scenario both of these occur but without the presence of understanding, and since in humans there clearly is understanding, there seems to be a respect in which computers and humans are relevantly different. Any reliance on an argument from analogy on the part of strong AI is therefore called into question.

But it does seem to come down to one's intuitions about the Chinese room scenario. Few strong AI proponents seem willing to allow that thermostats or smoke detectors or even today's PCs literally understand, but somehow when the computers get more sophisticated then they start understanding. (And to win the argument they will claim that, sure, the Chinese room system understands.) Searle's position is that nothing relevant would have changed between today's machines and those in the future, and so the disanalogy with humans and computers would still hold. But if your intuitions are that today's PCs do understand, and you really think that the Chinese room scenario the man conjoined to the bits of paper would understand, or if you think that processing speed really does make the relevant difference, then you will not buy the claim that the analogy between humans and computers has been broken and Searle will not succeed with you.

On the other hand, Searle's amazement that people mistake the simulation for the reality is a little hard to swallow. Searle claims that while a real fire is hot, no one takes a simulation of a fire to be hot, and so likewise no one should mistake a simulation of thinking for actual thinking. The reality and its simulation are distinct, but is it really so surprising that one could think that a simulation of thinking is thinking when it (on the hypothesis of a functioning robot) results in the same type of output for input? If you feed a real fire wood it produces heat, and if you fed a simulation of a fire real wood, and as output you got actual heat, you might think that the simulation was as good as the reality and in fact was a fire. So when you talk to a robot and it responds in exactly the same way that a real person would respond to that input, then you might say that this simulation of thinking is as good as the thinking it simulates and might really be the same thing.

Searle has not retreated, despite much criticism (and some misunderstanding) from the AI community. In a recent book he reiterates his earlier point (and of course claims he refuted strong AI with his original argument): "if I do not understand Chinese solely on the basis of implementing a computer program for understanding Chinese, then neither does any other digital computers solely on that basis, because no digital computer has anything I do not have (Searle, 1997, p. 11)." The structure of the argument is summarized as: "1. Programs are entirely syntactical. 2. Minds have a semantics. 3. Syntax is not the same as, nor by itself sufficient for, semantics. Therefore, programs are not minds. Q.E.D." (Searle, 1997, pp. 11-12).

I am sympathetic to Searle's claim that it can't be in virtue of the sheer fact of computation that we get understanding. It must be in virtue of something else. He thinks that this something else has to do with the causal powers of the brain. Presumably if it is not in virtue of the sheer fact of computation in the case of the brain, then of course it must be something else, and if one wants to call that mysterious something else the "causal powers" then I have no objection. But Searle leaves the impression he thinks the causal powers have something to do with the fact that the brain is wetware rather than hardware. We certainly don't know what this something else is, or that it is some property of wetware per se. It might have to do with the organization, structure, or mode of computation of the brain rather than just the sheer fact of computation itself. And surely there is the possibility that this could be recreated in a computer. Searle recognizes that he hasn't shown that future robots won't understand, only that if they do it won't be in virtue of the sheer fact of computation over formal symbols that they do so.

I am also sympathetic to a more general point that we can learn from Searle's comments, namely that there is a lot going on in the human brain and mind and we shouldn't just assume that computation captures all of it, assuming it even captures any of it. We can do calculations, but this doesn't mean that a calculator captures what is going on in the brain or mind when we calculate, much less perceive and appreciate a sunset, love another human being, feel joy at the birth of a son or daughter, etc. To assume that all the many subjective experiences of humans is captured in the notion of computation seems naive.

If my interpretation or reinterpretation is correct, then though Searle hasn't shown that the system doesn't understand, he has called into question the analogy on which strong AI builds its case that the system does understand. But the above discussion focuses on the systems reply. What of the robot reply? Maybe the reason that the man, or the system, in the Chinese translation scenario fails to understand is that he (or the system), like isolated computers, has no way of connecting to the external world. This is the essence of the robot reply to Searle.

Recall that Searle's response to the robot reply is twofold. First, to claim that robot interaction in the form of input and output involving the external world is necessary for understanding is to concede that computation alone is insufficient. This seems correct, but it may be attacking a straw man. Strong AI proponents do claim that sophisticated computers will think and understand and do so in the way we do. But not all would claim that anything that computes thinks--the amount and type of computation may be relevant, or other organizational factors from the brain. The sheer fact of computation is not enough. So many might already grant that in some sense computation alone is not enough--it must be a certain kind of computation, etc. Someone like Schank, at least, doesn't think his early programs were thinkers, though he saw potential future versions of this type of program as thinking. But it might be argued by Searle that even allowing that the amount and type of computation are relevant is still to hold the position that computation alone is sufficient for thinking. Specifying the amount and type of computation is not to add anything to the computation. Besides, who has offered to draw a distinction between not enough computation and enough computation, or between the right kind and the wrong kind?

The second reply of Searle is that even with robot interaction with the world there will not be understanding because the input and output will still just be the manipulation of symbols. As has been remarked, the success of such a response of course assumes that Searle has already shown with the Chinese room argument that manipulation of symbols is not sufficient for understanding (Copeland, 1993, p. 132). His response here is thus parasitic on his earlier argument. As I have argued, even if Searle has not shown this, he may have called into question the analogy on which strong AI builds its argument that computation is sufficient for understanding, and so his response to the robot reply may succeed if this earlier argument has succeeded.

But here we may be able to at last sort out exactly what Searle thinks the computer is missing. A computer embodied in the form of a robot will be able to match some symbols (for example those composing names) with other symbols (the binary digits that it processes as it receives input from its television cameras, auditory devices, and so forth). Similarly, on the symbol or binary digit level, it will match certain instructions it gives with body movement input and other "sensory" input from its external environment. Now why would anyone advancing the robot reply think that this addition to the computer would result in understanding if its previous symbol manipulation, no matter how extensive and speedy, did not? (Note that the robot reply tacitly acknowledges that Searle has been correct all along in thinking that the non-robot computer cannot understand! Searle does seem right about this. By allowing that a computer must have some sort of robot body to understand, it concedes that syntax alone will not give the computer semantics. Searle should do more to play off one AI camp against the other. How can it be so obvious that computers do or will understand if the robot reply camp tacitly concedes that something more is needed?) The robot reply position must be that adding "sensory" input, etc. would provide for semantics instead of just syntax. Semantics has to do with the meaning of sentences, as opposed to the correctness of their structure grammatically. "Sensory" input would presumably allow the computer to match its words with things in the world, and thus attain the "word-world" connections needed in semantics. So to Searle's claim that the computer has syntax without semantics, the robot reply is that here is a way to get the semantics.

How can Searle still be so dissatisfied with the conceptual equipment of the computer-as-robot to continue to deny that it understands when it has just the word-world connections that would seem to provide what is needed for semantics? Here is where we see that what Searle really may be driving at with his charge that computers lack intentionality and understanding is the claim that they lack any sort of conscious awareness. (Recall that intentionality refers to the fact that our beliefs and wishes, for example, are "about" something.) It seems that all any of us have in going from syntax to semantics is this word-world connection that such robots will have, except for one thing--conscious awareness of meaning. When I use the term "tree" I know what it means because I have the experience of trees in various ways, sensations of trees, of pictures of trees, etc. This comes through my sense organs or vicariously through someone else's sense organs. Now the robot can match its symbols for the term "tree" to other symbols for what comes through its "sense organs," in a way perhaps analogous to what goes on in humans. Why then would Searle not think this is enough for semantics? Because the robot has no conscious awareness of anything--treeness, tree sensations, etc. Any representation, any "intentionality" on the part of the robot is a purely "nonconscious" mapping of some symbols onto other symbols, and in understanding the meaning of words and sentences we do more than this. To Searle the robot has no original intentionality. This is why when he looks at the Chinese translation scenario involving a robot, with all the word-world connections one could ask for, Searle still refuses to concede it understands. Symbol manipulation, whether mapping words to things or not, will not create conscious awareness.

It certainly seems as if at the root of Searle's problems with the notion of computer understanding is his belief that they have no conscious awareness. But this is not entirely certain. First, he doesn't focus on conscious awareness as the missing extra when he talks of computers not being able to understand. If that is what it is, why not come out and say so? Second, in other places he distinguishes between consciousness and intentionality, and the notion of understanding that computers don't have seems to be more a denial of intentionality to them. It is not obvious that all he means by intentionality is conscious representation. As well, since he sometimes equates the notion of a computer understanding with the notion of a computer thinking, the claim that conscious awareness is a necessary component of understanding means that it is a necessary component of thinking, and if this is his view he cannot allow the possibility of unconscious thought. I'm not sure that he holds such a view.

What can we make of the plausibility of Searle's objection to the sufficiency of robot word-world connections for semantics and therefore understanding? Here we are back to the earlier discussion of Copeland's discussion of intentionality in the context of the Turing test. We find some support for his view with Robinson. But first we must note a distinction between word-world connections as a necessary condition of understanding and such connections as a sufficient condition of understanding. The robot reply argues that what is lacking in computers (and Searle's Chinese room) is word-world connections, and so providing the computer with these (via a robot body, etc.) is sufficient for understanding, given all else that is going on with the computer. Searle objects that since there would still be no understanding such word-world connections are not sufficient. Whether he is correct or not, this is a different issue from the one of whether such connections are necessary for understanding.

An argument that robots probably won't have conscious experience comes from Robinson, who asks us to imagine a series of robot scenarios that start with a simple computer. In each scenario the computer gains more sophistication, a body, more integration, etc. The point of all these robot scenarios is to try to flush out our intuitions about whether adding various body associations, including processing of external input and the ability to act, enable it to experience pain, understand, and have its actions mean something to it. A crucial point is reached with robot "Frp," who is embodied and can act in response to commands, for example "Go to the drugstore and bring back the prescription that the druggist will have for me." By carrying out such a command, does this embodied, acting robot show that it understands? (Recall that Searle would reply that such activity does not show that it understands, and furthermore it cannot understand because it is just manipulating symbols without the presence of semantic meaning.) Robinson's answer is that it depends on what you mean by the term "understanding." Here Robinson distinguishes between understanding in the sense connected to appropriate action and understanding in the sense connected to sensation and feelings. In the first sense of "understanding," Robinson thinks it does understand, because it engages in appropriate action. This is a functional sense of understanding--the robot can engage in the appropriate output given a particular input, and so we can say that it understands. Not understanding in this sense would be if it failed to produce the desired output, as if you told it to go to the drugstore, and instead it went to the supermarket. (Clearly this sense of understanding would not satisfy Searle.) But in the second sense of "understanding," robot Frp, who has no sensations or feelings, does not understand. According to Robinson, this is because having no sensations and feelings, its actions ultimately have no point for it (Robinson, 1992, pp. 39-54). This is somewhat of an odd way to put it. I would have expected Robinson to say that it lacks this kind of understanding because it has no conscious awareness.

So it seems that the only sense of understanding relevant to robots is the weak sense of performance, not the strong sense involving sensations and feelings. The strong sense of understanding does seem to involve some sort of conscious awareness, for it is the sense in which a congenitally blind person, never having experienced the color of anything, does not understand "Chlorophyll is green" and "Strawberries are red" (Robinson, 1992, p. 53). This is what Chalmers would call a phenomenal sense of understanding, as opposed to a psychological sense. Robinson's position then supports Searle's claim that even if a robot has input and output involving the external world, if it doesn't have sensations and feelings as we know them, then it will not understand in any strong phenomenal sense. Word-world connections will give it semantic meaning in the weak, functional sense of understanding, but there is no reason to believe, and reason to doubt, that it is enough to give it semantics in the sense of conscious awareness of meaning and therefore understanding in the strong sense.

I conclude that while Searle may not have proved that computers and robots have no understanding in the sense of conscious awareness, his story might give us pause in merely assuming they do. A robot provided with a sophisticated sensory capacity for interacting with the world may provide for semantics and not just syntax, but whether this is enough for it to be consciously aware of its surroundings is an open question. The question of robots and consciousness will continue in the next chapter.

Copyright: 2000