《人机共生》

Man-Computer Symbiosis

J. C. R. Licklider
IRE Transactions on Human Factors in Electronics,
volume HFE-1, pages 4-11, March 1960

Summary

Man-computer symbiosis is an expected development in cooperative interaction between men and electronic computers. It will involve very close coupling between the human and the electronic members of the partnership. The main aims are 1) to let computers facilitate formulative thinking as they now facilitate the solution of formulated problems, and 2) to enable men and computers to cooperate in making decisions and controlling complex situations without inflexible dependence on predetermined programs. In the anticipated symbiotic partnership, men will set the goals, formulate the hypotheses, determine the criteria, and perform the evaluations. Computing machines will do the routinizable work that must be done to prepare the way for insights and decisions in technical and scientific thinking. Preliminary analyses indicate that the symbiotic partnership will perform intellectual operations much more effectively than man alone can perform them. Prerequisites for the achievement of the effective, cooperative association include developments in computer time sharing, in memory components, in memory organization, in programming languages, and in input and output equipment.

1 Introduction

1.1 Symbiosis

The fig tree is pollinated only by the insect Blastophaga grossorun. The larva of the insect lives in the ovary of the fig tree, and there it gets its food. The tree and the insect are thus heavily interdependent: the tree cannot reproduce wit bout the insect; the insect cannot eat wit bout the tree; together, they constitute not only a viable but a productive and thriving partnership. This cooperative “living together in intimate association, or even close union, of two dissimilar organisms” is called symbiosis [27].

“Man-computer symbiosis is a subclass of man-machine systems. There are many man-machine systems. At present, however, there are no man-computer symbioses. The purposes of this paper are to present the concept and, hopefully, to foster the development of man-computer symbiosis by analyzing some problems of interaction between men and computing machines, calling attention to applicable principles of man-machine engineering, and pointing out a few questions to which research answers are needed. The hope is that, in not too many years, human brains and computing machines will be coupled together very tightly, and that the resulting partnership will think as no human brain has ever thought and process data in a way not approached by the information-handling machines we know today.

1.2 Between “Mechanically Extended Man” and “Artificial Intelligence”

As a concept, man-computer symbiosis is different in an important way from what North [21] has called “mechanically extended man.” In the man-machine systems of the past, the human operator supplied the initiative, the direction, the integration, and the criterion. The mechanical parts of the systems were mere extensions, first of the human arm, then of the human eye. These systems certainly did not consist of “dissimilar organisms living together…” There was only one kind of organism-man-and the rest was there only to help him.

In one sense of course, any man-made system is intended to help man, to help a man or men outside the system. If we focus upon the human operator within the system, however, we see that, in some areas of technology, a fantastic change has taken place during the last few years. “Mechanical extension” has given way to replacement of men, to automation, and the men who remain are there more to help than to be helped. In some instances, particularly in large computer-centered information and control systems, the human operators are responsible mainly for functions that it proved infeasible to automate. Such systems (“humanly extended machines,” North might call them) are not symbiotic systems. They are “semi-automatic” systems, systems that started out to be fully automatic but fell short of the goal.

Man-computer symbiosis is probably not the ultimate paradigm for complex technological systems. It seems entirely possible that, in due course, electronic or chemical “machines” will outdo the human brain in most of the functions we now consider exclusively within its province. Even now, Gelernter’s IBM-704 program for proving theorems in plane geometry proceeds at about the same pace as Brooklyn high school students, and makes similar errors.[12] There are, in fact, several theorem-proving, problem-solving, chess-playing, and pattern-recognizing programs (too many for complete reference [1, 2, 5, 8, 11, 13, 17, 18, 19, 22, 23, 25]) capable of rivaling human intellectual performance in restricted areas; and Newell, Simon, and Shaw’s [20] “general problem solver” may remove some of the restrictions. In short, it seems worthwhile to avoid argument with (other) enthusiasts for artificial intelligence by conceding dominance in the distant future of cerebration to machines alone. There will nevertheless be a fairly long interim during which the main intellectual advances will be made by men and computers working together in intimate association. A multidisciplinary study group, examining future research and development problems of the Air Force, estimated that it would be 1980 before developments in artificial intelligence make it possible for machines alone to do much thinking or problem solving of military significance. That would leave, say, five years to develop man-computer symbiosis and 15 years to use it. The 15 may be 10 or 500, but those years should be intellectually the most creative and exciting in the history of mankind.

2 Aims of Man-Computer Symbiosis

Present-day computers are designed primarily to solve preformulated problems or to process data according to predetermined procedures. The course of the computation may be conditional upon results obtained during the computation, but all the alternatives must be foreseen in advance. (If an unforeseen alternative arises, the whole process comes to a halt and awaits the necessary extension of the program.) The requirement for preformulation or predetermination is sometimes no great disadvantage. It is often said that programming for a computing machine forces one to think clearly, that it disciplines the thought process. If the user can think his problem through in advance, symbiotic association with a computing machine is not necessary.

However, many problems that can be thought through in advance are very difficult to think through in advance. They would be easier to solve, and they could be solved faster, through an intuitively guided trial-and-error procedure in which the computer cooperated, turning up flaws in the reasoning or revealing unexpected turns in the solution. Other problems simply cannot be formulated without computing-machine aid. Poincare anticipated the frustration of an important group of would-be computer users when he said, “The question is not, ‘What is the answer?’ The question is, ‘What is the question?’” One of the main aims of man-computer symbiosis is to bring the computing machine effectively into the formulative parts of technical problems.

The other main aim is closely related. It is to bring computing machines effectively into processes of thinking that must go on in “real time,” time that moves too fast to permit using computers in conventional ways. Imagine trying, for example, to direct a battle with the aid of a computer on such a schedule as this. You formulate your problem today. Tomorrow you spend with a programmer. Next week the computer devotes 5 minutes to assembling your program and 47 seconds to calculating the answer to your problem. You get a sheet of paper 20 feet long, full of numbers that, instead of providing a final solution, only suggest a tactic that should be explored by simulation. Obviously, the battle would be over before the second step in its planning was begun. To think in interaction with a computer in the same way that you think with a colleague whose competence supplements your own will require much tighter coupling between man and machine than is suggested by the example and than is possible today.

3 Need for Computer Participation in Formulative and Real-Time Thinking

The preceding paragraphs tacitly made the assumption that, if they could be introduced effectively into the thought process, the functions that can be performed by data-processing machines would improve or facilitate thinking and problem solving in an important way. That assumption may require justification.

3.1 A Preliminary and Informal Time-and-Motion Analysis of Technical Thinking

Despite the fact that there is a voluminous literature on thinking and problem solving, including intensive case-history studies of the process of invention, I could find nothing comparable to a time-and-motion-study analysis of the mental work of a person engaged in a scientific or technical enterprise. In the spring and summer of 1957, therefore, I tried to keep track of what one moderately technical person actually did during the hours he regarded as devoted to work. Although I was aware of the inadequacy of the sampling, I served as my own subject.

It soon became apparent that the main thing I did was to keep records, and the project would have become an infinite regress if the keeping of records had been carried through in the detail envisaged in the initial plan. It was not. Nevertheless, I obtained a picture of my activities that gave me pause. Perhaps my spectrum is not typical–I hope it is not, but I fear it is.

About 85 per cent of my “thinking” time was spent getting into a position to think, to make a decision, to learn something I needed to know. Much more time went into finding or obtaining information than into digesting it. Hours went into the plotting of graphs, and other hours into instructing an assistant how to plot. When the graphs were finished, the relations were obvious at once, but the plotting had to be done in order to make them so. At one point, it was necessary to compare six experimental determinations of a function relating speech-intelligibility to speech-to-noise ratio. No two experimenters had used the same definition or measure of speech-to-noise ratio. Several hours of calculating were required to get the data into comparable form. When they were in comparable form, it took only a few seconds to determine what I needed to know.

Throughout the period I examined, in short, my “thinking” time was devoted mainly to activities that were essentially clerical or mechanical: searching, calculating, plotting, transforming, determining the logical or dynamic consequences of a set of assumptions or hypotheses, preparing the way for a decision or an insight. Moreover, my choices of what to attempt and what not to attempt were determined to an embarrassingly great extent by considerations of clerical feasibility, not intellectual capability.

The main suggestion conveyed by the findings just described is that the operations that fill most of the time allegedly devoted to technical thinking are operations that can be performed more effectively by machines than by men. Severe problems are posed by the fact that these operations have to be performed upon diverse variables and in unforeseen and continually changing sequences. If those problems can be solved in such a way as to create a symbiotic relation between a man and a fast information-retrieval and data-processing machine, however, it seems evident that the cooperative interaction would greatly improve the thinking process.

It may be appropriate to acknowledge, at this point, that we are using the term “computer” to cover a wide class of calculating, data-processing, and information-storage-and-retrieval machines. The capabilities of machines in this class are increasing almost daily. It is therefore hazardous to make general statements about capabilities of the class. Perhaps it is equally hazardous to make general statements about the capabilities of men. Nevertheless, certain genotypic differences in capability between men and computers do stand out, and they have a bearing on the nature of possible man-computer symbiosis and the potential value of achieving it.

As has been said in various ways, men are noisy, narrow-band devices, but their nervous systems have very many parallel and simultaneously active channels. Relative to men, computing machines are very fast and very accurate, but they are constrained to perform only one or a few elementary operations at a time. Men are flexible, capable of “programming themselves contingently” on the basis of newly received information. Computing machines are single-minded, constrained by their “ pre-programming.” Men naturally speak redundant languages organized around unitary objects and coherent actions and employing 20 to 60 elementary symbols. Computers “naturally” speak nonredundant languages, usually with only two elementary symbols and no inherent appreciation either of unitary objects or of coherent actions.

To be rigorously correct, those characterizations would have to include many qualifiers. Nevertheless, the picture of dissimilarity (and therefore p0tential supplementation) that they present is essentially valid. Computing machines can do readily, well, and rapidly many things that are difficult or impossible for man, and men can do readily and well, though not rapidly, many things that are difficult or impossible for computers. That suggests that a symbiotic cooperation, if successful in integrating the positive characteristics of men and computers, would be of great value. The differences in speed and in language, of course, pose difficulties that must be overcome.

4 Separable Functions of Men and Computers in the Anticipated Symbiotic Association

It seems likely that the contributions of human operators and equipment will blend together so completely in many operations that it will be difficult to separate them neatly in analysis. That would be the case it; in gathering data on which to base a decision, for example, both the man and the computer came up with relevant precedents from experience and if the computer then suggested a course of action that agreed with the man’s intuitive judgment. (In theorem-proving programs, computers find precedents in experience, and in the SAGE System, they suggest courses of action. The foregoing is not a far-fetched example. ) In other operations, however, the contributions of men and equipment will be to some extent separable.

Men will set the goals and supply the motivations, of course, at least in the early years. They will formulate hypotheses. They will ask questions. They will think of mechanisms, procedures, and models. They will remember that such-and-such a person did some possibly relevant work on a topic of interest back in 1947, or at any rate shortly after World War II, and they will have an idea in what journals it might have been published. In general, they will make approximate and fallible, but leading, contributions, and they will define criteria and serve as evaluators, judging the contributions of the equipment and guiding the general line of thought.

In addition, men will handle the very-low-probability situations when such situations do actually arise. (In current man-machine systems, that is one of the human operator’s most important functions. The sum of the probabilities of very-low-probability alternatives is often much too large to neglect. ) Men will fill in the gaps, either in the problem solution or in the computer program, when the computer has no mode or routine that is applicable in a particular circumstance.

The information-processing equipment, for its part, will convert hypotheses into testable models and then test the models against data (which the human operator may designate roughly and identify as relevant when the computer presents them for his approval). The equipment will answer questions. It will simulate the mechanisms and models, carry out the procedures, and display the results to the operator. It will transform data, plot graphs (“cutting the cake” in whatever way the human operator specifies, or in several alternative ways if the human operator is not sure what he wants). The equipment will interpolate, extrapolate, and transform. It will convert static equations or logical statements into dynamic models so the human operator can examine their behavior. In general, it will carry out the routinizable, clerical operations that fill the intervals between decisions.

In addition, the computer will serve as a statistical-inference, decision-theory, or game-theory machine to make elementary evaluations of suggested courses of action whenever there is enough basis to support a formal statistical analysis. Finally, it will do as much diagnosis, pattern-matching, and relevance-recognizing as it profitably can, but it will accept a clearly secondary status in those areas.

5 Prerequisites for Realization of Man-Computer Symbiosis

The data-processing equipment tacitly postulated in the preceding section is not available. The computer programs have not been written. There are in fact several hurdles that stand between the nonsymbiotic present and the anticipated symbiotic future. Let us examine some of them to see more clearly what is needed and what the chances are of achieving it.

5.1 Speed Mismatch Between Men and Computers

Any present-day large-scale computer is too fast and too costly for real-time cooperative thinking with one man. Clearly, for the sake of efficiency and economy, the computer must divide its time among many users. Timesharing systems are currently under active development. There are even arrangements to keep users from “clobbering” anything but their own personal programs.

It seems reasonable to envision, for a time 10 or 15 years hence, a “thinking center” that will incorporate the functions of present-day libraries together with anticipated advances in information storage and retrieval and the symbiotic functions suggested earlier in this paper. The picture readily enlarges itself into a network of such centers, connected to one another by wide-band communication lines and to individual users by leased-wire services. In such a system, the speed of the computers would be balanced, and the cost of the gigantic memories and the sophisticated programs would be divided by the number of users.

5.2 Memory Hardware Requirements

When we start to think of storing any appreciable fraction of a technical literature in computer memory, we run into billions of bits and, unless things change markedly, billions of dollars.

The first thing to face is that we shall not store all the technical and scientific papers in computer memory. We may store the parts that can be summarized most succinctly-the quantitative parts and the reference citations-but not the whole. Books are among the most beautifully engineered, and human-engineered, components in existence, and they will continue to be functionally important within the context of man-computer symbiosis. (Hopefully, the computer will expedite the finding, delivering, and returning of books.)

The second point is that a very important section of memory will be permanent: part indelible memory and part published memory. The computer will be able to write once into indelible memory, and then read back indefinitely, but the computer will not be able to erase indelible memory. (It may also over-write, turning all the 0’s into l’s, as though marking over what was written earlier.) Published memory will be “read-only” memory. It will be introduced into the computer already structured. The computer will be able to refer to it repeatedly, but not to change it. These types of memory will become more and more important as computers grow larger. They can be made more compact than core, thin-film, or even tape memory, and they will be much less expensive. The main engineering problems will concern selection circuitry.

In so far as other aspects of memory requirement are concerned, we may count upon the continuing development of ordinary scientific and business computing machines There is some prospect that memory elements will become as fast as processing (logic) elements. That development would have a revolutionary effect upon the design of computers.

5.3 Memory Organization Requirements

Implicit in the idea of man-computer symbiosis are the requirements that information be retrievable both by name and by pattern and that it be accessible through procedure much faster than serial search. At least half of the problem of memory organization appears to reside in the storage procedure. Most of the remainder seems to be wrapped up in the problem of pattern recognition within the storage mechanism or medium. Detailed discussion of these problems is beyond the present scope. However, a brief outline of one promising idea, “trie memory,” may serve to indicate the general nature of anticipated developments.

Trie memory is so called by its originator, Fredkin [10], because it is designed to facilitate retrieval of information and because the branching storage structure, when developed, resembles a tree. Most common memory systems store functions of arguments at locations designated by the arguments. (In one sense, they do not store the arguments at all. In another and more realistic sense, they store all the possible arguments in the framework structure of the memory.) The trie memory system, on the other hand, stores both the functions and the arguments. The argument is introduced into the memory first, one character at a time, starting at a standard initial register. Each argument register has one cell for each character of the ensemble (e.g., two for information encoded in binary form) and each character cell has within it storage space for the address of the next register. The argument is stored by writing a series of addresses, each one of which tells where to find the next. At the end of the argument is a special “end-of-argument” marker. Then follow directions to the function, which is stored in one or another of several ways, either further trie structure or “list structure” often being most effective.

The trie memory scheme is inefficient for small memories, but it becomes increasingly efficient in using available storage space as memory size increases. The attractive features of the scheme are these: 1) The retrieval process is extremely simple. Given the argument, enter the standard initial register with the first character, and pick up the address of the second. Then go to the second register, and pick up the address of the third, etc. 2) If two arguments have initial characters in common, they use the same storage space for those characters. 3) The lengths of the arguments need not be the same, and need not be specified in advance. 4) No room in storage is reserved for or used by any argument until it is actually stored. The trie structure is created as the items are introduced into the memory. 5) A function can be used as an argument for another function, and that function as an argument for the next. Thus, for example, by entering with the argument, “matrix multiplication,” one might retrieve the entire program for performing a matrix multiplication on the computer. 6) By examining the storage at a given level, one can determine what thus-far similar items have been stored. For example, if there is no citation for Egan, J. P., it is but a step or two backward to pick up the trail of Egan, James … .

The properties just described do not include all the desired ones, but they bring computer storage into resonance with human operators and their predilection to designate things by naming or pointing.

5.4 The Language Problem

The basic dissimilarity between human languages and computer languages may be the most serious obstacle to true symbiosis. It is reassuring, however, to note what great strides have already been made, through interpretive programs and particularly through assembly or compiling programs such as FORTRAN, to adapt computers to human language forms. The “Information Processing Language” of Shaw, Newell, Simon, and Ellis [24] represents another line of rapprochement. And, in ALGOL and related systems, men are proving their flexibility by adopting standard formulas of representation and expression that are readily translatable into machine language.

For the purposes of real-time cooperation between men and computers, it will be necessary, however, to make use of an additional and rather different principle of communication and control. The idea may be highlighted by comparing instructions ordinarily addressed to intelligent human beings with instructions ordinarily used with computers. The latter specify precisely the individual steps to take and the sequence in which to take them. The former present or imply something about incentive or motivation, and they supply a criterion by which the human executor of the instructions will know when he has accomplished his task. In short: instructions directed to computers specify courses; instructions-directed to human beings specify goals.

Men appear to think more naturally and easily in terms of goals than in terms of courses. True, they usually know something about directions in which to travel or lines along which to work, but few start out with precisely formulated itineraries. Who, for example, would depart from Boston for Los Angeles with a detailed specification of the route? Instead, to paraphrase Wiener, men bound for Los Angeles try continually to decrease the amount by which they are not yet in the smog.

Computer instruction through specification of goals is being approached along two paths. The first involves problem-solving, hill-climbing, self-organizing programs. The second involves real-time concatenation of preprogrammed segments and closed subroutines which the human operator can designate and call into action simply by name.

Along the first of these paths, there has been promising exploratory work. It is clear that, working within the loose constraints of predetermined strategies, computers will in due course be able to devise and simplify their own procedures for achieving stated goals. Thus far, the achievements have not been substantively important; they have constituted only “demonstration in principle.” Nevertheless, the implications are far-reaching.

Although the second path is simpler and apparently capable of earlier realization, it has been relatively neglected. Fredkin’s trie memory provides a promising paradigm. We may in due course see a serious effort to develop computer programs that can be connected together like the words and phrases of speech to do whatever computation or control is required at the moment. The consideration that holds back such an effort, apparently, is that the effort would produce nothing that would be of great value in the context of existing computers. It would be unrewarding to develop the language before there are any computing machines capable of responding meaningfully to it.

5.5 Input and Output Equipment

The department of data processing that seems least advanced, in so far as the requirements of man-computer symbiosis are concerned, is the one that deals with input and output equipment or, as it is seen from the human operator’s point of view, displays and controls. Immediately after saying that, it is essential to make qualifying comments, because the engineering of equipment for high-speed introduction and extraction of information has been excellent, and because some very sophisticated display and control techniques have been developed in such research laboratories as the Lincoln Laboratory. By and large, in generally available computers, however, there is almost no provision for any more effective, immediate man-machine communication than can be achieved with an electric typewriter.

Displays seem to be in a somewhat better state than controls. Many computers plot graphs on oscilloscope screens, and a few take advantage of the remarkable capabilities, graphical and symbolic, of the charactron display tube. Nowhere, to my knowledge, however, is there anything approaching the flexibility and convenience of the pencil and doodle pad or the chalk and blackboard used by men in technical discussion.

1) Desk-Surface Display and Control: Certainly, for effective man-computer interaction, it will be necessary for the man and the computer to draw graphs and pictures and to write notes and equations to each other on the same display surface. The man should be able to present a function to the computer, in a rough but rapid fashion, by drawing a graph. The computer should read the man’s writing, perhaps on the condition that it be in clear block capitals, and it should immediately post, at the location of each hand-drawn symbol, the corresponding character as interpreted and put into precise type-face. With such an input-output device, the operator would quickly learn to write or print in a manner legible to the machine. He could compose instructions and subroutines, set them into proper format, and check them over before introducing them finally into the computer’s main memory. He could even define new symbols, as Gilmore and Savell [14] have done at the Lincoln Laboratory, and present them directly to the computer. He could sketch out the format of a table roughly and let the computer shape it up with precision. He could correct the computer’s data, instruct the machine via flow diagrams, and in general interact with it very much as he would with another engineer, except that the “other engineer” would be a precise draftsman, a lightning calculator, a mnemonic wizard, and many other valuable partners all in one.

2) Computer-Posted Wall Display: In some technological systems, several men share responsibility for controlling vehicles whose behaviors interact. Some information must be presented simultaneously to all the men, preferably on a common grid, to coordinate their actions. Other information is of relevance only to one or two operators. There would be only a confusion of uninterpretable clutter if all the information were presented on one display to all of them. The information must be posted by a computer, since manual plotting is too slow to keep it up to date.

The problem just outlined is even now a critical one, and it seems certain to become more and more critical as time goes by. Several designers are convinced that displays with the desired characteristics can be constructed with the aid of flashing lights and time-sharing viewing screens based on the light-valve principle.

The large display should be supplemented, according to most of those who have thought about the problem, by individual display-control units. The latter would permit the operators to modify the wall display without leaving their locations. For some purposes, it would be desirable for the operators to be able to communicate with the computer through the supplementary displays and perhaps even through the wall display. At least one scheme for providing such communication seems feasible.

The large wall display and its associated system are relevant, of course, to symbiotic cooperation between a computer and a team of men. Laboratory experiments have indicated repeatedly that informal, parallel arrangements of operators, coordinating their activities through reference to a large situation display, have important advantages over the arrangement, more widely used, that locates the operators at individual consoles and attempts to correlate their actions through the agency of a computer. This is one of several operator-team problems in need of careful study.

3) Automatic Speech Production and Recognition: How desirable and how feasible is speech communication between human operators and computing machines? That compound question is asked whenever sophisticated data-processing systems are discussed. Engineers who work and live with computers take a conservative attitude toward the desirability. Engineers who have had experience in the field of automatic speech recognition take a conservative attitude toward the feasibility. Yet there is continuing interest in the idea of talking with computing machines. In large part, the interest stems from realization that one can hardly take a military commander or a corporation president away from his work to teach him to type. If computing machines are ever to be used directly by top-level decision makers, it may be worthwhile to provide communication via the most natural means, even at considerable cost.

Preliminary analysis of his problems and time scales suggests that a corporation president would be interested in a symbiotic association with a computer only as an avocation. Business situations usually move slowly enough that there is time for briefings and conferences. It seems reasonable, therefore, for computer specialists to be the ones who interact directly with computers in business offices.

The military commander, on the other hand, faces a greater probability of having to make critical decisions in short intervals of time. It is easy to overdramatize the notion of the ten-minute war, but it would be dangerous to count on having more than ten minutes in which to make a critical decision. As military system ground environments and control centers grow in capability and complexity, therefore, a real requirement for automatic speech production and recognition in computers seems likely to develop. Certainly, if the equipment were already developed, reliable, and available, it would be used.

In so far as feasibility is concerned, speech production poses less severe problems of a technical nature than does automatic recognition of speech sounds. A commercial electronic digital voltmeter now reads aloud its indications, digit by digit. For eight or ten years, at the Bell Telephone Laboratories, the Royal Institute of Technology (Stockholm), the Signals Research and Development Establishment (Christchurch), the Haskins Laboratory, and the Massachusetts Institute of Technology, Dunn [6], Fant [7], Lawrence [15], Cooper [3], Stevens [26], and their co-workers, have demonstrated successive generations of intelligible automatic talkers. Recent work at the Haskins Laboratory has led to the development of a digital code, suitable for use by computing machines, that makes an automatic voice utter intelligible connected discourse [16].

The feasibility of automatic speech recognition depends heavily upon the size of the vocabulary of words to be recognized and upon the diversity of talkers and accents with which it must work. Ninety-eight per cent correct recognition of naturally spoken decimal digits was demonstrated several years ago at the Bell Telephone Laboratories and at the Lincoln Laboratory [4], [9]. To go a step up the scale of vocabulary size, we may say that an automatic recognizer of clearly spoken alpha-numerical characters can almost surely be developed now on the basis of existing knowledge. Since untrained operators can read at least as rapidly as trained ones can type, such a device would be a convenient tool in almost any computer installation.

For real-time interaction on a truly symbiotic level, however, a vocabulary of about 2000 words, e.g., 1000 words of something like basic English and 1000 technical terms, would probably be required. That constitutes a challenging problem. In the consensus of acousticians and linguists, construction of a recognizer of 2000 words cannot be accomplished now. However, there are several organizations that would happily undertake to develop an automatic recognize for such a vocabulary on a five-year basis. They would stipulate that the speech be clear speech, dictation style, without unusual accent.

Although detailed discussion of techniques of automatic speech recognition is beyond the present scope, it is fitting to note that computing machines are playing a dominant role in the development of automatic speech recognizers. They have contributed the impetus that accounts for the present optimism, or rather for the optimism presently found in some quarters. Two or three years ago, it appeared that automatic recognition of sizeable vocabularies would not be achieved for ten or fifteen years; that it would have to await much further, gradual accumulation of knowledge of acoustic, phonetic, linguistic, and psychological processes in speech communication. Now, however, many see a prospect of accelerating the acquisition of that knowledge with the aid of computer processing of speech signals, and not a few workers have the feeling that sophisticated computer programs will be able to perform well as speech-pattern recognizes even without the aid of much substantive knowledge of speech signals and processes. Putting those two considerations together brings the estimate of the time required to achieve practically significant speech recognition down to perhaps five years, the five years just mentioned.

References

[1] A. Bernstein and M. deV. Roberts, “Computer versus chess-player,” Scientific American, vol. 198, pp. 96-98; June, 1958.

[2] W. W. Bledsoe and I. Browning, “Pattern Recognition and Reading by Machine,” presented at the Eastern Joint Computer Conf, Boston, Mass., December, 1959.

[3] F. S. Cooper, et al., “Some experiments on the perception of synthetic speech sounds,” J. Acoust Soc. Amer., vol.24, pp.597-606; November, 1952.

[4] K. H. Davis, R. Biddulph, and S. Balashek, “Automatic recognition of spoken digits,” in W. Jackson, Communication Theory, Butterworths Scientific Publications, London, Eng., pp. 433-441; 1953.

[5] G. P. Dinneen, “Programming pattern recognition,” Proc. WJCC, pp. 94-100; March, 1955.

[6] H. K. Dunn, “The calculation of vowel resonances, and an electrical vocal tract,” J. Acoust Soc. Amer., vol. 22, pp.740-753; November, 1950.

[7] G. Fant, “On the Acoustics of Speech,” paper presented at the Third Internatl. Congress on Acoustics, Stuttgart, Ger.; September, 1959.

[8] B. G. Farley and W. A. Clark, “Simulation of self-organizing systems by digital computers.” IRE Trans. on Information Theory, vol. IT-4, pp.76-84; September, 1954

[9] J. W. Forgie and C. D. Forgie, “Results obtained from a vowel recognition computer program,” J. Acoust Soc. Amer., vol. 31, pp. 1480-1489; November, 1959

[10] E. Fredkin, “Trie memory,” Communications of the ACM, Sept. 1960, pp. 490-499

[11] R. M. Friedberg, “A learning machine: Part I,” IBM J. Res. & Dev., vol.2, pp.2-13; January, 1958.

[12] H. Gelernter, “Realization of a Geometry Theorem Proving Machine.” Unesco, NS, ICIP, 1.6.6, Internatl. Conf. on Information Processing, Paris, France; June, 1959.

[13] P. C. Gilmore, “A Program for the Production of Proofs for Theorems Derivable Within the First Order Predicate Calculus from Axioms,” Unesco, NS, ICIP, 1.6.14, Internatl. Conf. on Information Processing, Paris, France; June, 1959.

[14] J. T. Gilmore and R. E. Savell, “The Lincoln Writer,” Lincoln Laboratory, M. I. T., Lexington, Mass., Rept. 51-8; October, 1959.

[15] W. Lawrence, et al., “Methods and Purposes of Speech Synthesis,” Signals Res. and Dev. Estab., Ministry of Supply, Christchurch, Hants, England, Rept. 56/1457; March, 1956.

[16] A. M. Liberman, F. Ingemann, L. Lisker, P. Delattre, and F. S. Cooper, “Minimal rules for synthesizing speech,” J. Acoust Soc. Amer., vol. 31, pp. 1490-1499; November, 1959.

[17] A. Newell, “The chess machine: an example of dealing with a complex task by adaptation,” Proc. WJCC, pp. 101-108; March, 1955.

[18] A. Newell and J. C. Shaw, “Programming the logic theory machine.” Proc. WJCC, pp. 230-240; March, 1957.

[19] A. Newell, J. C. Shaw, and H. A. Simon, “Chess-playing programs and the problem of complexity,” IBM J. Res & Dev., vol.2, pp. 320-33.5; October, 1958.

[20] A. Newell, H. A. Simon, and J. C. Shaw, “Report on a general problem-solving program,” Unesco, NS, ICIP, 1.6.8, Internatl. Conf. on Information Processing, Paris, France; June, 1959.

[21] J. D. North, “The rational behavior of mechanically extended man”, Boulton Paul Aircraft Ltd., Wolverhampton, Eng.; September, 1954.

[22] 0. G. Selfridge, “Pandemonium, a paradigm for learning,” Proc. Symp. Mechanisation of Thought Processes, Natl. Physical Lab., Teddington, Eng.; November, 1958.

[23] C. E. Shannon, “Programming a computer for playing chess,” Phil. Mag., vol.41, pp.256-75; March, 1950.

[24] J. C. Shaw, A. Newell, H. A. Simon, and T. O. Ellis, “A command structure for complex information processing,” Proc. WJCC, pp. 119-128; May, 1958.

[25] H. Sherman, “A Quasi-Topological Method for Recognition of Line Patterns,” Unesco, NS, ICIP, H.L.5, Internatl. Conf. on Information Processing, Paris, France; June, 1959

[26] K. N. Stevens, S. Kasowski, and C. G. Fant, “Electric analog of the vocal tract,” J. Acoust. Soc. Amer., vol. 25, pp. 734-742; July, 1953.

[27] Webster’s New International Dictionary, 2nd e., G. and C. Merriam Co., Springfield, Mass., p. 2555; 1958.


中文翻译参考

摘要

人机共生是人类和电子计算机之间合作互动的一个预期发展。这将涉及人类和电子设备之间非常密切的耦合。主要目的是1)让计算机促进公式化思维,因为它们现在促进了公式化问题的解决;2)让人类和计算机能够合作做出决策和控制复杂的情况,而不依赖于预先确定的程序。在预期的共生伙伴关系中,人类将设定目标,制定假设,确定标准,并进行评估。计算机将会做一些常规的工作,为人类在技术和科学思考方面的见解和决策做好准备。初步分析表明,共生伙伴关系将比单独的人能更有效地进行智力活动。实现有效合作关系的先决条件包括计算机分时、内存组件、内存组织、编程语言以及输入和输出设备的发展。

1介绍

1.1共生

只有无花果小黄蜂(Blastophaga grossorun)才能帮助无花果树完成授粉。这种昆虫的幼体生活在无花果树的子房中,它们也能在无花果树的子房中找到食物。如此一来,无花果树和无花果小黄蜂便对彼此有着严重的依赖:没有无花果小黄蜂,无花果树就不能结出果实;没有无花果树,无花果小黄蜂也不能获得食物。两者的结合不仅能使彼此生存下去,更能创造一种高产且生机勃勃的合作关系。“两个不同的生物体以亲密合作的方式生活在一起,甚至结成紧密的联盟”,这种合作模式就叫共生。

人机共生是人机系统的一个子类。有许多人机系统。然而,目前还没有人机共生体。本文的目的是提出这一概念,并希望通过分析人机交互的一些问题,提请人们注意人机工程的适用原则,并指出一些需要研究回答的问题,从而促进人机共生的发展。我们希望,在不太长的时间内,人脑和计算机将紧密结合在一起,由此产生的伙伴关系将会认为,没有人脑能够以我们今天所知的信息处理机器所未有的方式思考和处理数据。

1.2在“机器增强的人类”和“人工智能”之间

作为一个概念,人机共生在一个重要的方面与North所称的“机器增强的人类”是不同的。在过去的人机系统中,操作者掌握主动权,提供方向,进行整合,制定标准。系统中机械的部分,首先是人类的胳膊,然后是眼睛的延伸。这些系统当然不是由“生活在一起的不同生物”组成的。只有一种有机体——人,其余的只是为了帮助这个人。

在某种意义上,任何人造系统都是为了帮助人类,帮助系统外的一个或多个人。然而,如果我们关注系统内的操作人员,我们会发现,在某些技术领域,过去几年发生了巨大的变化。“机器增强”已经取代了人类,转向了自动化,留下来的人更多的是为了帮助,而不是得到帮助。在某些情况下,特别是在以计算机为中心的大型信息和控制系统中,人工操作员主要负责自动化不可行的功能。这种系统(North可能称之为“人类增强的机器”)不是共生系统。它们是“半自动”系统,系统最初是全自动的,但没有达到目标。

人机共生可能不是复杂技术系统的最终范例。在适当的时候,电子或化学“机器”似乎完全有可能在我们现在专门考虑的大部分功能上超过人脑。即使是现在,Gelernter的IBM - 704平面几何定理证明程序的进度也和布鲁克林高中学生差不多,并且犯了类似的错误。事实上,有几个理论证明、解决问题、下棋和模式识别程序,它们能够在受限制的领域中与人类的智力表现相媲美;而Newell、Simon和Shaw的“一般问题解决器”可能会消除一些限制。简而言之,避免与(其他)人工智能爱好者争论似乎是值得的,因为他们认为在遥远的未来只有机器的统治地位。然而,在此期间,主要的智力进步将会由密切合作的人和计算机来完成,这将会是一个相当长的过渡期。一个多学科的研究小组,研究了空军未来的研究和发展问题,估计在1980年之前,人工智能的发展使机器能够独自思考或解决具有军事意义问题。这将会导致,比如说,5年的时间来发展人机共生,15年的时间来使用它。15年可能是10年或500年,但那些年应该是人类历史上最具创造力和最激动人心的时期。

2人机共生的目标

当今的计算机主要是为了解决预先设定的问题或者按照预定的程序处理数据。计算过程可能取决于计算过程中获得的结果,但是所有的替代方案都必须提前预见。(如果出现不可预见的替代方案,整个过程就会停止,等待程序的必要扩展。)预先制定或预先确定的要求有时并没有什么大的缺点。预先制定或预先确定的要求有时并没有什么大的缺点。人们常说,计算机的编程迫使人们要清楚地思考,它会规范思维过程。如果用户能够提前思考他的问题,那么与计算机的共生关系就没有必要了。

然而,很多问题……很难提前想透彻,回想一下前文对新兴系统的描述。如果能通过与计算机合作,由直觉引导进行试错,暴露出推理过程中的错误,或是揭示解决方案中某些意想不到的转折,就能更快、更好地解决问题。没有计算机辅助,其他问题根本无法解决。庞加莱预见到了一批重要的潜在计算机用户的沮丧,他说,“问题不是,答案是什么?,问题是,问题是什么”。人机共生的主要目的之一是将计算机有效地纳入技术问题的公式化部分。

另一个主要目标是密切相关的。这是为了有效地将计算机带入必须“实时”进行的思考过程,时间过快,不允许以传统方式使用计算机。想象一下,例如,试图在这样的时间表上借助计算机指挥一场战斗。你今天提出了你的问题。明天你和程序员一起度过。你会得到一张20英尺长的纸,上面写满了数字,这些数字并没有提供最终的解决方案,只是提出了一种应该通过模拟来探索的策略。显然,这场战斗将在其计划的第二步开始之前结束。与电脑互动的思维方式与你与一位同事互动的思维方式是一样的,他的能力补充了你自己的能力,这将要求人与机器之间的耦合比这个例子所建议的和现状可能的要紧密得多。

3计算机参与公式化和实时思维的需要

前面的段落默认了这样的假设:如果可以有效地引入到思想过程中,数据处理机器所能执行的功能将会以一种重要的方式改进或促进思考和解决问题。这种假设可能需要正当理由。

3.1技术思维的初步和非正式工效分析

尽管有大量关于思考和解决问题的文献,包括对发明过程的大量历史案例研究,但我找不到比对从事科技企业的人的脑力劳动进行工效研究分析更好的东西。因此,在1957年的春天和夏天,我试着记录一个中等技术人员在他认为专注于工作的时间里所做的事情。尽管我意识到取样的不足,但我还是做了自己的研究对象。很明显,我所做的主要事情是保持记录,如果按照最初计划中设想的细节保存记录,这个项目将会变成一个无限倒退。不是。尽管如此,我还是获得了一张让我停下来的活动照片。也许我的范围不典型——我希望不是,但我担心是。

我85 %的“思考”时间都花在了思考、决策、学习一些我需要知道的事情上。寻找或获取信息的时间比消化信息的时间多得多。几个小时用于绘制图表,其他几个小时用于指导助手如何绘制图表。当图表完成后,两种关系立刻变得明显,但必须进行绘图以使其成为现实。在某一点上,有必要对语音清晰度和语音噪声比相关函数的六个实验测定值进行比较。没有两个实验者使用相同的语音噪声比定义或测量。需要几个小时的计算才能把数据变成可比的形式。当它们处于可比较的形式时,我只花了几秒钟就确定了我需要知道的东西。

简而言之,在整个研究期间,我的“思考”时间主要用于本质上是文书或机械的活动:搜索、计算、绘图、转换、确定一组假设或假设的逻辑或动态后果,为决策或洞察铺平道路。此外,我对尝试什么和不尝试什么的选择在很大程度上是出于对文书可行性的考虑,而不是智力能力的考虑,这令人尴尬。

刚刚描述的研究结果传达的主要建议是,在大多数时间里,被称为技术思维的操作都是机器可以比人类更有效地执行的操作。这些操作必须在不同的变量上以不可预见的和不断变化的顺序进行,这一事实带来了严重的问题。然而,如果这些问题能够以在人和快速信息检索和数据处理机器之间建立共生关系的方式得到解决,那么合作互动显然会大大改善思维过程。

在这一点上,也许应该承认,我们正在使用术语“计算机”来涵盖各种计算、数据处理以及信息存储和检索机器。这类机器的能力几乎每天都在增加。因此,对该类的功能进行一般性陈述是危险的。也许就人的能力发表一般性声明也同样危险。尽管如此,人类和计算机之间在能力上的某些基因型差异确实很突出,并且它们对可能的人机共生的性质和实现这种共生的潜在价值有着影响。

正如已经以各种方式所说的那样,人类是嘈杂的窄带设备,但是他们的神经系统有很多平行的同时活跃的通道。相对于人类来说,计算机速度非常快而且非常精确,但是它们一次只能执行一个或几个基本操作。人类很灵活,能够根据新收到的信息“不断地自我规划”。计算机是一根筋的,受其“预编程”的约束。人类自然会说冗余性的语言,这些语言围绕着单一的物体和连贯的动作组织起来,使用20到60个基本符号。计算机“自然”会说非冗余语言,通常只有两个基本符号,对单一物体或连贯动作都没有固有的鉴赏能力。

要严格正确,这些特征必须包括许多限定符。尽管如此,它们所呈现的不同(因此也是潜在的补充)本质上是正确的。计算机可以很容易、很好、很快地做许多对人类来说困难或不可能的事情,而人类可以很容易、很好地做许多对计算机来说困难或不可能的事情,尽管不是很快。这表明,共生合作,如果成功地融合了人和计算机的积极特征,将具有巨大的价值。当然,速度和语言的差异带来了必须克服的困难。

4预期共生关系中人和计算机的可分离功能

似乎人类操作员和设备的贡献在许多操作中会如此完全地融合在一起,以至于在分析中很难将它们整齐地分开。情况就是这样;例如,在收集决策依据的数据时,人和计算机都从经验中找到了相关的先例,如果计算机随后提出了一个符合人直觉判断的行动方案。(在定理证明程序中,计算机在经验中找到了先例,在SAGE系统中,它们提出了行动方案。上述内容并不是一个牵强附会的例子。)然而,在其他行动中,人员和设备的贡献在某种程度上是可分离的。

当然,至少在早期,人类会设定目标并提供动机。他们将提出假设。他们会问问题。他们会想到机制、程序和模型。他们会记得,这样的人早在1947年,或者至少在二战后不久,就对一个感兴趣的主题做了一些可能相关的工作,他们会知道该主题可能在哪些期刊上发表。总的来说,他们会做出近似的、错误的、但领先的贡献,他们会定义标准并充当评估者,判断设备的贡献并指导总体思路。

此外,当这种情况确实出现时,人类会处理极低概率的情况。(在当前的人机系统中,这是操作员最重要的功能之一。极低概率替代方案的概率之和往往太大,不容忽视。)当计算机没有适用于特定环境的模式或程序时,人们会填补问题解决方案或计算机程序中的空白。

信息处理设备本身将把假设转换成可测试的模型,然后根据数据对模型进行测试(操作员可以粗略地指定这些数据,并在计算机将它们提交给他审批时确定它们是相关的)。这些设备将回答问题。它将模拟机制和模型,执行程序,并向操作员显示结果。它将转换数据,绘制图表(以人类操作员指定的任何方式“切蛋糕”,或者如果人类操作员不确定他想要什么,那就呈现几种替代方式)。设备将会插入、推断和转换。它将静态方程或逻辑语句转换成动态模型,以便操作员可以检查他们的行为。一般来说,它将执行可例行公事的文书工作,以填补决策之间的间隔。

此外,只要有足够的基础支持正式的统计分析,计算机将充当统计推断、决策理论或博弈论机器,对建议的行动方案进行初步评估。最后,它将尽可能多地进行诊断、模式匹配和相关性识别,但在这些领域,它将接受一个明显的次要地位。

5实现人机共生的前提条件

在前一节中默认的数据处理设备是不可用的。计算机程序尚未编写。事实上,在非共生的当下和预期的共生未来之间存在着一些障碍。让我们研究一下其中的一些障碍,以便更清楚地了解需要什么以及实现这一目标的可能性。

5.1人与计算机之间的速度不匹配

现在的大型计算机对于与一个人进行实时合作思考来说,速度太快,成本太高。显然,为了效率和经济,计算机必须在许多用户之间分配时间。分时系统目前正在积极开发中。甚至有一些安排来防止用户“破坏”除了他们自己的个人程序之外的任何东西。

在10年或15年后的一段时间里,设想一个“思维中心”似乎是合理的,它将结合当今图书馆的功能,以及信息存储和检索的预期进展和本文前面建议的共生功能。这种设想很容易放大成这样的中心网络,通过宽带通信线路相互连接,并通过租用线路服务连接到各个用户。在这样的系统中,计算机的速度将会平衡,巨大的存储器和复杂程序的成本将会除以用户的数量。

5.2存储器硬件要求

当我们开始考虑将任何已知的技术文献存储在计算机存储器中时,我们会遇到数十亿比特数据,除非事情发生显着变化,否则将花费数十亿美元。

首先要面对的是,我们不会将所有的技术和科学论文都存储在计算机存储器中。我们可能会储存概括得最简洁的部分——数量部分和参考文献——但不是全部。书籍是现存最精美、最人性化的组件之一,在人机共生的背景下,它们将继续发挥重要的功能。(希望计算机能加快图书的查找、交付和归还。)

第二点是,存储器的一个非常重要的部分将是永久的:部分是不可擦除的内存和部分是发布的内存。计算机将能够一次写入不可擦除的内存,然后无限期地读取,但是计算机将无法擦除不可擦除的内存。(它也可能会重写,将所有的0变成1,就像在之前写的东西上做标记一样。)发布的内存将是“只读”内存。它将被引入到已经构建好的计算机中。计算机将能够重复引用它,但不能改变它。随着计算机越来越大,这些类型的内存将变得越来越重要。它们可以做的比核心、薄膜、甚至磁带存储器更紧凑,而且价格也要便宜得多。主要的工程问题将涉及选择电路。

就内存需求的其他方面而言,我们可以指望普通的科学和商业计算机的持续发展。存储元件有可能变得与处理(逻辑)元件一样快。这一发展将对计算机的设计产生革命性的影响。

5.3存储组织要求

人机共生理念中隐含着这样的要求,即信息可以按名称和模式检索,并且可以通过比串行搜索快得多的程序访问。至少有一半的内存组织问题似乎存在于存储过程中。其余大部分似乎都包含在存储机制或介质中的模式识别问题中。对这些问题的详细讨论超出了目前的范围。然而,简要概述一个有希望的想法,即“trie存储”,可能有助于说明预期发展的一般性质。

Trie存储是由其创始人Fredkin所称的,是因为它被设计成便于检索信息,并且因为分支存储结构在开发的时候类似于一棵树。大多数常见的内存系统在参数指定的位置存储参数的函数。(从某种意义上说,它们根本不存储这些参数。在另一个更现实的意义上,它们将所有可能的参数存储在内存的框架结构中。)另一方面,trie存储系统存储函数和参数。从标准的初始寄存器开始,参数首先被引入内存,一次一个字符。每个参数寄存器都有一个单元格,每个字符都有一个单元格(例如,两个用于二进制形式的信息),每个字符单元都有一个存储空间,用于下一个寄存器的地址。这个参数是通过写一系列地址来存储的,每一个地址都告诉我们在哪里找到下一个地址。在引数的最后,是一个特殊的“结束参数”标记。然后遵循函数的指示,该函数以多种方式中的一种或另一种方式存储,进一步的trie结构或“列表结构”通常是最有效的。

Trie存储方案对于小内存来说是低效的,但是随着内存大小的增加,它在使用可用存储空间方面变得越来越高效。该方案吸引人的特点是:1)检索过程极其简单。给定参数后,输入第一个字符的标准初始寄存器,并提取第二个字符的地址。然后转到第二个寄存器,获取第三个寄存器的地址,等等。2)如果两个参数具有相同的初始字符,则它们对这些字符使用相同的存储空间。3)参数的长度不必相同,也不必事先指定。4)在实际存储之前,任何参数都不会保留或使用存储空间。trie结构是在项目被引入内存时创建的。5)一个函数可以用作另一个函数的参数,该函数可以用作下一个函数的参数。因此,例如,通过用参数“矩阵乘法”输入,人们可以检索在计算机上执行矩阵乘法的整个程序。6)通过检查给定级别的存储,可以确定迄今为止存储了哪些相似的项目。例如,如果没有引用Egan, J. P.,那么只需要一两步就能找到Egan James的踪迹…….

刚刚描述的属性并不包括所有想要的属性,但是它们使计算机存储与人类操作者产生共鸣,并且他们倾向于通过命名或指向来指定事物。

5.4语言问题

人类语言和计算机语言之间的基本差异可能是真正共生的最严重障碍。然而,令人欣慰的是,通过解释程序,特别是通过汇编或编译程序,如FORTRAN,已经取得了巨大进步,使计算机适应人类语言形式。Shaw,Newell,Simon和Ellis的“信息处理语言(Information Processing Language)”代表了另一种和解方式。而且,在ALGOL和相关系统中,人们通过采用可以轻易翻译成机器语言的表示和表达的标准公式来证明其灵活性。

然而,为了实现人与计算机之间的实时合作,有必要利用另外一种相当不同的通信和控制原理。可以通过比较通常针对智能人的指令和通常用于计算机的指令来突出这个想法。后者精确地指定了要采取的各个步骤以及采取这些步骤的顺序。前者提出或暗示了一些关于激励或动机的东西,它们提供了一个标准,在这个标准中,指令的执行者将知道完成任务的时间。简而言之:针对计算机的指令指定路线;针对人类的指令指定了目标。

人类似乎在目标方面比在路线方面更自然,更容易地思考。的确,他们通常知道一些关于旅行或工作路线的信息,但很少有人能从精确制定的行程开始。例如,谁会带着详细的路线说明从波士顿出发去洛杉矶?相反,用Wiener的话来说,前往洛杉矶的人试图不断减少他们还没有被烟雾笼罩的程度。

通过两种途径来实现计算机指令。第一个涉及解决问题、爬山算法、自组织项目。第二种是预编程段和闭合子程序的实时串联,操作员可以简单地通过名字来指定和调用这些子程序。

沿着第一条道路,已经有了有希望的探索性工作。很明显,在预定策略的宽松约束下工作,计算机将能够在适当的时候设计和简化自己的程序来实现既定的目标。迄今为止,这些成就并不重要;他们只是“原则上的示范”。然而,其影响深远。

尽管第二条道路更简单,显然能够更早实现,但它相对被忽视了。Fredkin的trie存储提供了一个有前途的范例。我们可能会在适当的时候看到一个认真努力来开发的计算机程序,这些程序可以像语言的单词和短语一样连接在一起,这样就可以进行任何计算或控制。显然,阻碍这种努力的考虑因素是,这种努力不会产生任何在现有计算机环境中具有重大价值的东西。在没有任何计算机能够对语言做出有意义的反应之前,开发语言是不可取的。

5.5输入输出设备

就人机共生的要求而言,似乎最不先进的数据处理部门是处理输入和输出设备的部门,或者从操作员的角度来看,是处理显示和控制的部门。在说了这句话之后,有必要做出符合条件的评论,因为用于高速引入和提取信息的设备工程一直很出色,而且一些非常复杂的显示和控制技术已经在林肯实验室等研究实验室中得到发展。然而,总的来说,在一般可用的计算机中,几乎没有比电动打字机更有效、更即时的人机通信。

显示器似乎比控制的状态好一些。许多计算机在示波器屏幕上绘制图形,少数计算机利用了字符显示管卓越的图形和符号功能。然而,据我所知,在技术讨论中,没有任何东西能接近铅笔和涂鸦板的灵活性和方便性,或者是人们使用的粉笔和黑板。

1)桌面显示和控制:当然,为了有效的人机交互,在相同的显示面上,人和电脑需要绘制图形和图画,并在相同的显示面上写注释和方程。这个人应该能够通过绘制图表,以粗糙但快速的方式向计算机展示一个功能。计算机应该阅读这个人的文字,也许是在清楚的大写字母的条件下,并且应该立即在每个手绘符号的位置张贴相应的字符,并将其翻译成精确的字体。有了这种输入输出设备,操作员将很快学会以机器可读的方式书写或打印。他可以编写指令和子程序,将它们设置成适当的格式,并在最终将它们引入计算机的主存储器之前检查它们。他甚至可以像Gilmore和Savell在林肯实验室所做的那样定义新的符号,并将它们直接呈现给计算机。他可以粗略地勾画出一张表格的格式,然后让电脑精确地塑造它。他可以修正计算机的数据,通过流程图指导机器,并且就像通常与其他工程师一样进行交互,除了“其他工程师”将是精确的绘图员,快速的计算器,助记符向导,以及许多其他有价值的合作伙伴。

2)计算机发布的墙面显示:在一些技术系统中,几个人共同负责控制行为相互影响的车辆。一些信息必须同时呈现给所有人,最好是在一个公共网格上,以协调他们的行动。其他信息仅与一两个操作者相关。如果所有信息都在一个显示器上呈现给所有人,那将只会产生无法解释的混乱。这些信息必须由计算机发布,因为手工绘图太慢,无法保持最新。

刚才概述的问题现在甚至是一个关键问题,随着时间的推移,它似乎肯定会变得越来越关键。一些设计者相信,基于光阀原理,可以借助脉冲光源和分时观看屏幕来构建具有所需特性的显示器。

大多数思考过这个问题的人认为,大型显示器应该由单独的显示器控制单元来补充。后者将允许操作者修改墙面显示而不离开他们的位置。出于某些目的,希望操作者能够通过辅助显示器甚至墙面显示与计算机通信。至少有一种提供这种通信的方案似乎是可行的。

当然,大型墙面显示及其相关系统与计算机和一组人之间的共生合作相关。实验室实验一再表明,操作员非正式的平行安排,通过参考大型位置显示器来协调他们的活动,比更广泛使用的安排具有重要优势,它将操作员定位在各个控制台上,并试图通过计算机代理来关联他们的行动。这是需要仔细研究的几个操作团队问题之一。

3)自动语音生成和识别:人类操作员和计算机之间的语音通信有多理想和可行?每当讨论复杂的数据处理系统时,都会问这个复杂的问题。与计算机一起工作和生活的工程师对这种愿望持保守态度。在自动语音识别领域有经验的工程师对可行性持保守态度。然而,人们仍然对与计算机对话的想法感兴趣。在很大程度上,这种兴趣源于人们认识到,很难把一名军事指挥官或公司总裁离开他们的工作,教他们打字。如果计算机能够被高层决策者直接使用,那么通过最自然的方式提供通信可能是值得的,即使花费相当大的成本。

对公司总裁的问题和时间尺度的初步分析表明,他只对作为爱好的计算机的共生关系感兴趣。业务情况通常进展缓慢,以至于有时间进行简报和会议。因此,对于计算机专家而言,与商务办公室中的计算机直接交互似乎是合理的。

另一方面,军事指挥官在短时间内做出关键决策的可能性更大。人们很容易夸大10分钟战争的概念,但是指望有十分钟以上的时间来做出关键决定是危险的。因此,随着军事系统地面环境和控制中心的能力和复杂性的增长,计算机自动语音生成和识别的真正需求似乎有可能得到发展。当然,如果设备已经被开发、可靠的和可用的,它就会被使用。

就可行性而言,与自动识别语音相比,语音产生带来的技术性问题不那么严重。一台商用电子数字电压表现在一个数字一个数字地大声读出它的指示。八年或十年,贝尔电话实验室,皇家理工学院(斯德哥尔摩),Signals Research and Development Establishment(Christchurch),耶鲁大学汉斯金实验室和麻省理工学院,Dunn,Fant,Lawrence,Cooper,Stevens和他们的同事,已经展示了一代又一代的可理解的自动发生器。汉斯金实验室的研究已经开发出了一种适合计算机使用的数字代码,这种代码使得自动语音完全可以理解相关的话语。

自动语音识别的可行性在很大程度上取决于要识别的单词的词汇量以及说话者和口音的多样性。几年前,在贝尔电话实验室和林肯实验室,人们已经证明了百分之九十八的正确识别自然的十进制数字。为了进一步扩大词汇量,我们可以说,现在几乎可以肯定地在现有知识的基础上开发出一种清晰发音的字母数字字符的自动识别器。由于未经训练的操作员读取的速度与训练有素的操作员键入的速度至少一样快,因此这种设备几乎可以在任何计算机安装中使用。

然而,为了在真正共生的水平上进行实时交互,可能需要大约2000个单词的词汇,例如1000个基本英语单词和1000个专业术语。这是一个具有挑战性的问题。在声学专家和语言学家的共识中,现在还不能完成建立2000个单词的识别器。然而,有几个组织乐意承诺在五年内为这样的词汇开发一个自动识别系统。他们会规定演讲要清晰,听写的风格,没有不寻常的口音。

尽管对自动语音识别技术的详细讨论超出了目前的范围,但值得注意的是,计算机在自动语音识别器的发展中起着主导作用。他们为当前的乐观情绪提供了动力,或者说是一些人目前的乐观情绪。两三年前,似乎在10年或15年内不会自动识别大量词汇;它将不得不等待语音交流中声学、语音、语言和心理过程的知识的逐渐积累。然而,现在,许多人看到了借助计算机处理语音信号来加速获取这种知识的前景,不少工作者认为,即使没有语音信号和过程的大量实质性知识的帮助,复杂的计算机程序也能像语音模式识别一样表现出色。将这两个考虑因素结合起来,可以将实现实际意义上的语音识别所需的时间估计降低到五年,即刚才提到的五年。


个人解读


   转载规则


《《人机共生》》 Henry-Avery 采用 知识共享署名 4.0 国际许可协议 进行许可。
  目录