Wednesday, July 11, 2012

Computer Science

  At the risk of boring some readers of The Berkeley Write, I've decided to review the remarkable history of computer science, from its barely existing a half century ago to being at the core of almost everything now.  That history is remarkable not merely because computers have become so ubiquitous, but especially because computer science has fundamentally changed the way we solve problems. 

  When I was an electrical engineering undergraduate at MIT in the late 1940s, I was dumbfounded to find that a classmate was studying Boolean algebra, an algebra of logic.  As one who was steeped in the mathematics of the continuum, I asked him why he was interested in an algebra having only two values—0 and 1.  He told me that he wanted to study digital computers. 

  I couldn't understand why he would be wasting his time.  Although I had worked on electro-mechanical analog computers during the summer after my freshman year, I scarcely knew what a digital computer was, and had absolutely no inkling at all of the revolution they would shortly ignite.  In my own exculpation, I should say that there were only a handful of them in existence at the time, and that even Thomas Watson, founder and then-chairman of IBM, had a few years earlier estimated their ultimate world-wide commercial market to be about five.  (To paraphrase John Kenneth Galbraith's remark about economic forecasting: The only function of engineering forecasting is to make astrology look respectable.)

  In 1960, when I joined the faculty at UC Berkeley, digital computers were much more widespread, but the term "computer science" was still an oxymoron: few could perceive any science there at all.  The study of computers at Cal, as elsewhere, was embedded in the department of electrical engineering, where the few courses on computers involved only the design of their electronic circuitry, peripherals and elementary programming.  Later in the decade, a few faculty members started going beyond those engineering issues, into basic research on the limits on digital computation and the theory of programming languages—i.e., computer science.

  Computer science has since come a very long way, now equally sharing with mathematics the role of fundament of all the sciences, and being central to many non-science disciplines as well.  Most particularly, it has largely changed our way of solving problems, from classical analysis to algorithmic procedures.

  In mathematics and the sciences, classical analysis is characterized by theorem proving and by the formulation of problems so that they can be solved by equations.  You had doses of that way of thinking in your high school geometry, algebra and physics courses, if not further.  For example, you encountered the quadratic equation, y = ax2 + bx + c (often arising in the physics of motion), and were asked to find the values of x for which y = 0.   The two solutions of that problem turned out to be x = - (b/2a) ± ((b2 – 4ac)1/2)/2a.  That's pure analysis.

  On the other hand, algorithmic thinking is characterized by a step-by-step, iterative procedure, the modus operandi of computers.  The process for such thinking is often described by a flow chart such as that below, which is taken from one of the forms of the federal income tax return.  You start at the box at the upper left, and step through the chart by answering a series of yes/no questions.  If you get to the box labeled "You must figure your penalty," you then proceed to a set of forms that, rather than giving a formula for calculation of the penalty, guides you through a sequence of "if-then" statements to make the calculation, one step at a time.  That's a pure algorithmic procedure.


  Computer science now occupies a central role in the physical sciences and elsewhere because many problems are too complex for mathematical analysis—for example, the behavior of chaotic systems such as the weather or population dynamics, or of economic systems with a large numbers of variables and participants. To investigate such a problem, a sophisticated algorithmic procedure is designed to model the system at hand, and the procedure is then run on a computer to provide insight into the system's performance.  It's not just a matter of using the computer as a glorified calculator; to get accurate results in reasonable amounts of time and formats, investigators must be as steeped in the fundamentals of computer science as their predecessors were in mathematics.

  Even though computer science didn't start flourishing until the 1960s, it had its ur-moment in 1935 in England in the work of Alan Turing.  He posited a machine that manipulates symbols in a step-by-step fashion.  The device, now called a Turing machine, is quintessentially algorithmic.  Turing never built one.  His genius consisted in specifying an iterative procedure that even now models any computer program and therefore any computer.

  Turing, a superbly talented mathematician, also used the concept of his machine to address a question his contemporary Kurt Gödel had tackled: the completeness of mathematics. Gödel's Incompleteness Theorem showed that some true mathematical statements can't be mathematically proved to be true.  Turing approached the question differently, from the viewpoint of the computability of algorithms.  In a result very much related to Gödel's theorem, he showed that not all problems are solvable computationally in a finite amount of time.

  The question of computability is now front and center in computer science.  Even if a problem is theoretically computable in a finite amount of time, the time needed may be unreasonably large, given the exabyte (quintillion byte) data sets that are becoming common in such scientific and commercial operations as the Large Hadron Collider, the Sloan Digital Sky Survey, the World Data Centre for Climate, Amazon, YouTube, Google, etc.  (The first of these generates a quadrillion bytes of data—more than the information in all the world's libraries—every second!)  This "big data" challenge goes to the heart of computer science: how to effectively organize huge data sets, and how to formulate algorithms that handle them both efficiently in computation time and accurately as the size of the data set increases. 

  An indication of the current importance of big data is the establishment this year of a major research institute at UC Berkeley with a private $60-million grant from the Simons Foundation.  The institute will bring together researchers from around the world to study efficient algorithmic approaches to such important big-data subjects as evolution, the immune system, maintaining privacy while doing data analysis, and climate modeling. You will surely hear much more about big-data research in the next decade.

  So, to repeat: in my own professional lifetime, computer science has come from barely existing at all to being "at the core of almost everything," even of disciplines that were not previously thought of as particularly quantitative.  I find it a stunning story.