Introduction

What is a computer?

A computer is a device that systematically transforms and manipulates data by executing a sequence of user-supplied instructions. These instructions constitute a program. Unlike a mere calculator, a computer can store and execute its program autonomously, carrying out logical decisions based on the current state of its data.

Calculating machines of considerable sophistication have been found dating back at least 2000 years. The Antikythera mechanism, for example, discovered in a shipwreck off the Greek coast, was an analog calculator that could determine the positions of astronomical bodies. As far as we know, the first truly programmable computer was the Analytical Engine, envisioned by Charles Babbage in the 1860s but never actually built. Babbage’s design called for a machine consisting of tens of thousands of kilograms of high-precision gears and switches whose operation was programmable using punch cards. By this reckoning, the very first programmer was Ada Lovelace, the daughter of Lord Byron, who was a mathematician and friend of Babbage. She devised a program for Babbage’s machine that would calculate the Bernoulli numbers.

Despite this history, our common understanding of computation as executed by machines is rather recent. Most of the fundamental algorithms for numerical computation were developed by Newton and his contemporaries and were intended to be carried out by hand. Indeed, the word computer in English originally referred to a person who performed computations. During World War II, for example, huge teams of human computers—mostly women, working in parallel with slide rules and adding machines—performed the tedious nuclear fission calculations required by the Manhattan project.

Mechanical and human computers, which were both slow and error-prone, were eventually superseded by electronic devices. Early experiments with analogue-electronic computers were abandoned in favour of a digital architecture that relies on discrete rather than continuous internal states. Since the 1940s, almost all computers have been based on binary (two-state digital) logic, a design made practical by the invention of the transistor. A computer today is built using integrated circuits etched into a semiconductor wafer. It consists of a large bank of digital memory (storage units that can hold a persistent on/off state) and one or more processing units that can carry out logical, bit-wise, and arithmetic transformations on the data in memory.

A modern computer is a general-purpose computing device, and the limits to what it can compute are entirely practical in nature. Alan Turing, an early 20th century mathematician, was one of the first to formalize the notion of computability. He introduced the concept of the Turing machine, an infinite length of cells containing symbols (the tape) operated on by a device (the head) that could move along the tape and read, write, and erase symbols according to a set of instructions. Turing demonstrated that his idealized machine could compute anything that is in principle computable. This insight is relevant because a real computer is equivalent to a Turing machine, except for the limitations of storage and speed. Thus, given infinite memory and infinite time to compute, it can perform any computation you like.

In practice, the constraints imposed by the computer’s finite resources are quite severe. The art of scientific computing is to design programs that overcome these hurdles through a judicious choice of algorithms and data structures. What considerations go into such a choice will be an important emphasis of this course.

Programming languages

Every computer must provide some method by which the user can specify the set of instructions to be executed. In the past, this might have involved setting switches or preparing punchcards. Today, computer programs are typically stored on magnetic media and loaded into memory when they are to be run. The instructions themselves are bit patterns (machine code or assembly code) that the processor associates with various operations.

Programming languages occupy an intermediate space between the abstract algorithm and the machine code. Most languages consist of a small set of English keywords and punctuation symbols arranged according to prescribed rules. Another program—either a compiler or an interpreter—translates the program into machine code that can be executed. (In the case of a compiler, the translation is carried out in advance; an interpreter translates on the fly. Although these days the distinction is beginning to blur.) Programming languages have several advantages: they are human readable and portable (from machine to machine), and they abstract away the low-level operations of the processor.

Keep in mind that a language merely defines a syntax. The programmer has to provide the content. In any language it is always possible to construct grammatically correct nonsense: “Colourless green ideas sleep furiously.” Analogously, a computer program can be valid but not correct. In other words, a program may compile but not behave as you intended.

Many languages have been developed over the last sixty years. A few of the more popular ones are listed below.

1950s

FORTRAN, LISP, COBOL

1960s

ALGOL, Simula, BASIC

1970s

Forth, Pascal, Smalltalk, C, Prolog

1980s

Ada, C++, Eiffel, Miranda, Objective-C, Perl, Erlang

1990s

Haskell, Python, Lua, Java, PHP, Ruby

2000s

D, C#, Scala, F#, Clojure, Go

2010s

Julia, Rust

The progression of languages follows the introduction of new organizational concepts, such as structured, functional, and object-oriented programming; new expressive patterns, such as generics, closures, meta-programming, and mix-ins; and new language guarantees for concurrency and memory safety. Many of these languages have continued to evolve over time. FORTRAN, in particular, has a long history as a language for scientific computing. In its modern incarnation, it bears only a passing resemblance to the original.

There are two primary dichotomies that divide languages into groups. Languages are either strongly or weakly typed, interpreted or compiled. Loosely speaking, the weakly typed, interpreted ones—this includes most scripting languages—offer the greatest flexibility and ease-of-use. Python is a good example. Languages that are strongly typed and compiled, on the other hand, tend to emphasize code correctness and performance. C, C++, D, Go, and Rust belong to this latter category. Julia is an example of a language that somewhat tanscends these dichotomies.

How does computation relate to physics?

The working premise in physics is that our universe is governed by laws, expressible in mathematical form. Fundamental physical laws typically describe point particles or continuous fields that evolve in space and time according to ordinary or partial differential equations. In complex systems, especially large aggregates of many interacting particles, the effective degrees of freedom may be quite unusual and described by emergent laws that are even stochastic or rule-based. One aspect of physics is the discovery and formulation of such laws. Another—which comprises much of the work that goes on in the discipline—is the application of established laws to describe and predict the behaviour of particular physical systems.

For practical purposes, the laws governing matter from the size of the atomic nucleus up to the size of our galaxy are known. For almost any system of interest, we can write down equations (of varying degrees of sophistication) that model its behaviour to the desired accuracy. Solving those equations is another matter. For only a handful of idealized systems, e.g., a harmonic oscillator or a gas of non-interacting particles, can we find a closed-form, analytic solution.

In response to this challenge, theorists have developed a toolbox of controlled approximations: systematic expansions around a known solution (perturbation theory), integration of fast or short-length-scale degrees of freedom to derive a simpler, low-energy model (effective field theory or the renormalization group), etc. Sometimes, however, these approaches are inconclusive (giving nonsensical or mutually contradictory results) or unsatisfactory (not able to answer the questions we are most interested in) and the only alternative is to resort to brute force: i.e., to solve the equations numerically on a computer. As computers become increasingly powerful, this option becomes more and more appealing. Lattice quantum chromodynamics (QCD) and simulations of plasmas in a Tokomak are good examples of systems that are largely inaccessible by any other means.

Over the last 30 years, computational physics has developed into an important branch of physics, alongside traditional theory and experiment. Realistic ab initio computational studies are now seen as complementary to experiment and can be used to simulate time-scales, length-scales, or parameter regimes (e.g., conditions of extreme temperature or pressure) that are inaccessible in the lab. On the other hand, simulations of semi-realistic toy models have also played an important role in studying the universal properties of various phenomena.

In setting out to “do physics on the computer,” there are some important issues that arise. A modern computer is a digital state machine. That is, it is a device that moves from one discrete internal state to another according to specific rules. Somehow, the mathematical problem at hand must be translated into a procedure that the computer can execute. In attempting this translation, one runs into some fundamental limitations:

  1. A computer is finite. It cannot simulate all algebraic operations to unlimited accuracy. It cannot even represent all the real numbers! The algorithm and data structures you devise must be efficient enough that the computer can complete its calculation in a reasonable amount of time and within the bounds of the available memory.

  2. A computer is deterministic. Barring some malfunction, the same machine given the same input will step through the same sequence of states each time. One consequence is that it cannot produce any truly random process.

  3. A computer knows nothing about mathematics. Every mathematical concept, operation, or procedure must first be recast in terms of machine structures and operations.

  4. A computer blindly follows its instructions. It will not object to flawed but legal instructions. It is up to the programmer to ensure that the algorithm is conceptually correct and that its implementation is sound. Good coding practices can help prevent “bugs” from creeping into your programs.

All of these are important issues that will be discussed in these notes.