The following glossary has been put together to attempt to clarify expressions commonly used in and around parallel computing for people who are not familiar with the area. Experts will notice that it over-simplifies many of the definitions, and some are technically not quite true. It should not be used as more than an indication of what the phrases mean. [ Aside: for example, if anyone can provide a mathematically precise definition of either causal consistency or strong memory consistency that doesn't classify at least some designs 'the wrong way', I should very much like to see it! ] Access -- For data, any action on it, for reading, writing or update ACML -- AMD Core Mathematical Library - AMD's mathematical library, including tuned versions of the BLAS and LAPACK (see also BLAS, LAPACK, MKL and NAG) Ada -- A programming language, originally designed for high reliability by the DoD, named after Ada, Countess Lovelace Altivec -- The SIMD instruction set used on IBM's POWER systems (see also SSE) AMD -- The main maker of Intel compatible CPUs (see also Intel) AMD64 -- The CPU architecture that almost all modern systems use (see x86 for more information) API -- Application Programming Interface (the syntax, names and purpose needed for a programmer to use a facility, but usually omitting any detailed specification) Acquire -- See release-acquire Affinity -- Either when some memory, a device or other resource is associated to a system thread, or when a logical (program) thread, or process is associated with a CPU core Alias/Aliasing -- When two names, pointers etc. refer to the same or overlapping objects, especially when they are in different threads Architecture -- The abstract design of something, usually computer hardware but sometimes programming interfaces ASCI -- Accelerated SuperComputer Initiative (a USA slush fund for HPC computing, set up to simulate nuclear bomb testing) Asynchronous/Asynchronism -- Performing operations at an unspecified time, which may or may not be in parallel (see also synchronous) Attached Processor -- A separate CPU which is attached to (say) a workstation to deliver special functionality or high performance facilities Atomic -- An action which happens completely once it starts, any intermediate state cannot be observed, and no external change will affect it; also a variable where access to it is atomic - note that many people use it to imply coherence, but that is not always the case Autoparallelisation -- When a compiler takes a serial program and makes it run in parallel, with no code changes needed by the user, invariably using some form of threading AVX -- Advanced Vector Extensions - the latest and most powerful SIMD extensions in the x86 instruction set, currently available on the Xeon Phi and latest Intel CPUs - the first such extension was MMX, followed by several versions of SSE, both of which are still supported Background Process -- A process that is run asynchronously, leaving the initiating process free to do something else - in Unix, started by a command that ends in an '&' and often incorrectly called a job Batch -- When a job is executed without interacting with anything other than files and similar devices - i.e. not with a user, the network, other jobs or the application that started it Binding -- A specification of an abstract (semi-formal) interface design, in terms of an explicit API for a particular programming language BLAS -- Basic Linear Algebra Subroutines - a standard interface for basic operations on real and complex vectors and matrices (see also ACML, LAPACK, MKL and NAG) Block/Blocking -- In I/O and message passing, when a transfer does not return until the data has been copied - for writes, it may be in a system buffer and have not reached its destination BSP -- Bulk Synchronous Parallel - a very simple parallel model developed by C.A.R. Hoare C++ Threads -- The C++ programming language and the various versions of the standard defined by ISO SC22 WG21 C++/C++03/C++11 -- The C++ programming language and the various versions of the standard defined by ISO SC22/WG21 C/C90/C99/C11 -- The C programming language and the various versions of the standard defined by ISO SC22/WG14 Cache -- A faster form of memory, used to keep a copy of the most recently used locations in main memory Cache line -- The unit of memory that is copied into or out of the cache - typically 32-128 bytes, occasionally 256 or more Causal consistency -- For data access, the property that apparent 'time travel' cannot occur (see also sequential consistency); note that causal consistency is not a well-defined term Child -- A thread, process or program that was created and (usually) is controlled by its parent Cilk/Cilk Plus -- Intel's language extensions to C++ intended for shared-memory parallel programming; also the compiler for them Client -- A program that makes requests of a server - think of a Web browser or FTP command Clock rate -- The frequency at which a CPU starts instructions or a memory controller accesses data - the maximum number that may be issued may be larger, if several are issued at once Cluster -- A system built up of multiple workstations or small servers, typically connected by Ethernet Coarrays -- See Fortran coarrays Coherence -- For data access, the property that each thread will see simultaneous parallel actions on data occur in some unspecified order; note that it does not mean that all threads see the same order (see consistency) Communication -- Any form of data transfer or signalling between two CPUs, processes etc. Condition variables -- see locking Condor -- A widely-used job scheduler Consistency -- For data access, the property that all threads' views of parallel data accesses obey some consistency rule (see also causal consistency and sequential consistency) Controller -- A program or piece of hardware that controls the execution of other programs or hardware (see also harness) CORBA -- Common Object Request Broker Architecture - a networking interface widely used in commercial applications Core -- A unit of a CPU that executes a single thread or process CPU -- Central Processing Unit - usually used to refer to the processing hardware of a computer system (see also GPU and memory) Cray -- A USA manufacturer of supercomputer systems, specialising in DoD (USA Department of Defense) contracts Cray SHMEM -- A semi-shared memory communication mechanism, requiring RDMA Critical Sections -- A section of code that is automatically locked, because otherwise it might cause a data race CS -- Computer Science CUDA -- Compute Unified Device Architecture - NVIDIA's interface specification for using its GPUs as compute processors, available on its GeForce and TESLA ranges of GPUs DAG -- see Directed Acyclic Graph Data Affinity -- When some data locations are bound more closely to particular CPU cores on an SMP system than to others Data distribution -- How the program's data is distributed across multiple processes (or, sometimes, threads or CPU cores) Dataflow -- An execution design where the programming model is how data are filtered through actions, rather than by specifying an order of execution of actions Data race -- Non-atomic access to the same or overlapping data by two threads with no intervening synchronisation (see also race condition) Deadlock -- When a set of threads or processes are stuck, because none can proceed until one of the others has (see also livelock) Directed Acyclic Graph -- a mathematical directed graph is a set of nodes connected by one-directional links; a graph is acyclic if there is no possible path from any node back to itself Distributed memory -- A programming or hardware model where multiple processes run with no shared data, and communicate by message passing or I/O DMA -- See RDMA DoD -- USA Department of Defense Double precision -- For floating-point, roughly twice as much precision as the 'basic' precision - nowadays, typically taking 8 bytes and giving 15-16 significant digits Duplex I/O -- A connection where data can be passed in both directions Dynamic process -- A process that is created and destroyed as part of the execution of a program Email -- Electronic mail Embarrassingly parallel -- An application that can be parallelised by running multiple separate threads, with very little no communication between them; the term originated because these applications perform well in parallel benchmarking, no matter how slow the interconnect Encapsulate -- To ensure that all accesses to some data or actions of a particular form (e.g. I/O) are through a small number of interfaces Erlang -- A programming language with built-in parallelism Ethernet -- The currently dominating network hardware specification (see also InfiniBand) Event -- The name used for semaphores in the forthcoming Fortran coarray extension Farmable -- Like embarrassingly parallel, but with no communication between threads or processes, and none with the harness except reading the parameters and input, and writing the output Fence -- A very low-level synchronisation mechanism, used to construct higher-level ones; fences myst be executed by both threads to get any sychronisation between them FIFO -- First-Input, First-Output - another name for a queue, usually used under POSIX systems to indicate devices like sockets, pipes and named FIFOs FFTW -- The Fastest Fourier Transform in The West - a widely used and portable open-source fast Fourier transform library Firmware -- Software that is stored in read-only memory, and that appears to programmers to be part of the hardware (see also hardware and software) Fortran coarrays -- The Fortran 2008 parallel programming facility; it is a PGAS model Fortran/Fortran 66/Fortran 77/Fortran 90/Fortran 2003/Fortran 2008 -- The Fortran programming language and the various versions of the standard defined by ISO SC22/WG5 FPS -- Floating Point Systems - the maker of an attached SIMD unit that was widely used in the 1980s to enhance the performance of IBM PCs for scientific calculations (see also GPU) FTP -- File Transfer Protocol - a widely used Internet protocol for transferring files between systems Future -- In C++, very similar to a task Gang scheduling -- When all threads or processes either execute on separate, dedicated CPU cores or none execute GeForce -- NVIDIA's range of ordinary video cards that can also be used for CPU programming GNU -- The brand name for software produced by the Free Software Foundation GNU Ada -- The Ada compiler that is part of the GNU compiler suite, that also includes gcc, G++ and gfortran GPU -- Graphics Programming Unit; while mainly used to deliver smooth effects for video and gaming, many modern ones (like NVIDIA's) can be used as high-performance SIMD attached processors GridEngine -- A widely-used job scheduler GUI -- Graphical User Interface - the sort of interface available on almost all modern computers used for interactive work Handshake -- A barrier that involves only two agents 'Happens After'/'Happens Before' -- In Java, C++ etc., when two actions are required to occur in a specific order by the language rules Hardware -- The physical components of a computer system (see also firmware and software) Harness -- A program, script or other framework that is used to control the execution of other processes (see also controller) High Performance Computing -- A calculation which is limited by the availability of resources, where the primary objective is to do larger calculations, faster HPC -- High Performance Computing HPF -- High Performance Fortran - one of the earlier PGAS designs, no longer available, and superseded by either OpenMP or Fortran coarrays I/O -- Input/output - when a program reads or writes data to a file or other file-like device (e.g. a socket or terminal) IA/IA-32/IA-64 -- Intel Architecture; IA-32 usually means x86 and IA-64 usually means Itanium IBM -- International Business Machines (see also POWER and Altivec) Image -- The name used for a thread/process in the Fortran standard InfiniBand -- The leading 'non-proprietary' specification for HPC interconnects -- it is faster and more expensive than Ethernet Intel -- The maker of most CPUs currently used for general computing (see also AMD) Interconnect -- The network used to link a cluster together and enable fast message passing or other communication Internet servers/Internet services -- The server systems and services accessible via the Internet, such as online shopping ones ISO -- International Standards Organisation, the body responsible for standards like Fortran, C and C++ Itanium -- A computer architecture developed by HP and Intel, intended to replace x86, but which is fast disappearing Java -- The widely used scripting language designed by Sun Job -- A set of commands that specifies the execution of one or more processes and the location of their data (see also task); Unix incorrectly used it to for background tasks, and that use is widespread Job scheduler -- A system program that controls where and when jobs are executed Kernel scheduler -- The part of the operating system kernel that controls where and when threads and processes are executed on the CPU cores it manages KISS -- Keep It Simple and Stupid - an age-old engineering principle, and perhaps the most important one in computing - the acronym was coined by Kelly Johnson of the Skunkworks and is commonly misquoted as Keep It Simple, Stupid LAPACK -- Linear Algebra Package - portable, high-quality, open source code for matrix decomposition, solution of simultaneous linear equations and eigensystems (see also ACML, BLAS, MKL and NAG) Linux -- The Unix-like operating system that is currently used for most scientific computing Livelock -- When a set of threads or processes are stuck in an infinite loop, all waiting for one of the others to do something (see also deadlock) Lock/Locking -- To prevent any other thread, process or command getting access to an item of data or facility until the lock is released (unlocked) - there are numerous different forms of this, including condition variables, mutexes, readers/writers locks and semaphores, which are sometimes provided under other names LSF -- Load Sharing Facility - a widely-used job scheduler Master/Master-Worker -- An application design where one process (the master) parcels out work to the other processes (the workers) Matlab -- The widely used matrix programming package MCS -- Managed Cluster Service - the system used for the teaching systems supported by the University of Cambridge University Information Services Memory -- The component of a computer used to store the data that is being worked on by a process Memory Consistency/Memory Model -- The rules stating what guarantees parallel accesses to the same or overlapping data locations, and what rules a program must obey to get defined behaviour Message passing -- A communication model that involves one process sending messages to another (a bit like a sort of internal Email) Message Passing Interface -- See MPI MIC -- Many Integrated Core Architecture (see Xeon Phi) Microsoft -- The well-known computer company, and its software MIMD -- Multiple Instruction Multiple Data - when multiple processes execute independently on different data - it is often incorrectly used to mean distributed memory with message passing (see also MPMD) MKL -- Mathematics Kernel Library - Intel's tuned mathematical library, including tuned versions of the BLAS and LAPACK (see also ACML, BLAS, LAPACK and NAG) MMX -- Multi-Media Extensions - see AVX MPI -- Message Passing Interface - the name of the distributed memory message passing library standard that dominates HPC programming on clusters (see also OpenMPI and MPICH) MPICH -- A widely-used open source MPI implementation (see also OpenMPI) MPMD -- Multiple Program Multiple Data - when multiple processes execute separate programs on different data MTBF -- Mean Time Between Failures - a measure of failure rate used when analysing non-repeatable failures Mutex -- See locking NAG -- Numerical Algorithms Group - a not-for-profit commercial company, producing probably the most general and high-quality numerical library NAG SMP -- The version of the NAG library that can make use of multiple cores for extra performance Named FIFO -- A FIFO that is accessed by a filename (see also FIFO) Nesting -- When an instance of one construct occurs inside another instance; e.g. when lock B is used in code that already holds lock A NUMA -- Non-Uniform Memory Architecture - a form of SMP when it takes longer to access some data locations than others from a particular thread -- almost all SMP CPus nowadays are NUMA (see also data affinity) NVIDIA/NVIDIA Fermi/NVIDIA Kepler/NVIDIA Tesla -- The well-known manufacturer of video cards, and the names for its high-end GPUs OpenAcc -- The previous name for the OpenMP extensions to support SIMD as distinct from threading OpenCL -- The most widely-used interface used to program GPUs, callable from C and C++ and indirectly from other languages OpenMP -- A language extension for Fortran, C and C++ that is commonly used for SMP programming using threads - not to be confused with OpenMPI OpenMPI -- Possibly the most widely-used open source MPI implementation (see also MPICH) - not to be confused with OpenMP Packet -- A unit of data sent over a network Parallel Genetic Algorithms -- Parallel search methods based on a simplification of a biological model of genetics Parallel Global Array Storage -- See PGAS Parameter Space Searching -- Global optimisation to find the best combination of a set of parameters for some calculation Parent -- A thread, process or program that creates and (usually) controls its children Partitioned Global Address Space -- See PGAS PBS -- Portable Batch Scheuler - a widely-used job scheduler PC -- Personal Computer PDE -- Partial Differential Equation Perl -- A widely-used but low level and complicated scripting language (see also Python) PGAS -- Partitioned Global Address Space or Parallel Global Array Storage - a hybrid distributed/shared memory model where some arrays can be accessed semi-directly from all processes Pipe -- A Unix (POSIX) mechanism for one process to pass data to another, using normal I/O facilities (see also FIFO) POSIX/POSIX standard -- The specification of the Unix-like operating system interface that is fairly closely followed by Linux and other modern Unix systems POSIX mmap/POSIX shmat/POSIX shmem -- The shared memory segment APIs defined by POSIX POSIX threads -- The APIs for executing multiple threads iwith shared memory defined by POSIX POWER -- IBM's proprietary computer architecture Process -- In modern usage, the unit of execution that is protected against other processes, including not being able to access their data directly, owned by the same or other users; each process may have multiple threads executing Prolog -- A programming language that uses a dataflow design Prototype -- A preliminary implementation of a design, intended to find problems that have been missed up to that stage PVM -- Parallel Virtual Machine - a distributed memory programming environment for multiple workstations, now superseded by MPI PWF -- Personal Workstation Facility, later Public Workstation Facility - this now called the MCS PWF Condor -- The version of Condor used on the PWF Python -- A widely-used, simple and recommended scripting language Queue -- A data structure where entries are appended to the end and taken off the beginning; job schedulers often use it for jobs (see also FIFO) Race Condition -- Two actions of the same form that happen with no intervening synchronisation and conflict; this is a generalisation of data race RAS -- Reliability, Availability and Serviceability - i.e. the property that a system rarely crashes, misbehaves or is inaccessible RDMA -- Remote Direct Memory Access - a facility for one distributed memory process to access the memory of another, without that other process needing to do anything Readers/writers locks -- see locking Reduction -- A parallel mechanism by which data is (for example) summed across threads or processes Release/Release-acquire -- ??? SC22 -- The ISO subscommittee responsible for all programming languages and programming interfaces ScaLAPACK -- A parallel form of LAPACK (Linear Algebra Package), based on MPI Scheduler -- See job scheduler and kernel scheduler Semaphore -- See locking Semantics -- The meaning of a program and the restrictions on what is permitted, as distinct from its syntax (i.e. the rules for writing its text) Sequential -- occuring in a single order in time, which may be explicit or unspecified; the opposite of parallel (see also synchronous and asynchronous) Sequential Consistency -- For data access, the property that all parallel data accesses appear to have occurred in some serial order (see 'causal consistency') Serial -- See sequential Serial Debugger -- A debugger written to handle only serial programs Shared Memory -- A programming or hardware model where multiple threads can access each other's data as if it were their own; it is often incorrectly assumed to mean coherence and even consistency Shared Memory/Shared Memory Processor -- See SMP Shared Memory Segment -- An area of memory that can be shared between two processes, that otherwise do not share memory Shared Memory Threading -- see threading SHMEM -- See Cray SHMEM Sibling -- Two threads, processes or programs with the same parent SIGPIPE -- A signal sent to indicate that a pipe is broken under POSIX systems SIMD -- Single Instruction Multiple Data - when a single operation acts on a large number of data items at once, usually some form of vector operation Simplex I/O -- A connection where data can be passed in only one direction (see also streaming I/O) Smalltalk -- A language with built-in parallelism and message passing that had a major influence on later parallel designs SMP -- Shared Memory Processor - a system capable of sharibng memory between multiple CPU cores (and hence threads) - originally, it meant Symmetric Multi-Processor, but that usage has disappeared SOAP -- Simple Object Access Protocol - a networking interface widely used in commercial applications Socket -- A mechanism for one process to pass data to another, not necessarily on the same system, using normal I/O facilities (see also FIFO) Software -- The programs of a computer system, including the operating system itself (see also firmware and hardware) Solaris dtrace -- A system debugging mechanism that allows the ordinary user to trace some events that happen in the kernel that are related to his processes Son of Star Wars -- See ASCI Spawn -- To create a new (child) process Specification -- The description of something, in this context often a programming interface Spin loop -- where one thread or process waits for another by testing an atomic variable in a tight loop, and exiting when it changes value SPMD -- Single Program Multiple Data - a form of MIMD where each thread or process runs the same executable SSE -- Streaming SIMD Extensions - see AVX Standard -- A specification that is produced by some official body (e.g. ISO) or is widely accepted as the interface to design to STL -- (C++) Standard Template Library - the old name for the standard library, especially the containers, iterators and algorithms Strong Memory Model -- Typically, a memory model that guarantees causal consistency Streaming -- When a program takes a sequence of inputs (e.g. lines), processes them, and writes them as soon as they are ready; many basic Unix utilities (e.g. grep, cat, tr) are streaming Streaming I/O -- Simplex I/O that does not have any form of repositioning (including being closed and reopened) Sun -- A computer company, now taken over by Oracle SVD -- Singular Value Decomposition Synchronisation -- Actions to ensure that parallel operations occur in a particular order (see also asynchronous) Synchronous -- Performing operations at a specified time; in a parallel context, this implies that they are executed in some serial order Syntax -- The test forms that are valid in a programming language SysV shmem -- See POSIX shmem Task -- A unit of work, such as a procedure (function) call, but more general than that; in threading, usually an asynchronous procedure call that is run as a separate thread TANSTAAFL -- There Ain't No Such Thing As A Free Lunch - an acronym coined by the science fiction writer Larry Niven to indicate that everything has its disadvantages TCP/IP -- Transmission Control Protocol/Internet Protocol - the currently most widely used low-level interface used to transfer data between systems TESLA -- A range of NVIDIA's CPUs designed for high-end computation (models include FERMI and KEPLER) Thread -- In modern usage, a unit of execution that can run asynchronously from other threads, but can share data with them; these are usually in the context of a process Thread Interference -- When two threads do something that conflicts, and so things go wrong Thread pool -- A collection of threads that can be used to provide a thread when one is needed to execute a task Transaction -- An action (usually involving multiple agents or threads) that is packaged to be atomic - i.e. it is independent of all other transactions and has no effect if it is cancelled UPC -- Unified Parallel C - a widely-hyped PGAS model which does not seem to be much used, and is not recommended Update -- Any access that can potentially change the contents of a data location, whether it does or not USA -- United States of America Variable -- The name of a data location that can be updated Vector Hardware/Vector System -- A system which provides major SIMD capabilities - these were the dominant supercomputers of the 1960s and 1970s, but have been superseded by SSE etc. and GPUs Vectorisation -- Ensuring that a program uses the SIMD facilities of its CPU - compilers often do this automatically for SSE etc. at high levels of optimisation Virtual shared memory -- A programming model that that makes multiple cores with distributed memory appear to the programmer as if they had shared memory Weak Memory Model -- A memory model that is not a strong memory model (q.v.) Web/World Wide Web -- The collection of information accessible via a browser over the Internet Web of a Million Lies -- a gross underestimate; while the information on the Web is extremely useful, a high proportion of it is misleading or just plain wrong Web pages -- Locations (i.e. views of data) on the Web WG14 -- The ISO SC22 working group responsible for the C standard WG21 -- The ISO SC22 working group responsible for the C++ standard WG5 -- The ISO SC22 working group responsible for the Fortran standard Worker -- See Master-Worker x86/x86-64 -- The Intel architecture used by the CPUs used on almost all modern CPUs above the size of tablets; the 64-bit version was developed by AMD (see AMD64) and Intel call it x86-64 Xeon Phi -- Intel's range of attached processors using the MIC architecture for x86 systems, that provide a large number of CPU cores