If a computer were human, then its central processing unit (CPU) would be its brain. A CPU is a microprocessor — a computing engine on a chip. Some computational problems take years to solve even with the benefit of a powerful microprocessor, so computer scientists sometimes use a parallel computing approach called parallel processing.
What Is Parallel Computing?
Parallel computing is a broad term that involves dividing a task into smaller parts that are processed simultaneously by two or more processors. Unlike traditional sequential computing, which relies on a single processor to execute tasks one at a time, parallel computing makes use of parallel programs and multiple processing units to enhance efficiency and reduce computation time. This approach is critical for handling complex problems and large datasets in modern computing, allowing for the concurrent execution of multiple tasks.
What Does Parallel Processing Mean?
Parallel processing is a type of parallel computing. The concept is pretty simple: A computer scientist divides a complex problem into component parts using special software specifically designed for the task. They then assign each component part to a dedicated processor. Each processor solves its part of the overall computational problem. The software reassembles the data to reach the end conclusion of the original complex problem.
It's a high-tech way of saying that it's easier to get work done if you can share the load. You could divide the load up among different processors housed in the same computer or you could network several computers together and divide the load up among all of them. There are several ways to achieve the same goal.
Parallel Processing Approaches
To understand parallel processing, we need to look at the four basic programming models. Computer scientists define these models based on two factors: the number of instruction streams and the number of data streams the computer handles. Instruction streams are algorithms. An algorithm is just a series of steps designed to solve a particular problem. Data streams are information pulled from computer memory used as input values to the algorithms. The processor plugs the values from the data stream into the algorithms from the instruction stream. Then, it initiates the operation to obtain a result.
Single Instruction, Single Data (SISD) computers have one processor that handles one algorithm using one source of data at a time. The computer tackles and processes each task in order, so sometimes people use the word "sequential" to describe SISD computers. They aren't capable of performing parallel processing on their own.
Multiple Instruction, Single Data (MISD) computers have multiple processors. Each processor uses a different algorithm but uses the same shared input data. MISD computers can analyze the same set of data using several different operations at the same time. The number of operations depends upon the number of processors. There aren't many actual examples of MISD computers, partly because the problems an MISD computer can calculate are uncommon and specialized.
Single Instruction, Multiple Data (SIMD) computers have several processors that follow the same set of instructions, but each processor inputs different data into those instructions. SIMD computers run different data through the same algorithm. This can be useful for analyzing large chunks of data based on the same criteria. Many complex computational problems don't fit this model.
Multiple Instruction, Multiple Data (MIMD) computers have multiple processors, each capable of accepting its own instruction stream independently from the others. Each processor also pulls data from a separate data stream. An MIMD computer can execute several different processes at once. MIMD computers are more flexible than SIMD or MISD computers, but it's more difficult to create the complex algorithms that make these computers work. Single Program, Multiple Data (SPMD) systems are a subset of MIMDs. An SPMD computer is structured like an MIMD, but it runs the same set of instructions across all processors.
Out of these four, SIMD and MIMD computers are the most common models in parallel processing systems. While SISD computers aren't able to perform parallel processing on their own, it's possible to network several of them together into a cluster. Each computer's CPU can act as a processor in a larger parallel system. Together, the computers act like a single supercomputer. This technique has its own name: grid computing. Like MIMD computers, a grid computing system can be very flexible with the right software.
Parallel Processing Computations
Individually, each processor works the same as any other microprocessor. The processors act on instructions written in assembly language. Based on these instructions, the processors perform mathematical operations on data pulled from computer memory. The processors can also move data to a different memory location.
In a sequential system, it's not a problem if data values change as a result of a processor operation. The processor can incorporate the new value into future processes and carry on. In a parallel system, changes in values can be problematic. If multiple processors are working from the same data but the data's values change over time, the conflicting values can cause the system to falter or crash. To prevent this, many parallel processing systems use some form of messaging between processors.
Processors rely on software to send and receive messages. The software allows a processor to communicate information to other processors. By exchanging messages, processors can adjust data values and stay in sync with one another. This is important because once all processors finish their tasks, the CPU must reassemble all the individual solutions into an overall solution for the original computational problem. Think of it like a puzzle — if all the processors remain in sync, the pieces of the puzzle fit together seamlessly. If the processors aren't in sync, pieces of the puzzle might not fit together at all.
There are two major factors that can impact system performance: latency and bandwidth. Latency refers to the amount of time it takes for a processor to transmit results back to the system. It's not good if it takes the processor less time to run an algorithm than it does to transmit the resulting information back to the overall system. In such cases, a sequential computer system would be more appropriate. Bandwidth refers to how much data the processor can transmit in a specific amount of time. A good parallel processing system will have both low latency and high bandwidth.
Sometimes, parallel processing isn't faster than sequential computing. If it takes too long for the computer's CPU to reassemble all the individual parallel processor solutions, a sequential computer might be the better choice. As computer scientists refine parallel processing techniques and programmers write effective software, this might become less of an issue.
The Future of Computing With Parallel Systems
Looking ahead, the role of parallel computing in advancing technology and science is undeniable. As the complexity of computational problems grows, so does the need for more sophisticated parallel computer architectures and programming techniques. Modern computers, equipped with hardware that supports parallelism, are paving the way for innovations in everything from artificial intelligence to climate modeling. In this era, understanding and leveraging the power of parallel computing is essential for anyone looking to solve the most challenging problems of our time.
We updated this article in conjunction with AI technology, then made sure it was fact-checked and edited by a HowStuffWorks editor.
Sources
- Brown, Martin. "Comparing traditional grids with high-performance computing." IBM. Jun 13, 2006. http://mcslp.com/gridpdfs/gr-tradhp.pdf
- Dietz, Hank. "Linux Parallel Processing HOWTO." Aggregate.org. Jan. 5, 1998. Retrieved March 29, 2008. http://aggregate.org/LDP/19980105/pphowto.html
- "Parallel processing." SearchDataCenter. March 27, 2007. Retrieved March 29, 2008. http://searchdatacenter.techtarget.com/sDefinition/0,,sid80_gci212747,00.html
- "Programming Models." Tommesani. Retrieved March 29, 2008. http://www.tommesani.com/ProgrammingModels.html.