To install click the Add extension button. That's it.

The source code for the WIKI 2 extension is being checked by specialists of the Mozilla Foundation, Google, and Apple. You could also do it yourself at any point in time.

4,5
Kelly Slayton
Congratulations on this excellent venture… what a great idea!
Alexander Grigorievskiy
I use WIKI 2 every day and almost forgot how the original Wikipedia looks like.
Live Statistics
English Articles
Improved in 24 Hours
Added in 24 Hours
Languages
Recent
Show all languages
What we do. Every page goes through several hundred of perfecting techniques; in live mode. Quite the same Wikipedia. Just better.
.
Leo
Newton
Brights
Milds

Single instruction, multiple threads

From Wikipedia, the free encyclopedia

Single instruction, multiple threads (SIMT) is an execution model used in parallel computing where single instruction, multiple data (SIMD) is combined with multithreading. It is different from SPMD in that all instructions in all "threads" are executed in lock-step. The SIMT execution model has been implemented on several GPUs and is relevant for general-purpose computing on graphics processing units (GPGPU), e.g. some supercomputers combine CPUs with GPUs.

The processors, say a number p of them, seem to execute many more than p tasks. This is achieved by each processor having multiple "threads" (or "work-items" or "Sequence of SIMD Lane operations"), which execute in lock-step, and are analogous to SIMD lanes.[1]

The simplest way to understand SIMT is to imagine a multi-core system, where each core has its own register file, its own ALUs (both SIMD and Scalar) and its own data cache, but that unlike a standard multi-core system which has multiple independent instruction caches and decoders, as well as multiple independent Program Counter registers, the instructions are synchronously broadcast to all SIMT cores from a single unit with a single instruction cache and a single instruction decoder which reads instructions using a single Program Counter.

The key difference between SIMT and SIMD lanes is that each of the SIMT cores may have a completely different Stack Pointer (and thus perform computations on completely different data sets), whereas SIMD lanes are simply part of an ALU that knows nothing about memory per se.

YouTube Encyclopedic

  • 1/3
    Views:
    50 495
    3 959
    46 241
  • Difference between Multiprocessing and Multithreading
  • Multi Threading Performance - Georgia Tech - HPCA: Part 5
  • Process vs Thread

Transcription

History

SIMT was introduced by Nvidia in the Tesla GPU microarchitecture with the G80 chip.[2][3] ATI Technologies, now AMD, released a competing product slightly later on May 14, 2007, the TeraScale 1-based "R600" GPU chip.

Description

As access time of all the widespread RAM types (e.g. DDR SDRAM, GDDR SDRAM, XDR DRAM, etc.) is still relatively high, engineers came up with the idea to hide the latency that inevitably comes with each memory access. Strictly, the latency-hiding is a feature of the zero-overhead scheduling implemented by modern GPUs. This might or might not be considered to be a property of 'SIMT' itself.

SIMT is intended to limit instruction fetching overhead,[4] i.e. the latency that comes with memory access, and is used in modern GPUs (such as those of Nvidia and AMD) in combination with 'latency hiding' to enable high-performance execution despite considerable latency in memory-access operations. This is where the processor is oversubscribed with computation tasks, and is able to quickly switch between tasks when it would otherwise have to wait on memory. This strategy is comparable to multithreading in CPUs (not to be confused with multi-core).[5] As with SIMD, another major benefit is the sharing of the control logic by many data lanes, leading to an increase in computational density. One block of control logic can manage N data lanes, instead of replicating the control logic N times.

A downside of SIMT execution is the fact that thread-specific control-flow is performed using "masking", leading to poor utilization where a processor's threads follow different control-flow paths. For instance, to handle an IF-ELSE block where various threads of a processor execute different paths, all threads must actually process both paths (as all threads of a processor always execute in lock-step), but masking is used to disable and enable the various threads as appropriate. Masking is avoided when control flow is coherent for the threads of a processor, i.e. they all follow the same path of execution. The masking strategy is what distinguishes SIMT from ordinary SIMD, and has the benefit of inexpensive synchronization between the threads of a processor.[6]

SIMT Terminology
Nvidia CUDA OpenCL Hennessy & Patterson[7]
Thread Work-item Sequence of SIMD Lane operations
Warp Wavefront Thread of SIMD Instructions
Block Workgroup Body of vectorized loop
Grid NDRange Vectorized loop

See also

References

  1. ^ Michael McCool; James Reinders; Arch Robison (2013). Structured Parallel Programming: Patterns for Efficient Computation. Elsevier. p. 52.
  2. ^ "Nvidia Fermi Compute Architecture Whitepaper" (PDF). www.nvidia.com/. NVIDIA Corporation. 2009. Retrieved 2014-07-17.
  3. ^ Lindholm, Erik; Nickolls, John; Oberman, Stuart; Montrym, John (2008). "NVIDIA Tesla: A Unified Graphics and Computing Architecture". IEEE Micro. 28 (2): 6 (Subscription required.). doi:10.1109/MM.2008.31. S2CID 2793450.
  4. ^ Rul, Sean; Vandierendonck, Hans; D’Haene, Joris; De Bosschere, Koen (2010). An experimental study on performance portability of OpenCL kernels. Symp. Application Accelerators in High Performance Computing (SAAHPC). hdl:1854/LU-1016024.
  5. ^ "Advanced Topics in CUDA" (PDF). cc.gatech.edu. 2011. Retrieved 2014-08-28.
  6. ^ Michael McCool; James Reinders; Arch Robison (2013). Structured Parallel Programming: Patterns for Efficient Computation. Elsevier. pp. 209 ff.
  7. ^ John L. Hennessy; David A. Patterson (1990). Computer Architecture: A Quantitative Approach (6 ed.). Morgan Kaufmann. pp. 314 ff. ISBN 9781558600690.
This page was last edited on 26 January 2024, at 00:24
Basis of this page is in Wikipedia. Text is available under the CC BY-SA 3.0 Unported License. Non-text media are available under their specified licenses. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc. WIKI 2 is an independent company and has no affiliation with Wikimedia Foundation.