A Primer on Threads [Operating Systems]

What are threads? What is multithreading? Parallelism Vs Concurrency? Green threads?

Threads

Thread is an execution unit that is part of a process.

Singlethreaded Vs Multithreaded

singlethreaded Vs multithreaded
singlethreaded Vs multithreaded

Threads Vs Process

ParameterProcessThread
DefinitionProcess means a program is in execution.Thread means a segment of a process.
LightweightThe process is not Lightweight.Threads are Lightweight.
Termination timeThe process takes more time to terminate.The thread takes less time to terminate.
Creation timeIt takes more time for creation.It takes less time for creation.
CommunicationCommunication between processes needs more time compared to thread.Communication between threads requires less time compared to processes.
Context switching timeIt takes more time for context switching.It takes less time for context switching.
ResourceProcess consume more resources.Thread consume fewer resources.
Treatment by OSDifferent process are treated separately by OS.All the level peer threads are treated as a single task by OS.
MemoryThe process is mostly isolated.Threads share memory.
SharingIt does not share dataThreads share data with each other.

Parallelism Vs Concurrency

Concurrency

Concurrency means that an application is making progress on more than one task - at the same time or at least seemingly at the same time (concurrently).

Concurrent tasks
Concurrent tasks

Parallel Execution

Parallel execution is when a computer has more than one CPU or CPU core, and makes progress on more than one task simultaneously. However, parallel execution is not referring to the same phenomenon as parallelism.

Parallel execution
Parallel execution

Parallel Concurrent Execution

It is possible to have parallel concurrent execution, where threads are distributed among multiple CPUs. Thus, the threads executed on the same CPU are executed concurrently, whereas threads executed on different CPUs are executed in parallel.

Parallel Concurrent Execution
Parallel Concurrent Execution

Parallelism

The term parallelism means that an application splits its tasks up into smaller subtasks which can be processed in parallel, for instance on multiple CPUs at the exact same time. Thus, parallelism does not refer to the same execution model as parallel concurrent execution - even if they may look similar on the surface.

To achieve true parallelism your application must have more than one thread running - and each thread must run on separate CPUs / CPU cores / graphics card GPU cores or similar.

Parallelism
Parallelism

Types of threads

User Level Threads

user threads
user threads

  • Scheduling can be application specific in the user level thread.
  • Not dependent on OS, kernel doesn’t know about the user level thread and manages them as if they were single-threaded processes.
  • If a thread causes a page fault, the entire process is blocked.

Kernel Level Threads

  • Sometime referred as Native Threads or OS-level Threads , really? *
  • Since kernel has full knowledge about the threads in the system, scheduler may decide to give more time to processes having large number of threads.
  • Good for applications that frequently block.
  • Slow and inefficient.

Extra:

  • htop shows individual threads as separate processes by default, similarly to how ps -AL would.
    • Toggle capital H to show user threads (options are accessible under the settings menu - F2 then Display Options).
    • Toggle capital K for kernel threads.
    • Each thread’s summary information actually reflects the whole process, so they don’t have separate memory counts etc.
    • The threads will show R when active and S when not active, although because htop only refreshes every few seconds you can’t actually monitor their activity easily that way, as thread switches happen much faster than that.
    • Source:

Green Threads

  • Many programming languages provide their own special implementation of threads. Programming language-provided threads are known as green threads.

  • Various names depending on the language (coroutines in C, goroutines in Go, fibers in Ruby, etc). These threads only exist inside your language and not in your OS. Because the language chooses context switches (i.e. at the end of a statement), it prevents TONS of subtle race conditions (such as seeing a partially-copied structure, or needing to lock most data structures). The programmer sees “blocking” calls (i.e. data = file.read() ), but the language translates it into async calls to the OS. The language then allows other green threads to run while waiting for the result.

  • Greenlets in Python

    pthreadsgreenlets
    pthreads can switch between threads pre-emptively, switching control from a running thread to a non-running thread at any timegreenlets only switch when control is explicitly given up by a thread - when using yield() or wait() - or when a thread performs a I/O blocking operation such as read or write
    On multicore machines, pthreads can run more than one thread. However python’s Global Interpreter Lock (CPython Intepreter) prevents parallelism and concurrency is only effective for I/O-bound programsgreenlets can only run on one single CPU and is useful for I/O-bound programs
    Race conditions can occur when implementing multi-threading code. Use locks to manage mutex to avoid race conditions.There’s no possibility of two threads of control accessing the same shared memory at the same time for greenlets so there will not be any race conditions.

Pthreads

“Pthreads” is an interface, and not a library, based on the Posix standard.

How a pthreads library is implemented will differ from platform to platform and library to library. Linux uses kernel threads; Windows uses Win32 threads, etc.

POSIX
The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines the application programming interface (API), along with command line shells and utility interfaces, for software compatibility with variants of Unix and other operating systems.

Thread safe

Thread-safe code is code that will work even if many Threads are executing it simultaneously.