Threads, concurrency, or synchronization are not very easy to understand concepts. When some concurrency is involved in our applications it’s pretty hard to avoid making mistakes. Although Java provides mechanisms to deal with parallel programming, sometimes there are just too many options. And often some essential options are missing. For web applications, Jakarta EE provides a simplified programming model to deal with parallel tasks. But in order to use it effectively and avoid mistakes, you need to understand the basic concepts which I’d like to explain here.
Java provides a lot of mechanisms to help working with threads and concurrent tasks. As Java evolved, some new mechanisms were added while old mechanisms stayed. And it’s often not clear which of them are better and recommended to use in new applications. Jakarta EE builds on these features and makes them easier to understand and use. The standard Jakarta EE API intentionally specifies only interfaces and essential conceptual behavior. A lot of the complexity is abstracted away and provided by Jakarta EE runtimes to keep things simple. As a result, developers have a concise set of features. Easy to learn and understand but enough to build applications.
What is a thread pool
The basic concept of threading and parallelism in Jakarta EE runtimes is a thread pool. This is connected with the request processing model, where most of the tasks originate as a request from an external caller, they are then processed sequentially in a single thread, and produce some response that is usually sent back to the caller, persisted into a database or sent as a message to another system. Many separate tasks can run in parallel, each using its own separate thread. So there’s usually a simple mapping – one request needs one thread. After a task is finished, a thread doesn’t have to be destroyed. It can be reused to run another task, in order to avoid creating and destroying threads too often.
Incoming tasks are not tied with any specific thread. A thread pool always makes sure there’s a thread available for a task. This is called “scheduling” and is often referred to as “thread scheduling”. Tasks are scheduled and processed either:
by an existing thread that is finished with its previous task
by a new thread if no free thread is available
or they are queued and wait until a thread is available if all threads are busy
Vice versa, threads aren’t tied to tasks either. They just take a task from a queue and process it. When threads are done with their tasks, they start processing another new task from a queue or wait for a task if the queue is empty. A group of such threads, together with the logic how they are managed and scheduled, is called a thread pool.
Why thread pools?
Thread pools are used because it’s time-consuming to dispose a thread and create a new thread later. A new thread is created only when there’s no idle thread already available. And they are disposed only when there are already too many idle threads it’s likely not going to be needed for some time.
Another aspect of thread pools is that their size is limited and adapts to the load. This is mainly for 3 reasons:
- Threads allocates some memory for their stacks and keep the memory until the threads are disposed
- The CPU can run very small number of threads at once; having too many threads can even lead to performance degradation
- Each task requires some heap memory; it makes sense to limit the maximum number of parallel tasks to avoid overwhelming the system
The memory argument is pretty clear and often it’s evident when it becomes and issue. There’s always a limited amount of memory on the system. When threads consume a big portion of that memory, it’s easy to reach the limit. The default stack size for each thread is 1MB, which means that each thread needs 1MB of system memory. If there are 1000 threads in a JVM, they clearly need 1GB of memory on top of the Java heap size just to exist. You can probably imagine the consequences if there are even more threads.
The CPU argument isn’t so straightforward but is really valid. Only X amount of threads can run on a CPU at the same time (usually 8 on an 8-core CPU). With more threads, it’s more likely that a thread will be suspended and another thread will be scheduled. Switching of threads is a relatively time-consuming operation and doesn’t contribute to the computation. Therefore, having an excessive amount of threads can actually decrease performance.
The last argument is that executing too many tasks in parallel costs too much memory as each tasks needs to store something to the heap even if it’s waiting for an I/O operation or the CPU. As described above, more parallel tasks doesn’t not always lead to increased performance. But it definitely leads to increased memory. Therefore it’s better to limit how many tasks can run in parallel and queue other tasks to be executed later.
For these reasons, it’s good to have some reasonable amount of threads ready to handle new tasks immediately. It’s also necessary to limit the number of threads to a reasonable amount. And threads should also be disposed after some time if they are really not needed. With this, it’s possible to avoid system become thrashed and unusable under high load. If there are too many requests to handle, some of them simply need to wait while others can be efficiently processed. This will make the system usable at least for some requests/users.
Sometimes a single request needs to be processed by multiple threads in parallel. This doesn’t fit the simplified thread-per-request model but it’s also supported by Jakarta EE runtimes. Applications can use a specialized concurrency API, which allows splitting a task and execute each part in a separate thread. This again works on top of thread pools to retain all the advantages mentioned above.