How to Make an Application Thread Safe

how to make an application thread safe?

There are several ways in which a function can be thread safe.

It can be reentrant. This means that a function has no state, and does not touch any global or static variables, so it can be called from multiple threads simultaneously. The term comes from allowing one thread to enter the function while another thread is already inside it.

It can have a critical section. This term gets thrown around a lot, but frankly I prefer critical data. A critical section occurs any time your code touches data that is shared across multiple threads. So I prefer to put the focus on that critical data.

If you use a mutex properly, you can synchronize access to the critical data, properly protecting from thread unsafe modifications. Mutexes and Locks are very useful, but with great power comes great responsibility. You must not lock the same mutex twice within the same thread (that is a self-deadlock). You must be careful if you acquire more than one mutex, as it increases your risk for deadlock. You must consistently protect your data with mutexes.

If all of your functions are thread safe, and all of your shared data properly protected, your application should be thread safe.

As Crazy Eddie said, this is a huge subject. I recommend reading up on boost threads, and using them accordingly.

low-level caveat: compilers can reorder statements, which can break thread safety. With multiple cores, each core has its own cache, and you need to properly sync the caches to have thread safety. Also, even if the compiler doesn't reorder statements, the hardware might. So, full, guaranteed thread safety isn't actually possible today. You can get 99.99% of the way there though, and work is being done with compiler vendors and cpu makers to fix this lingering caveat.

Anyway, if you're looking for a checklist to make a class thread-safe:

Identify any data that is shared across threads (if you miss it, you can't protect it)
create a member boost::mutex m_mutex and use it whenever you try to access that shared member data (ideally the shared data is private to the class, so you can be more certain that you're protecting it properly).
clean up globals. Globals are bad anyways, and good luck trying to do anything thread-safe with globals.
Beware the static keyword. It's actually not thread safe. So if you're trying to do a singleton, it won't work right.
Beware the Double-Checked Lock Paradigm. Most people who use it get it wrong in some subtle ways, and it's prone to breakage by the low-level caveat.

That's an incomplete checklist. I'll add more if I think of it, but hopefully it's enough to get you started.

Thread safety in web application

Your typical web application has objects like servlets, controllers, services, and data access objects, which have no conversational state and so can be accessed safely from concurrent threads. Then there are persistent entities that are created by a request thread and usually don't get passed to other threads, their scope is confined to the thread that created them.

There are some infrastructure objects, like the connection pool and the Hibernate session factory, that need to be designed to be threadsafe. But if you're using any kind of reasonable framework you usually don't have to create these kinds of things yourself.

The most likely source of errors for your application, assuming you manage to avoid keeping state inappropriately in things like services or controllers, is probably going to be having database operations interleaved in an unintended way due to developers who don't know how to use transactions. That's what I would look out for. So 3 things:

1) avoid conversational state in services, controllers, daos,

2) use a framework (spring is one example) that provides proven threadsafe infrastructure, and

3) learn about database transactions, and isolation levels, and optimistic locking, and use them to make sure data is accessed or changed by different threads without corruption.

Thread safe programming

Thread-safety is one aspect of a larger set of issues under the general heading of "Concurrent Programming". I'd suggest reading around that subject.

Your assumption that two threads cannot access the struct at the same time is not good. First: today we have multi-core machines, so two threads can be running at exactly the same time. Second: even on a single core machine the slices of time given to any other thread are unpredicatable. You have to anticipate that ant any arbitrary time the "other" thread might be processing. See my "window of opportunity" example below.

The concept of thread-safety is exactly to answer the question "is this dangerous in any way". The key question is whether it's possible for code running in one thread to get an inconsistent view of some data, that inconsistency happening because while it was running another thread was in the middle of changing data.

In your example, one thread is reading a structure and at the same time another is writing. Suppose that there are two related fields:

  { foreground: red; background: black }

and the writer is in the process of changing those

   foreground = black;
            <=== window of opportunity
   background = red;

If the reader reads the values at just that window of opportunity then it sees a "nonsense" combination

  { foreground: black; background: black }

This essence of this pattern is that for a brief time, while we are making a change, the system becomes inconsistent and readers should not use the values. As soon as we finish our changes it becomes safe to read again.

Hence we use the CriticalSection APIs mentioned by Stefan to prevent a thread seeing an inconsistent state.

Thread safety in java web application?

In a normal web-application Servlet treats as Singleton class, it means if you are using instance variable in Servlet that is not thread safe in that case it will create an issue for multiple request that is served simultaneously.

A Java servlet container / web server is typically multithreaded. That means, that multiple requests to the same servlet may be executed at the same time. Therefore, you need to take concurrency into consideration when you implement your servlet.

How to make writing method thread safe?

I am not well versed in Java so I am going to provide a language-agnostic answer.

What you want to do is to transform matrices into results, then format them as string and finally write them all into the stream.

Currently you are writing into the stream as soon as you process each result, so when you add multi threads to your logic you end up with racing conditions in your stream.

You already figured out that only the calls for ResultGenerator.getResult() should be done in parallel whilst the stream still need to be accessed sequentially.

Now you only need to put this in practice. Do it in order:

Build a list where each item is what you need to generate a result
Process this list in parallel thus generating all results (this is a map operation). Your list of items will become a list of results.
Now you already have your results so you can iterate over them sequentially to format and write them into the stream.

I suspect the Java 8 provides some tools to make everything in a functional-way, but as said I am not a Java guy so I cannot provide code samples. I hope this explanation will suffice.

@edit

This sample code in F# explains what I meant.

open System

// This is a pretty long and nasty operation!
let getResult doc =
    Threading.Thread.Sleep(1000)
    doc * 10

// This is writing into stdout, but it could be a stream...
let formatAndPrint =
    printfn "Got result: %O"

[<EntryPoint>]
let main argv =
    printfn "Starting..."

    [| 1 .. 10 |] // A list with some docs to be processed
    |> Array.Parallel.map getResult // Now that's doing the trick
    |> Array.iter formatAndPrint

    0

How to make application building blocks thread-safe?

What you're describing in "Update 2" is similar to the Actor model. If you're 100% sure that you don't care one tiny bit about performance--and I mean potentially really bad performance--and never will, then what you're suggesting is a fair solution, though your proposed implementation has problems around the locking (or lack thereof) around busy and the wait(). You may be better served by taking a look at Akka or another Actor framework.

Think of an Actor as something that runs in a single thread and has a FIFO queue that you can deliver units of work to. For each unit of work, the Actor processes it in some way and then sends back a reply, and you're guaranteed that units of work are processed serially and not in parallel.

What you've dubbed your "Backend" would be the code running in one or more Actors, each separate from the others. A framework like this would allow you take an approach similar to what you've described but with the possibility to scale up to increase performance without too much effort and without requiring you to manage the concurrency.

How to make java class thread safe?

The way you presented it, if each thread has its one copy, then it can be called thread-safe, as maximum of accessing threads is one.

Another thing - if you declare your fields as private and create the instance of that class as final, then it's immutable (final User user = new User(...)). There are no setters, so the object cannot be modified as well as it cannot change its reference. If you wanted to keep the immutability, you would have to make setters return a new instance of this object with changed fields.

@markspace noticed, that better approach would be to declare fields as final, because if you use the previous one and make User a member of some class, it won't work (unless final).

How to make a variable thread-safe

synchronized in Java is a mean to allow only a single thread to execute a code block (at any given time).

In Go there are numerous constructs to achieve that (e.g. mutexes, channels, waitgroups, primitives in sync/atomic), but Go's proverb is: "Do not communicate by sharing memory; instead, share memory by communicating."

So instead of locking and sharing a variable, try to not do that but instead communicate the result between goroutines e.g. using channels (so you won't have to access shared memory). For details, see The Go Blog:
Share Memory By Communicating.

Of course there may be cases when the simplest, direct solution is to use a mutex to protect concurrent access from multiple goroutines to a variable. When this is the case, this is how you can do that:

var (
    mu        sync.Mutex
    protectMe int
)

func getMe() int {
    mu.Lock()
    me := protectMe
    mu.Unlock()
    return me
}

func setMe(me int) {
    mu.Lock()
    protectMe = me
    mu.Unlock()
}

The above solution could be improved in several areas:

Use sync.RWMutex instead of sync.Mutex, so that the getMe() may lock for reading only, so multiple concurrent readers would not block each other.
After a (successful) locking it is advisable to unlock using defer, so if something bad happens in the subsequent code (e.g. runtime panic), the mutex will still be unlocked, avoiding resource leaks and deadlocks. Although this example is so simple, nothing bad could happen and does not warrant unconditional use of deferred unlocking.
It is good practice to keep the mutex close to the data it is ought to protect. So "wrapping" protectMe and its mu in a struct is a good idea. And if we're at it, we may also use embedding, so locking / unlocking becomes more convenient (unless this functionality must not be exposed). For details, see When do you embed mutex in struct in Go?

So an improved version of the above example could look like this (try it on the Go Playground):

type Me struct {
    sync.RWMutex
    me int
}

func (m *Me) Get() int {
    m.RLock()
    defer m.RUnlock()
    return m.me
}

func (m *Me) Set(me int) {
    m.Lock()
    m.me = me
    m.Unlock()
}

var me = &Me{}

func main() {
    me.Set(2)
    fmt.Println(me.Get())
}

This solution has another advantage: should you need multiple values of Me, it will automatically have different, separate mutexes for each value (our initial solution would require creating separate mutexes manually for each new values).

Although this example is correct and valid, may not be practical. Because protecting a single integer does not really require a mutex. We could achieve the same using the sync/atomic package:

var protectMe int32

func getMe() int32 {
    return atomic.LoadInt32(&protectMe)
}

func setMe(me int32) {
    atomic.StoreInt32(&protectMe, me)
}

This solution is shorter, cleaner and faster. If you're goal is only to protect a single value, this solution is preferred. If the data structure you ought to protect is more complex, atomic may not even be viable, and using a mutex might be justified.

Now after showing examples of sharing / protecting variables, we should also give an example what we should aim to achieve to live up to "Do not communicate by sharing memory; instead, share memory by communicating."

The situation is that you have multiple concurrent goroutines, and you use a variable where you store some state. One goroutine changes (sets) the state, and another reads (gets) the state. To access this state from multiple goroutines, access must be synchronized.

And the idea is to not have a "shared" variable like this, but instead the state that one goroutine would set, it should "send" it instead, and the other goroutine that would read it, it should be the one the state is "sent to" (or in other words, the other goroutine should receive the changed state). So there is no shared state variable, instead there is a communication between the 2 goroutines. Go provides excellent support for this kind of "inter-goroutine" communication: channels. Support for channels is built into the language, there are send statements, receive operators and other support (e.g. you can loop over the values sent on a channel). For an intro and details, please check this answer: What are channels used for?

Let's see a practical / real-life example: a "broker". A broker is an entity where "clients" (goroutines) may subscribe to receive messages / updates, and the broker is capable of broadcasting messages to subscribed clients. In a system where there are numerous clients that might subscribe / unsubscribe at any time, and there may be a need to broadcast messages at any time, synchronizing all this in a safe manner would be complex. Wisely using channels, this broker implementation is rather clean and simple. Please allow me to not repeat the code, but you can check it in this answer: How to broadcast message using channel. The implementation is perfectly safe for concurrent use, supports "unlimited" clients, and does not use a single mutex or shared variable, only channels.

Also see related questions:

Reading values from a different thread

How to Make an Application Thread Safe