Should I Use Byte or Int

Using string as a lock to do thread synchronization

Strings like that (from the code) could be "interned". This means all instances of "ABC" point to the same object. Even across AppDomains you can point to the same object (thx Steven for the tip).

If you have a lot of string-mutexes, from different locations, but with the same text, they could all lock on the same object.

The intern pool conserves string storage. If you assign a literal string constant to several variables, each variable is set to reference the same constant in the intern pool instead of referencing several different instances of String that have identical values.

It's better to use:

 private static readonly object mutex = new object();

Also, since your string is not const or readonly, you can change it. So (in theory) it is possible to lock on your mutex. Change mutex to another reference, and then enter a critical section because the lock uses another object/reference. Example:

private static string mutex = "1";
private static string mutex2 = "1";  // for 'lock' mutex2 and mutex are the same

private static void CriticalButFlawedMethod() {
    lock(mutex) {
      mutex += "."; // Hey, now mutex points to another reference/object
      // You are free to re-enter
      ...
    }
}

Is it a good practice to create and use String objects for locking in thread synchronization?

You should not use Strings for lock objects especially not string literals.

String literals are from the String pool and each String literal which is the same string is the same reference. This means if 2 different threads use 2 "different" string literals are actually the same and hence deadlock can easily occur.

To demonstrate:

// Thread #1
String LOCK1 = "mylock";
synchronized (LOCK1) {
}

// Thread #2
String LOCK2 = "mylock";
synchronized (LOCK2) {
    // This is actually the SAME lock,
    // might cause deadlock between the 2 synchronized blocks!
    // Because LOCK1==LOCK2!
}

Best would be to synchronize on private objects which are not accessible from the "outside". If you use an Object for lock which is visible from "outside" (or returned by a method), that object is available to anyone to also use as a lock which you have no control over and may cause a deadlock with your internal synchronized block.

For example you can synchronize on the object you whish to guard if it is private, or create a private, internal lock Object:

private final Object LOCK = new Object();

// Later:
synchronized (LOCK) {
    // LOCK is not known to any "outsiders", safe to use it as internal lock
}

Is it OK to use a string as a lock object?

Locking on strings is discouraged, the main reason is that (because of string-interning) some other code could lock on the same string instance without you knowing this. Creating a potential for deadlock situations.

Now this is probably a far fetched scenario in most concrete situations. It's more a general rule for libraries.

But on the other hand, what is the perceived benefit of strings?

So, point for point:

Are there any problems with this approach?

Yes, but mostly theoretical.

Is it OK to lock on a string object in this way, and are there any thread safety issues in using the HashSet?

The HashSet<> is not involved in the thread-safety as long as the threads only read concurrently.

Is it better to, for example, create a Dictionary that creates a new lock object for each string instance?

Yes. Just to be on the safe side. In a large system the main aim for avoiding deadlock is to keep the lock-objects as local and private as possible. Only a limited amount of code should be able to access them.

Creating an object from string and using as monitor in synchronized block

I know that I can't use string directly and ints don't have monitor so
I can't use hashcode directly.

Yes, nonetheless, you could (but you should not) take advantage of the method intern(). From source one can read:

The java string intern() method returns the interned string. It
returns the canonical representation of string.
It can be used to return string from memory, if it is created by new
keyword. It creates exact copy of heap string object in string
constant pool.

String annotationString = // ...;
synchronized (annotationString.intern())

Notwithstanding, the problem of locking on strings is that other objects could inadvertently be locking on those same strings for entirely different purposes. In other words, strings have a global semantic scope, that you cannot control, i.e., ensure that those same strings are not being used elsewhere as a locking mechanism. From this SO Thread (among other points) one can read:

Locking on strings is discouraged, the main reason is that (because of
string-interning) some other code could lock on the same string
instance without you knowing this. Creating a potential for deadlock
situations.
Now this is probably a far fetched scenario in most concrete
situations. It's more a general rule for libraries.

An alternative solution would be to have a Map of <String, Object> where the key is the string read from the annotation, and the value would be an object which threads can use to synchronize.

The Map would be shared among threads, therefore one would have to ensure mutual exclusion when accessing such a Map. Every time a thread reads a string from an annotation adds that string to the Map if not yet added, otherwise synchronizes on the value associated with that string key.

What object is the lock on when a String parameter is used for locking?

The intrinsic lock on the String object is the lock that gets acquired. But whether locking works depends on if the string is always the same instance or not. String pools and interning will affect this.

That it is difficult to figure out if the same instance will be used is only one reason not to do this.

If two classes in the application use the same string instance then one of them can acquire the lock and shut the other out. So you can have conceptually unrelated objects affecting each other, contending for the same lock.

Also people are going to be confused into thinking they can use the string value to mean something and have code change the value of s inside the synchronized block or method. That will break all the locking. Locks are on the objects, not on the variables. Changing values means the thread currently holding the lock now has the old object but threads trying to get in are trying to get the lock on the new object, a thread can acquire the new object and start executing the synchronized code before the previous thread is done.

It might work by accident sometimes but it is a terrible idea. Use a dedicated object as a lock instead:

private final Object lock = new Object();

Thread synchronization- When does a thread release the lock on an object

A synchronized method will only stop other threads from executing it while it is being executed. As soon as it returns other threads can (and often will immediately) get access.

The scenario to get your 1 1 2 2 ... could be:

Thread 1 calls push(1) and is allowed in.
Thread 2 calls push(1) and is blocked while Thread 1 is using it.
Thread 1 exits push(1).
Thread 2 gains access to push and pushes 1 but at the same time Thread 1 calls push(2).

Result 1 1 2 - you can clearly see how it continues.

Why is reason that synchronized lock not working on String concatenation

The reason for this behaviour is that String concatenation, if the operands are not both compile-time constant expressions, results in a new instance of String being created. See here in the language spec:

The String object is newly created (§12.5) unless the expression is a constant expression (§15.29).

Method parameters are not compile-time constant expressions, since the method can be called with any value for the parameter.

And, because the current thread is the only thread which has access to this newly-created String, synchronizing on it has no effect.

Since you say you don't want to know the alternatives, I'll stop there.

You seem unconvinced, despite my pointing to the relevant part of the spec, that the strings you are synchronizing on are different. Try adding this line, right before the synchronized:

    System.out.println(System.identityHashCode(lock));  // Add this
    synchronized (lock) {
      // ...

This will print out the "identity hash code" of the lock, which isn't the same as lock.hashCode(), which is based on the value of the string. This will show that synchronized is synchronizing on different values (unless you're exceptionally lucky/unlucky, since hash collisions are unlikely but possible).

How does these lazy inline string-locks actually work, do they work, do they cause problem?

There are problems using strings as mutexes for locking and a very good answer here:

Using string as a lock to do thread synchronization

In addition to that, there may be an issue with your logic. If the 'SomeIdentifier' is mutable, then you run into the possible race condition where:

Thread A calls get on object instance 1

Thread B updates SomeIdentifer on object instance 1

Thread A locks and instantiates stuff on object instance 1

Thread B calls get on object instance 1 (locks on a different string)

Thread A calls DoLazyInitialization

Thread B returns 'stuff' (as it is not null) prior to DoLazyInitialization completing.

Synchronizing on String objects in Java

Without putting my brain fully into gear, from a quick scan of what you say it looks as though you need to intern() your Strings:

final String firstkey = "Data-" + email;
final String key = firstkey.intern();

Two Strings with the same value are otherwise not necessarily the same object.

Note that this may introduce a new point of contention, since deep in the VM, intern() may have to acquire a lock. I have no idea what modern VMs look like in this area, but one hopes they are fiendishly optimised.

I assume you know that StaticCache still needs to be thread-safe. But the contention there should be tiny compared with what you'd have if you were locking on the cache rather than just the key while calling getSomeDataForEmail.

Response to question update:

I think that's because a string literal always yields the same object. Dave Costa points out in a comment that it's even better than that: a literal always yields the canonical representation. So all String literals with the same value anywhere in the program would yield the same object.

Edit

Others have pointed out that synchronizing on intern strings is actually a really bad idea - partly because creating intern strings is permitted to cause them to exist in perpetuity, and partly because if more than one bit of code anywhere in your program synchronizes on intern strings, you have dependencies between those bits of code, and preventing deadlocks or other bugs may be impossible.

Strategies to avoid this by storing a lock object per key string are being developed in other answers as I type.

Here's an alternative - it still uses a singular lock, but we know we're going to need one of those for the cache anyway, and you were talking about 50 threads, not 5000, so that may not be fatal. I'm also assuming that the performance bottleneck here is slow blocking I/O in DoSlowThing() which will therefore hugely benefit from not being serialised. If that's not the bottleneck, then:

If the CPU is busy then this approach may not be sufficient and you need another approach.
If the CPU is not busy, and access to server is not a bottleneck, then this approach is overkill, and you might as well forget both this and per-key locking, put a big synchronized(StaticCache) around the whole operation, and do it the easy way.

Obviously this approach needs to be soak tested for scalability before use -- I guarantee nothing.

This code does NOT require that StaticCache is synchronized or otherwise thread-safe. That needs to be revisited if any other code (for example scheduled clean-up of old data) ever touches the cache.

IN_PROGRESS is a dummy value - not exactly clean, but the code's simple and it saves having two hashtables. It doesn't handle InterruptedException because I don't know what your app wants to do in that case. Also, if DoSlowThing() consistently fails for a given key this code as it stands is not exactly elegant, since every thread through will retry it. Since I don't know what the failure criteria are, and whether they are liable to be temporary or permanent, I don't handle this either, I just make sure threads don't block forever. In practice you may want to put a data value in the cache which indicates 'not available', perhaps with a reason, and a timeout for when to retry.

// do not attempt double-check locking here. I mean it.
synchronized(StaticObject) {
    data = StaticCache.get(key);
    while (data == IN_PROGRESS) {
        // another thread is getting the data
        StaticObject.wait();
        data = StaticCache.get(key);
    }
    if (data == null) {
        // we must get the data
        StaticCache.put(key, IN_PROGRESS, TIME_MAX_VALUE);
    }
}
if (data == null) {
    // we must get the data
    try {
        data = server.DoSlowThing(key);
    } finally {
        synchronized(StaticObject) {
            // WARNING: failure here is fatal, and must be allowed to terminate
            // the app or else waiters will be left forever. Choose a suitable
            // collection type in which replacing the value for a key is guaranteed.
            StaticCache.put(key, data, CURRENT_TIME);
            StaticObject.notifyAll();
        }
    }
}

Every time anything is added to the cache, all threads wake up and check the cache (no matter what key they're after), so it's possible to get better performance with less contentious algorithms. However, much of that work will take place during your copious idle CPU time blocking on I/O, so it may not be a problem.

This code could be commoned-up for use with multiple caches, if you define suitable abstractions for the cache and its associated lock, the data it returns, the IN_PROGRESS dummy, and the slow operation to perform. Rolling the whole thing into a method on the cache might not be a bad idea.

What is the use of taking a lock on an object in synchronized block, if it can be accessed in any other method?

If there is a synchronized block which is taking lock on an object, say StringBuilder sb, which one thread is executing this synchronized block in which sb is locked, suppose there is another thread which is calling another method which will try to change the value of sb(not in synchronized block), then, will it be allowed to do that?

Uh yes. I think you need to do some reading about what synchronized does. See the Java tutorial. It does not "lock" an object as in restrict other threads from operating on it. What it does is provide mutex for the surrounded block of code for threads that lock on the same object instance. The fields in an object are perfectly able to be mutated both inside or outside the synchronized block.

It is always a good idea to synchronize on a constant object so I tend to do something like:

 private final Object lock = new Object();
 ...
 synchronized (lock) {
    // only one thread allowed inside this block at a time
 }

If multiple threads are accessing some sort of thread-unsafe object, I will synchronize on that object and do the operations on the object inside the synchronized block:

 private final SomeUnsafeObject someObject = ...;
 ...
 synchronized (someObject) {
    // only one thread allowed inside this block at a time
    someObject.someUnsafeMethodCall(...);
 }
 // no accesses (read or write) outside of the synchronized block

I thought locking an object would not allow to change that object as well from being modified or accessed. But, from output I am seeing the locked object can be modified:

No, there is no implicit blocking of the object being changed. If you only access the object's fields inside of the synchronized block then you would have accomplished what you want.

public void m2() {
    ...
    abc.append("A");

Right, since you are not inside of a synchronized (abc) block here, there is nothing that stops the thread from calling abc.append(...).

I am not getting this, can anybody please explain this. What is the use of taking a lock on an object? Is it just because wait/notify/notifyAll methods?

Synchronization allows you to control access to a block of code to one thread at a time based on the lock object (or the monitor on that object to be precise). It also allows you to do lock.wait(...) and lock.notify(...) to control the threads operation and block/release them as well.

Synchronization also puts up memory barriers so that a thread will see changes stored to central memory when it enters a synchronized block and will see it's changes written to central memory when it leaves. Without these memory barriers, if other threads access the StringBuilder without synchronization then they may seem some partially updated portion of that class which can cause NPEs or other failures.

Should I Use Byte or Int

Using string as a lock to do thread synchronization

Is it a good practice to create and use String objects for locking in thread synchronization?

Is it OK to use a string as a lock object?

Creating an object from string and using as monitor in synchronized block

What object is the lock on when a String parameter is used for locking?

Thread synchronization- When does a thread release the lock on an object

Why is reason that synchronized lock not working on String concatenation

How does these lazy inline string-locks actually work, do they work, do they cause problem?

Synchronizing on String objects in Java

What is the use of taking a lock on an object in synchronized block, if it can be accessed in any other method?

Related Topics

Leave a reply