Is It Ok to Use a String as a Lock Object

Is it OK to use a string as a lock object?

Locking on strings is discouraged, the main reason is that (because of string-interning) some other code could lock on the same string instance without you knowing this. Creating a potential for deadlock situations.

Now this is probably a far fetched scenario in most concrete situations. It's more a general rule for libraries.

But on the other hand, what is the perceived benefit of strings?

So, point for point:

Are there any problems with this approach?

Yes, but mostly theoretical.

Is it OK to lock on a string object in this way, and are there any thread safety issues in using the HashSet?

The HashSet<> is not involved in the thread-safety as long as the threads only read concurrently.

Is it better to, for example, create a Dictionary that creates a new lock object for each string instance?

Yes. Just to be on the safe side. In a large system the main aim for avoiding deadlock is to keep the lock-objects as local and private as possible. Only a limited amount of code should be able to access them.

Is it a good practice to create and use String objects for locking in thread synchronization?

You should not use Strings for lock objects especially not string literals.

String literals are from the String pool and each String literal which is the same string is the same reference. This means if 2 different threads use 2 "different" string literals are actually the same and hence deadlock can easily occur.

To demonstrate:

// Thread #1
String LOCK1 = "mylock";
synchronized (LOCK1) {
}

// Thread #2
String LOCK2 = "mylock";
synchronized (LOCK2) {
    // This is actually the SAME lock,
    // might cause deadlock between the 2 synchronized blocks!
    // Because LOCK1==LOCK2!
}

Best would be to synchronize on private objects which are not accessible from the "outside". If you use an Object for lock which is visible from "outside" (or returned by a method), that object is available to anyone to also use as a lock which you have no control over and may cause a deadlock with your internal synchronized block.

For example you can synchronize on the object you whish to guard if it is private, or create a private, internal lock Object:

private final Object LOCK = new Object();

// Later:
synchronized (LOCK) {
    // LOCK is not known to any "outsiders", safe to use it as internal lock
}

Using string as a lock to do thread synchronization

Strings like that (from the code) could be "interned". This means all instances of "ABC" point to the same object. Even across AppDomains you can point to the same object (thx Steven for the tip).

If you have a lot of string-mutexes, from different locations, but with the same text, they could all lock on the same object.

The intern pool conserves string storage. If you assign a literal string constant to several variables, each variable is set to reference the same constant in the intern pool instead of referencing several different instances of String that have identical values.

It's better to use:

 private static readonly object mutex = new object();

Also, since your string is not const or readonly, you can change it. So (in theory) it is possible to lock on your mutex. Change mutex to another reference, and then enter a critical section because the lock uses another object/reference. Example:

private static string mutex = "1";
private static string mutex2 = "1";  // for 'lock' mutex2 and mutex are the same

private static void CriticalButFlawedMethod() {
    lock(mutex) {
      mutex += "."; // Hey, now mutex points to another reference/object
      // You are free to re-enter
      ...
    }
}

Locking by string. Is this safe/sane?

Another option is to get the HashCode of each URL, then divide it by a prime number and use it as an index into an array of locks. This will limit the number of locks you need while letting you control the probability of a “false locking” by choose the number of locks to use.

However the above is only worthwhile if it is too costly just have one lock per active url.

When String is the object of the lock, which is better String.intern() or Striped?

In the first version, you get a unique lock per device as distinguished by the deviceId string. The intern is necessary in this case.

In the second version, you have only (say) 8 locks and they are striped across all of the devices.

What is the difference?

The first version creates more primitive locks. But primitive locks are cheap unless there is contention on a lock. With this version you only get lock contention if two or more threads are really trying to do something with the same device.
The second version never creates more than 8 locks, and they are Lock objects. The Lock API provides more functionality than a primitive lock ... if that is useful to you.
But with this version you get more contention. For example, if two different devices use the same striped lock, you get mutual exclusion for those two devices ... which you probably don't want.

There is also a theoretical issue with using interned strings as lock identifiers. The space of interned strings is JVM wide, so if you have different parts of your application independently locking things this way (e.g. using the interned device id strings), they can potentially interfere. The two parts of the application might end up accidentally sharing locks. (This issue is discussed in more depth in https://stackoverflow.com/a/134154/5973816.)

There are ways to avoid this if it is a real issue. For example, the two parts of the application could add (different) prefixes to the strings before interning, etc.

Which is better?

Well it depends on 1) what you are optimizing for and 2) how long the locks are liable to be held; i.e. the cost of unwanted contention caused by striping.

Under most circumstances, the first version (a distinct lock per device) will be better. But, if you have a huge number of devices, the cost of a huge number of interned strings might be significant. In that case, AND if the locks are held for a very short time, then lock striping might be better.

What object is the lock on when a String parameter is used for locking?

The intrinsic lock on the String object is the lock that gets acquired. But whether locking works depends on if the string is always the same instance or not. String pools and interning will affect this.

That it is difficult to figure out if the same instance will be used is only one reason not to do this.

If two classes in the application use the same string instance then one of them can acquire the lock and shut the other out. So you can have conceptually unrelated objects affecting each other, contending for the same lock.

Also people are going to be confused into thinking they can use the string value to mean something and have code change the value of s inside the synchronized block or method. That will break all the locking. Locks are on the objects, not on the variables. Changing values means the thread currently holding the lock now has the old object but threads trying to get in are trying to get the lock on the new object, a thread can acquire the new object and start executing the synchronized code before the previous thread is done.

It might work by accident sometimes but it is a terrible idea. Use a dedicated object as a lock instead:

private final Object lock = new Object();

Is it bad to overwrite a lock object if it is the last statement in the lock?

So, to start with, no the specific solution that you have provided is not safe, due to the specifics of how you're accessing the field.

Getting a working solution is easy enough. Just don't create a new list; instead, clear it out and add the new items:

class Foo
{
    private List<string> lockedList = new List<string>();

    public void ReplaceList(IEnumerable<string> items)
    {
        lock (lockedList)
        {
            lockedList.Clear();
            lockedList.AddRange(items);
        }
    }

    public void Add(string newItem)
    {
        lock (lockedList)
        {
            lockedList.Add(newItem);
        }
    }

    public void Contains(string item)
    {
        lock (lockedList)
        {
            lockedList.Contains(item);
        }
    }
}

Now your field isn't actually changing, and you don't need to worry about all of the problems that that can cause.

As for how the code in the question can break, all it takes is a call to Add or Contains to read the field, get the list, lock the list, and then have another thread replace the field. The when you read the field for a second time, after already getting the value to lock on, the value may have changed, so you'll end up mutating or reading from a list that another caller won't be restricted from accessing.

All that said, while mutating the lockedList variable is a really bad idea, and you should unquestionably avoid mutating it, as shown above, you can also ensure that you only actually read the field once, rather than reading from it repeatedly, and you'll still ensure that each list is only ever accessed from a single thread at any one time:

class Foo
{
    private volatile List<string> lockedList = new List<string>();

    public void ReplaceList(IEnumerable<string> items)
    {
        lockedList = new List<string>(items);
    }

    public void Add(string newItem)
    {
        var localList = lockedList;
        lock (localList)
        {
            localList.Add(newItem);
        }
    }

    public void Contains(string item)
    {
        var localList = lockedList;
        lock (localList)
        {
            localList.Contains(item);
        }
    }
}

Notice here that the problem that this fixes isn't mutating the field that the object to lock on was fetched from (that's not inherently the problem, although is a very bad practice), but rather constantly getting new values from the field in all usages of it from inside of the lock statements and expecting that value to never change, when it can.

This is going to be much harder to maintain, is very fragile, and is much more difficult to understand or ensure the correctness of though, so again, do do things like this.

What are the implications of using lock(typeof(string))

If you lock on a Type it will mean that you have a mutual access exclusion based on that instance of the Type. The implications are that two threads in the application doing this will inadvertently block each other or cause unforeseen deadlocks.

Remember, typeof(someType) just returns a Type instance.

It is typically a best practice to dedicate an object to locking a complex process, such as declaring a readonly object in your class. If the lock just needs to go around some access to a private variable, say a collection, then locking that collection is quite fine.

Is It Ok to Use a String as a Lock Object