What Exactly Does the .Join() Method Do

What exactly does the .join() method do?

Look carefully at your output:

5wlfgALGbXOahekxSs9wlfgALGbXOahekxSs5
^ ^ ^

I've highlighted the "5", "9", "5" of your original string. The Python join() method is a string method, and takes a list of things to join with the string. A simpler example might help explain:

>>> ",".join(["a", "b", "c"])
'a,b,c'

The "," is inserted between each element of the given list. In your case, your "list" is the string representation "595", which is treated as the list ["5", "9", "5"].

It appears that you're looking for + instead:

print array.array('c', random.sample(string.ascii_letters, 20 - len(strid)))
.tostring() + strid

What exactly is Python multiprocessing Module's .join() Method Doing?

The join() method, when used with threading or multiprocessing, is not related to str.join() - it's not actually concatenating anything together. Rather, it just means "wait for this [thread/process] to complete". The name join is used because the multiprocessing module's API is meant to look as similar to the threading module's API, and the threading module uses join for its Thread object. Using the term join to mean "wait for a thread to complete" is common across many programming languages, so Python just adopted it as well.

Now, the reason you see the 20 second delay both with and without the call to join() is because by default, when the main process is ready to exit, it will implicitly call join() on all running multiprocessing.Process instances. This isn't as clearly stated in the multiprocessing docs as it should be, but it is mentioned in the Programming Guidelines section:

Remember also that non-daemonic processes will be automatically be
joined.

You can override this behavior by setting the daemon flag on the Process to True prior to starting the process:

p = Process(target=say_hello)
p.daemon = True
p.start()
# Both parent and child will exit here, since the main process has completed.

If you do that, the child process will be terminated as soon as the main process completes:

daemon

The process’s daemon flag, a Boolean value. This must be set before
start() is called.

The initial value is inherited from the creating process.

When a process exits, it attempts to terminate all of its daemonic
child processes.

What is the use of join() in Python threading?

A somewhat clumsy ascii-art to demonstrate the mechanism:
The join() is presumably called by the main-thread. It could also be called by another thread, but would needlessly complicate the diagram.

join-calling should be placed in the track of the main-thread, but to express thread-relation and keep it as simple as possible, I choose to place it in the child-thread instead.

without join:
+---+---+------------------ main-thread
| |
| +........... child-thread(short)
+.................................. child-thread(long)

with join
+---+---+------------------***********+### main-thread
| | |
| +...........join() | child-thread(short)
+......................join()...... child-thread(long)

with join and daemon thread
+-+--+---+------------------***********+### parent-thread
| | | |
| | +...........join() | child-thread(short)
| +......................join()...... child-thread(long)
+,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, child-thread(long + daemonized)

'-' main-thread/parent-thread/main-program execution
'.' child-thread execution
'#' optional parent-thread execution after join()-blocked parent-thread could
continue
'*' main-thread 'sleeping' in join-method, waiting for child-thread to finish
',' daemonized thread - 'ignores' lifetime of other threads;
terminates when main-programs exits; is normally meant for
join-independent tasks

So the reason you don't see any changes is because your main-thread does nothing after your join.
You could say join is (only) relevant for the execution-flow of the main-thread.

If, for example, you want to concurrently download a bunch of pages to concatenate them into a single large page, you may start concurrent downloads using threads, but need to wait until the last page/thread is finished before you start assembling a single page out of many. That's when you use join().

Why is the word join is used for the Thread.join() method?

The word "join" comes from the Fork–Join model, where "fork" means to split the thread into multiple threads for parallel processing, and "join" means to wait for the parallel threads to complete their part, before continuing in a single thread.

How does join() work in java? Does it guarantee the execution before main()?

Short answer

How does join() work in java?

I grant you that the javadoc for join() is a little bit unclear, because it's not obvious who this refers to when you first read it.

It means that the thread calling t.join() blocks until the thread t has finished its execution. If t has already finished when the current thread calls t.join(), then the current thread does not stop and just keeps going.
The word this in the doc refers to t here, not to the thread that calls the method.

Does it guarantee the execution before main()?

[...] if no join is used, main() executes anywhere b/w the execution of threads [...]

You shouldn't consider main() as a whole. Parts of main() are executed before the other threads, parts of it in parallel, and parts of it after. That's actually what start() and join() control. Let me explain below.

What happens in your main()

Here is the sequence of events regarding t1.start() and t1.join(). You can obviously think the same way for t3.

  1. The instructions of main() preceding t1.start() are executed

  2. t1.start() starts the thread t1 (t1.run() might not start right away.)

  3. The instructions of main() between t1.start() and t1.join() are executed in parallel(*) of the ones in t1.run().

    Note: You have none in your example, so only t1.run() instructions are executed at this moment.

  4. t1.join():

    • if t1.run() has already finished, nothing happens and main() keeps going
    • if t1.run() has not finished yet, the main thread stops and waits until t1.run() finishes. Then t1.run() finishes, and then main() resumes.
  5. The instructions of main() after t1.join() are executed

Here you can see that:

  • the part of main() preceding t1.start() is guaranteed to be executed before t1.run()
  • the part of main() following t1.join() is guaranteed to be executed after t1.run()

(*) see below section about parallelism

What I mean by "executed in parallel"

Suppose you have these 2 sets of instructions being executed in 2 threads A and B:

// Thread A                   |     // Thread B
|
System.out.println("A1"); | System.out.println("B1");
System.out.println("A2"); | System.out.println("B2");
System.out.println("A3"); | System.out.println("B3");

If these 2 threads are "executed in parallel", this means 3 things:

  • the order of execution of the instructions of thread A is guaranteed:

    A1 will execute before A2, and A2 before A3.

  • the order of execution of the instructions of thread B is guaranteed:

    B1 will execute before B2, and B2 before B3.

  • however, A's and B's instructions can be interlaced, which means all of the following are possible (and more):

A1, B1, A2, B2, B3, A3

B1, B2, A1, B3, A2, A3

A1, A2, A3, B1, B2, B3 // special case where A's are all executed before B's

B1, B2, B3, A1, A2, A3 // special case where B's are all executed before A's


Note: this section dealt with parallelism as an illusion created by the OS to make the user feel like things run at the same time, where actually there is only one core executing instructions sequentially, jumping from one process/thread to another.

In fact, an A instruction and a B instruction could be executed simultaneously (real parallelism) on 2 separate cores. The 3 bullet points above still stand anyway. As @jameslarge pointed out, usually we model concurrency with a sequence of events, even for multicores. This leaves aside the concept of simultaneity of 2 events, which does not bring anything useful but complications.

Java Multithreading concept and join() method

You must understand , threads scheduling is controlled by thread scheduler.So, you cannot guarantee the order of execution of threads under normal circumstances.

However, you can use join() to wait for a thread to complete its work.

For example, in your case

ob1.t.join();

This statement will not return until thread t has finished running.

Try this,

class Demo {
Thread t = new Thread(
new Runnable() {
public void run () {
//do something
}
}
);
Thread t1 = new Thread(
new Runnable() {
public void run () {
//do something
}
}
);
t.start(); // Line 15
t.join(); // Line 16
t1.start();
}

In the above example, your main thread is executing. When it encounters line 15, thread t is available at thread scheduler. As soon as main thread comes to line 16, it will wait for thread t to finish.

NOTE that t.join did not do anything to thread t or to thread t1. It only affected the thread that called it (i.e., the main() thread).

Edited:

t.join(); needs to be inside the try block because it throws the InterruptedException exception, otherwise you will get an error at compile time. So, it should be:

try{
t.join();
}catch(InterruptedException e){
// ...
}


Related Topics



Leave a reply



Submit