How to Suspend and Resume a Sequence of Commands in Bash

How do I suspend and resume a sequence of commands in Bash?

Run it in a subshell:

(cmd1 && cmd2)

Example:

$ (sleep 5 && echo 1)                        # started the command chain
^Z
[1]+  Stopped         ( sleep 5 && echo 1 )  # stopped before `sleep 5` finished
$ fg                                         # resumed
( sleep 5 && echo 1 )
1                                            # `sleep 5` finished and `echo 1` ran

In bash, is there a way to suspend a script, let a user enter some commands, and then resume the script once they are done?

If the script is running interactively, simply invoke an editor on the file:

#!/bin/bash
echo "hello world" > myfile.txt
${EDITOR:-nano} myfile.txt

echo "Here's what you ended up saving:"
cat myfile.txt

This opens the user's preferred editor (with fallback on nano), and will not continue until the user quits the editor. This is similar to how an editor opens when you use git commit.

You can similarly give the user an interactive shell:

#!/bin/bash
echo "Please run any commands you want, and 'exit' when done"
bash
echo "Ok, continuing"

In this case, a shell starts, the user can run whichever commands they want, and when they exit the shell using exit or Ctrl+D, the script continues.

Run one command after another, even if I suspend the first one (Ctrl-z)

The following should do it:

(command1; command2)

Note the added parentheses.

Stop and restart a loop in Bash

As other mentioned you need to start new sub-shell with (for i in {1..10}; do sleep 3; echo $i; done)
You can suspend with ctrl+z. If you run jobs command, you should see the suspended jobs. Then resume it via fg or bg commands

Sample Image

Jsfiddle:http://jsfiddle.net/lakshmipathi/chccLdLt/3/

Suspend, save to disk, restart long jobs on a supercomputer with PBS

I'm not sure which version of pbs you're using, but TORQUE offers integration with Berkeley Lab Checkpoint/Restart (BLCR). The most important thing for BLCR is that all the nodes have the same exact OS image. Setting it up is rather detailed and documented in the TORQUE docs.

Essentially, the pbs_mom daemons are configured to use BLCR, and whenever you stop a job the daemon uses BLCR to take a snapshot of the OS internal data structures to know the exact state of the process, making it able to restart the same process from exactly the same point.

How to suspend/resume a process in Windows?

You can't do it from the command line, you have to write some code (I assume you're not just looking for an utility otherwise Super User may be a better place to ask). I also assume your application has all the required permissions to do it (examples are without any error checking).

Hard Way

First get all the threads of a given process then call the SuspendThread function to stop each one (and ResumeThread to resume). It works but some applications may crash or hung because a thread may be stopped in any point and the order of suspend/resume is unpredictable (for example this may cause a dead lock). For a single threaded application this may not be an issue.

void suspend(DWORD processId)
{
    HANDLE hThreadSnapshot = CreateToolhelp32Snapshot(TH32CS_SNAPTHREAD, 0);

    THREADENTRY32 threadEntry; 
    threadEntry.dwSize = sizeof(THREADENTRY32);

    Thread32First(hThreadSnapshot, &threadEntry);

    do
    {
        if (threadEntry.th32OwnerProcessID == processId)
        {
            HANDLE hThread = OpenThread(THREAD_ALL_ACCESS, FALSE,
                threadEntry.th32ThreadID);
            
            SuspendThread(hThread);
            CloseHandle(hThread);
        }
    } while (Thread32Next(hThreadSnapshot, &threadEntry));

    CloseHandle(hThreadSnapshot);
}

Please note that this function is even too much naive, to resume threads you should skip threads that was suspended and it's easy to cause a dead-lock because of suspend/resume order. For single threaded applications it's prolix but it works.

Undocumented way

Starting from Windows XP there is the NtSuspendProcess but it's undocumented. Read this post (or this article) for a code example (reference for undocumented functions: news://comp.os.ms-windows.programmer.win32).

typedef LONG (NTAPI *NtSuspendProcess)(IN HANDLE ProcessHandle);

void suspend(DWORD processId)
{
    HANDLE processHandle = OpenProcess(PROCESS_ALL_ACCESS, FALSE, processId));

    NtSuspendProcess pfnNtSuspendProcess = (NtSuspendProcess)GetProcAddress(
        GetModuleHandle("ntdll"), "NtSuspendProcess");

    pfnNtSuspendProcess(processHandle);
    CloseHandle(processHandle);
}

"Debugger" Way

To suspend a program is what usually a debugger does, to do it you can use the DebugActiveProcess function. It'll suspend the process execution (with all threads all together). To resume you may use DebugActiveProcessStop.

This function lets you stop a process (given its Process ID), syntax is very simple: just pass the ID of the process you want to stop et-voila. If you'll make a command line application you'll need to keep its instance running to keep the process suspended (or it'll be terminated). See the Remarks section on MSDN for details.

void suspend(DWORD processId)
{
    DebugActiveProcess(processId);
}

From Command Line

As I said Windows command line has not any utility to do that but you can invoke a Windows API function from PowerShell. First install Invoke-WindowsApi script then you can write this:

Invoke-WindowsApi "kernel32" ([bool]) "DebugActiveProcess" @([int]) @(process_id_here)

Of course if you need it often you can make an alias for that.

Any way to exit bash script, but not quitting the terminal

The "problem" really is that you're sourcing and not executing the script. When you source a file, its contents will be executed in the current shell, instead of spawning a subshell. So everything, including exit, will affect the current shell.

Instead of using exit, you will want to use return.

Run a shell script and immediately background it, however keep the ability to inspect its output

To 'background' a process when you start it

Simply add an ampersand (&) after the command.

If the program writes to standard out, it will still write to your console / terminal.

To foreground the process

Simply use the fg command. You can see a list of jobs in the background with jobs.

For example:

sh -c 'sleep 3 && echo I just woke up' & jobs

To background a currently running process

If you have already started the process in the foreground, but you want to move it to the background, you can do the following:

Press Ctrl+z to put the current process to sleep and return to your shell. This process will be paused until you send it another signal.
Run the bg command to resume the process, but have it run in the background instead of the foreground.

How to make a program continue to run after log out from ssh?

Assuming that you have a program running in the foreground, press ctrl-Z, then:

[1]+  Stopped                 myprogram
$ disown -h %1
$ bg 1
[1]+ myprogram &
$ logout

If there is only one job, then you don't need to specify the job number. Just use disown -h and bg.

Explanation of the above steps:

You press ctrl-Z. The system suspends the running program, displays a job number and a "Stopped" message and returns you to a bash prompt.

You type the disown -h %1 command (here, I've used a 1, but you'd use the job number that was displayed in the Stopped message) which marks the job so it ignores the SIGHUP signal (it will not be stopped by logging out).

Next, type the bg command using the same job number; this resumes the running of the program in the background and a message is displayed confirming that.

You can now log out and it will continue running..

Why can't I use job control in a bash script?

What he meant is that job control is by default turned off in non-interactive mode (i.e. in a script.)

From the bash man page:

JOB CONTROL
       Job  control refers to the ability to selectively stop (suspend)
       the execution of processes and continue (resume) their execution at a
       later point.
       A user typically employs this facility via an interactive interface
       supplied jointly by the system’s terminal driver and bash.

and

   set [--abefhkmnptuvxBCHP] [-o option] [arg ...]
      ...
      -m      Monitor mode.  Job control is enabled.  This option is on by
              default for interactive shells on systems that support it (see
              JOB CONTROL above).  Background processes run in a separate
              process group and a line containing their exit status  is
              printed  upon  their completion.

When he said "is stupid" he meant that not only:

is job control meant mostly for facilitating interactive control (whereas a script can work directly with the pid's), but also
I quote his original answer, ... relies on the fact that you didn't start any other jobs previously in the script which is a bad assumption to make. Which is quite correct.

UPDATE

In answer to your comment: yes, nobody will stop you from using job control in your bash script -- there is no hard case for forcefully disabling set -m (i.e. yes, job control from the script will work if you want it to.) Remember that in the end, especially in scripting, there always are more than one way to skin a cat, but some ways are more portable, more reliable, make it simpler to handle error cases, parse the output, etc.

You particular circumstances may or may not warrant a way different from what lhunath (and other users) deem "best practices".