How to Link a Linux's Thread Tid and a Pthread_T "Thread Id"

How do I get a thread ID from an arbitrary pthread_t?

Since pthreads do not need to be implemented with Linux threads (or kernel threads at all, for that matter), and some implementations are entirely user-level or mixed, the pthreads interface does not provide functions to access these implementation details, as those would not be portable (even across pthreads implementations on Linux). Thread libraries that use those could provide this as an extension, but there do not seem to be any that do.

Other than accessing internal data structures of the threading library (which you understandably do not want, although with your assumptions about processor affinity and Linux thread IDs, your code will not be portable anyway), you may be able to play a trick at creation time, if you control the code that creates the threads:

Give pthread_create() an entry function that calls gettid() (which by the way you are likely to have to do using the syscall macro directly because it is not always exported by libc), stores the result somewhere, and then calls the original entry function. If you have multiple threads with the same entry function, you can pass an incremented pointer into an array in the arg argument to pthread_create, which will then be passed to the entry function you created to store the thread ID in. Store the pthread_t return value of pthread_create in the same order, and then you will be able to look up the Linux thread IDs of all threads you created given their pthread_t value.

Whether this trick is worth it, depends on how important setting the CPU affinity is in your case, versus not accessing internal structures of the thread library or depending on a thread library that provides pthread_setaffinity_np.

get pthread_t from thread id

I don't think such a function exists but I would be happy to be corrected.

As a workaround I can create a table mapping &pthread_t to pid_t and ensure that I always invoke pthread_create() via a wrapper that adds an entry to this table. This works very well and allows me to convert an OS thread id to a pthread_t which I can then terminate using pthread_cancel(). Here is a snippet of the mechanism:

typedef void* (*threadFunc)(void*);

static void* start_thread(void* arg)
{
  threadFunc threadRoutine = routine_to_start;
  record_thread_start(pthread_self(),syscall(SYS_gettid));
  routine_to_start = NULL; //let creating thread know its safe to continue
  return threadRoutine(arg);
}

How to get thread id of a pthread in linux c program?

pthread_self() function will give the thread id of current thread.

pthread_t pthread_self(void);

The pthread_self() function returns the Pthread handle of the calling thread. The pthread_self() function does NOT return the integral thread of the calling thread. You must use pthread_getthreadid_np() to return an integral identifier for the thread.

NOTE:

pthread_id_np_t   tid;
tid = pthread_getthreadid_np();

is significantly faster than these calls, but provides the same behavior.

pthread_id_np_t   tid;
pthread_t         self;
self = pthread_self();
pthread_getunique_np(&self, &tid);

Does Posix thread ID have an one-to-one relation with linux thread ID?

Does Posix thread ID have an one-to-one relation with linux thread ID

Yes.

But consider this an implementation detail. Other OSs might do this differently.

which is usually defined as

pthread_t is opaque. As well do not make any assumptions on how it is implemented.

I found that one linux thread ID maps to several POSIX thread IDs

Really? I doubt this. At least not if all POSIX thread Ids in question were valid, that is the related thread either had not been joined yet or, if running detached, the thread had not ended yet.

How to map pthread_t to pid (on Linux)

One (convoluted, non-portable, Linux-specific, lightly destructive) method of mapping pthread_t to tid without looking into struct pthread is as follows:

Use pthread_setname_np to set a thread name to something unique.
Iterate over subdirectories of /proc/self/task and read a line from a file named comm in each of those.
If the line equals to the unique string just used, extract tid from the last component of the subdirectory name. This is your answer.

The thread name is not used by the OS for anything, so it should be safe to change it. Nevertheless you probably want to set it back to the value it had originally (use pthread_getname_np to obtain it).

How to extract taskid(tid) of a pthread from the parent thread?

All threads have a unique id:

std::thread::id this_id = std::this_thread::get_id();

You can store it in a variable when the program starts and it'll be accessible from the other threads.

I understand what you mean when you say parent thread, but even though one thread gave birth to another, they are siblings.

if you want the master thread to be able to get the /proc path to each worker thread, you could wrap the worker thread object in a class that, when it starts the actual thread, creates a path property that the master can later get.

An example:

#include <unistd.h>
#include <sys/syscall.h>
#include <sys/types.h>

#include <condition_variable>
#include <iostream>
#include <mutex>
#include <thread>

// A base class for thread object wrappers
class abstract_thread {
public:
    abstract_thread() {}

    abstract_thread(const abstract_thread&) = delete;
    abstract_thread(abstract_thread&& rhs) :
        m_th(std::move(rhs.m_th)), m_terminated(rhs.m_terminated), m_cv{}, m_mtx{} {}
    abstract_thread& operator=(const abstract_thread&) = delete;
    abstract_thread& operator=(abstract_thread&& rhs) {
        terminate();
        join();
        m_th = std::move(rhs.m_th);
        m_terminated = rhs.m_terminated;
        return *this;
    }

    virtual ~abstract_thread() {
        // make sure we don't destroy a running thread object
        terminate();
        join();
    }

    virtual void start() {
        if(joinable())
            throw std::runtime_error("thread already running");
        else {
            std::unique_lock<std::mutex> lock(m_mtx);
            m_terminated = true;
            // start thread and wait for it to signal that setup has been done
            m_th = std::thread(&abstract_thread::proxy, this);
            m_cv.wait(lock, [this] { return m_terminated == false; });
        }
    }
    inline bool joinable() const { return m_th.joinable(); }
    inline void join() {
        if(joinable()) {
            m_th.join();
        }
    }
    inline void terminate() { m_terminated = true; }
    inline bool terminated() const { return m_terminated; }

protected:
    // override if thread specific setup needs to be done before start() returns
    virtual void setup_in_thread() {}
    // must be overridden in derived classes
    virtual void execute() = 0;

private:
    std::thread m_th{};
    bool m_terminated{};
    std::condition_variable m_cv{};
    std::mutex m_mtx{};

    void proxy() {
        {
            std::unique_lock<std::mutex> lock(m_mtx);
            setup_in_thread(); // call setup function
            m_terminated = false;
            m_cv.notify_one();
        }
        execute(); // run thread code
    }
};

// an abstract thread wrapper capable of returning its /proc path
class proc_path_thread : public abstract_thread {
public:
    // function to call from master to get the path
    const std::string& get_proc_path() const { return m_proc_path; }

protected:
    void setup_in_thread() override {
        m_proc_path =
            std::move(std::string("/proc/")) + std::to_string(syscall(SYS_gettid));
    }

private:
    std::string m_proc_path{};
};

// two different thread wrapper classes. Just inherit proc_path_thread and implement
// "execute()". Loop until terminated() is true (or you're done with the work)
class AutoStartThread : public proc_path_thread {
public:
    AutoStartThread() { start(); }

private:
    void execute() override {
        while(!terminated()) {
            std::this_thread::sleep_for(std::chrono::milliseconds(500));
            std::cout << std::this_thread::get_id() << " AutoStartThread running\n";
        }
    }
};

class ManualStartThread : public proc_path_thread {
    void execute() override {
        std::this_thread::sleep_for(std::chrono::milliseconds(100));
        std::cout << std::this_thread::get_id() << " ManualStartThread running\n";
        std::this_thread::sleep_for(std::chrono::milliseconds(100));
    }
};

int main() {
    AutoStartThread a;
    std::cout << a.get_proc_path() << "\t// AutoStartThread, will have path\n";

    ManualStartThread b;
    std::cout << b.get_proc_path()
              << "\t// ManualStartThread not started, no path\n";
    b.start();
    std::cout << b.get_proc_path()
              << "\t// ManualStartThread will now have a path\n";
    b.join();

    std::this_thread::sleep_for(std::chrono::milliseconds(1500));
    // terminate() + join() is called automatically when abstract_thread descendants
    // goes out of scope:
    //
    // a.terminate();
    // a.join();
}

Possible output:

/proc/38207 // AutoStartThread, will have path
    // ManualStartThread not started, no path
/proc/38208 // ManualStartThread will now have a path
139642064209664 ManualStartThread running
139642072602368 AutoStartThread running
139642072602368 AutoStartThread running
139642072602368 AutoStartThread running
139642072602368 AutoStartThread running

C: Printing out multiple threads' ID's before they execute anything?

One simple way to implement a thread pool with each thread having an opportunity to do preparatory work, before all threads will start actual work, is to have an additional startup mutex, associated condition variable, and a counter for ready threads:

static pthread_mutex_t  prep_mutex = PTHREAD_MUTEX_INITIALIZER;
static pthread_cond_t   prep_cond = PTHREAD_COND_INITIALIZER;
static volatile size_t  prep_count = 0;

static pthread_mutex_t  thread_mutex = PTHREAD_MUTEX_INITIALIZER;
static volatile size_t  thread_count = 0;

static inline pid_t gettid(void) { return syscall(SYS_gettid); }

static void *thread_function(void *idref)
{
    /* Get the number/index of this thread. */
    const size_t  num = *(pid_t *)idref;
    const pid_t   tid = gettid();

    /* Replace it with the Linux tid. */
    *(pid_t *)idref = tid;

    /* Block here until preparatory work can be done. */
    pthread_mutex_lock(&prep_mutex);

    printf("Thread %zu of %zu has started.\n", id + 1, thread_count);
    fflush(stdout);

    prep_count++;
    pthread_cond_signal(&prep_cond);
    pthread_mutex_unlock(&prep_mutex);

    /* Block until actual work can start. */
    pthread_mutex_lock(&thread_mutex);
    ...

When creating the thread pool, the creator holds both mutexes. When it is ready to let the threads start the prep work, it releases the prep_mutex, waiting on the prep_cond until sufficient number of threads have completed preparatory work (prep_count is high enough). Then, it releases the thread_mutex.

int start_threads(size_t max_count)
{
    pthread_t       thread_id[max_count];
    pid_t           linux_tid[max_count];
    pthread_attr_t  attrs;
    int             err = 0;

    /* Assume no threads currently running */
    prep_count = thread_count = 0;

    /* Shrink stack size; the default is too large. */
    pthread_attr_init(&attrs);
    pthread_attr_setstacksize(&attrs, 2*PTHREAD_STACK_MIN);

    /* Grab the mutexes, so the threads will block initially. */
    pthread_mutex_lock(&prep_mutex);
    pthread_mutex_lock(&thread_mutex);

    while (thread_count < max_count) {
        linux_tid[thread_count] = thread_count;
        err = pthread_create(thread_id + thread_count, &attrs,
                             thread_function, linux_tid + thread_count);
        if (err)
            break;

        thread_count++;
    }

    /* Attributes are no longer needed. */
    pthread_attr_destroy(&attrs);

    /* No threads created at all? */
    if (thread_count < 1) {
        pthread_mutex_unlock(&prep_mutex);
        pthread_mutex_unlock(&thread_mutex);
        /* Return failure. */
        return err;
    }

    /* All threads have now been created; let them do prep work. */
    while (prep_count < thread_count) {
        pthread_cond_wait(&prep_cond, &prep_mutex);
    }
    pthread_mutex_unlock(&prep_mutex);

    /* All threads have finished their prep work; start working. */
    pthread_mutex_unlock(&thread_mutex);

    ...

The above creates up to max_num threads, with their pthread IDs in thread_id[], and their Linux tids in linux_tid[].

When the thread starts, it gets a pointer to the location to store the Linux tid to. Initially, that contains the index of the thread ("thread number", starting at 0), so thread_function grabs that as num, obtains the Linux tid as tid, and stores it to the location given as a parameter. This way, both the original thread and the created thread know the index (num), pthread ID (thread_id[], pthread_self()), and Linux tid (linux_tid[] and tid).

Note how the arrays are dereferenced in the pthread_create() call. If array is an array, and index is an index to it, then &(array[index]) == array + index.

Above, the start_threads() function does not assume it can start all max_num threads. It does return with nonzero errno number if it cannot create any, but if it can create at least one thread, as soon as the threads can grab prep_mutex, the thread_count will reflect the correct number of threads in their gang. (There is no reason to fail just because you were only able to create 53 worker threads instead of 54, say.)