System Wide Keyboard Hook on X Under Linux

system wide keyboard hook on X under linux

XGrabKey on the root window is how xbindkey does it. Be careful about having some alternative method of killing the grab though, it's very annoying to have to go somewhere to ssh into your own box just to kill that process... And that's why, if it was me, xbindkeys+"echo 'moo' > /tmp/moo-fifo" would be the way to do it. That way, you could also control it in any number of other ways you haven't thought of yet.

Error when trying to build a Global Keyboard Hook in Ubuntu Linux

Read more about keyboard X11 events. You will get them only from some X11 windows, if that window has set some of KeyPressMask or KeyReleaseMask bits in its event mask. And that window should be created InputOnly or InputOutput

You are apparently using Qt (which is a good idea). Then, stick to Qt key events.

(if you want to catch all X11 key events, use the root window of the display. But then you are interfering with your window manager, which is a bad idea; learn more about ICCCM and EWMH)

Also, run xev -in a terminal- to understand more about X11 events

How to Create Global Keyboard Hook with C# in Linux

Answering my own question here because I had to do a lot of research for it and I'm sure it will help someone else later. I went with option 1 because it seemed to be the easiest to implement.

Warning - There is going to be a lot of code in this post

Summary

For my intents and purposes, I wanted to have some code that would publish an event any time the user pressed a key anywhere on the system. While developing this, I found that I could also hook into mouse events as well.

It's important to note, that the code here like the linux OS doesn't really distinguish between a keyboard button press and a mouse button press. To linux, they're both just buttons.

Understanding that you can actually expand this code to work with other items like gamepads and special input peripherals if you desire.

Additional Gotchas - As expressed in the question, this code will not block the device input to other programs. This can be problematic if you want to override the default functionality of say the power button or the volume buttons.

Setting Up Permissions

In order to run this code, the user that runs this program will have to be in the input user group otherwise it will throw an exception. Run this code to add the current user to that group.

 sudo gpasswd -a $USER input

EventType.cs

Since the folder /dev/input is essentially an event bus of a bunch of input/output devices for the linux OS, there are a variety of event types that you may want to consume. Here is the enum that I was able to put together to make deciphering the event types a little easier.

public enum EventType
{
///
/// Used as markers to separate events. Events may be separated in time or in space, such as with the multitouch protocol.
///

EV_SYN,

///
/// Used to describe state changes of keyboards, buttons, or other key-like devices.
///

EV_KEY,

///
/// Used to describe relative axis value changes, e.g. moving the mouse 5 units to the left.
///

EV_REL,

///
/// Used to describe absolute axis value changes, e.g. describing the coordinates of a touch on a touchscreen.
///

EV_ABS,

///
/// Used to describe miscellaneous input data that do not fit into other types.
///

EV_MSC,

///
/// Used to describe binary state input switches.
///

EV_SW,

///
/// Used to turn LEDs on devices on and off.
///

EV_LED,

///
/// Used to output sound to devices.
///

EV_SND,

///
/// Used for autorepeating devices.
///

EV_REP,

///
/// Used to send force feedback commands to an input device.
///

EV_FF,

///
/// A special type for power button and switch input.
///

EV_PWR,

///
/// Used to receive force feedback device status.
///

EV_FF_STATUS,
}

KeyState.cs

Like many other event handling systems there are multiple events that happen each time that the user presses a key. Once when the key is pressed down, another when the key is pressed up, and another if the user decides to hold the key down.

public enum KeyState
{
KeyUp,
KeyDown,
KeyHold
}

EventCode.cs

Each distinct button is associated with an event code. Whether it's a button on a keyboard or a button on a mouse, you'll probably be able to find it here. He is a helper enum class to make deciphering those codes easier.

/// 
/// Mapping for this can be found here: https://github.com/torvalds/linux/blob/master/include/uapi/linux/input-event-codes.h
///

public enum EventCode
{
Reserved = 0,
Esc = 1,
Num1 = 2,
Num2 = 3,
Num3 = 4,
Num4 = 5,
Num5 = 6,
Num6 = 7,
Num7 = 8,
Num8 = 9,
Num9 = 10,
Num0 = 11,
Minus = 12,
Equal = 13,
Backspace = 14,
Tab = 15,
Q = 16,
W = 17,
E = 18,
R = 19,
T = 20,
Y = 21,
U = 22,
I = 23,
O = 24,
P = 25,
LeftBrace = 26,
RightBrace = 27,
Enter = 28,
LeftCtrl = 29,
A = 30,
S = 31,
D = 32,
F = 33,
G = 34,
H = 35,
J = 36,
K = 37,
L = 38,
Semicolon = 39,
Apostrophe = 40,
Grave = 41,
LeftShift = 42,
Backslash = 43,
Z = 44,
X = 45,
C = 46,
V = 47,
B = 48,
N = 49,
M = 50,
Comma = 51,
Dot = 52,
Slash = 53,
RightShift = 54,
KpAsterisk = 55,
LeftAlt = 56,
Space = 57,
Capslock = 58,
F1 = 59,
Pf2 = 60,
F3 = 61,
F4 = 62,
F5 = 63,
F6 = 64,
F7 = 65,
F8 = 66,
Pf9 = 67,
F10 = 68,
Numlock = 69,
ScrollLock = 70,
Kp7 = 71,
Kp8 = 72,
Kp9 = 73,
PkpMinus = 74,
Kp4 = 75,
Kp5 = 76,
Kp6 = 77,
KpPlus = 78,
Kp1 = 79,
Kp2 = 80,
Kp3 = 81,
Kp0 = 82,
KpDot = 83,

Zenkakuhankaku = 85,
//102ND = 86,
F11 = 87,
F12 = 88,
Ro = 89,
Katakana = 90,
Hiragana = 91,
Henkan = 92,
Katakanahiragana = 93,
Muhenkan = 94,
KpJpComma = 95,
KpEnter = 96,
RightCtrl = 97,
KpSlash = 98,
SysRq = 99,
RightAlt = 100,
LineFeed = 101,
Home = 102,
Up = 103,
Pageup = 104,
Left = 105,
Right = 106,
End = 107,
Down = 108,
Pagedown = 109,
Insert = 110,
Delete = 111,
Macro = 112,
Mute = 113,
VolumeDown = 114,
VolumeUp = 115,
Power = 116, // SC System Power Down
KpEqual = 117,
KpPlusMinus = 118,
Pause = 119,
Scale = 120, // AL Compiz Scale (Expose)

KpComma = 121,
Hangeul = 122,
Hanja = 123,
Yen = 124,
LeftMeta = 125,
RightMeta = 126,
Compose = 127,

Stop = 128, // AC Stop
Again = 129,
Props = 130, // AC Properties
Undo = 131, // AC Undo
Front = 132,
Copy = 133, // AC Copy
Open = 134, // AC Open
Paste = 135, // AC Paste
Find = 136, // AC Search
Cut = 137, // AC Cut
Help = 138, // AL Integrated Help Center
Menu = 139, // Menu (show menu)
Calc = 140, // AL Calculator
Setup = 141,
Sleep = 142, // SC System Sleep
Wakeup = 143, // System Wake Up
File = 144, // AL Local Machine Browser
Sendfile = 145,
DeleteFile = 146,
Xfer = 147,
Prog1 = 148,
Prog2 = 149,
Www = 150, // AL Internet Browser
MsDos = 151,
Coffee = 152, // AL Terminal Lock/Screensaver
RotateDisplay = 153, // Display orientation for e.g. tablets
CycleWindows = 154,
Mail = 155,
Bookmarks = 156, // AC Bookmarks
Computer = 157,
Back = 158, // AC Back
Forward = 159, // AC Forward
CloseCd = 160,
EjectCd = 161,
EjectCloseCd = 162,
NextSong = 163,
PlayPause = 164,
PreviousSong = 165,
StopCd = 166,
Record = 167,
Rewind = 168,
Phone = 169, // Media Select Telephone
Iso = 170,
Config = 171, // AL Consumer Control Configuration
Homepage = 172, // AC Home
Refresh = 173, // AC Refresh
Exit = 174, // AC Exit
Move = 175,
Edit = 176,
ScrollUp = 177,
ScrollDown = 178,
KpLeftParen = 179,
KpRightParen = 180,
New = 181, // AC New
Redo = 182, // AC Redo/Repeat

F13 = 183,
F14 = 184,
F15 = 185,
F16 = 186,
F17 = 187,
F18 = 188,
F19 = 189,
F20 = 190,
F21 = 191,
F22 = 192,
F23 = 193,
F24 = 194,

PlayCd = 200,
PauseCd = 201,
Prog3 = 202,
Prog4 = 203,
Dashboard = 204, // AL Dashboard
Suspend = 205,
Close = 206, // AC Close
Play = 207,
FastForward = 208,
BassBoost = 209,
Print = 210, // AC Print
Hp = 211,
Camera = 212,
Sound = 213,
Question = 214,
Email = 215,
Chat = 216,
Search = 217,
Connect = 218,
Finance = 219, // AL Checkbook/Finance
Sport = 220,
Shop = 221,
AltErase = 222,
Cancel = 223, // AC Cancel
BrightnessDown = 224,
BrightnessUp = 225,
Media = 226,

SwitchVideoMode = 227, // Cycle between available video outputs (Monitor/LCD/TV-out/etc)
KbdIllumToggle = 228,
KbdIllumDown = 229,
KbdIllumUp = 230,

Send = 231, // AC Send
Reply = 232, // AC Reply
ForwardMail = 233, // AC Forward Msg
Save = 234, // AC Save
Documents = 235,

Battery = 236,

Bluetooth = 237,
Wlan = 238,
Uwb = 239,

Unknown = 240,

VideoNext = 241, // drive next video source
VideoPrev = 242, // drive previous video source
BrightnessCycle = 243, // brightness up, after max is min
BrightnessAuto = 244, // Set Auto Brightness: manual brightness control is off, rely on ambient
DisplayOff = 245, // display device to off state

Wwan = 246, // Wireless WAN (LTE, UMTS, GSM, etc.)
RfKill = 247, // Key that controls all radios

MicMute = 248, // Mute / unmute the microphone
LeftMouse = 272,
RightMouse = 273,
MiddleMouse = 274,
MouseBack = 275,
MouseForward = 276,

ToolFinger = 325,
ToolQuintTap = 328,
Touch = 330,
ToolDoubleTap = 333,
ToolTripleTap = 334,
ToolQuadTap = 335,
Mic = 582
}

MouseAxis.cs

Mouse movements are expressed in an amount moved and an axis associated with that change. 0 represents movements on the X axis and 1 represents movements on the Y axis.

public enum MouseAxis
{
X,
Y
}

KeypressEvent.cs

Here is the event that I use to process key press events.

public class KeyPressEvent : EventArgs
{
public KeyPressEvent(EventCode code, KeyState state)
{
Code = code;
State = state;
}

public EventCode Code { get; }

public KeyState State { get; }
}

MouseMoveEvent.cs

Here is the event that I use process mouse movement change updates.

public class MouseMoveEvent : EventArgs
{
public MouseMoveEvent(MouseAxis axis, int amount)
{
Axis = axis;
Amount = amount;
}

public MouseAxis Axis { get; }

public int Amount { get; set; }
}

InputReader.cs

This is where the bulk of the work happens. Here we have a class, where you provide the path to one of the event files and it publishes updates whenever it comes in. An example file that does this would be "/dev/input/event0".

More research would be needed to support more events types, but I was only interested in keyboard and mouse input so it serves my purposes. I also opted to drop the timestamp that is included with each button event, but if you're interested, you can find it on the first 16 bits on the buffer.

public class InputReader : IDisposable
{
public delegate void RaiseKeyPress(KeyPressEvent e);

public delegate void RaiseMouseMove(MouseMoveEvent e);

public event RaiseKeyPress OnKeyPress;
public event RaiseMouseMove OnMouseMove;

private const int BufferLength = 24;

private readonly byte[] _buffer = new byte[BufferLength];

private FileStream _stream;
private bool _disposing;

public InputReader(string path)
{
_stream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);

Task.Run(Run);
}

private void Run()
{
while (true)
{
if (_disposing)
break;

_stream.Read(_buffer, 0, BufferLength);

var type = BitConverter.ToInt16(new[] {_buffer[16], _buffer[17]}, 0);
var code = BitConverter.ToInt16(new[] {_buffer[18], _buffer[19]}, 0);
var value = BitConverter.ToInt32(new[] {_buffer[20], _buffer[21], _buffer[22], _buffer[23]}, 0);

var eventType = (EventType) type;

switch (eventType)
{
case EventType.EV_KEY:
HandleKeyPressEvent(code, value);
break;
case EventType.EV_REL:
var axis = (MouseAxis) code;
var e = new MouseMoveEvent(axis, value);
OnMouseMove?.Invoke(e);
break;
}
}
}

private void HandleKeyPressEvent(short code, int value)
{
var c = (EventCode) code;
var s = (KeyState) value;
var e = new KeyPressEvent(c, s);
OnKeyPress?.Invoke(e);
}

public void Dispose()
{
_disposing = true;
_stream.Dispose();
_stream = null;
}
}

AggregateInputReader.cs

Since I'm looking to handle input from every device anywhere on the system, I've put together this classes to aggregate the input events from all the files in the "/dev/input" folder.

Known issue - This code will throw an exception if a usb device is removed while it's running. I do intend to fix it in my own app implementation, but I don't have time to take care of it now.

public class AggregateInputReader : IDisposable
{
private List _readers = new();

public event InputReader.RaiseKeyPress OnKeyPress;

public AggregateInputReader()
{
var files = Directory.GetFiles("/dev/input/", "event*");

foreach (var file in files)
{
var reader = new InputReader(file);

reader.OnKeyPress += ReaderOnOnKeyPress;

_readers.Add(reader);
}
}

private void ReaderOnOnKeyPress(KeyPressEvent e)
{
OnKeyPress?.Invoke(e);
}

public void Dispose()
{
foreach (var d in _readers)
{
d.OnKeyPress -= ReaderOnOnKeyPress;
d.Dispose();
}

_readers = null;
}
}

Example Usage

Not bad that this can now be accomplished in two line of code.

public class Program
{
public static void Main(string[] args)
{
using var aggHandler = new AggregateInputReader();

aggHandler.OnKeyPress += (e) => { System.Console.WriteLine($"Code:{e.Code} State:{e.State}"); };

System.Console.ReadLine();
}
}

Thanks for sticking with this. I hope it works out for you!

Get Keyboard input c++ outside of terminal

If the application is running in the background as a daemon, you can use the common Windows approach of a "keyboard hook". This is performed much differently on Linux though and there are various methods you may want to look into.

It is discussed a bit in this SO question: system wide keyboard hook on X under linux

How can I capture a key stroke immediately in linux?

Basically, it depends heavily on how you define immediately.

There are two tasks here. The first is to disable the regular key echoing that is built into most C input libraries. The second is to print out the new character instead of the old one.

In pseudo code.

 echo(off);
while (capturing && charIsAvailable()) {
c = readOneChar();
if (c == '\n') {
capturing = false;
}
printf("%c", c++);
}
echo(on);

There are a number of systems communicating to capture a key press.

  1. The keyboard
  2. (possibly) a USB bus
  3. The CPU interrupt handler
  4. The operating system
  5. The X window server process
  6. The X "window" that has focus.

The last step is done with a program that runs a continuous loop that captures events from the X server and processes them. If you wanted to expand this program in certain ways (get the length of time the key was pressed) you need to tell the other programs that you want "raw" keyboard events, which means that you won't really be receiving fully "cooked" characters. As a result, you will have to keep track of which keys are up and down, and how long, and handle all the odd meta key behavior in your program (that's not an 'a' it's a 'A' because shift is down, etc).

There are also other processing modes to consider, like canonical and non-canonical, which will control whether you wish the events to be received in line oriented chunks (line events) or character oriented chunks (character events). Again this is somewhat complicated by the need to make the upstream programs aware of the requirements of the downstream client.

Now that you have some idea of your environment, let's revisit the actual code needed to suppress character output.

// define a terminal configuration data structure
struct termios term;

// copy the stdin terminal configuration into term
tcgetattr( fileno(stdin), &term );

// turn off Canonical processing in term
term.c_lflag &= ~ICANON;

// turn off screen echo in term
term.c_lflag &= ~ECHO;

// set the terminal configuration for stdin according to term, now
tcsetattr( fileno(stdin), TCSANOW, &term);


(fetch characters here, use printf to show whatever you like)

// turn on Canonical processing in term
term.c_lflag |= ICANON;

// turn on screen echo in term
term.c_lflag |= ECHO;

// set the terminal configuration for stdin according to term, now
tcsetattr( fileno(stdin), TCSANOW, &term);

Even this is not immediate. To get immediate, you need to get closer to the source, which eventually means a kernel module (which still isn't as immediate as the keyboard micro-controller, which isn't as immediate as the moment the switch actually closes). With enough items in between the source and the destination, eventually it becomes possible to notice the difference, however, in practice this code has been worked on a lot by people who are seeking the best tradeoff between performance and flexibility.

Listening to keyboard events without consuming them in X11 - Keyboard hooking

Here's my quick and dirty example

#include 
#include
#include
#include
#include


int main ()
{
Display* d = XOpenDisplay(NULL);
Window root = DefaultRootWindow(d);
Window curFocus;
char buf[17];
KeySym ks;
XComposeStatus comp;
int len;
int revert;

XGetInputFocus (d, &curFocus, &revert);
XSelectInput(d, curFocus, KeyPressMask|KeyReleaseMask|FocusChangeMask);

while (1)
{
XEvent ev;
XNextEvent(d, &ev);
switch (ev.type)
{
case FocusOut:
printf ("Focus changed!\n");
printf ("Old focus is %d\n", (int)curFocus);
if (curFocus != root)
XSelectInput(d, curFocus, 0);
XGetInputFocus (d, &curFocus, &revert);
printf ("New focus is %d\n", (int)curFocus);
if (curFocus == PointerRoot)
curFocus = root;
XSelectInput(d, curFocus, KeyPressMask|KeyReleaseMask|FocusChangeMask);
break;

case KeyPress:
printf ("Got key!\n");
len = XLookupString(&ev.xkey, buf, 16, &ks, &comp);
if (len > 0 && isprint(buf[0]))
{
buf[len]=0;
printf("String is: %s\n", buf);
}
else
{
printf ("Key is: %d\n", (int)ks);
}
}

}
}

It's not reliable but most of the time it works. (It is showing keys I'm typing into this box right now). You may investigate why it does fail sometimes ;) Also it cannot show hotkeys in principle. Hotkeys are grabbed keys, and only one client can get a grabbed key. Absolutely nothing can be done here, short of loading a special X11 extension designed for this purpose (e.g. XEvIE).

Hook key & key combinations from keyboard with Qt 4.6

System-wide key grabbing is a tricky subject, but system-wide key hooking is even trickier. Every OS/GUI has its own solution, at least for grabbing. Qt4 doesn't expose such feature, but Qt eXTension library solves the problem with its QxtGlobalShortcut. It's a nice wrapper for:

  • XGrabKey()/XUngrabKey() in X11,
  • RegisterHotKey()/UnregisterHotKey() in Windows,
  • RegisterEventHotKey()/UnregisterEventHotKey() in Mac OS X.

So you can grab explicit key combination, i.e. particular key and modifiers (XGrabKey() allows a bit more), that no other application will get. Key sequences, i.e. consecutive key combinations, are not supported here.


Keyboard hooking is much more powerful, because it allows peeking at the input events (or even filtering them). It's not only used by keyboard loggers, but they are a typical association here.

If you're into Windows, then you can read:

  • Hooks and DLLs by Joseph M. Newcomer,
  • Hooks.

In X11 it's much more complicated. There are at least a two pages you may want to read:

  • X.Org Wiki - Development/Documentation/InputEventProcessing - to have some background,
  • Exploiting X11 to monitor keystrokes - to understand difficulties.

There was a X Event Interception Extension, but it wasn't maintained and eventually has been removed.

Hopefully it can be done without the help of X11 infrastructure. In Linux 2.6 kernel there is "Event interface", known as evdev, that can be exploited here. Details can be found in the source code of the logkeys Linux keylogger. It can also be done with something in effect similar to evdev. See my PoC project:
kaos - Key Activity On-Screen display.

And I don't have Mac, so no further references. ;)



Related Topics



Leave a reply



Submit