Where does output of print in kernel go?
It depends on the distribution, but many use klogd(8)
to get the messages from the kernel and will either log them to a file (sometimes /var/log/dmesg
or /var/log/kernel
) or to the system log via syslog(3)
. In the latter case, where the log entries end up will depend on the configuration of syslogd(8)
.
One note about the dmesg
command: Kernel messages are stored in a circular buffer, so large amounts of output will be overwritten.
Where does printk() print to?
dmesg
should display printk
messages.
Unusual kernel behaviour when outputting directly to text video memory
I was missing the fact that I need to compile 32-bit code delicately on a 64-bit system. I solved the problem by adding -m32
and -fno-PIC
flags to GCC, -melf_i386
to LD, and changing elf64
to elf32
in NASM in my Makefile.
What does DebugView's Enable Verbose Kernel Output do?
TL;DR: it calls a driver to repeatedly call NtSetDebugFilterState
on all kernel components, so that they are all able to print something on the debug output.
Program
Let start with the program itself; there's only one occurrence of the sentence "Enable Verbose Kernel Output"
:
mov [rsp+78h+mi.wID], 9C7Ch
lea rax, aEnableVerboseK ; "Enable &Verbose Kernel Output"
sbb ecx, ecx
mov [rsp+78h+mi.dwTypeData], rax
and ecx, 8
mov [rsp+78h+mi.fState], ecx
mov rcx, cs:hMenu ; hMenu
call cs:GetSubMenu
mov rcx, rax ; hmenu
lea r9, [rsp+78h+mi] ; lpmi
lea edx, [rdi+3] ; item
lea r8d, [rdi+1] ; fByPosition
call cs:InsertMenuItemA
The above code insert the sub-menu into the main menu. What's important here is the the menu ID, namely 0x9C7C
.
This menu ID is used only once more here:
movzx edx, al ; al can either be 0 or 1
xor edi, edi
mov qword ptr [rsp+830h+iNumButtons], rdi ; lpOverlapped
lea rax, [rsp+830h+BytesReturned]
mov [rsp+830h+lpButtons], rax ; lpBytesReturned
xor edx, 1
mov dword ptr [rsp+830h+wBMID], edi ; nOutBufferSize
xor r9d, r9d ; nInBufferSize
xor r8d, r8d ; lpInBuffer
mov [rsp+830h+dwInitParam], rdi ; lpOutBuffer
lea edx, ds:0FFFFFFFF8305003Ch[rdx*4] ; dwIoControlCode
call cs:DeviceIoControl
movzx eax, cs:byte_1400935A3
mov edx, 9C7Ch ; uIDCheckItem
mov rcx, cs:hMenu ; hMenu
mov cs:byte_1400A2776, al
neg al
sbb r8d, r8d
and r8d, 8 ; uCheck
call cs:CheckMenuItem
The above code calls DeviceIoControl
and then checks the menu item. The former means the program is actually talking with a device driver.
If we remove a bit of code we can see which IOCTL can be sent to the driver:
movzx edx, al ; al can either be 0 or 1
; snip
xor edx, 1 ; invert AL
; snip
lea edx, ds:0FFFFFFFF8305003Ch[rdx*4] ; dwIoControlCode
call cs:DeviceIoControl
Since RDX can be either 0 or 1 we end up with (base 10):
- [rdx*4-2096824260]
Thus:
- 4 - 2096824260 = -2096824256
- 0 - 2096824260 = -2096824260
Looking at the handles opened by dbgview64.exe
we can see a \Device\dbgv
is currently opened.
0: kd> !devobj \Device\dbgv
Device object (ffffd58a97007630) is for:
Dbgv \Driver\DBGV DriverObject ffffd58a8688aaa0
Current Irp 00000000 RefCount 0 Type 00008305 Flags 00000048
SecurityDescriptor ffffe58fb8bdeea0 DevExt 00000000 DevObjExt ffffd58a97007780
ExtensionFlags (0x00000800) DOE_DEFAULT_SD_PRESENT
Characteristics (0000000000)
Device queue is not busy.
0: kd> dt _driver_object ffffd58a8688aaa0
nt!_DRIVER_OBJECT
+0x000 Type : 0n4
+0x002 Size : 0n336
+0x008 DeviceObject : 0xffffd58a`97007630 _DEVICE_OBJECT
+0x010 Flags : 0x12
+0x018 DriverStart : 0xfffff800`dcf90000 Void
+0x020 DriverSize : 0x9000
+0x028 DriverSection : 0xffffd58a`a3ba9be0 Void
+0x030 DriverExtension : 0xffffd58a`8688abf0 _DRIVER_EXTENSION
+0x038 DriverName : _UNICODE_STRING "\Driver\DBGV"
+0x048 HardwareDatabase : 0xfffff800`8372e990 _UNICODE_STRING "\REGISTRY\MACHINE\HARDWARE\DESCRIPTION\SYSTEM"
+0x050 FastIoDispatch : (null)
+0x058 DriverInit : 0xfffff800`dcf97058 long +0
+0x060 DriverStartIo : (null)
+0x068 DriverUnload : (null)
+0x070 MajorFunction : [28] 0xfffff800`dcf91b80 long +0
0: kd> dt nt!_LDR_DATA_TABLE_ENTRY 0xffffd58a`a3ba9be0 Full*
+0x048 FullDllName : _UNICODE_STRING "\??\C:\WINDOWS\system32\Drivers\Dbgv.sys"
So the driver is currently loaded from C:\WINDOWS\system32\Drivers\Dbgv.sys
(or you can extract it from the .rsrc section...).
Driver
Looking at the driver, in the driver entry we spot the function used for IRP_MJ_DEVICE_CONTROL
:
lea rax, sub_180001B80
mov [rdi+0E0h], rax ; IRP_MJ_DEVICE_CONTROL
mov [rdi+80h], rax
mov [rdi+70h], rax
Inside that function we have the usual setup before calling the right IOCTL:
movzx eax, [rcx+_IO_STACK_LOCATION.MajorFunction]
mov r9d, [rcx+_IO_STACK_LOCATION.Parameters.DeviceIoControl.OutputBufferLength]
mov r10d, [rcx+_IO_STACK_LOCATION.Parameters.DeviceIoControl.IoControlCode]
test al, al ; IRP_MJ_CREATE
jz loc_180001C6C
cmp al, 2 ; IRP_MJ_CLOSE
jz short loc_180001C0C
cmp al, 0Eh ; IRP_MJ_DEVICE_CONTROL
jnz @@CompleteRequest
mov eax, r10d
and eax, 3
cmp al, METHOD_NEITHER
jnz short loc_180001BDF
mov rdx, [rdi+_IRP.UserBuffer]
loc_180001BDF:
mov [rsp+98h+do], r11 ; _DEVICE_OBJECT*
mov [rsp+98h+IoStatus], rbx ; IoStatus
mov [rsp+98h+ioctl], r10d ; IoCtl
mov [rsp+98h+OutputBufferLength], r9d ; OuputBufferLength
mov r9d, [rcx+_IO_STACK_LOCATION.Parameters.DeviceIoControl.InputBufferLength] ; int
mov rcx, [rcx+_IO_STACK_LOCATION.FileObject]
mov qword ptr [rsp+98h+Buffer], rdx ; Buffer
mov dl, 1 ; int
call sub_1800017E0
jmp @@CompleteRequest
Inside the call (sub_1800017E0
) we have a big switch for the IOCTL, here's the case -2096824260 (case -2096824256 is slightly different):
loc_1800018B9:
call sub_180002470 ; jumptable 000000018000182F case -2096824260
jmp loc_180001AEB
This function is mostly comprised of two loops:
loc_1800024A0:
xor ebx, ebx
@@LoopQuerySetDebugFilter:
mov edx, ebx
mov ecx, esi
call cs:qword_180005438 ; DbgQueryDebugFilterState
mov r8b, 1 ; State
mov edx, ebx ; Level (keeps incrementing up to 0x1E)
mov ecx, esi ; ComponentId (keeps incrementing up to 0x82)
mov [rdi], al ; save current state.
call cs:qword_180005440 ; DbgSetDebugFilterState
inc ebx
inc rdi
cmp ebx, 1Eh
jb short @@LoopQuerySetDebugFilter
inc esi
cmp esi, 82h ; '‚'
jb short loc_1800024A0
Both calls are on DbgQueryDebugFilterState
and DbgSetDebugFilterState
(reactos source)
which is just a minimal wrapper around NtSetDebugFilterState
(reactos source).
As far as we can see the debug filter state is queried, saved, and then set for all kernel components (following is the component tables from the kernel, there are a lot of them):
.rdata:00000001400073E0 KdComponentTable dq offset Kd_SYSTEM_Mask
.rdata:00000001400073E0 ; DATA XREF: NtQueryDebugFilterState+36↓o
.rdata:00000001400073E0 ; NtSetDebugFilterState+43↓o ...
.rdata:00000001400073E8 dq offset Kd_SMSS_Mask
.rdata:00000001400073F0 dq offset Kd_SETUP_Mask
.rdata:00000001400073F8 dq offset Kd_NTFS_Mask
.rdata:0000000140007400 dq offset Kd_FSTUB_Mask
.rdata:0000000140007408 dq offset Kd_CRASHDUMP_Mask
.rdata:0000000140007410 dq offset Kd_CDAUDIO_Mask
.rdata:0000000140007418 dq offset Kd_CDROM_Mask
.rdata:0000000140007420 dq offset Kd_CLASSPNP_Mask
....
Which finally means that all kernel components are able to print something to the debug output.
Note that the other IOCTL just reset the components masks to what they were before checking the menu in the main program.
CUDA kernel printf() produces no output in terminal, works in profiler
You are implicitly, and mistakenly, assuming a certain order of occurrences when main()
is done. Specifically, you're assuming that because the default stream is synchronous, everything having to do with your kernel is over and done with by the time the next line of code after the kernel launch gets executed. That is not 100% true - as @RobertCrovella suggests; specifically, it's not guaranteed that the device's printf()
buffer will be ferried back into host memory and dumped into the standard output stream before control returns to your program.
You will need to synchronize the (default, current) CUDA device with the host, i.e. execute:
cuda::device::current::get().synchronize();
or at least synchronize the device's default stream:
cuda::device::current::get().default_stream().synchronize();
and this will ensure the printf()
results make it to standard output.
Now, nvvp instruments your execution in some way (probably just by having the profiler running - but nvprof instruments the execution through the hooks which are the CUDA runtime API calls). So, the behavior is different when you run your program that way.
Somewhat-related question: The behavior of stream 0 (default) and other streams .
How to view print messages in Windbg when Windbg is in local kernel debugging mode?
obviously you need a two computer setup
local kernel debugging is similar to a memory snapshot it is somewhat stale or maybe intermittantly refreshed
you can try using a vm running inside a physical machine which you can kernel debug over network or serial VMware player does the job most of the time
Edit:
you cannot attach in kernel mode (attach is a term normally associated with debugging an already running user mode process )
under kernel mode debugging ([two computer] ,[ vm,vm ] , [physical,vm], [physical hypervisor] ) you can inspect state and set breakpoints on a live
kernel module
a critical kernel module like ntos , hal etc is associated with all the processes
Local kernel debugging does not do anything live
it is like a specimen testing on a pathological lab
(like a syringe full of blood to diagnose if there is a strain of some pandemical organisms you cannot treat the blood you need the physical person to treat the diagnosed problem)
you cannot step ,
you cannot view registers,
you cannot run
you cannot pagein
you cannot set breakpoints
I can't get output numbers with ctypes cuda
cudaMemcpy
is expecting pointers for dst
and src
.
In your function resoult
, h_answer
is a pointer to a long
allocated by the caller.
Since it's already the pointer where the data should be copied to, you should use it as is and not take it's address by using &h_answer
.
Therefore you need to change your cudaMemcpy
from:
cudaMemcpy(&h_answer, d_answer, sizeof(long), cudaMemcpyDeviceToHost);
To:
cudaMemcpy(h_answer, d_answer, sizeof(long), cudaMemcpyDeviceToHost);
DbgView on Vmware Workstation 12 not capturing output
By default DbgPrint() outputs are disabled from Vista/Server2008 onwards. You may have to do below settings in registry to enable the debug prints.
- Open up the registry.
- Go to path, “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Debug Print Filter”. If "Debug Print Filter" is not present then create it.
- Add value “DEFAULT” : REG_DWORD : 0xFFFFFFFF and then reboot.
After rebooting you will be able to get debug prints.
My code caused the kernel to restart. Why is the kernel restarting?
> def countdown(n):
> for n in range(0,5):
> print(n)
> countdown(n-1)
> countdown(2)
In your code above, each function call will call itself recursively 5 times. So the first call is 5, second call there will be 25 calls, third call 125 calls, and the recursive calls went exponentially high, causing the Kernel to restart.
If you use recursive function, there must be an exit condition. There are a few ways to achieve your goal:
(1) recursive with if-condition exit (this is your successful code)
def countdown(n):
if n == 0:
print("blast off!")
else:
print(n)
countdown(n-1)
countdown(5)
(2) recursive with while-condition exit
def countdown(n):
while n != 0:
print(n)
countdown(n-1)
return
print("blast off!")
countdown(5)
(3) for loop (no need recursive)
def countdown(n):
for i in range(n, 0, -1):
print(i)
print("blast off!")
countdown(5)
Output:
5
4
3
2
1
blast off!
Related Topics
Detecting Command Not Found in Bash Script
How to Execve a Process, Retaining Capabilities in Spite of Missing Filesystem-Based Capabilities
How to Use Local and Remote Variables Within a Heredoc or Command Over Ssh
How to Write Shell Command Within Pharo Smalltalk
Using Named Pipes with Bash - Problem with Data Loss
Unzip a Bunch of Zips into Their Own Directories
Detecting a Change of Ip Address in Linux
Microsecond Accurate (Or Better) Process Timing in Linux
What Is the Most Efficient and Elegant Way Develop/Debug Linux Kernel
Git Permission Denied (Publickey,Gssapi-Keyex,Gssapi-With-Mic)
Understanding Buffers/Cache in Linux 'Free -M'
How to Test If an Address Is Readable in Linux Userspace App
Why Do I Have to 'Wait()' for Child Processes
Docker Change Cgroup Driver to Systemd
How to Store One Billion Files on Ext4
Who Can Access a File with Octal Permissions "000" on Linux/Unix