Skip to content Skip to sidebar Skip to footer

Python Ctypes How To Read A Byte From A Character Array Passed To NASM

UPDATE: I solved this problem with the help of Mark Tolonen's answer below. Here is the solution (but I'm puzzled by one thing): I begin with the encoding string shown in Mark

Solution 1:

Your name-reading code would return a list of Unicode strings. The following would encode a list of Unicode strings into an array of strings to be passed to a function taking a POINTER(c_char_p):

>>> import ctypes
>>> names = ['Mark','John','Craig']
>>> ca = (ctypes.c_char_p * len(names))(*(name.encode() for name in names))
>>> ca
<__main__.c_char_p_Array_3 object at 0x000001DB7CF5F6C8>
>>> ca[0]
b'Mark'
>>> ca[1]
b'John'
>>> ca[2]
b'Craig'

If ca is passed to your function as the first parameter, the address of that array would be in rcx per x64 calling convention. The following C code and its disassembly shows how the VS2017 Microsoft compiler reads it:

DLL code (test.c)

#define API __declspec(dllexport)

int API func(const char** instr)
{
    return (instr[0][0] << 16) + (instr[1][0] << 8) + instr[2][0];
}

Disassembly (compiled optimized to keep short, my comments added)

; Listing generated by Microsoft (R) Optimizing Compiler Version 19.00.24215.1

include listing.inc

INCLUDELIB LIBCMT
INCLUDELIB OLDNAMES

PUBLIC  func
; Function compile flags: /Ogtpy
; File c:\test.c
_TEXT   SEGMENT
instr$ = 8
func    PROC

; 5    :     return (instr[0][0] << 16) + (instr[1][0] << 8) + instr[2][0];

  00000 48 8b 51 08      mov     rdx, QWORD PTR [rcx+8]  ; address of 2nd string
  00004 48 8b 01         mov     rax, QWORD PTR [rcx]    ; address of 1st string
  00007 48 8b 49 10      mov     rcx, QWORD PTR [rcx+16] ; address of 3rd string
  0000b 44 0f be 02      movsx   r8d, BYTE PTR [rdx]     ; 1st char of 2nd string, r8d=4a
  0000f 0f be 00         movsx   eax, BYTE PTR [rax]     ; 1st char of 1st string, eax=4d
  00012 0f be 11         movsx   edx, BYTE PTR [rcx]     ; 1st char of 3rd string, edx=43
  00015 c1 e0 08         shl     eax, 8                  ; eax=4d00
  00018 41 03 c0         add     eax, r8d                ; eax=4d4a
  0001b c1 e0 08         shl     eax, 8                  ; eax=4d4a00
  0001e 03 c2            add     eax, edx                ; eax=4d4a43

; 6    : }

  00020 c3               ret     0
func    ENDP
_TEXT   ENDS
END

Python code (test.py)

from ctypes import *

dll = CDLL('test')
dll.func.argtypes = POINTER(c_char_p),
dll.restype = c_int

names = ['Mark','John','Craig']
ca = (c_char_p * len(names))(*(name.encode() for name in names))
print(hex(dll.func(ca)))

Output:

0x4d4a43

That's the correct ASCII codes for 'M', 'J', and 'C'.


Post a Comment for "Python Ctypes How To Read A Byte From A Character Array Passed To NASM"