[ACCEPTED]-Different behaviour of ctypes c_char_p?-ctypes

Accepted answer
Score: 23

c_char_p is a subclass of _SimpleCData, with _type_ == 'z'. The __init__ method calls 28 the type's setfunc, which for simple type 'z' is z_set.

In 27 Python 2, the z_set function (2.7.7) is written to handle 26 both str and unicode strings. Prior to Python 3, str is 25 an 8-bit string. CPython 2.x str internally 24 uses a C null-terminated string (i.e. an 23 array of bytes terminated by \0), for which 22 z_set can call PyString_AS_STRING (i.e. get a pointer to the internal 21 buffer of the str object). A unicode string needs 20 to first be encoded to a byte string. z_set handles 19 this encoding automatically and keeps a 18 reference to the encoded string in the _objects attribute.

>>> c = u'spam'
>>> a = c_char_p(c)
>>> a._objects
'spam'
>>> type(a._objects)
<type 'str'>

On 17 Windows, the default ctypes string encoding 16 is 'mbcs', with error handling set to 'ignore'. On all 15 other platforms the default encoding is 14 'ascii', with 'strict' error handling. To modify the default, call 13 ctypes.set_conversion_mode. For example, set_conversion_mode('utf-8', 'strict').

In Python 3, the z_set function (3.4.1) does 12 not automatically convert str (now Unicode) to 11 bytes. The paradigm shifted in Python 3 to strictly 10 divide character strings from binary data. The 9 ctypes default conversions were removed, as 8 was the function set_conversion_mode. You have to pass c_char_p a bytes object 7 (e.g. b'spam' or 'spam'.encode('utf-8')). In CPython 3.x, z_set calls the 6 C-API function PyBytes_AsString to get a pointer to the 5 internal buffer of the bytes object.

Note that 4 if the C function modifies the string, then 3 you need to instead use create_string_buffer to create a c_char array. Look 2 for a parameter to be typed as const to know 1 that it's safe to use c_char_p.

More Related questions