Python-ctype unicode processing and python copiled with UCS-2 or UCS-4? -
i trying call c-interface python using ctype module. below prototype of c function
void utf_to_wide_char( const char* source, unsigned short* buffer, int buffersize)
utf_to_wide_char : converts utf-* string ucs2 string
source (input) : contains null terminated utf-8 string
buffer (output) : pointer buffer hold converted text
buffersize : indicates size of buffer, system copy upto size including null.
following python function:
def to_ucs2(py_unicode_string): len_str = len(py_unicode_string) local_str = py_unicode_string.encode('utf-8') src = c_wchar_p(local_str) buff = create_unicode_buffer(len_str * 2 ) # shared_lib ctype loaded instance of shared library. shared_lib.utf8_to_widechar(src, buff, sizeof(buff)) return buff.value
problem : above code snippet works fine in python compiled ucs-4 ( --enable-unicode=ucs4 option ) , behave unexpected python compiled ucs-2 ( --enable-unicode=ucs2 ). ( verified python unicode compilation option referring how find out if python compiled ucs-2 or ucs-4? )
unfortunately in production environment using python compiled ucs-2. please comment on following points.
- although sure issue unicode option, yet nail down happening under hoods. need in coming required justification.
- is possible overcome issue, without compiling python --enable-unicode=ucs4 option?
( quite new unicode encoding stuff. have basic know-how. )
Comments
Post a Comment