Compatibility with Built-In Unicode Character Type

Professional User Interface Suite, Copyright FOSS Software Inc. Help Published with Permission.

The information in this article applies to:

  • Visual Studio .NET 2002
  • Visual Studio .NET 2003
  • Visual Studio .NET 2005
  • Prof-UIS version 2.23 or later

One of the new features that were introduced in Visual C++ 7.0 was a new native data type wchar_t (a.k.a. Built-in/Native Unicode Character Type __wchar_t). Until then, wchar_t had typically been defined in header files as unsigned short. In Visual C++ 7.0 the wchar_t type can be treated either as unsigned short or a native Unicode character type __wchar_t. This is specified with the /Zc:wchar_t option, which is set to on by default in new MFC projects and causes the compiler to treat wchar_t as a native type.

Just like any other built-in data type, the native Unicode character type directly affects the names of functions exported from DLLs. That makes virtually impossible to use DLLs and EXEs with different wchar_t types. The solution suggested by Microsoft is to provide overloads for both the unsigned short and __wchar_t variations of wchar_t. For libraries as large as Prof-UIS that cannot be done easily. Besides, MFC-based libraries have two main unresolvedproblems:

  • CString is based on the CStringT template, which cannot be instantiated with both __wchar_t and unsigned short within a single project simultaneously
  • Some message map macroses heavily depend on the currently selected wchar_t type: ON_WM_DEVMODECHANGE, ON_WM_ASKCBFORMATNAME, ON_WM_SETTINGCHANGE and ON_WM_WININICHANGE

Here is some reference information related to the new data type:

  • __wchar_t - New built-in data type whose size is 2 bytes and which is treated as the native type no matter which project settings are used
  • wchar_t - Data type whose size is 2 bytes and which is treated either as __wchar_t or unsigned short depending on the project settings
  • _WCHAR_T_DEFINED - Built-in macros, which is defined in case the compiler supports the native Unicode character type
  • _NATIVE_WCHAR_T_DEFINED - Built-in macros, which is defined in case the /Zc:wchar_t option is set to on, i.e. wchar_t is __wchar_t

The problem is solved by redefining data types and macroses that depend on wchar_t to wchar_t-independent ones, which was experimentally implemented in Prof-UIS 2.23. New data types are defined in the ExtMfcSafeNativeTCHAR.h file. In non-Unicode projects they are just equivalent standard data types:

typedef       TCHAR   __EXT_MFC_SAFE_TCHAR;
typedef const TCHAR __EXT_MFC_SAFE_CONST_TCHAR;
typedef LPTSTR __EXT_MFC_SAFE_LPTSTR;
typedef LPCTSTR __EXT_MFC_SAFE_LPCTSTR;
typedef TCHAR & __EXT_MFC_SAFE_TCHAR_REF;
typedef const TCHAR & __EXT_MFC_SAFE_CONST_TCHAR_REF;
typedef LPTSTR & __EXT_MFC_SAFE_LPTSTR_REF;
typedef LPCTSTR & __EXT_MFC_SAFE_LPCTSTR_REF;
typedef CString CExtSafeString;
typedef CStringArray CExtSafeStringArray;
typedef CStringList CExtSafeStringList;
typedef CMapStringToPtr CExtSafeMapStringToPtr;
typedef CMapStringToOb CExtSafeMapStringToOb;
typedef CMapStringToString CExtSafeMapStringToString;

The same (in non-Unicode projects) was done for the standard MFC message map macroses:

#define  __EXT_MFC_SAFE_ON_WM_DEVMODECHANGE() ON_WM_DEVMODECHANGE()
#define __EXT_MFC_SAFE_ON_WM_ASKCBFORMATNAME() ON_WM_ASKCBFORMATNAME()
#define __EXT_MFC_SAFE_ON_WM_SETTINGCHANGE() ON_WM_SETTINGCHANGE()
#define __EXT_MFC_SAFE_ON_WM_WININICHANGE() ON_WM_WININICHANGE()

In Unicode projects new redefined data types are based on template classes, each of which expands to data types based on __wchar_t or unsigned short. Message map macroses are also redefined in a way to avoid dependency upon symbol types. Such approach makes Unicode versions of Prof-UIS completely independent on the /Zc:wchar_t option in the client project (but you need to define the __EXT_MFC_COMPILED_WITH_NATIVE_WCHAR_T macros - see below).

So any Prof-UIS library project can be compiled with or without /Zc:wchar_t set to on: Define the __EXT_MFC_COMPILED_WITH_NATIVE_WCHAR_T macros in the ExtMfcSafeNativeTCHAR.h file if you need the native Unicode character type; otherwise, not define it. This macros affects generated code for substituted data types in the client project.

The experimental feature of enabling compatibility with native Unicode character type can be disabled if in the beginning of ExtMfcSafeNativeTCHAR.h you comment the following line:

#define __EXT_MFC_ENABLE_TEMPLATED_CHARS

The approach described in this article is experimental. Besides being independent of a symbol type, the current version of CExtSafeString causes some inconvenience like:

  • Necessity of explicit type casting to LPCTSTR in some cases
  • Although CExtSafeString uses the same ATL string manager as CString does and has the similar data structure, it is not a CString-derived class

The following code was used to test the feature:

cout << "__prof_uis_used_wchar_t is :                    \""
<< typeid(__prof_uis_used_wchar_t ).name() << "\"\n";
cout << "__prof_uis_converted_wchar_t is : \""
<< typeid( __prof_uis_converted_wchar_t ).name() << "\"\n";
cout << "__prof_uis_used_CStringT::XCHAR is : \""
<< typeid( __prof_uis_used_CStringT::XCHAR ).name() << "\"\n";
cout << "__prof_uis_used_CStringT::YCHAR is : \""
<< typeid( __prof_uis_used_CStringT::YCHAR ).name() << "\"\n";
cout << "__prof_uis_converted_CStringT::XCHAR is : \""
<< typeid( __prof_uis_converted_CStringT::XCHAR ).name()
<< "\"\n";
cout << "__prof_uis_converted_CStringT::YCHAR is : \""
<< typeid( __prof_uis_converted_CStringT::YCHAR ).
name() << "\"\n";
cout << "\n";
cout << "ATL::ChTraitsCRT<__prof_uis_used_wchar_t>::XCHAR is : \""
<< typeid( ATL::ChTraitsCRT<__prof_uis_used_wchar_t>::XCHAR ). name()
<< "\"\n";
cout << "ATL::ChTraitsCRT<__prof_uis_used_wchar_t>::YCHAR is : \""
<< typeid( ATL::ChTraitsCRT<__prof_uis_used_wchar_t>::YCHAR ).name()
<< "\"\n";
cout << "ATL::ChTraitsCRT<__prof_uis_converted_wchar_t>::XCHAR is : \""
<< typeid(ATL::ChTraitsCRT<__prof_uis_converted_wchar_t>::XCHAR ). name()
<< "\"\n";
cout << "ATL::ChTraitsCRT<__prof_uis_converted_wchar_t>::YCHAR is : \""
<< typeid( ATL::ChTraitsCRT<__prof_uis_converted_wchar_t>::YCHAR ). name()
<< "\"\n";
cout << "\n";
cout << "CExtSafeString::XCHAR is : \""
<< typeid( CExtSafeString::XCHAR ).name() << "\"\n";
cout << "CExtSafeString::YCHAR is : \""
<< typeid( CExtSafeString::YCHAR ).name() << "\"\n";
cout << "__prof_uis_used_CStringT::XCHAR is : \""
<< typeid( __prof_uis_used_CStringT::XCHAR ).name() << "\"\n";
cout << "__prof_uis_used_CStringT::YCHAR is : \""
<< typeid( __prof_uis_used_CStringT::YCHAR ).name() << "\"\n";
cout << "__prof_uis_converted_CStringT::XCHAR is : \""
<< typeid( __prof_uis_converted_CStringT::XCHAR ).name()
<< "\"\n";
cout << "__prof_uis_converted_CStringT::YCHAR is : \""
<< typeid( __prof_uis_converted_CStringT::YCHAR ).name()
<< "\"\n";

///////////////////////////////////////////////////////////////////
/// constructor / assignment test
///////////////////////////////////////////////////////////////////
// from strings - indirect
CString mfcs000a ( "test string \"mfcs000a\"" );
CString mfcs000w ( L"test string \"mfcs000w\"" );
CString mfcs000sn(
__EXT_MFC_SAFE_LPTSTR( _T("test string \"mfcs000sn\"") )
);
CString mfcs000sc(
__EXT_MFC_SAFE_LPCTSTR( _T("test string \"mfcs000sc\"") )
);
CExtSafeString prof000a ( "test string \"prof000a\"" );
CExtSafeString prof000w ( L"test string \"prof000w\"" );
// prof000sn error - required conversion from
// __EXT_MFC_SAFE_LPTSTR to LPTSTR or LPCTSTR
CExtSafeString prof000sn(
(LPTSTR)__EXT_MFC_SAFE_LPTSTR(
_T("test string \"prof000sn\"") )
);
CExtSafeString prof000sc(
__EXT_MFC_SAFE_LPCTSTR(
_T("test string \"prof000sc\"") ) );
// from string pointers - indirect
__EXT_MFC_SAFE_LPTSTR safe_str =
_T("safe string pointer (non-const)");
__EXT_MFC_SAFE_LPCTSTR safe_const_str =
_T("safe const-string pointer");
char * ansi_str = "ansi string";
__wchar_t * native_str =
(__wchar_t *) ( L"native string ");
unsigned short * normal_wide_str =
(unsigned short *)( L"normal wide string" );
CString mfcs001ansi( ansi_str );
//CString mfcs001native( native_str );
//CString mfcs001normalwide( normal_wide_str );
CString mfcs001sn( safe_str );
CString mfcs001sc( safe_const_str );
CExtSafeString prof001ansi( ansi_str );
CExtSafeString prof001native( native_str );
CExtSafeString prof001normalwide( normal_wide_str );
// prof001sn error - required conversion from
// __EXT_MFC_SAFE_LPTSTR to LPTSTR or LPCTSTR
CExtSafeString prof001sn( (LPTSTR)safe_str );
CExtSafeString prof001sc( safe_const_str );
// from chars - indirect
CString mfcs002a( 'A' );
CString mfcs002w( L'A' );
CExtSafeString prof002a( 'A' );
CExtSafeString prof002w( L'A' );
// from strings - assignment
// error - we are unicode app
//CString mfcs003a = "test string \"mfcs003a\"";
CString mfcs003w = L"test string \"mfcs003w\"";
// error - we are unicode app
// CExtSafeString prof003a = "test string \"prof003a\"";
CExtSafeString prof003w = L"test string \"prof003w\"";
// from string pointers - assignment
// CString mfcs004ansi = ansi_str;
// CString mfcs004native = native_str;
// CString mfcs004normalwide = normal_wide_str;
CString mfcs004sn = safe_str;
CString mfcs004sc = safe_const_str;
// why we should be better than MFC? (see previous line)
// CExtSafeString prof004ansi = ansi_str;
CExtSafeString prof004native = native_str;
CExtSafeString prof004normalwide = normal_wide_str;
// prof004sn error - required conversion from
// __EXT_MFC_SAFE_LPTSTR to LPTSTR or LPCTSTR
CExtSafeString prof004sn = (LPTSTR)safe_str;
CExtSafeString prof004sc = safe_const_str;
// from chars - assignment
// CString mfcs005a = 'A';
// CString mfcs005w = L'A';
// warning - coversion from int('A') to LPCTSTR('A')
// CExtSafeString prof005a = 'A';
// warning - coversion from int('A') to LPCTSTR('A')
// CExtSafeString prof005w = L'A';
// from the same otrhers
CString mfcs006a( mfcs000a );
CString mfcs006w( mfcs000w );
CExtSafeString prof006a( prof000a );
CExtSafeString prof006w( prof000w );
// from the different otrhers
CString pure_mfc_string(
_T("Pure MFC string :-(") );
CString professional_string(
_T("The extremely professional string !!! :-) !!!")
);
CString mfcs007d( professional_string );
CString mfcs007e = professional_string;
// error - no conversion CString ==> CExtSafeString
CExtSafeString prof007d( (LPCTSTR)pure_mfc_string );
// error - no conversion CString ==> CExtSafeString
CExtSafeString prof007e = (LPCTSTR)pure_mfc_string;

///////////////////////////////////////////////////////////
/// operator test
///////////////////////////////////////////////////////////

// += with chars, constants and string pointers
CString mfcs008x;
mfcs008x += " +++ ansi append (stat)";
mfcs008x += L" +++ unicode append (stat)";
mfcs008x += ansi_str;
// error, needs explicit conversion
mfcs008x += (LPCTSTR)native_str;
// error, needs explicit conversion
mfcs008x += (LPCTSTR)normal_wide_str;
mfcs008x += safe_str;
mfcs008x += safe_const_str;
mfcs008x += 'B';
mfcs008x += L'B';
CExtSafeString prof008x;
prof008x += " +++ ansi append (stat)";
prof008x += L" +++ unicode append (stat)";
prof008x += ansi_str;
prof008x += native_str;
prof008x += normal_wide_str;
prof008x += safe_str;
// error, needs explicit conversion
prof008x += (LPCTSTR)safe_const_str;
prof008x += 'B';
prof008x += L'B';

// + with different strings
CString mfcs009trg;
CString mfcs009src( _T("***mfcs009src string***") );
CExtSafeString prof009trg;
CExtSafeString prof009src( _T("***prof009src string***") );

// error, ansi + simply unsupported
//mfcs009trg = "cat" + mfcs009src;
mfcs009trg = "cat" + prof009src;
// error, ansi + simply unsupported
//prof009trg = "cat" + mfcs009src;
prof009trg = "cat" + prof009src;

mfcs009trg = _T("cat") + mfcs009src;
mfcs009trg = _T("cat") + prof009src;
prof009trg = _T("cat") + mfcs009src;
prof009trg = _T("cat") + prof009src;

// error, ansi + simply unsupported
//mfcs009trg = mfcs009src + "cat";
mfcs009trg = prof009src + "cat";
// error, ansi + simply unsupported
//prof009trg = mfcs009src + "cat";
prof009trg = prof009src + "cat";

mfcs009trg = mfcs009src + _T("cat");
mfcs009trg = prof009src + _T("cat");
prof009trg = mfcs009src + _T("cat");
prof009trg = prof009src + _T("cat");

// + with chars
mfcs009trg = 'C' + mfcs009src;
// error, mfc does not like prof string
//mfcs009trg = 'C' + prof009src;
prof009trg = 'C' + mfcs009src;
prof009trg = 'C' + prof009src;

mfcs009trg = _T('C') + mfcs009src;
// error, mfc does not like prof string
//mfcs009trg = _T('C') + prof009src;
prof009trg = _T('C') + mfcs009src;
prof009trg = _T('C') + prof009src;

mfcs009trg = mfcs009src + 'C' ;
// error, mfc does not like prof string
//mfcs009trg = prof009src + 'C' ;
prof009trg = mfcs009src + 'C' ;
prof009trg = prof009src + 'C' ;

mfcs009trg = mfcs009src + _T('C');
// error, mfc does not like prof string
//mfcs009trg = prof009src + _T('C');
prof009trg = mfcs009src + _T('C');
prof009trg = prof009src + _T('C');

//////////////////////////////////////////////////////////////
/// serialization
//////////////////////////////////////////////////////////////

CString mfcs010_src_mfcs_as_mfcs( _T("mfcs as mfcs") );
CString mfcs010_src_prof_as_mfcs( _T("prof as mfcs") );
CExtSafeString prof010_src_mfcs_as_prof( _T("mfcs as prof") );
CExtSafeString prof010_src_prof_as_prof( _T("prof as prof") );

CString mfcs010_dst_mfcs_as_mfcs;
CExtSafeString prof010_dst_prof_as_mfcs;
CString mfcs010_dst_mfcs_as_prof;
CExtSafeString prof010_dst_prof_as_prof;

CMemFile _file;
{ // BLOCK: serialization - store
CArchive ar(
&_file,
CArchive::store
);
ar << mfcs010_src_mfcs_as_mfcs;
ar << mfcs010_src_prof_as_mfcs;
ar << prof010_src_mfcs_as_prof;
ar << prof010_src_prof_as_prof;
ar.Flush();
} // BLOCK: serialization - store
_file.Seek(0,CFile::begin);
{ // BLOCK: serialization - load
CArchive ar(
&_file,
CArchive::load
);
ar >> mfcs010_dst_mfcs_as_mfcs;
ar >> prof010_dst_prof_as_mfcs;
ar >> mfcs010_dst_mfcs_as_prof;
ar >> prof010_dst_prof_as_prof;
} // BLOCK: serialization - load