8. Unicode¶
AGX uses utf-8 internally. One of the reasons is that it is backwards compatible with ascii. It will also save storage. std::string/agx::String works just as it is as a storage container for utf8 strings.
In <agx/Encoding.h> there are several utility methods for converting between Wide and utf8.
So for example, reading a string encoded in utf8 from the windows registry requires you to convert from utf8 to Wide:
const char *keyInUtf8; // Initialized with some utf8 string
std::wstring wKey = agx::utf8ToWide( key ); // Convert to Wide character string
agx::String& value; // Here is where we store the result
dwType=0;
DWORD dwLen=0;
status = RegQueryValueExW(hkey, wKey.c_str(), 0,&dwType,nullptr, &dwLen);
if (status== ERROR_SUCCESS && dwType == REG_SZ)
{
agx::Vector<wchar_t> data;
// Since we use wchar, this will overallocate, but that's ok.
data.resize(dwLen+1);
// Now get the value
status = RegQueryValueExW(hkey, wKey.c_str(), nullptr,nullptr,(PBYTE)(data.ptr()), &dwLen);
if (status== ERROR_SUCCESS)
{
// If we use RegQueryValueEx and UNICODE is not set, then the API call
// will be RegQueryValueExA that will convert from wstring to ANSI if
// the data in the registry key is written as unicode.
//
// We don't want that conversion since it might not be safe and having to mess
// around with code pages is not fun.
value = agx::wideToUtf8( std::wstring(data.ptr() ) );
found = true;
}
RegCloseKey(hkey);
}
8.1. Limitations¶
Applications such as agxViewer will not be able to read files in a path containing utf8 characters.
C++ streams do not handle unicode paths in windows, and there are no open methods that handle wstring. Hence basic_ios and classes such as ifstream must not be used for accessing files in paths with non-ascii characters.