8. Unicode

AGX uses utf-8 internally. One of the reasons is that it is backwards compatible with ascii. It will also save storage. std::string/agx::String works just as it is as a storage container for utf8 strings.

In <agx/Encoding.h> there are several utility methods for converting between Wide and utf8.

So for example, reading a string encoded in utf8 from the windows registry requires you to convert from utf8 to Wide:

const char *keyInUtf8; // Initialized with some utf8 string
std::wstring wKey = agx::utf8ToWide( key ); // Convert to Wide character string

agx::String& value; // Here is where we store the result

dwType=0;
DWORD dwLen=0;
status = RegQueryValueExW(hkey, wKey.c_str(), 0,&dwType,nullptr, &dwLen);

if (status== ERROR_SUCCESS &&  dwType == REG_SZ)
{
agx::Vector<wchar_t> data;

// Since we use wchar, this will overallocate, but that's ok.
data.resize(dwLen+1);

// Now get the value
status = RegQueryValueExW(hkey, wKey.c_str(), nullptr,nullptr,(PBYTE)(data.ptr()), &dwLen);
if (status== ERROR_SUCCESS)
{
  // If we use RegQueryValueEx and UNICODE is not set, then the API call
  // will be RegQueryValueExA that will convert from wstring to ANSI if
  // the data in the registry key is written as unicode.
  //
  // We don't want that conversion since it might not be safe and having to mess
  // around with code pages is not fun.

  value = agx::wideToUtf8( std::wstring(data.ptr() ) );
  found = true;
}
RegCloseKey(hkey);
}

8.1. Limitations

Applications such as agxViewer will not be able to read files in a path containing utf8 characters.

C++ streams do not handle unicode paths in windows, and there are no open methods that handle wstring. Hence basic_ios and classes such as ifstream must not be used for accessing files in paths with non-ascii characters.