ANSI与Unicode编码，TCHAR | LPSTR | LPCSTR | LPWSTR | LPCWSTR | LPTSTR | LPCTSTR 的含义 - c++编程基础

TOP

2017-10-13 09:44:19 【大中小】浏览:9152次

LPCWSTR - const wchar_t*

LPTSTR - TCHAR*

LPCTSTR - const TCHAR*

在编程中有时候会因为选择的字符集不同，而编译出错，如下面的写法在ANSI下没事，但在Unicode下就会报错：

int main()
{
    TCHAR name[] = "Saturn";
    int nLen; // Or size_t

    lLen = strlen(name);
}

error C2440: 'initializing' : cannot convert from 'const char [7]' to 'TCHAR []'
error C2664: 'strlen' : cannot convert parameter 1 from 'TCHAR []' to 'const char *'

同样的问题出现在：

nLen = wcslen("Saturn");
// ERROR: cannot convert parameter 1 from 'const char [7]' to 'const wchar_t *'

遗憾的是，上面的错误不能通过强制转换的方法修改：

nLen = wcslen((const wchar_t*)"Saturn");

上面的写法会得到错误的结果，往往导致越界。原因是“Saturn”占用7个字节

'S'(83)

'a'(97)

't'(116)

'u'(117)

'r'(114)

'n'(110)

'\0'(0)

但传给wcslen的时候，对于每个字符分配2-bytes。因此头两个字节[83,97]被看作一个字符，value：(97<<8 | 83)，是字符'？'.后面的以此类推。

所以如果用Unicode的api，需要提前转换：

TCHAR name[] = _T("Saturn");
//或者
wcslen(L"Saturn");

在之前的例子中，strlen(name)中的name在Unicode下编译，每个字符占2-bytes，如果强制转换成ANSI：

lLen = strlen ((const char*)name);

也会出现问题，‘S'原来表示为[83,0]，但在ANSI中第一个字节[83]可以被正确翻译成'S'，但接着第二个字节[0]直接被翻译为为'\0'，结束了整个字符串。所以strlen得到的结果为1。

综上，C语言风格的强制转换在这里是行不通的。

如果需要分配内存，在C++中通过new直接指定字符的个数，不用去管具体分配了多少字节：

LPTSTR pBuffer; // TCHAR* 

pBuffer = new TCHAR[128]; // Allocates 128 or 256 BYTES, depending on compilation.

但如果你是用malloc，LocalAlloc，GlobalAlloc这类api分配空间，就需要指定具体的字节数：

pBuffer = (TCHAR*) malloc (128 * sizeof(TCHAR) );