我在以下代码中遇到问题:
int convert_to_opcodes(wchar_t *parts) {
wchar_t caracter[1];
wcscpy(caracter, parts);
return (int)caracter[0];
};
int main(void) {
wchar_t *text1=L"ñ";
printf("Opcode: %x\n",convert_to_opcodes(text1));
return 0;
}
找到此代码,但我想将 L"ñ" 给出或转换为 argv[1] 或其他 var 等效项:
int convert_to_opcodes(wchar_t *parts) {
wchar_t caracter[1];
wcscpy(caracter, parts);
return (int)caracter[0];
};
int main(void) {
char *example="ñ"
wchar_t *text1=example;
printf("Opcode: %x\n",convert_to_opcodes(text1));
return 0;
}
问题:
如何获取 A-Z0-1 中的字符和特殊字符(如“ñ,%$”)的操作码?例如?
我希望得到的是每个字符的操作码,例如“ñ”=f1等等
“操作码”是一条 CPU 指令。我认为你的意思是“Unicode 代码点”。将文本从未指定的编码转换为宽字符串称为解码文本,可以使用
mbrtowc
来完成。
#include <locale.h>
#include <stdio.h>
#include <string.h>
#include <wchar.h>
int main( void ) {
setlocale( LC_ALL, "en_US.utf8" ); // Converting from UTF-8?
mbstate_t state;
memset( &state, 0, sizeof( state ) );
const char *encoded_text = "ñ";
size_t encoded_len = strlen( encoded_text ) + 1;
wchar_t ucp;
const char *encoded_text_end = encoded_text + encoded_len;
while (1) {
size_t left = encoded_text_end - encoded_text;
size_t rv = mbrtowc( &ucp, encoded_text, left, &state );
if ( rv == 0 ) // NUL encountered.
break;
if ( rv == (size_t)-2 ) // Incomplete sequence encountered.
break;
if ( rv == (size_t)-1 ) // Other error encountered.
break;
encoded_text += rv;
printf( "U+%06lX", (unsigned long)ucp );
}
}
U+0000F1