如何在VBScript中从单个字符中获取UTF-8代码?

问题描述 投票:3回答:1

我想得到一个字符的UTF-8码,曾尝试使用流,但似乎没有效果。

例子: פ应该给出16#D7A4,根据: https:/en.wikipedia.orgwikiPe_(Semitic_letter)#Character_encodings。

Const adTypeBinary = 1
Dim adoStr, bytesthroughado
Set adoStr = CreateObject("Adodb.Stream")
    adoStr.Charset = "utf-8"
    adoStr.Open
    adoStr.WriteText labelString
    adoStr.Position = 0 
    adoStr.Type = adTypeBinary
    adoStr.Position = 3 
    bytesthroughado = adoStr.Read
    Msgbox(LenB(bytesthroughado)) 'gives 2
    adoStr.Close
Set adoStr = Nothing
MsgBox(bytesthroughado) ' gives K

注:AscW给出的是Unicode,而不是UTF-8。

encoding utf-8 vbscript wincc
1个回答
3
投票

bytesthroughado 是一个值 byte() 子类型(见第1行输出),因此您需要以适当的方式处理它。

Option Explicit

Dim ss, xx, ii, jj, char, labelString

labelString = "ařЖפ€"
ss = ""
For ii=1 To Len( labelString)
  char = Mid( labelString, ii, 1)
  xx = BytesThroughAdo( char)
  If ss = "" Then ss = VarType(xx) & " " & TypeName( xx) & vbNewLine
  ss = ss & char & vbTab
  For jj=1 To LenB( xx)
      ss = ss & Hex( AscB( MidB( xx, jj, 1))) & " "
  Next
  ss = ss & vbNewLine
Next   

Wscript.Echo ss

Function BytesThroughAdo( labelChar)
    Const adTypeBinary = 1  'Indicates binary data.
    Const adTypeText   = 2  'Default. Indicates text data.
    Dim adoStream
    Set adoStream = CreateObject( "Adodb.Stream")
    adoStream.Charset = "utf-8"
    adoStream.Open
    adoStream.WriteText labelChar
    adoStream.Position = 0 
    adoStream.Type = adTypeBinary
    adoStream.Position = 3 
    BytesThroughAdo = adoStream.Read
    adoStream.Close
    Set adoStream = Nothing
End Function

输出:

cscript D:\bat\SO\61368074q.vbs
8209 Byte()
a       61
ř       C5 99
Ж       D0 96
פ       D7 A4
€       E2 82 AC

我用的是字符 ařЖפ€ 来演示您的UTF-8编码器的功能(在您的UTF-8编码器中的 alts8.ps1 PowerShell脚本来自于另一个项目)。)

alts8.ps1 "ařЖפ€"
Ch Unicode     Dec    CP    IME     UTF-8   ?  IME 0405/cs-CZ; CP852; ANSI 1250

 a  U+0061      97         …97…      0x61   a  Latin Small Letter A
 ř  U+0159     345         …89…    0xC599  Å�  Latin Small Letter R With Caron
 Ж  U+0416    1046         …22…    0xD096  Ð�  Cyrillic Capital Letter Zhe
 פ  U+05E4    1508        …228…    0xD7A4  פ  Hebrew Letter Pe
 €  U+20AC    8364        …172…  0xE282AC â�¬  Euro Sign
© www.soinside.com 2019 - 2024. All rights reserved.