输入: ["xL01(F]J","2pn5Mm","-5)8gF{","KWq0P]*%Q","n@,:\u003eAm@","\u003cRN_qCa7","8Qx\u0026RAON", "gT~s!1s?4i{K","w"r^d_#l$Mmp"]
输出: ["(01FJL]x","25Mmnp",")-58Fg{","%*0KPQW]q",",:\u003e@@Amn","7\u003cCNR_aq","\u00268ANOQRx","! 14?KTgiss{~",""#$M^_dlmprw"]
目前我已经准备了一个解决方法,但这对于更改来说风险太大。
public static void main(String[] args) {
String[] qArray = { "xL01(F]J", "2pn5Mm", "-5)8gF{", "KWq0P]*%Q", "n@,:\u003eAm@", "\u003cRN_qCa7",
"8Qx\u0026RAON", "gT~s!1s?4i{K", "w\"r^d_#l$Mmp" };
List<String> aList = new ArrayList<>();
List<String> qList = Arrays.asList(qArray);
qList.forEach(qString -> {
String sorted = sortedString(qString);
String replacedAndSorted = replaceUnicode(sorted);
aList.add(replacedAndSorted);
});
System.out.println(Arrays.toString(aList.toArray()));
}
public static String sortedString(String str) {
char[] c = str.toCharArray();
Arrays.sort(c);
return new String(c);
}
public static String replaceUnicode(String sorted) {
String replacedString = sorted;
char[] c = sorted.toCharArray();
if (sorted.contains("\u003e")) {
replacedString = sorted.replace("\u003e",
"\\u0" + Integer.toHexString(sorted.codePointAt(sorted.indexOf("\u003e"))));
}
if (sorted.contains("\u003c")) {
replacedString = sorted.replace("\u003c",
"\\u0" + Integer.toHexString(sorted.codePointAt(sorted.indexOf("\u003c"))));
}
if (sorted.contains("\u0026")) {
replacedString = sorted.replace("\u0026",
"\\u0" + Integer.toHexString(sorted.codePointAt(sorted.indexOf("\u0026"))));
}
return replacedString;
}
我不想检查每个 unicode 代码,是否有任何通用方法来确定哪个字符被写为 unicode 以及给定字符串中的位置?
一个简单的方法是
\uxxxx
表示 BMP 字符)。 public static String replaceUnicode(String str) {
StringBuilder stringBuilder = new StringBuilder();
for (char ch : str.toCharArray()) {
if (ch > 127) { // Non-ASCII
String unicode = String.format("\\u%04x", (int) ch);
stringBuilder.append(unicode);
} else {
stringBuilder.append(ch);
}
}
return stringBuilder.toString();
}
更复杂的方法是使用正则表达式:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public static String replaceUnicode(String str) {
Pattern pattern = Pattern.compile("[^\\x00-\\x7F]");
Matcher matcher = pattern.matcher(str);
StringBuffer sb = new StringBuffer();
while (matcher.find()) {
matcher.appendReplacement(
sb, String.format("\\\\u%04x", (int) matcher.group().charAt(0)));
}
matcher.appendTail(sb);
return sb.toString();
}
此正则表达式匹配标准 ASCII 范围之外的任何字符 (
[^\\x00-\\x7F]
)