仅使用iOS API从html中提取JSON字符串

问题描述 投票:0回答:1

我想使用第三方框架从“没有”提取html文档中的JSON字符串。我正在尝试创建iOS框架,我不想在其中使用第三方框架。

示例网址:http://www.nicovideo.jp/watch/sm33786214

在那个html中,有一行:

我需要提取:JSON_String_I_want_to提取并将其转换为JSON对象。

使用第三方框架“Kanna”,它是这样的:



    if let doc = Kanna.HTML(html: html, encoding: String.Encoding.utf8) {
        if let descNode = doc.css("#js-initial-watch-data[data-api-data]").first {
            let dataApiData = descNode["data-api-data"]
                if let data = dataApiData?.data(using: .utf8) {
                    if let json = try? JSON(data: data, options: JSONSerialization.ReadingOptions.mutableContainers) {

我在网上搜索了类似的问题,但无法申请我的案例:(我需要承认我不太遵循正则表达式)



      if let html = String(data:data, encoding:.utf8) {
        let pattern = "data-api-data=\"(.*?)\".*?>"
        let regex = try! NSRegularExpression(pattern: pattern, options: .caseInsensitive)
        let matches = regex.matches(in: html, options: [], range: NSMakeRange(0, html.count))
        var results: [String] = []
        matches.forEach { (match) -> () in
            results.append( (html as NSString).substring(with: match.rangeAt(1)) )
        }
        if let stringJSON = results.first {
          let d = stringJSON.data(using: String.Encoding.utf8)
          if let json = try? JSONSerialization.jsonObject(with: d!, options: []) as? Any {
            // it does not get here...      
          }

有谁从html中提取并将其转换为JSON?

谢谢。

html json swift nsregularexpression
1个回答
0
投票

您的pattern似乎并不坏,只是HTML Elements的属性值可能正在使用字符实体。

在将String解析为JSON之前,您需要将它们替换为实际字符。

if let html = String(data:data, encoding: .utf8) {
    let pattern = "data-api-data=\"([^\"]*)\""
    let regex = try! NSRegularExpression(pattern: pattern, options: .caseInsensitive)
    let matches = regex.matches(in: html, range: NSRange(0..<html.utf16.count)) //<-USE html.utf16.count, NOT html.count
    var results: [String] = []
    matches.forEach {match in
        let propValue = html[Range(match.range(at: 1), in: html)!]
            //### You need to replace character entities into actual characters
            .replacingOccurrences(of: "&quot;", with: "\"")
            .replacingOccurrences(of: "&apos;", with: "'")
            .replacingOccurrences(of: "&gt;", with: ">")
            .replacingOccurrences(of: "&lt;", with: "<")
            .replacingOccurrences(of: "&amp;", with: "&")
        results.append(propValue)
    }
    if let stringJSON = results.first {
        let dataJSON = stringJSON.data(using: .utf8)!
        do {
            let json = try JSONSerialization.jsonObject(with: dataJSON)
            print(json)
        } catch {
            print(error) //You should not ignore errors silently...
        }
    } else {
        print("NO result")
    }
}
© www.soinside.com 2019 - 2024. All rights reserved.