如何从facebook页面的xhr回复中获取有用的数据?

问题描述 投票:0回答:3

我正在尝试通过网络抓取我的 Facebook 页面来获取我所有朋友的生日。由于 facebook 使用 ajax 调用来加载“生日事件”页面中的朋友姓名,因此我查看了 chrome 开发工具中的网络活动,以了解其在何处以及如何进行 xhr 调用以及响应数据如何。

这些电话的回复对我来说没有任何意义。它们看起来像是被混淆了或者其他什么...如何使用 xhr 调用时获得的响应数据提取我在网站上看到的数据?

这是响应数据:

for (;;); {
    "__ar": 1,
    "payload": null,
    "domops": [
        ["replace", "#birthdays_pager", false, {
            "__html": "\u003Cdiv class=\"_4-u2 _tzh _fbBirthdays__monthCard _4-u8\">\u003Cdiv class=\"_4-u3 _5dwa _5dw9\" id=\"birthdays_monthly_card_1522566000\">\u003Cspan class=\"_38my\">April\u003Cspan class=\"_c1c\">\u003C\/span>\u003C\/span>\u003Cspan class=\"_5dw8\">\u003Cdiv class=\"_tzj\">\u003Ca href=\"https:\/\/www.facebook.com\/kajal.chaudhary.5492\">Kajal Chaudhary\u003C\/a>, \u003Ca href=\"https:\/\/www.facebook.com\/shreesha.bhat.963\">Shreesha Bhat Galimane\u003C\/a> and 19 others\u003C\/div>\u003C\/span>\u003Cdiv class=\"_3s3-\">\u003C\/div>\u003C\/div>\u003Cdiv class=\"_4-u3\">\u003Cdiv class=\"_43qm _tzu _43q9\">\u003Cul class=\"uiList _4cg3 _509- _4ki\">\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/satish.ven.58\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Satish Ven (4\/2)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/27540391_2084438825122967_6451048031944951645_n.jpg?oh=77383450a07722e1a44bf39c6d2c12f7&oe=5B19517E\" alt=\"Satish Ven\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/sheshufirefox\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Sheshadri Sharma (4\/6)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/16998890_758118407695665_4675113951836594565_n.jpg?oh=946ce323c5b3824fbf8dbbe59fd9160f&oe=5B02616B\" alt=\"Sheshadri Sharma\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/aayush.sinha.146\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Aayush Sinha (4\/8)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/10968514_981734045189455_1830626709337028270_n.jpg?oh=428a495a9379b6b2202408aa5284923b&oe=5B12711E\" alt=\"Aayush Sinha\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/pranav.ys.5\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Pranav YS (4\/11)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/12541116_1676590859264773_7240167064125691378_n.jpg?oh=3d4d0b034a06ecf460b8668fcdd0fad2&oe=5AD7EEAA\" alt=\"Pranav YS\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/profile.php?id=100012822522252\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Pankaj Thakur (4\/11)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/27752205_468390053598408_3401567454276318428_n.jpg?oh=1f2fb7ee2da724506757029fdb8a46b2&oe=5B1F151B\" alt=\"Pankaj Thakur\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/prajwal.bhadravathiravi\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Prajwal Bhadravathi Ravi (4\/11)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/c10.0.57.57\/p57x57\/26000914_361577560980364_1712446738221265545_n.jpg?oh=370dc4419b0767b7e79bc27e854bc06b&oe=5B03D96B\" alt=\"Prajwal Bhadravathi Ravi\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/sachinr.doddaguni\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Sachin R Doddaguni (4\/12)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/c17.0.57.57\/p57x57\/10354686_10150004552801856_220367501106153455_n.jpg?oh=21091066fea75337ac98a3cf1f341740&oe=5B16DBF3\" alt=\"Sachin R Doddaguni\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/kajal.chaudhary.5492\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Kajal Chaudhary (4\/14)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/23316350_1440015076106508_6579302328578807067_n.jpg?oh=4e3fc491c9a32f9581286452933b1e50&oe=5B227D7E\" alt=\"Kajal Chaudhary\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/usha.shastri.54\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Usha Shastri (4\/14)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/c0.0.57.57\/p57x57\/10152373_10203424125546031_1766227792_n.jpg?oh=8b0e95a8a60e09c79005a84f3c6a8b98&oe=5B225FD5\" alt=\"Usha Shastri\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/ashish.dwivedi.39566\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Ashish Dwivedi (4\/14)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/26239697_2005584583047675_396510917460842524_n.jpg?oh=eab2bd118623e449e2dcefa3fb64899e&oe=5B021392\" alt=\"Ashish Dwivedi\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/shreesha.bhat.963\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Shreesha Bhat Galimane (4\/15)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/27541088_2089364294633310_7146912677909552069_n.jpg?oh=e16a6a514982d8f15ae0a1c81a719752&oe=5B1B0577\" alt=\"Shreesha Bhat Galimane\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/chethanhr.chazz\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Chethan Vilas (4\/16)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/13620256_595294770631304_6000009159215075898_n.jpg?oh=b5b3ea3db6040e8a79233a7e90c916a9&oe=5B1C56BD\" alt=\"Chethan Vilas\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/kshitija.kallesh\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Kshitija Vidya Kallesh (4\/18)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/26733802_1408840822577029_120794789359415364_n.jpg?oh=19b7eada0711726990750fb6cf4add09&oe=5B03C8E7\" alt=\"Kshitija Vidya Kallesh\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/vishesh.ug\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Vishesh Umesh Gujjar (4\/18)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/21192739_1939744576303216_5388844998198614270_n.jpg?oh=24d78a736265c7c8c0adeb54324f5894&oe=5B08520E\" alt=\"Vishesh Umesh Gujjar\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/santosh.bhat.7359\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Santosh Bhat (4\/18)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/19149245_1892258634388875_8828676164364322774_n.jpg?oh=741b3bf6f9080726d54251044ba34355&oe=5B09B6F5\" alt=\"Santosh Bhat\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/profile.php?id=100007305601325\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Rahul Kumar (4\/20)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/c90.210.540.540\/s57x57\/21685964_1884540188466150_8711746607997503911_n.jpg?oh=09c4c1f9f707950987c0eb70e7f3ad58&oe=5B1329B8\" alt=\"Rahul Kumar\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/sumantha.murali\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Sumanth Sharma (4\/22)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/c1.0.57.57\/p57x57\/20663860_764480790380420_5549902384541679375_n.jpg?oh=132d2d9ec2b0b1f77f83620fc1efeb2a&oe=5B044E32\" alt=\"Sumanth Sharma\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/archana.kashyap.90226\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Sweekruthi Kashyap (4\/22)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/26814579_2066880066876357_3732840647672074955_n.jpg?oh=e27797ae7d2fcfa8ca23cf06bc36dbb9&oe=5B081963\" alt=\"Sweekruthi Kashyap\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/vinayaka.cbg\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Vinayaka Bhat Galimane (4\/23)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/25994873_1536622213073481_4403814656121225467_n.jpg?oh=d26a01066699d858d20bfa367fba02a4&oe=5B0EC392\" alt=\"Vinayaka Bhat Galimane\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/profile.php?id=100004456147835\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Pruthvi Kalyan Reddy (4\/28)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/14563438_693932530765279_8227735103682834751_n.jpg?oh=17a9ef5cfa963fe9902bc16e94e6b51d&oe=5B0F7604\" alt=\"Pruthvi Kalyan Reddy\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003Cli class=\"_43q7\">\u003Ca href=\"https:\/\/www.facebook.com\/kushal.kushu.31\" class=\"link\" data-jsid=\"anchor\" data-hover=\"tooltip\" data-tooltip-content=\"Kushal Kushu (4\/29)\">\u003Cimg class=\"_s0 _ry img\" src=\"https:\/\/scontent.fblr6-1.fna.fbcdn.net\/v\/t1.0-1\/p57x57\/19366450_1032761963526766_3567943503656629473_n.jpg?oh=cdf7c11db05db93d8fd0e966d816ea98&oe=5B1B0A36\" alt=\"Kushal Kushu\" data-jsid=\"img\" \/>\u003C\/a>\u003C\/li>\u003C\/ul>\u003C\/div>\u003C\/div>\u003C\/div>\u003Cdiv class=\"clearfix uiMorePager stat_elem _52jv\" id=\"birthdays_pager\">\u003Cdiv>\u003Ca rel=\"ajaxify\" href=\"\/async\/birthdays\/?date=1525158000\" class=\"pam uiBoxLightblue uiMorePagerPrimary\">May\u003Ci class=\"mhs mts arrow img sp_m7lN5cdLBIi sx_fa6ba6\">\u003C\/i>\u003C\/a>\u003Cspan class=\"uiMorePagerLoader pam uiBoxLightblue\">\u003Cimg class=\"img\" src=\"https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yb\/r\/GsNJNwuI-UM.gif\" alt=\"\" width=\"16\" height=\"11\" \/>\u003C\/span>\u003C\/div>\u003C\/div>"
        }]
    ],
    "jsmods": {
        "instances": [
            ["__inst_1c03405d_i_0", ["MorePagerFetchOnScroll", "__elem_1c03405d_i_0"],
                [{
                    "__m": "__elem_1c03405d_i_0"
                }, 0, true], 1
            ]
        ],
        "elements": [
            ["__elem_1c03405d_i_0", "birthdays_pager", 1]
        ],
        "require": [
            ["__inst_1c03405d_i_0"],
            ["Tooltip"]
        ]
    },
    "js": ["lbOvC", "I1Wyg", "iaXyh", "RIWAf"],
    "css": ["trv4T", "eyM74", "0wVzo", "YGsVX", "rwXTv", "hAqW4", "bTiWO"],
    "bootloadable": {
        "TimeSliceInteractionsLiteTypedLogger": {
            "resources": ["ZN6iu", "lbOvC", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "WebSpeedInteractionsTypedLogger": {
            "resources": ["lbOvC", "lTQVw", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "AsyncDOM": {
            "resources": ["lbOvC", "trv4T", "d25Q1"],
            "needsAsync": 1,
            "module": 1
        },
        "Dialog": {
            "resources": ["lbOvC", "YGsVX", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "ErrorSignal": {
            "resources": ["lbOvC", "trv4T", "eVg16", "CHoRV"],
            "needsAsync": 1,
            "module": 1
        },
        "ExceptionDialog": {
            "resources": ["vdrq6", "lbOvC", "JeUwF", "YGsVX", "trv4T", "mzeym", "eVg16", "taIOX", "iaXyh"],
            "needsAsync": 1,
            "module": 1
        },
        "PageTransitions": {
            "resources": ["lbOvC", "np5Vl", "trv4T", "eVg16", "I1Wyg", "iaXyh"],
            "needsAsync": 1,
            "module": 1
        },
        "ReactDOM": {
            "resources": ["lbOvC", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "QuickSandSolver": {
            "resources": ["lbOvC", "Klc20", "trv4T", "+ClWy", "6Q\/Yd"],
            "needsAsync": 1,
            "module": 1
        },
        "ConfirmationDialog": {
            "resources": ["oE4Do", "lbOvC", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "Banzai": {
            "resources": ["lbOvC", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "BanzaiODS": {
            "resources": ["lbOvC", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "ResourceTimingBootloaderHelper": {
            "resources": ["lbOvC", "CHoRV"],
            "needsAsync": 1,
            "module": 1
        },
        "TimeSliceHelper": {
            "resources": ["WmPot", "lbOvC", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "ContextualLayerInlineTabOrder": {
            "resources": ["lbOvC", "b2zWq", "Nv4jJ", "YGsVX", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "BanzaiStream": {
            "resources": ["lbOvC", "ZU1ro", "trv4T"],
            "needsAsync": 1,
            "module": 1
        },
        "SnappyCompressUtil": {
            "resources": ["lbOvC"],
            "needsAsync": 1,
            "module": 1
        },
        "KeyEventTypedLogger": {
            "resources": ["lbOvC", "trv4T", "VMKqM"],
            "needsAsync": 1,
            "module": 1
        }
    },
    "resource_map": {
        "ZN6iu": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yJ\/r\/r98JDkrPdB7.js",
            "crossOrigin": 1
        },
        "lbOvC": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3iQRw4\/y-\/l\/en_US\/5WZyEzO-yKR.js",
            "crossOrigin": 1
        },
        "trv4T": {
            "type": "css",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/y5\/l\/0,cross\/Hams2CQ6T8x.css",
            "permanent": 1,
            "crossOrigin": 1
        },
        "lTQVw": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yk\/r\/8v3L65OKN6U.js",
            "crossOrigin": 1
        },
        "d25Q1": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yW\/r\/2Hfsrn8zSCU.js",
            "crossOrigin": 1
        },
        "YGsVX": {
            "type": "css",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yN\/l\/0,cross\/tw4_CoryHby.css",
            "permanent": 1,
            "crossOrigin": 1
        },
        "eVg16": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3iPWO4\/y7\/l\/en_US\/zUpriHPHyi0.js",
            "crossOrigin": 1
        },
        "CHoRV": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3i3pY4\/yB\/l\/en_US\/QJ9nYHU0qO9.js",
            "crossOrigin": 1
        },
        "vdrq6": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3itvn4\/yT\/l\/en_US\/6_7pVZCnDMo.js",
            "crossOrigin": 1
        },
        "JeUwF": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/y_\/r\/ash8xOAZVK-.js",
            "crossOrigin": 1
        },
        "mzeym": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3i2nZ4\/y4\/l\/en_US\/SE27RbSq37K.js",
            "crossOrigin": 1
        },
        "taIOX": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3if8X4\/yf\/l\/en_US\/I3G_M2Fe60k.js",
            "crossOrigin": 1
        },
        "iaXyh": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3idkl4\/y-\/l\/en_US\/Wcgyvl_N-Xj.js",
            "crossOrigin": 1
        },
        "np5Vl": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yF\/r\/arfpg0J9xVr.js",
            "crossOrigin": 1
        },
        "I1Wyg": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3i4KP4\/yO\/l\/en_US\/SZb_o9LvjeN.js",
            "crossOrigin": 1
        },
        "Klc20": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yS\/r\/fPmoZFDHfot.js",
            "crossOrigin": 1
        },
        "+ClWy": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yF\/r\/rhy6VMHHsHB.js",
            "crossOrigin": 1
        },
        "6Q\/Yd": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3iGqd4\/yn\/l\/en_US\/zcxRQpdn3KC.js",
            "crossOrigin": 1
        },
        "oE4Do": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yW\/r\/STvuQMoVsgo.js",
            "crossOrigin": 1
        },
        "WmPot": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yA\/r\/KOciABKx4w7.js",
            "crossOrigin": 1
        },
        "b2zWq": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yj\/r\/1Q-q4laVvzx.js",
            "crossOrigin": 1
        },
        "Nv4jJ": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3ivjx4\/y7\/l\/en_US\/-wVIYTKb-J1.js",
            "crossOrigin": 1
        },
        "ZU1ro": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/ym\/r\/tnX8h1hMAqX.js",
            "crossOrigin": 1
        },
        "VMKqM": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yq\/r\/VX_g1H0zcZv.js",
            "crossOrigin": 1
        },
        "eyM74": {
            "type": "css",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/y3\/l\/0,cross\/0uxWhoQ2bKZ.css",
            "permanent": 1,
            "crossOrigin": 1
        },
        "0wVzo": {
            "type": "css",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/y4\/l\/0,cross\/acUhycgW0b0.css",
            "permanent": 1,
            "crossOrigin": 1
        },
        "rwXTv": {
            "type": "css",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/yq\/l\/0,cross\/x7EQi00Ge7H.css",
            "permanent": 1,
            "crossOrigin": 1
        },
        "hAqW4": {
            "type": "css",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/ya\/l\/0,cross\/01llQAe-xml.css",
            "permanent": 1,
            "crossOrigin": 1
        },
        "bTiWO": {
            "type": "css",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3\/ya\/l\/0,cross\/Jxgn8lU3xE2.css",
            "permanent": 1,
            "crossOrigin": 1
        },
        "RIWAf": {
            "type": "js",
            "src": "https:\/\/static.xx.fbcdn.net\/rsrc.php\/v3ikeI4\/yW\/l\/en_US\/jLnWPpCtWMp.js",
            "crossOrigin": 1
        }
    },
    "ixData": {},
    "gkxData": {
        "AT4kYIk7PhRqUACJJM8qs58t-WNCoM2ZYe35b1xv03xf3OtmC7RfXVIT9hWB6yTOgfA": {
            "result": false,
            "hash": "AT5oUVeShxEj-wBy"
        },
        "AT6ospK-Tdqu5qRhy-TcAU0nIA_ctyO-ghWqmAEjf7bDt3FzGNFL8C4Kn6qbsJrp6oPJYeq6bUEntlvCEgoH4eYQlTJ0DsJar1ZABa0GLxyieQ": {
            "result": false,
            "hash": "AT5lUwvU9ACQ1puA"
        },
        "AT6Afdq0Tt2jEesGOMGnSRKoZIl2eQfQBS7ISXiYFG3RHN4ykkPiZeyWuKALtD0ObEVGeeZuAFKdYpfxlBzUUPkd": {
            "result": false,
            "hash": "AT616ipsS9Q6IRps"
        },
        "AT7IsskI4XB9V3_ZpKFnRxAvs6BVPIgSDbDcq24b8ToUAOY2pCaSzuagN7f_cNx9vGp7vgNftn1_SRfogFUNGS0K": {
            "result": true,
            "hash": "AT5Na-Nz7G8XKMru"
        },
        "AT52sTP_5lkBPKbNz2mUZWsbcEDkBzQg0lQckIsVf32rCwFPbCAUTv2-qAeYwt3QMKM": {
            "result": false,
            "hash": "AT7Pq-Rl8e-_XQMy"
        },
        "AT68bJwSI-83elN-7JSMMH9zt32KbiF6pW-XMlf6NViAJ3CbAk_16Vq8cK1tl1029_ApvFwINR8hmoci3nMKFTDhDCBp1wrvYQbOKq0pCjZpqA": {
            "result": false,
            "hash": "AT7iq4cEmcKTjkfp"
        },
        "AT6DanO60hgFT7juQEF_b5acv5amdrLzodvaFbz5tWF8DGQCmmf0_a7wsRZnn4yNp9kI3S6KXc87dzKSPpUSy11k": {
            "result": false,
            "hash": "AT6MaFQR8z-lSlRA"
        }
    },
    "lid": "6523134272703508330"
}

Facebook 网站似乎以某种方式解释了此响应数据,并在页面上呈现好友姓名。我想要的数据是一路向下滚动后最终呈现的网页。但是当我使用python的“requests”模块或查看页面源代码时,大部分HTML内容都不存在。

我该如何解决这个问题?

python facebook facebook-graph-api web-scraping scrapy
3个回答
3
投票

这看起来像是某种类似 json 的响应,其中实际的 html 包含在

__html
字段中。

由于实际数据是通过这种方式返回的,因此您必须通过几个步骤来完成此操作:

  1. 加载json数据
  2. 创建一个选择器
  3. 从选择器中提取您需要的数据

例如,获取名称的一种方法可能是:

>>> data = json.loads(response_text[response_text.index('{'):])
>>> sel = Selector(text=data['domops'][0][3]['__html'])
>>> sel.xpath('//a/img/@alt').getall()
['Satish Ven', 'Sheshadri Sharma', 'Aayush Sinha', 'Pranav YS', 'Pankaj Thakur', 'Prajwal Bhadravathi Ravi', 'Sachin R Doddaguni', 'Kajal Chaudhary', 'Usha Shastri', 'Ashish Dwivedi', 'Shreesha Bhat Galimane', 'Chethan Vilas', 'Kshitija Vidya Kallesh', 'Vishesh Umesh Gujjar', 'Santosh Bhat', 'Rahul Kumar', 'Sumanth Sharma', 'Sweekruthi Kashyap', 'Vinayaka Bhat Galimane', 'Pruthvi Kalyan Reddy', 'Kushal Kushu']

请注意,抓取 facebook 并不是一个好主意,你最好使用他们的 api。


0
投票

合法的解决方案可以使用 stalkscan.com 功能,或另一个基于官方 API 的类似 Facebook 爬行网站。


-1
投票

五年过去了,您可能已经从一场车祸中幸存下来,生下了一个婴儿,到达了马里亚纳海沟,经历了抑郁症,穿越了五个工作场所,忘记了编程,或者成为了编程大师。然而,从 facebook 的乱码代码的角度来看,所有这些似乎毫无意义:)

由于我目前正在开发一个 chrome 扩展来清除网站上的所有建议/赞助内容,因此这似乎是一个适当的抓取解决方法。请记住,Facebook 不断改变其结构,由于相当奇怪的技巧,在 DOM 树中找到你的方法并不是一件容易的事。

目前,您可能会进入“事件”->“生日”,并使用正确的 DOM 选择器和逻辑选择所有日期。其中一些需要在滚动后动态加载 - 这可以使用mutationObserver来处理。通过这种方法,您可以将生日页面滚动到底部一次,并可能生成一个包含所有姓名/生日对的数组。所以该方法是半自动的。

有太多动态加载的内容以及动态更改的 id/classes + 站点随着时间的推移而不断变化,因此深入任何细节都是徒劳的。如果你能理解他们的 DOM 树,很好地掌握 DOM 操作和 CSS 选择器 + chrome 扩展制作的基本知识(这非常简单),那么你就可以实现目标:)

© www.soinside.com 2019 - 2024. All rights reserved.