我怎样才能抓取这个websocket?

问题描述 投票:0回答:1

我正在尝试使用 node.js 来抓取 websocket,但对我来说这是不可能的。

我正在尝试从该网站抓取 websocket:https://dexscreener.com/new-pairs

这实际上打开了一个指向 wss://io.dexscreener.com/dex/screener/pairs/h24/1?rankBy[key]=trendingScoreH6&rankBy[order]=desc&filters[pairAge][max]=24&filters[流动性][min]=1000(例如)

问题是 websocket 受 Cloudflare 保护,我尝试了 30-40 种不同的方法,但仍然没有得到正确的解决方案。

这是我正在尝试的代码:

const WebSocket = require('ws');
const crypto = require('crypto');
const tls = require('tls');
const UserAgent = require('user-agents');

class ListPage {
    constructor() {
        this.base_url = 'wss://io.dexscreener.com/dex/screener/pairs/h24/1?rankBy[key]=trendingScoreH6&rankBy[order]=desc&filters[chainIds][0]=solana&filters[liquidity][min]=1000&filters[pairAge][max]=24';
    }

    generateWebSocketKey() {
        const buffer = crypto.randomBytes(16);
        const key = buffer.toString('base64');
        return key;
    }

    openConnection(num) {
        const header = {
            'Sec-WebSocket-Key': this.generateWebSocketKey(),
            'Sec-WebSocket-Version': '1',
            'Sec-WebSocket-Extensions': 'permessage-deflate; client_max_window_bits',
            'Origin': 'wss://io.dexscreener.com',
            'User-Agent': UserAgent.random().toString(),
        };

        const url = this.base_url;
        console.info(`Request ${num}: ${url}`, header);

        const defaultCiphers = tls.DEFAULT_CIPHERS.split(':');
        const shuffledCiphers = [
            defaultCiphers[0],
            defaultCiphers[2],
            defaultCiphers[1],
            ...defaultCiphers.slice(3)
        ].join(':');

        const ws = new WebSocket(url, {
            headers: header,
            ciphers: shuffledCiphers,
        });

        ws.on('open', function open() {
            console.log('Connected through proxy');
        });

        ws.on('message', function incoming(data) {
            console.log(data);
        });

        ws.on('error', function error(err) {
            console.log('Error: ', err);
        });

        return ws;
    }
}

(async () => {
    // Usage example with proxy
    const listPage = new ListPage();
    const wsConnection = listPage.openConnection(1);
    console.log(`Listening..`);
})();

无论如何,它总是返回 403(Cloudflare 受保护)。您对如何实现这项工作有什么想法吗?我正在使用node.js

websocket scrape
1个回答
0
投票
Have to add headers.

import WebSocket from "ws";
import fs from "fs";

const url =
  "wss://io.dexscreener.com/dex/screener/pairs/h24/1?rankBy[key]=trendingScoreH6&rankBy[order]=desc&filters[liquidity][min]=1000&filters[pairAge][max]=24";

const headers = {
  Host: "io.dexscreener.com",
  Origin: "https://dexscreener.com",
  "User-Agent":
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:123.0) Gecko/20100101 Firefox/123.0",
};

const ws = new WebSocket(url, { headers });

ws.on("open", () => {
  console.log("Connected to WebSocket");
});

ws.on("message", (data: any) => {
  if (data == "ping") {
    console.log("ping");
    ws.send("pong");
    return;
  }
}
© www.soinside.com 2019 - 2024. All rights reserved.