Node JS HTTP代理挂断

问题描述 投票:2回答:1

我有一个http代理来代理任何网站并注入一些自定义JS文件,然后再将HTML提供给客户端。每当我尝试访问代理的网站时,它都会挂断,或者浏览器似乎加载不定。但是,当我检查HTML源代码时,就成功地注入了自定义JavaScript文件。这是代码:

const cheerio = require('cheerio');
const http = require('http');
const httpProxy = require('http-proxy');
const { ungzip } = require('node-gzip');

_initProxy(host: string) {
    let proxy = httpProxy.createProxyServer({});
    let option = {
        target: host,
        selfHandleResponse: true
    };

    proxy.on('proxyRes', function (proxyRes, req, res) {
        let body = [];
        proxyRes.on('data', function (chunk) {
            body.push(chunk);
        });
        proxyRes.on('end', async function () {
            let buffer = Buffer.concat(body);
            if (proxyRes.headers['content-encoding'] === 'gzip') {
                try {
                    let $ = null;
                    const decompressed = await ungzip(buffer);
                    const scriptTag = '<script src="my-customjs.js"></script>';
                    $ = await cheerio.load(decompressed.toString());
                    await $('body').append(scriptTag);
                    res.end($.html());
                } catch (e) {
                    console.log(e);
                }
            }
        });
    });

    let server = http.createServer(function (req, res) {
        proxy.web(req, res, option, function (e) {
            console.log(e);
        });
    });

    console.log("listening on port 5051");
    server.listen(5051);
}

有人可以告诉我是否做错了什么,看来node-http-proxy快要死了,不能再依赖它了,因为代理有时可以工作并在下次运行时死亡,具体取决于如何我多次运行服务器。

javascript node.js reverse-proxy http-proxy node-http-proxy
1个回答
0
投票

您的代码看起来不错,所以我很好奇并尝试过。

尽管您确实记录了一些错误,但是您无法处理几种情况:

  • 服务器返回没​​有响应的正文(cheerio在发生这种情况时将生成一个空的HTML正文)
  • 服务器返回未压缩的响应(您的代码将静默丢弃该响应)

我对您的代码做了一些修改。

更改初始选项

let proxy = httpProxy.createProxyServer({
    secure: false,
    changeOrigin: true
});
  • 不验证TLS证书secure: false
  • 发送正确的Host标题changeOrigin: true

删除if语句并用三元数代替

const isCompressed = proxyRes.headers['content-encoding'] === 'gzip';
const decompressed = isCompressed ? await ungzip(buffer) : buffer;

您也可以删除await上的2 cheerio,Cheerio不是异步的,并且不会返回await able。

最终代码

这是最后的代码,有效。您提到“根据我运行服务器的次数,看来node-http-proxy快要消亡了。”我没有遇到过这样的稳定​​性问题,所以如果发生这种情况,您的问题可能就在其他地方(坏ram?)

const cheerio = require('cheerio');
const http = require('http');
const httpProxy = require('http-proxy');
const { ungzip } = require('node-gzip');

const host = 'https://github.com';

let proxy = httpProxy.createProxyServer({
    secure: false,
    changeOrigin: true
});
let option = {
    target: host,
    selfHandleResponse: true
};

proxy.on('proxyRes', function (proxyRes, req, res) {

    console.log(`Proxy response with status code: ${proxyRes.statusCode} to url ${req.url}`);
    if (proxyRes.statusCode == 301) {
        throw new Error('You should probably do something here, I think there may be an httpProxy option to handle redirects');
    }
    let body = [];
    proxyRes.on('data', function (chunk) {
        body.push(chunk);
    });
    proxyRes.on('end', async function () {
        let buffer = Buffer.concat(body);
        try {
            let $ = null;
            const isCompressed = proxyRes.headers['content-encoding'] === 'gzip';
            const decompressed = isCompressed ? await ungzip(buffer) : buffer;
            const scriptTag = '<script src="my-customjs.js"></script>';
            $ = cheerio.load(decompressed.toString());
            $('body').append(scriptTag);
            res.end($.html());
        } catch (e) {
            console.log(e);
        }
    });
});

let server = http.createServer(function (req, res) {
    proxy.web(req, res, option, function (e) {
        console.log(e);
    });
});

console.log("listening on port 5051");
server.listen(5051);
© www.soinside.com 2019 - 2024. All rights reserved.