通过Python请求访问Moodle

Question

大家好，stackoverflowers，

我目前正在编写一个 python 脚本，该脚本应该从 Moodle 中的课程下载反馈文件，

脚本可以成功登录moodle。但对于每个 GET 请求，它都会将我重定向到未请求的某个页面。

import time
import requests
import configparser
import bs4


moodle_url = "https://moodle.uni-bielefeld.de/"


def login(session, id, password):

    def get_relay_state_saml(html):
        soup = bs4.BeautifulSoup(html, features="html.parser")
        divs = soup.findAll("div")
        divs = divs[0]
        relay_tag = divs.contents[1]
        saml_tag = divs.contents[3]
        relay_value = relay_tag['value']
        saml_value = saml_tag['value']
        return {'RelayState': str(relay_value), 'SAMLResponse': str(saml_value)}

    response = session.get('https://moodle.uni-bielefeld.de/auth/shibboleth/index.php')

    # select university
    payload_select = {
        'user_idp': 'https://shibboleth.uni-bielefeld.de/idp/shibboleth',
        'Select': 'Select'
    }
    # redirect to login page
    response_select = session.post(response.url, data=payload_select)
    # print(response_select.text)
    # login form
    payload_login = {
        'j_username': id,
        'j_password': password,
        '_eventId_proceed': ''
    }

    response_login = session.post(response_select.url, data=payload_login)
    # print(response_login.url)
    # print(response_login.text)

    payload_accept = {
        '_shib_idp_consentIds': 'givenname',
        '_eventId_proceed': 'Accept'
    }

    response_logged = session.post(response_login.url, data=payload_accept)
    # print(response_logged.url)
    # print(response_logged.text)
    # print(get_relay_state_saml(response_logged.text))

    payload_saml = get_relay_state_saml(response_logged.text)
    # print(payload_saml)

    response_moodle = session.post("https://moodle.uni-bielefeld.de/Shibboleth.sso/SAML2/POST", data=payload_saml)
    # print(response_moodle.url)
    # print(response_moodle.text)


def check_for_new_feedback(session):

    def get_uebungsblatter(html):
        soup = bs4.BeautifulSoup(html, features="html.parser")

        # print(soup)
        blaetter = []
        # retrieve all
        return blaetter
    # this is the part where i always get redirected to the profile editing page
    response = session.get(moodle_url)

    blaetter = get_uebungsblatter(response.text)


def scraper():
    config = configparser.ConfigParser()
    config.read('config.ini')
    student_id = config['config']['student_id']
    password = config['config']['password']
    mail = config['config']['mail']
    last_sent = int(config['config']['last_sent'])
    wait_between_calls = int(config['config']['wait_between_calls'])

    with requests.session() as session:
        login(session, student_id, password)
        while True:
            check_for_new_feedback(session)
            time.sleep(60*wait_between_calls)


if __name__ == '__main__':
    scraper()

经过一些研究和逆向工程，我发现通过使用浏览器单击某个课程，它会发送带有课程 ID 的正确链接的 GET 请求。不久之后。它向此 URL 发送 POST 请求

https://moodle.uni-bielefeld.de/lib/ajax/service.php?sesskey=PWSjDeVJNC&info=core_courseformat_get_state

给定这个有效负载

[{"index":0,"methodname":"core_courseformat_get_state","args":{"courseid":2891}}]

我尝试发送这些请求以访问我想要的页面，但它不断将我重定向到该页面。

如果有人能够提供帮助或至少了解这里发生的情况并告诉我代码有什么问题，我将很高兴收到您的来信。

非常感谢，

另一个 stackoverflower

Answer 1

这是我总是被重定向到个人资料编辑页面的部分

在 Moodle 中，用户可以有必填字段 - 这些字段可以随时添加为自定义字段

如果它们为空，则用户在登录后将被重定向以编辑其个人资料

还有权限检查，因此某些页面可能无法按预期显示，或者您可能会再次重定向

网络服务

要从外部访问数据，您应该使用Web服务

设置起来有点麻烦，但是设置好之后使用起来会方便很多

转至站点管理 > 服务器 > Web 服务 > 概述

或直接访问 http://yourwebsite.com/admin/settings.php?section=webservicesoverview

这将显示用于设置网络服务的检查列表

API文档

有关现有 Web 服务功能的列表，请访问

站点管理 > 服务器 > Web 服务 > Api 文档

或直接访问 http://yourmoodlesite.com/admin/webservice/documentation.php

取决于您的要求，但有一些反馈功能

定制网络服务

如果您找不到适合您的现有功能，那么您还可以创建自己的网络服务功能

看

如何为moodle创建插件以通过带有休息请求的选项字段获取用户

通过Python请求访问Moodle

问题描述投票：0回答：1

1个回答

最新问题

通过Python请求访问Moodle

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1