Pytest用线描绘的json

Question

我对Python比较陌生，对pytest来说真的很新。无论如何，我正在尝试编写一些测试来解析在线描述的json中的推文。这是test_cases.jsonl的简化示例：

{"contributors":null,"coordinates":null,"created_at":"Sat Aug 20 01:00:12 +0000 2016","entities":{"hashtags":[{"indices":[97,116],"text":"StandWithLouisiana"}]}}
{"contributors":null,"coordinates":null,"created_at":"Sat Aug 20 01:01:35 +0000 2016","entities":{"hashtags":[]}}

我想做的是测试如下函数：

def hashtags(t):
    return ' '.join([h['text'] for h in t['entities']['hashtags']])

我可以测试一行JSON，如下所示：

@pytest.fixture
def tweet(file='test_cases.jsonl'):
    with open(file, encoding='utf-8') as lines:
        for line in lines:
            return json.loads(line)


def test_hashtag(tweet):
    assert hashtags(tweet) == 'StandWithLouisiana'

（我只是将文件名作为此示例函数的参数）

这是因为测试通过，因为第一行通过了测试，但我基本上尝试做的是这样的事情，我不希望它在编写时工作。

def test_hashtag(tweet):
    assert hashtags(tweet) == 'StandWithLouisiana' # first tweet
    assert hashtags(tweet) == ''    # second tweet

这失败了，因为它测试第一条推文（json中的行）是否为空，而不是第二条。我认为这是因为夹具中的return，但如果我尝试yield而不是return，我得到一个yield_fixture function has more than one 'yield'错误`（并且第二行仍然失败）。

我现在正在做的解决这个问题的方法是让每一行成为一个单独的JSON文件，然后为每个文件创建一个单独的夹具。（对于较短的示例，我使用StringIO编写JSON内联）。这确实有效，但感觉不够优雅。我有一种感觉，我应该使用@pytest.mark.parametrize，但我无法理解它。我想我也尝试过pytest_generate_tests来做这件事，但它会测试每一把钥匙。是否有可能做我正在考虑的事情，或者当我有不同的断言值时创建单独的灯具会更好吗？

Answer 1

我认为最合适的方法是参数化夹具：

import json
import pathlib
import pytest


lines = pathlib.Path('data.json').read_text().split('\n')

@pytest.fixture(params=lines)
def tweet(request):
    line = request.param
    return json.loads(line)


def hashtags(t):
    return ' '.join([h['text'] for h in t['entities']['hashtags']])


def test_hashtag(tweet):
    assert hashtags(tweet) == 'StandWithLouisiana'

这将使用test_hashtag的每个返回值调用tweet一次：

$ pytest -v
...
test_spam.py::test_hashtag[{"contributors":null,"coordinates":null,"created_at":"Sat Aug 20 01:00:12 +0000 2016","entities":{"hashtags":[{"indices":[97,116],"text":"StandWithLouisiana"}]}}]
test_spam.py::test_hashtag[{"contributors":null,"coordinates":null,"created_at":"Sat Aug 20 01:01:35 +0000 2016","entities":{"hashtags":[]}}]
...

Edit: extending the fixture to provide the expected value

您可以将预期值包含在tweet夹具参数中，然后将其更改为未更改的测试。在下面的示例中，预期的标签用文件行压缩以构建形式为(line, tag)的对。 tweet fixture将行加载到字典中，传递标记，因此测试中的tweet参数变为一对值。

import json
import pathlib
import pytest


lines = pathlib.Path('data.json').read_text().split('\n')
expected_tags = ['StandWithLouisiana', '']

@pytest.fixture(params=zip(lines, expected_tags),
                ids=tuple(repr(tag) for tag in expected_tags))
def tweet(request):
    line, tag = request.param
    return (json.loads(line), tag)


def hashtags(t):
    return ' '.join([h['text'] for h in t['entities']['hashtags']])


def test_hashtag(tweet):
    data, tag = tweet
    assert hashtags(data) == tag

测试运行产生如前所述的两个测试：

test_spam.py::test_hashtag['StandWithLouisiana'] PASSED
test_spam.py::test_hashtag[''] PASSED

Edit 2: using indirect parametrization

另一种可能更干净的方法是让tweet夹具只处理从原始字符串解析推文，将参数化移动到测试本身。我正在使用indirect parametrization将原始线传递到tweet夹具：

import json
import pathlib
import pytest


lines = pathlib.Path('data.json').read_text().split('\n')
expected_tags = ['StandWithLouisiana', '']

@pytest.fixture
def tweet(request):
    line = request.param
    return json.loads(line)


def hashtags(t):
    return ' '.join([h['text'] for h in t['entities']['hashtags']])


@pytest.mark.parametrize('tweet, tag', 
                         zip(lines, expected_tags),
                         ids=tuple(repr(tag) for tag in expected_tags),
                         indirect=('tweet',))
def test_hashtag(tweet, tag):
    assert hashtags(tweet) == tag

测试运行现在也产生两个测试：

test_spam.py::test_hashtag['StandWithLouisiana'] PASSED
test_spam.py::test_hashtag[''] PASSED

Pytest用线描绘的json

问题描述投票：2回答：1

1个回答

Edit: extending the fixture to provide the expected value

Edit 2: using indirect parametrization

最新问题

Pytest用线描绘的json

问题描述 投票：2回答：1

1个回答

Edit: extending the fixture to provide the expected value

Edit 2: using indirect parametrization

最新问题

问题描述投票：2回答：1