我一直在尝试批处理精灵渲染,并且我有一个可以在台式机上正常工作的解决方案。但是,在集成的Intel UHD 620笔记本电脑上尝试使用它时,会收到以下性能警告:
[21:42:03 error] OpenGL: API - Performance - Recompiling fragment shader for program 27
[21:42:03 error] OpenGL: API - Performance - multisampled FBO 0->1
大概是由于这些性能警告的来源,在我的专用图形卡计算机上花费1-2毫秒的帧在我的笔记本电脑上花费了约100毫秒。
这是我的渲染器代码:
BatchedSpriteRenderer::BatchedSpriteRenderer(ResourceManager &resource_manager)
: resource_manager(&resource_manager),
max_sprites(100000),
vertex_array(std::make_unique<VertexArray>()),
vertex_buffer(std::make_unique<VertexBuffer>())
{
resource_manager.load_shader("batched_texture",
"shaders/texture_batched.vert",
"shaders/texture.frag");
std::vector<unsigned int> sprite_indices;
for (int i = 0; i < max_sprites; ++i)
{
unsigned int sprite_number = i * 4;
sprite_indices.push_back(0 + sprite_number);
sprite_indices.push_back(1 + sprite_number);
sprite_indices.push_back(2 + sprite_number);
sprite_indices.push_back(2 + sprite_number);
sprite_indices.push_back(3 + sprite_number);
sprite_indices.push_back(0 + sprite_number);
}
element_buffer = std::make_unique<ElementBuffer>(sprite_indices.data(), max_sprites * 6);
VertexBufferLayout layout;
layout.push<float>(2);
layout.push<float>(2);
layout.push<float>(4);
vertex_array->add_buffer(*vertex_buffer, layout);
}
void BatchedSpriteRenderer::draw(const std::string &texture,
const std::vector<glm::mat4> &transforms,
const glm::mat4 &view)
{
vertex_array->bind();
auto shader = resource_manager->shader_store.get("batched_texture");
shader->bind();
std::vector<SpriteVertex> vertices;
vertices.reserve(transforms.size() * 4);
for (const auto &transform : transforms)
{
glm::vec4 transformed_position = transform * glm::vec4(0.0, 1.0, 1.0, 1.0);
vertices.push_back({glm::vec2(transformed_position.x, transformed_position.y),
glm::vec2(0.0, 1.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)});
transformed_position = transform * glm::vec4(0.0, 0.0, 1.0, 1.0);
vertices.push_back({glm::vec2(transformed_position.x, transformed_position.y),
glm::vec2(0.0, 0.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)});
transformed_position = transform * glm::vec4(1.0, 0.0, 1.0, 1.0);
vertices.push_back({glm::vec2(transformed_position.x, transformed_position.y),
glm::vec2(1.0, 0.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)});
transformed_position = transform * glm::vec4(1.0, 1.0, 1.0, 1.0);
vertices.push_back({glm::vec2(transformed_position.x, transformed_position.y),
glm::vec2(1.0, 1.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)});
}
vertex_buffer->add_data(vertices.data(),
sizeof(SpriteVertex) * vertices.size(),
GL_DYNAMIC_DRAW);
shader->set_uniform_mat4f("u_view", view);
shader->set_uniform_1i("u_texture", 0);
resource_manager->texture_store.get(texture)->bind();
glDrawElements(GL_TRIANGLES, transforms.size() * 6, GL_UNSIGNED_INT, 0);
}
希望我的抽象应该是自我解释。每个抽象类(VertexArray
,VertexBuffer
,ElementBuffer
,VertexBufferLayout
)管理其等效OpenGL对象的生存期。
以下是使用的着色器:texture_batched.vert
#version 430 core
layout(location = 0)in vec2 v_position;
layout(location = 1)in vec2 v_tex_coord;
layout(location = 2)in vec4 v_color;
out vec4 color;
out vec2 tex_coord;
uniform mat4 u_view;
void main()
{
tex_coord = v_tex_coord;
gl_Position = u_view * vec4(v_position, 0.0, 1.0);
color = v_color;
}
texture.frag
#version 430 core
in vec4 color;
in vec2 tex_coord;
out vec4 frag_color;
uniform sampler2D u_texture;
void main()
{
frag_color = texture(u_texture, tex_coord);
frag_color *= color;
}
是什么导致这些性能问题,以及如何解决它们?
编辑:我完全忘了提到用它渲染的实际图像完全被弄乱了,当我到达台式机时,我将尝试抓取其正常工作的屏幕快照,但这是残缺的版本:
应该是一个200x200白色圆圈的整齐的网格。
编辑2:我在另一台计算机上尝试过,这次是使用GTX 1050 Ti,它也坏了。这次没有错误消息或警告。该警告可能不相关。
据我所知,它最终与OpenGL无关。
在绘图功能中,我创建了一个称为vertices
的向量,然后将所有顶点放入其中。由于某种原因,当我每帧重新创建矢量时,以下push_back
调用未正确添加到矢量中。 SpriteVertex
结构的成员变得混在一起。因此,不是正确的布局:
pos tex_coord color
pos tex_coord color
pos tex_coord color
pos tex_coord color
它被填充为以下布局:
pos tex_coord color
tex_coord pos color
tex_coord pos color
tex_coord pos color
或者至少有这种效果。
我进行了更改,以使vertices
向量成为BatchedSpriteRenderer
类的成员,并为最大可能的顶点保留空间。
void BatchedSpriteRenderer::draw(const std::string &texture,
const std::vector<glm::mat4> &transforms,
const glm::mat4 &view)
{
vertex_array->bind();
auto shader = resource_manager->shader_store.get("batched_texture");
shader->bind();
for (unsigned int i = 0; i < transforms.size(); ++i)
{
const auto &transform = transforms[i];
glm::vec4 transformed_position = transform * glm::vec4(0.0, 1.0, 1.0, 1.0);
vertices[i * 4] = {glm::vec2(transformed_position.x,
transformed_position.y),
glm::vec2(0.0, 1.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)};
transformed_position = transform * glm::vec4(0.0, 0.0, 1.0, 1.0);
vertices[i * 4 + 1] = {glm::vec2(transformed_position.x,
transformed_position.y),
glm::vec2(0.0, 0.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)};
transformed_position = transform * glm::vec4(1.0, 0.0, 1.0, 1.0);
vertices[i * 4 + 2] = {glm::vec2(transformed_position.x,
transformed_position.y),
glm::vec2(1.0, 0.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)};
transformed_position = transform * glm::vec4(1.0, 1.0, 1.0, 1.0);
vertices[i * 4 + 3] = {glm::vec2(transformed_position.x,
transformed_position.y),
glm::vec2(1.0, 1.0),
glm::vec4(1.0, 1.0, 1.0, 1.0)};
}
vertex_buffer->add_data(vertices.data(),
sizeof(SpriteVertex) * (transforms.size() * 4),
GL_DYNAMIC_DRAW);
shader->set_uniform_mat4f("u_view", view);
shader->set_uniform_1i("u_texture", 0);
resource_manager->texture_store.get(texture)->bind();
glDrawElements(GL_TRIANGLES, transforms.size() * 6, GL_UNSIGNED_INT, 0);
}