所以我一直纠结于如何在 Ruby 上配置 GRPC 客户端重试。要重现的代码取自https://github.com/grpc/grpc/tree/master/examples/ruby/errors_and_cancellation
pb 文件可以在这里找到 https://github.com/grpc/grpc/tree/master/examples/ruby/lib
客户代码
this_dir = File.expand_path(File.dirname(__FILE__))
lib_dir = File.join(File.dirname(this_dir), 'lib')
$LOAD_PATH.unshift(lib_dir) unless $LOAD_PATH.include?(lib_dir)
require 'grpc'
require 'route_guide_services_pb'
require 'json'
include Routeguide
def run_get_feature_expect_error(stub)
resp = stub.get_feature(Point.new)
end
def run_list_features_expect_error(stub)
resps = stub.list_features(Rectangle.new)
# NOOP iteration to pick up error
resps.each { }
end
def run_record_route_expect_error(stub)
stub.record_route([])
end
def run_route_chat_expect_error(stub)
resps = stub.route_chat([])
# NOOP iteration to pick up error
resps.each { }
end
def main
stub = RouteGuide::Stub.new('127.0.0.1:50051', :this_channel_is_insecure, channel_args: {
"grpc.enable_retries" => 1,
"grpc.service_config" => JSON.generate(
methodConfig: [
{
name: [{service: "routeguide.RouteGuide"}],
retryPolicy: {
retryableStatusCodes: ["UNAVAILABLE", "INTERNAL", "UNKNOWN", "NOT_FOUND"],
maxAttempts: 3,
initialBackoff: "1s",
backoffMultiplier: 2.0,
maxBackoff: "0.3s"
}
}
]
)
})
begin
run_record_route_expect_error(stub)
rescue GRPC::BadStatus => e
puts "===== run_record_route_expect_error exception: ====="
puts e.inspect
puts "e.code: #{e.code}"
puts "e.details: #{e.details}"
puts "e.metadata: #{e.metadata}"
puts "================================="
end
end
main
服务器代码
this_dir = File.expand_path(File.dirname(__FILE__))
lib_dir = File.join(File.dirname(this_dir), 'lib')
$LOAD_PATH.unshift(lib_dir) unless $LOAD_PATH.include?(lib_dir)
require 'grpc'
require 'route_guide_services_pb'
include Routeguide
include GRPC::Core::StatusCodes
# CanellingandErrorReturningServiceImpl provides an implementation of the RouteGuide service.
class CancellingAndErrorReturningServerImpl < RouteGuide::Service
# def get_feature
# Note get_feature isn't implemented in this subclass, so the server
# will get a gRPC UNIMPLEMENTED error when it's called.
def list_features(rectangle, _call)
raise "string appears on the client in the 'details' field of a 'GRPC::Unknown' exception"
end
def record_route(call)
puts "incoming request record_route"
raise GRPC::BadStatus.new_status_exception(UNAVAILABLE)
end
def route_chat(notes)
raise GRPC::BadStatus.new_status_exception(ABORTED, details = 'arbitrary', metadata = {somekey: 'val'})
end
end
def main
puts "starting grpc server"
port = '127.0.0.1:50051'
s = GRPC::RpcServer.new
s.add_http2_port(port, :this_port_is_insecure)
GRPC.logger.info("... running insecurely on #{port}")
puts "running on #{port}"
s.handle(CancellingAndErrorReturningServerImpl.new)
s.run_till_terminated
end
main
客户端输出
GRPC_VERBOSITY=debug GRPC_TRACE=retry bundle exec errors_and_cancellation/error_examples_client.rb
D0419 14:46:40.892379000 4332586368 config.cc:204] gRPC EXPERIMENT call_status_override_on_cancellation OFF (default:OFF)
D0419 14:46:40.892407000 4332586368 config.cc:204] gRPC EXPERIMENT call_v3 OFF (default:OFF)
D0419 14:46:40.892410000 4332586368 config.cc:204] gRPC EXPERIMENT canary_client_privacy OFF (default:OFF)
D0419 14:46:40.892412000 4332586368 config.cc:204] gRPC EXPERIMENT chaotic_good OFF (default:OFF)
D0419 14:46:40.892414000 4332586368 config.cc:204] gRPC EXPERIMENT client_idleness ON (default:ON)
D0419 14:46:40.892415000 4332586368 config.cc:204] gRPC EXPERIMENT client_privacy OFF (default:OFF)
D0419 14:46:40.892417000 4332586368 config.cc:204] gRPC EXPERIMENT event_engine_client OFF (default:OFF)
D0419 14:46:40.892419000 4332586368 config.cc:204] gRPC EXPERIMENT event_engine_dns OFF (default:OFF)
D0419 14:46:40.892441000 4332586368 config.cc:204] gRPC EXPERIMENT event_engine_listener ON (default:ON)
D0419 14:46:40.892451000 4332586368 config.cc:204] gRPC EXPERIMENT free_large_allocator OFF (default:OFF)
D0419 14:46:40.892453000 4332586368 config.cc:204] gRPC EXPERIMENT http2_stats_fix ON (default:ON)
D0419 14:46:40.892455000 4332586368 config.cc:204] gRPC EXPERIMENT keepalive_fix OFF (default:OFF)
D0419 14:46:40.892457000 4332586368 config.cc:204] gRPC EXPERIMENT keepalive_server_fix OFF (default:OFF)
D0419 14:46:40.892459000 4332586368 config.cc:204] gRPC EXPERIMENT monitoring_experiment ON (default:ON)
D0419 14:46:40.892460000 4332586368 config.cc:204] gRPC EXPERIMENT multiping OFF (default:OFF)
D0419 14:46:40.892637000 4332586368 config.cc:204] gRPC EXPERIMENT peer_state_based_framing OFF (default:OFF)
D0419 14:46:40.892643000 4332586368 config.cc:204] gRPC EXPERIMENT pending_queue_cap ON (default:ON)
D0419 14:46:40.892645000 4332586368 config.cc:204] gRPC EXPERIMENT pick_first_happy_eyeballs ON (default:ON)
D0419 14:46:40.892647000 4332586368 config.cc:204] gRPC EXPERIMENT promise_based_client_call OFF (default:OFF)
D0419 14:46:40.892649000 4332586368 config.cc:204] gRPC EXPERIMENT promise_based_inproc_transport OFF (default:OFF)
D0419 14:46:40.892651000 4332586368 config.cc:204] gRPC EXPERIMENT promise_based_server_call OFF (default:OFF)
D0419 14:46:40.892652000 4332586368 config.cc:204] gRPC EXPERIMENT registered_method_lookup_in_transport ON (default:ON)
D0419 14:46:40.892654000 4332586368 config.cc:204] gRPC EXPERIMENT rfc_max_concurrent_streams OFF (default:OFF)
D0419 14:46:40.892655000 4332586368 config.cc:204] gRPC EXPERIMENT round_robin_delegate_to_pick_first ON (default:ON)
D0419 14:46:40.892657000 4332586368 config.cc:204] gRPC EXPERIMENT rstpit OFF (default:OFF)
D0419 14:46:40.892659000 4332586368 config.cc:204] gRPC EXPERIMENT schedule_cancellation_over_write OFF (default:OFF)
D0419 14:46:40.892661000 4332586368 config.cc:204] gRPC EXPERIMENT server_privacy OFF (default:OFF)
D0419 14:46:40.892683000 4332586368 config.cc:204] gRPC EXPERIMENT tcp_frame_size_tuning OFF (default:OFF)
D0419 14:46:40.892685000 4332586368 config.cc:204] gRPC EXPERIMENT tcp_rcv_lowat OFF (default:OFF)
D0419 14:46:40.892686000 4332586368 config.cc:204] gRPC EXPERIMENT trace_record_callops OFF (default:OFF)
D0419 14:46:40.892688000 4332586368 config.cc:204] gRPC EXPERIMENT unconstrained_max_quota_buffer_size OFF (default:OFF)
D0419 14:46:40.892690000 4332586368 config.cc:204] gRPC EXPERIMENT v3_backend_metric_filter OFF (default:OFF)
D0419 14:46:40.892691000 4332586368 config.cc:204] gRPC EXPERIMENT v3_channel_idle_filters OFF (default:OFF)
D0419 14:46:40.892693000 4332586368 config.cc:204] gRPC EXPERIMENT v3_compression_filter OFF (default:OFF)
D0419 14:46:40.892704000 4332586368 config.cc:204] gRPC EXPERIMENT v3_server_auth_filter OFF (default:OFF)
D0419 14:46:40.892705000 4332586368 config.cc:204] gRPC EXPERIMENT work_serializer_clears_time_cache ON (default:ON)
D0419 14:46:40.892707000 4332586368 config.cc:204] gRPC EXPERIMENT work_serializer_dispatch OFF (default:OFF)
D0419 14:46:40.892709000 4332586368 config.cc:204] gRPC EXPERIMENT write_size_cap ON (default:ON)
D0419 14:46:40.892710000 4332586368 config.cc:204] gRPC EXPERIMENT write_size_policy ON (default:ON)
D0419 14:46:40.892712000 4332586368 config.cc:204] gRPC EXPERIMENT wrr_delegate_to_pick_first ON (default:ON)
D0419 14:46:40.892767000 4332586368 ev_posix.cc:113] Using polling engine: poll
I0419 14:46:40.892782000 4332586368 rb_grpc.c:331] GRPC_RUBY: grpc_ruby_init_threads g_bg_thread_init_done=0
D0419 14:46:40.892821000 4332586368 rb_grpc.c:351] GRPC_RUBY: grpc_ruby_init - g_enable_fork_support=0 prev g_grpc_ruby_init_count:0
D0419 14:46:40.892842000 4332586368 lb_policy_registry.cc:46] registering LB policy factory for "priority_experimental"
D0419 14:46:40.892847000 4332586368 lb_policy_registry.cc:46] registering LB policy factory for "outlier_detection_experimental"
D0419 14:46:40.892850000 4332586368 lb_policy_registry.cc:46] registering LB policy factory for "weighted_target_experimental"
D0419 14:46:40.892852000 4332586368 lb_policy_registry.cc:46] registering LB policy factory for "pick_first"
D0419 14:46:40.892854000 4332586368 lb_policy_registry.cc:46] registering LB policy factory for "round_robin"
D0419 14:46:40.892857000 4332586368 lb_policy_registry.cc:46] registering LB policy factory for "weighted_round_robin"
D0419 14:46:40.892870000 4332586368 lb_policy_registry.cc:46] registering LB policy factory for "grpclb"
D0419 14:46:40.892878000 4332586368 dns_resolver_plugin.cc:52] Using ares dns resolver
D0419 14:46:40.892886000 4332586368 lb_policy_registry.cc:46] registering LB policy factory for "rls_experimental"
D0419 14:46:40.892895000 4332586368 lb_policy_registry.cc:46] registering LB policy factory for "xds_cluster_manager_experimental"
D0419 14:46:40.892897000 4332586368 lb_policy_registry.cc:46] registering LB policy factory for "xds_cluster_impl_experimental"
D0419 14:46:40.892900000 4332586368 lb_policy_registry.cc:46] registering LB policy factory for "cds_experimental"
D0419 14:46:40.892902000 4332586368 lb_policy_registry.cc:46] registering LB policy factory for "xds_override_host_experimental"
D0419 14:46:40.892906000 4332586368 lb_policy_registry.cc:46] registering LB policy factory for "xds_wrr_locality_experimental"
D0419 14:46:40.892909000 4332586368 lb_policy_registry.cc:46] registering LB policy factory for "ring_hash_experimental"
D0419 14:46:40.893046000 4332586368 certificate_provider_registry.cc:33] registering certificate provider factory for "file_watcher"
D0419 14:46:40.893080000 4332586368 channel_init.cc:157] Filter server-auth not registered, but is referenced in the after clause of grpc-server-authz when building channel stack SERVER_CHANNEL
I0419 14:46:40.893698000 6143455232 retry_filter_legacy_call_data.cc:1504] chand=0x106a83bc0 calld=0x126860a80: created call
I0419 14:46:40.893714000 6143455232 retry_filter_legacy_call_data.cc:1588] chand=0x106a83bc0 calld=0x126860a80: batch started from surface: SEND_INITIAL_METADATA{:path: /routeguide.RouteGuide/RecordRoute, GrpcRegisteredMethod: (nil), WaitForReady: false}
I0419 14:46:40.893718000 6143455232 retry_filter_legacy_call_data.cc:1824] chand=0x106a83bc0 calld=0x126860a80: adding pending batch at index 0
I0419 14:46:40.893721000 6143455232 retry_filter_legacy_call_data.cc:1699] chand=0x106a83bc0 calld=0x126860a80: creating call attempt
I0419 14:46:40.893724000 6143455232 retry_filter_legacy_call_data.cc:147] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000: created attempt, lb_call=0x126671980
I0419 14:46:40.893727000 6143455232 retry_filter_legacy_call_data.cc:529] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000: constructing retriable batches
I0419 14:46:40.893730000 6143455232 retry_filter_legacy_call_data.cc:745] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000: creating batch 0x126622f50
I0419 14:46:40.893771000 6143455232 retry_filter_legacy_call_data.cc:326] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000: adding batch (start replayable pending batch on call attempt): SEND_INITIAL_METADATA{:path: /routeguide.RouteGuide/RecordRoute, GrpcRegisteredMethod: (nil), WaitForReady: false}
I0419 14:46:40.893775000 6143455232 retry_filter_legacy_call_data.cc:539] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000: starting 1 retriable batches on lb_call=0x126671980
D0419 14:46:40.893905000 6142881792 rb_channel.c:738] GRPC_RUBY: run_poll_channels_loop - create connection polling thread
D0419 14:46:40.893912000 6142881792 rb_channel.c:657] GRPC_RUBY: run_poll_channels_loop_no_gil - begin
I0419 14:46:40.894537000 4332586368 retry_filter_legacy_call_data.cc:1295] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000 batch_data=0x126622f50: got on_complete, error=OK, batch= SEND_INITIAL_METADATA{user-agent: grpc-ruby/1.62.0 grpc-c/39.0.0 (osx; chttp2), :authority: 127.0.0.1:50051, :path: /routeguide.RouteGuide/RecordRoute, GrpcRegisteredMethod: (nil), WaitForReady: false, grpc-accept-encoding: identity, deflate, gzip, te: trailers, content-type: application/grpc, :scheme: http, :method: POST}
I0419 14:46:40.894549000 4332586368 retry_filter_legacy_call_data.cc:1949] chand=0x106a83bc0 calld=0x126860a80: completed pending batch at index 0
I0419 14:46:40.894552000 4332586368 retry_filter_legacy_call_data.cc:1892] chand=0x106a83bc0 calld=0x126860a80: clearing pending batch
I0419 14:46:40.894555000 4332586368 retry_filter_legacy_call_data.cc:766] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000: destroying batch 0x126622f50
I0419 14:46:40.894580000 4332586368 retry_filter_legacy_call_data.cc:1588] chand=0x106a83bc0 calld=0x126860a80: batch started from surface: SEND_TRAILING_METADATA{} RECV_INITIAL_METADATA RECV_MESSAGE RECV_TRAILING_METADATA
I0419 14:46:40.894584000 4332586368 retry_filter_legacy_call_data.cc:1824] chand=0x106a83bc0 calld=0x126860a80: adding pending batch at index 2
I0419 14:46:40.894586000 4332586368 retry_filter_legacy_call_data.cc:1708] chand=0x106a83bc0 calld=0x126860a80: starting batch on attempt=0x12681f000
I0419 14:46:40.894588000 4332586368 retry_filter_legacy_call_data.cc:529] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000: constructing retriable batches
I0419 14:46:40.894590000 4332586368 retry_filter_legacy_call_data.cc:745] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000: creating batch 0x1366bcfb0
I0419 14:46:40.894594000 4332586368 retry_filter_legacy_call_data.cc:326] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000: adding batch (start replayable pending batch on call attempt): SEND_TRAILING_METADATA{} RECV_INITIAL_METADATA RECV_MESSAGE RECV_TRAILING_METADATA
I0419 14:46:40.894650000 4332586368 retry_filter_legacy_call_data.cc:539] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000: starting 1 retriable batches on lb_call=0x126671980
I0419 14:46:40.894688000 4332586368 retry_filter_legacy_call_data.cc:1295] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000 batch_data=0x1366bcfb0: got on_complete, error=OK, batch= SEND_TRAILING_METADATA{} RECV_INITIAL_METADATA RECV_MESSAGE RECV_TRAILING_METADATA
I0419 14:46:40.894692000 4332586368 retry_filter_legacy_call_data.cc:1949] chand=0x106a83bc0 calld=0x126860a80: completed pending batch at index 2
I0419 14:46:40.899799000 6142881792 retry_filter_legacy_call_data.cc:839] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000 batch_data=0x1366bcfb0: got recv_initial_metadata_ready, error=OK
I0419 14:46:40.899816000 6142881792 retry_filter_legacy_call_data.cc:1967] chand=0x106a83bc0 calld=0x126860a80: committing retries
I0419 14:46:40.899818000 6142881792 retry_filter_legacy_call_data.cc:1766] chand=0x106a83bc0 calld=0x126860a80: destroying send_initial_metadata
I0419 14:46:40.899822000 6142881792 retry_filter_legacy_call_data.cc:1785] chand=0x106a83bc0 calld=0x126860a80: destroying send_trailing_metadata
I0419 14:46:40.899824000 6142881792 retry_filter_legacy_call_data.cc:243] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000: retry state no longer needed; moving LB call to parent and unreffing the call attempt
I0419 14:46:40.899827000 6142881792 retry_filter_legacy_call_data.cc:1949] chand=0x106a83bc0 calld=0x126860a80: invoking recv_initial_metadata_ready for pending batch at index 2
I0419 14:46:40.899834000 6142881792 retry_filter_legacy_call_data.cc:938] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000 batch_data=0x1366bcfb0: got recv_message_ready, error=OK
I0419 14:46:40.899869000 6142881792 retry_filter_legacy_call_data.cc:1949] chand=0x106a83bc0 calld=0x126860a80: invoking recv_message_ready for pending batch at index 2
I0419 14:46:40.899877000 6142881792 retry_filter_legacy_call_data.cc:1132] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000 batch_data=0x1366bcfb0: got recv_trailing_metadata_ready, error=OK
I0419 14:46:40.899880000 6142881792 retry_filter_legacy_call_data.cc:1159] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000: call finished, status=UNAVAILABLE server_pushback=N/A is_lb_drop=0 stream_network_state=N/A
I0419 14:46:40.899883000 6142881792 retry_filter_legacy_call_data.cc:602] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000: retries already committed
I0419 14:46:40.899885000 6142881792 retry_filter_legacy_call_data.cc:1949] chand=0x106a83bc0 calld=0x126860a80: invoking recv_trailing_metadata_ready for pending batch at index 2
I0419 14:46:40.899943000 6142881792 retry_filter_legacy_call_data.cc:1892] chand=0x106a83bc0 calld=0x126860a80: clearing pending batch
I0419 14:46:40.899949000 6142881792 retry_filter_legacy_call_data.cc:766] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000: destroying batch 0x1366bcfb0
I0419 14:46:40.899951000 6142881792 retry_filter_legacy_call_data.cc:176] chand=0x106a83bc0 calld=0x126860a80 attempt=0x12681f000: destroying call attempt
I0419 14:46:40.902168000 4332586368 retry_filter_legacy_call_data.cc:1766] chand=0x106a83bc0 calld=0x126860a80: destroying send_initial_metadata
I0419 14:46:40.902181000 4332586368 retry_filter_legacy_call_data.cc:1785] chand=0x106a83bc0 calld=0x126860a80: destroying send_trailing_metadata
===== run_record_route_expect_error exception: =====
#<GRPC::Unavailable: 14:unknown cause. debug_error_string:{UNKNOWN:Error received from peer ipv4:127.0.0.1:50051 {created_time:"2024-04-19T14:46:40.899962+07:00", grpc_status:14, grpc_message:"unknown cause"}}>
e.code: 14
e.details: unknown cause
e.metadata: {}
=================================
D0419 14:46:40.902432000 4332586368 rb_channel.c:703] GRPC_RUBY: run_poll_channels_loop_unblocking_func - begin aborting connection polling
I0419 14:46:40.902623000 4332586368 work_stealing_thread_pool.cc:269] WorkStealingThreadPoolImpl::Quiesce
D0419 14:46:40.902709000 4332586368 rb_channel.c:723] GRPC_RUBY: cq shutdown on global polling cq. pid: 35442
D0419 14:46:40.902718000 4332586368 rb_channel.c:728] GRPC_RUBY: run_poll_channels_loop_unblocking_func - end aborting connection polling
D0419 14:46:40.902737000 6142881792 rb_channel.c:687] GRPC_RUBY: run_poll_channels_loop_no_gil - exit connection polling loop
服务器输出
└─Δ bundle exec errors_and_cancellation/error_examples_server.rb
starting grpc server
running on 127.0.0.1:50051
incoming request record_route
我期望服务器会有输出
incoming request record_route
incoming request record_route
incoming request record_route
incoming request record_route
因为客户端应该重试调用。哪一部分是错误的? 我使用此示例代码在 Golang 上尝试了服务配置https://github.com/grpc/grpc-go/tree/b78c0ebf1e21da5423319c19541934ca000e2aa6/examples/features/retry,它工作完美
我现在正在努力解决同样的问题,但我相信问题出在 gRPC ruby 服务器端。
例如这个配置/客户端
host = "localhost:1111"
channel_args = {
"grpc.enable_retries" => 1,
"grpc.service_config" => JSON.generate(
methodConfig: [
name: [{ service: "" }],
timeout: '5s',
retryPolicy: {
retryableStatusCodes:
%w(
FAILED_PRECONDITION
ABORTED
),
maxAttempts: 3,
initialBackoff: "0.2s",
backoffMultiplier: 2.0,
maxBackoff: "1s"
}
]
)
}
stub = ::SimpleService::V1::SimpleRetryAPI::Stub.new(host,
:this_channel_is_insecure,
channel_args: channel_args)
指向正在运行的 gRPC Python 服务器,并且重试按预期发生。
如果我使用 ruby 服务器提供相同的 gRPC 服务,则不会发生重试。
您上面的配置对我来说似乎是正确的 - 您可以考虑使用 如果问题与服务名称匹配,请使用
name: [{service: ""}],
而不是 name: [{service: "routeguide.RouteGuide"}],
。这里有一些关于匹配如何发生的细节:https://github.com/grpc/grpc-proto/blob/master/grpc/service_config/service_config.proto#L54-L71
我可能误解了这个问题,但我建议尝试针对 Go 服务器/非 ruby 服务器运行配置的 Ruby 客户端,因为我相信您正在正确配置客户端。