I’m using: swipl:8.4.2 (Docker, Linux host, k8s cluster)
I’m getting an odd error when passing certain addresses as src_url
to Pengines. It’s an io_error
with 'Success'
as the context. It seems to happen randomly, about 50% chance for me, with the other requests succeeding as normal. It only happens with HTTP URLs on the local network. It seems that HTTPS URLs on the public internet are fine.
error(io_error(read,'<stream>(0x562a20ed6000)'),context(system:read_string/3,'Success'))
I’m calling Pengines like such:
pengine_rpc('http://pengines-service:4242/', member(X,[1,2]), [src_url('http://foo-service.bar-dev.svc.cluster.local:8000/internal/script/')]).
However, requests using a public HTTPS URL work properly. Attaching to the toplevel and running http_open/3
had no trouble obtaining the script with either HTTP or HTTPS so I’m led to believe this might be a Pengines or threading-specific thing. My pengines setup is a minimal, mostly default setup that looks like this.
I tried enabling a bunch of debugging stuff to get a log, and this is what I found:
% [Thread http@4242] New HTTP connection from ip(127,0,0,6)
% [Thread http@4242] Waiting for connection
% [Thread httpd@4242_2] Opened connection from ip(127,0,0,6)
% [Thread httpd@4242_2] First line: POST /pengine/create?format=prolog HTTP/1.1
% [Thread httpd@4242_2] Header =
host: pengines-service:4242
user-agent: Go-http-client/1.1
content-length: 598
content-type: application/json; charset=utf-8
accept-encoding: gzip
x-forwarded-proto: http
x-request-id: 62b42b45-2175-4561-90a0-b2e7cbbca38f
x-envoy-attempt-count: 1
x-forwarded-client-cert: By=spiffe://cluster.local/ns/stream-dev/sa/default;Hash=b11c61e4809804d88820e751dfa9e748722cd16c697ced90f5addac7ddd9bd15;Subject="";URI=spiffe://cluster.local/ns/bar-dev/sa/default
x-b3-traceid: 5b588c39a8db8f697d98554f664c2572
x-b3-spanid: 0bcd87679e8f2804
x-b3-parentspanid: 7d98554f664c2572
x-b3-sampled: 0
% [Thread httpd@4242_2] Field: [host('pengines-service'),port(4242),user_agent('Go-http-client/1.1'),content_length(598),content_type('application/json; charset=utf-8'),accept_encoding(gzip),x_forwarded_proto(http),x_request_id('62b42b45-2175-4561-90a0-b2e7cbbca38f'),x_envoy_attempt_count('1'),x_forwarded_client_cert('By=spiffe://cluster.local/ns/bar-dev/sa/default;Hash=b11c61e4809804d88820e751dfa9e748722cd16c697ced90f5addac7ddd9bd15;Subject="";URI=spiffe://cluster.local/ns/bar-dev/sa/default'),x_b3_traceid('5b588c39a8db8f697d98554f664c2572'),x_b3_spanid('0bcd87679e8f2804'),x_b3_parentspanid('7d98554f664c2572'),x_b3_sampled('0')]
% [Thread httpd@4242_2] [234] post /pengine/create ...
% [Thread 13] http_open: Connecting to 'foo-service.bar-dev.svc.cluster.local':8000 ...
% [Thread 13] ok <stream>(0x562a20fb8f00) ---> <stream>(0x562a20ed6600)
% [Thread 13] > GET /internal/script/ HTTP/1.1
% [Thread 13] > Host: foo-service.bar-dev.svc.cluster.local:8000
% [Thread 13] > User-Agent: SWI-Prolog
% [Thread 13] > Connection: close
% [Thread 13] HTTP/1.1 200 OK
% [Thread 13] content-type: application/x-prolog; charset=utf-8
% [Thread 13] etag: ACME_1650984427546036306
% [Thread 13] last-modified: Tue, 26 Apr 2022 23:47:07 GMT
% [Thread 13] date: Tue, 26 Apr 2022 14:53:29 GMT
% [Thread 13] x-envoy-upstream-service-time: 1
% [Thread 13] server: envoy
% [Thread 13] connection: close
% [Thread 13] transfer-encoding: chunked
% [Thread 13] Field: [content_type('application/x-prolog; charset=utf-8')]
% [Thread 13] Field: [etag('ACME_1650984427546036306')]
% [Thread 13] Field: [last_modified('Tue, 26 Apr 2022 23:47:07 GMT')]
% [Thread 13] Field: [date('Tue, 26 Apr 2022 14:53:29 GMT')]
% [Thread 13] Field: [x_envoy_upstream_service_time('1')]
% [Thread 13] Field: [server(envoy)]
% [Thread 13] Field: [connection(close)]
% [Thread 13] Field: [transfer_encoding(chunked)]
[Thread 13] T [24] Call: read_string(<stream>(0x562a20ed6000), _626, _628)
% [Thread 13] Reply to <message_queue>(0x562a210533b0): error('b5b507ab-cca7-4492-b320-f9f3d20bd6a3',error(io_error(read,'<stream>(0x562a20ed6000)'),context(system:read_string/3,'Success')))
% [Thread 13] Sending error('b5b507ab-cca7-4492-b320-f9f3d20bd6a3',error(io_error(read,'<stream>(0x562a20ed6000)'),context(system:read_string/3,'Success'))), timout: 300
% [Thread 13] Reply to <message_queue>(0x562a210533b0): destroy('b5b507ab-cca7-4492-b320-f9f3d20bd6a3')
% [Thread 11] Got error('b5b507ab-cca7-4492-b320-f9f3d20bd6a3',error(io_error(read,'<stream>(0x562a20ed6000)'),context(system:read_string/3,'Success'))) from <message_queue>(0x562a210533b0)
% [Thread 13] Sending destroy('b5b507ab-cca7-4492-b320-f9f3d20bd6a3'), timout: 300
% [Thread 11] DESTROY? error('b5b507ab-cca7-4492-b320-f9f3d20bd6a3',error(io_error(read,'<stream>(0x562a20ed6000)'),context(system:read_string/3,'Success')))
% [Thread 11] Field: [cache_control('no-cache, no-store, must-revalidate'),pragma('no-cache'),expires('0'),content_type('text/x-prolog; charset=UTF-8')]
% [Thread 11] [234] 200 OK (0.000 seconds; 142 bytes)
% [Thread httpd@4242_3] Closing connection from ip(127,0,0,6)
I can see custom headers parsed properly such as the ETag, so it doesn’t look like a network issue.
I’m quite perplexed by this error. I think that 'Success'
corresponds to C’s strerror(0)
but I can’t imagine why that would be getting returned. Race condition? I noticed this change recently in swipl-devel, could it possibly be related?
I’m having trouble reproducing this easily locally so I am still searching for a minimal reproduction case. Any ideas for extra debug flags or traces I could add to help isolate this?