Skip to content

lpstat hangs with stopped cupsd on Solaris #156

@l1gi

Description

@l1gi

There is an issue with cups client time out on Solaris. If I stop cupsd or use -h localhost:632 for lpstat it does not fail even if there is no daemon listening and hangs.

$ lpstat -h localhost:632

I have rebuilt the cups with debug enabled and tried again to generate more debug info. This is the interesting part of it:

T001 07:37:07.242  httpConnect2(host="localhost", port=632, addrlist=0, family=0, encryption=0, blocking=1, msec=30000, cancel=0)
T001 07:37:07.242  http_create(host="localhost", port=632, addrlist=0, family=0, encryption=0, blocking=1, mode=0)
T001 07:37:07.243  httpAddrGetList(hostname="localhost", family=AF_UNSPEC, service="632")
T001 07:37:07.243  httpReconnect2(http=701cd5790, msec=30000, cancel=0)
T001 07:37:07.243  httpAddrString(addr=701cd5558, s=7fd24f9417f0, slen=256)
T001 07:37:07.243  httpAddrString: returning "[v1.::1]"...
T001 07:37:07.243  httpReconnect2: Address [v1.::1]:632
T001 07:37:07.243  httpAddrString(addr=701cd5678, s=7fd24f9417f0, slen=256)
T001 07:37:07.243  httpAddrString: returning "127.0.0.1"...
T001 07:37:07.243  httpReconnect2: Address 127.0.0.1:632
T001 07:37:07.243  httpAddrConnect2(addrlist=701cd5550, sock=701cd5790, msec=30000, cancel=0)
T001 07:37:07.243  httpAddrString(addr=701cd5558, s=7fd24f941670, slen=256)
T001 07:37:07.243  httpAddrString: returning "[v1.::1]"...
T001 07:37:07.243  httpAddrConnect2: Trying [v1.::1]:632...
T001 07:37:07.243  httpAddrConnect2: Setting non-blocking connect()
T001 07:37:07.243  httpAddrConnect2: Finishing async connect()
T001 07:37:07.244  httpAddrConnect2: poll() returned 1 (150)
T001 07:37:07.244  pfds[0].revents=5
T001 07:37:07.244  httpReconnect2: New socket=5
T001 07:37:07.244  httpAddrString(addr=701cd5558, s=7fd24f9417f0, slen=256)
T001 07:37:07.244  httpAddrString: returning "[v1.::1]"...
T001 07:37:07.244  httpReconnect2: Connected to [v1.::1]:632...
T001 07:37:07.244  cups_enum_dests(flags=0, msec=250, cancel=0, type=0, mask=0, cb=7fca01a55cb0, user_data=7fd24f941b00)

cups.log

Even if there is nothing listening on [::1]:632, it thinks it is connected and continues trying with re-connect.

This is an interresting part of truss:

so_socket(PF_INET6, SOCK_STREAM, IPPROTO_IP, NULL, SOV_XPG4_2) = 4
setsockopt(4, SOL_SOCKET, SO_REUSEADDR, 0x7FF8F8CCB678, 4, SOV_DEFAULT) = 0
setsockopt(4, SOL_SOCKET, SO_REUSEPORT, 0x7FF8F8CCB678, 4, SOV_DEFAULT) = 0
setsockopt(4, SOL_SOCKET, SO_NOSIGPIPE, 0x7FF8F8CCB678, 4, SOV_DEFAULT) = 0
setsockopt(4, tcp, TCP_NODELAY, 0x7FF8F8CCB678, 4, SOV_DEFAULT) = 0
fcntl(4, F_SETFD, 0x00000001) = 0
fcntl(4, F_GETFL) = 2
fcntl(4, F_SETFL, FWRITE|FNONBLOCK) = 0
connect(4, 0x71FDC20E8, 32, SOV_XPG4_2) Err#150 EINPROGRESS
fcntl(4, F_SETFL, FWRITE) = 0
pollsys(0x7FF8F8CCBB30, 1, 0x7FF8F8CCB620, 0x00000000) = 1
getpeername(4, 0x7FF8F8CCBE50, 0x7FF8F8CCB67C, SOV_DEFAULT) Err#134 ENOTCONN
uname(0x7FF8F8CCB8A0) = 1
send(4, " P O S T / H T T P /".., 203, 0) Err#32 EPIPE
send(4, " P O S T / H T T P /".., 203, 0) Err#32 EPIPE
close(4) = 0

Firstly I don't understand why httpConnect2 is called with blocking=1 and the argument is not used in the underlying functions while they do unconditional nonblocking connect.

If this is an intention, than there is a missing mechanism to check the state of the socket before it is being polled. Maybe getsockopt() is what would 'workaround' the issue though I am in doubt that is the right fix here.

I assume it works accidentally on linux as the non blocking connec() operation finishes sooner that poll() is called. Still the check should be there as far as I understand.

Here is the patch which consider a socket on which getsockopt() fails as non connected on Solaris.

04-nonblock_connect.txt

Though I still don't understand some details of httpAddrConnect2() behavior this change makes Solaris lpstat -t fail in the same manner as on linux:

$ lpstat -t
scheduler is not running
no system default destination
lpstat: Bad file number
lpstat: Bad file number
lpstat: Bad file number
lpstat: Bad file number
lpstat: Bad file number
$

Could I ask you to rework the patch and integrate it in current cups, please?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions