Discussion:
Repeated semop calls
Anand Kumar, Karthik
2014-06-26 20:03:48 UTC
Permalink
Hi,

We run postgres 9.3.3 on Centos 6.3, kernel 2.6.32-431.3.1. Every once in a while, we see postgres processes spinning on semop:

Here is an output from an strace on a delete process:

***@site-db01a:~ # strace -p 53744
Process 53744 attached - interrupt to quit
semop(21692498, {{6, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(20480045, {{15, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(20185124, {{3, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(20185124, {{3, 1, 0}}, 1) = 0
semop(20185124, {{3, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21135425, {{14, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21692498, {{6, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21266501, {{4, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21692498, {{6, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(20250662, {{0, 1, 0}}, 1) = 0
semop(21135425, {{14, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(20250662, {{0, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21266501, {{6, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(20185124, {{3, 1, 0}}, 1) = 0
semop(20185124, {{3, 1, 0}}, 1) = 0
semop(20250662, {{0, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21626960, {{7, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(20185124, {{3, 1, 0}}, 1) = 0
semop(21266501, {{4, 1, 0}}, 1) = 0
semop(21266501, {{4, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(20119586, {{5, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(20185124, {{3, 1, 0}}, 1) = 0
semop(21200963, {{5, 1, 0}}, 1) = 0
semop(21135425, {{14, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21233732, {{9, 1, 0}}, 1) = 0
semop(21266501, {{7, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(20185124, {{3, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21266501, {{4, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(20185124, {{3, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21135425, {{14, 1, 0}}, 1) = 0
semop(20185124, {{3, 1, 0}}, 1) = 0
semop(21233732, {{9, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(20774966, {{7, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(20185124, {{3, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21233732, {{9, 1, 0}}, 1) = 0
semop(20185124, {{3, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(20185124, {{3, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
…
(output truncated)
…
semop(21233732, {{9, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21168194, {{10, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21168194, {{10, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21168194, {{10, 1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21561422, {{12, -1, 0}}, 1) = 0
semop(21168194, {{10, 1, 0}}, 1) = 0
sendto(3, "<134>Jun 26 12:23:30 postgres[53"..., 495, MSG_NOSIGNAL, NULL, 0) = 495
sendto(3, "<134>Jun 26 12:23:30 postgres[53"..., 258, MSG_NOSIGNAL, NULL, 0) = 258
sendto(10, "\2\0\0\0000\2\0\0001@\0\0\5\0\0\0\17\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 560, 0, NULL, 0) = 560
sendto(11, "1\0\0\0\0042\0\0\0\4n\0\0\0\4C\0\0\0\rDELETE 0\0Z\0\0"..., 35, 0, NULL, 0) = 35
recvfrom(11, "P\0\0\1\220\0delete from sym_data where"..., 8192, 0, NULL, NULL) = 576
lseek(27, 0, SEEK_END) = 499105792
lseek(28, 0, SEEK_END) = 19578880
lseek(29, 0, SEEK_END) = 4120576
lseek(30, 0, SEEK_END) = 20733952
lseek(31, 0, SEEK_END) = 20070400
lseek(32, 0, SEEK_END) = 491520
lseek(14, 0, SEEK_END) = 115073024
lseek(21, 0, SEEK_END) = 158318592
lseek(49, 0, SEEK_END) = 45277184
lseek(15, 0, SEEK_END) = 252067840
lseek(16, 0, SEEK_END) = 22503424
lseek(44, 0, SEEK_END) = 35635200
lseek(45, 0, SEEK_END) = 64151552
lseek(14, 0, SEEK_END) = 115073024
lseek(21, 0, SEEK_END) = 158318592
lseek(49, 0, SEEK_END) = 45277184
semop(21561422, {{12, -1, 0}}, 1) = 0
lseek(27, 0, SEEK_END) = 499105792
sendto(11, "1\0\0\0\0042\0\0\0\4n\0\0\0\4C\0\0\0\rDELETE 0\0Z\0\0"..., 35, 0, NULL, 0) = 35
recvfrom(11, "P\0\0\1\220\0delete from sym_data where"..., 8192, 0, NULL, NULL) = 576
lseek(27, 0, SEEK_END) = 499105792
lseek(28, 0, SEEK_END) = 19578880
lseek(29, 0, SEEK_END) = 4120576
lseek(30, 0, SEEK_END) = 20733952
lseek(31, 0, SEEK_END) = 20070400
lseek(32, 0, SEEK_END) = 491520
lseek(14, 0, SEEK_END) = 115073024
lseek(21, 0, SEEK_END) = 158318592
lseek(49, 0, SEEK_END) = 45277184
lseek(15, 0, SEEK_END) = 252067840
lseek(16, 0, SEEK_END) = 22503424
lseek(44, 0, SEEK_END) = 35635200
lseek(45, 0, SEEK_END) = 64151552
lseek(14, 0, SEEK_END) = 115073024
lseek(21, 0, SEEK_END) = 158318592
lseek(49, 0, SEEK_END) = 45277184
lseek(27, 0, SEEK_END) = 499105792


Running a perf on the process showed this:


Samples: 33K of event 'cycles', Event count (approx.): 20693251070

26.16% postmaster postgres [.] 0x0000000000188450

21.13% postmaster postgres [.] hash_search_with_hash_value

10.47% postmaster postgres [.] heap_page_prune_opt

4.21% postmaster postgres [.] LWLockAcquire

3.71% postmaster postgres [.] slot_getattr

1.97% postmaster postgres [.] HeapTupleSatisfiesMVCC

1.82% postmaster postgres [.] LWLockRelease

1.48% postmaster postgres [.] ExecProject

1.19% postmaster postgres [.] bpcharne

0.98% postmaster postgres [.] MemoryContextReset

0.90% postmaster postgres [.] ExecScan

0.89% postmaster postgres [.] pglz_decompress

0.79% postmaster postgres [.] hash_any

0.77% postmaster postgres [.] LookupTupleHashEntry

0.66% postmaster postgres [.] heap_form_minimal_tuple

0.65% postmaster postgres [.] ExecProcNode

0.63% postmaster [kernel.kallsyms] [k] _spin_lock

0.58% postmaster postgres [.] heap_fill_tuple

0.52% postmaster postgres [.] ExecClearTuple

0.46% postmaster libc-2.12.so [.] __memcmp_sse4_1

0.39% postmaster postgres [.] SearchCatCache

0.38% postmaster postgres [.] heap_compute_data_size

0.35% postmaster postgres [.] ExecQual

0.35% postmaster postgres [.] ReadBufferExtended

0.34% postmaster postgres [.] palloc0

0.34% postmaster libc-2.12.so [.] memcpy

0.31% postmaster postgres [.] LockBuffer

0.28% postmaster postgres [.] deconstruct_array

0.27% postmaster postgres [.] ResourceOwnerForgetBuffer

0.25% postmaster [kernel.kallsyms] [k] sys_semtimedop

0.25% postmaster postgres [.] base_yyparse

0.25% postmaster postgres [.] hash_uint32

0.23% postmaster postgres [.] FunctionCall1Coll

0.23% postmaster postgres [.] heap_getnext

0.22% postmaster postgres [.] ReleaseBuffer

0.22% postmaster [kernel.kallsyms] [k] ipc_has_perm

0.21% postmaster postgres [.] check_stack_depth

0.21% postmaster postgres [.] hash_search

0.20% postmaster [kernel.kallsyms] [k] native_write_msr_safe

0.20% postmaster postgres [.] GetSnapshotData

0.20% postmaster libc-2.12.so [.] _int_malloc

0.19% postmaster [kernel.kallsyms] [k] schedule

0.19% postmaster postgres [.] ExecStoreTuple

0.18% postmaster [kernel.kallsyms] [k] update_curr

0.18% postmaster postgres [.] pgstat_init_function_usage

0.18% postmaster libc-2.12.so [.] __strlen_sse42

0.17% postmaster postgres [.] copyObject

0.17% postmaster postgres [.] s_lock

0.17% postmaster postgres [.] MemoryContextAllocZeroAligned

0.17% postmaster postgres [.] palloc

0.17% postmaster [kernel.kallsyms] [k] avc_has_perm_noaudit

0.16% postmaster postgres [.] core_yylex

0.15% postmaster postgres [.] pgstat_end_function_usage

0.15% postmaster libc-2.12.so [.] __strcmp_sse42

0.15% postmaster [kernel.kallsyms] [k] task_rq_lock

0.14% postmaster postgres [.] expression_tree_walker

0.14% postmaster pg_stat_statements.so [.] 0x00000000000024f3

0.14% postmaster postgres [.] FunctionCall2Coll

0.12% postmaster postgres [.] CheckForSerializableConflictOut

0.12% postmaster [kernel.kallsyms] [k] select_task_rq_fair

0.12% postmaster [kernel.kallsyms] [k] __audit_syscall_exit

0.11% postmaster postgres [.] nocachegetattr

0.11% postmaster postgres [.] pg_detoast_datum_packed

0.10% postmaster [kernel.kallsyms] [k] try_to_wake_up

0.10% postmaster libc-2.12.so [.] _int_free

0.10% postmaster postgres [.] ResourceOwnerEnlargeBuffers

0.10% postmaster postgres [.] slot_attisnull

0.09% postmaster postgres [.] ExecStoreVirtualTuple

0.09% postmaster postgres [.] slot_getsomeattrs

0.08% postmaster [kernel.kallsyms] [k] try_atomic_semop

0.08% postmaster [kernel.kallsyms] [k] tcp_ack

0.08% postmaster postgres [.] get_hash_value

0.08% postmaster postgres [.] BufTableLookup

0.08% postmaster libc-2.12.so [.] __memset_sse2

0.08% postmaster [kernel.kallsyms] [k] dequeue_entity

0.08% postmaster postgres [.] ScanKeywordLookup

0.08% postmaster [kernel.kallsyms] [k] kmem_cache_free

0.08% postmaster [kernel.kallsyms] [k] tcp_recvmsg

0.08% postmaster [kernel.kallsyms] [k] _spin_lock_irq

0.07% postmaster libc-2.12.so [.] malloc

0.07% postmaster [kernel.kallsyms] [k] idr_find

0.07% postmaster [tg3] [k] tg3_poll_work

0.07% postmaster [kernel.kallsyms] [k] enqueue_entity

0.07% postmaster postgres [.] PostgresMain

0.07% postmaster [kernel.kallsyms] [k] unroll_tree_refs

0.07% postmaster postgres [.] ExecCopySlotMinimalTuple

0.07% postmaster [kernel.kallsyms] [k] kfree

0.06% postmaster postgres [.] hashint8

0.06% postmaster [kernel.kallsyms] [k] __do_softirq

0.06% postmaster [kernel.kallsyms] [k] dequeue_task_fair

0.06% postmaster postgres [.] DirectFunctionCall1Coll

0.06% postmaster [kernel.kallsyms] [k] tcp_rcv_established

0.06% postmaster [kernel.kallsyms] [k] update_queue

0.06% postmaster postgres [.] pg_encoding_mbcliplen

0.06% postmaster [kernel.kallsyms] [k] resched_task

0.06% postmaster [kernel.kallsyms] [k] copy_user_generic_string

0.06% postmaster postgres [.] LockAcquireExtended

0.06% postmaster [kernel.kallsyms] [k] find_busiest_group

0.06% postmaster postgres [.] ResourceOwnerRememberBuffer

0.05% postmaster [kernel.kallsyms] [k] enqueue_task

0.05% postmaster postgres [.] mcv_selectivity

0.05% postmaster [kernel.kallsyms] [k] thread_return

0.05% postmaster [kernel.kallsyms] [k] pid_vnr

0.05% postmaster [kernel.kallsyms] [k] audit_syscall_entry

0.05% postmaster [kernel.kallsyms] [k] __local_bh_enable

0.05% postmaster [kernel.kallsyms] [k] rcu_procesa_gp_end


I ran an ipcs and was able to see the semaphore, and get its id:


***@site-db01a:~ # ipcs -c -s | grep 21561422

21561422 600 postgres postgres postgres postgres


But I’m not sure where to go next. Is there a resource contention that this is indicating, and if so, any way to tell what the contention is on?


~Karthik
hubert depesz lubaczewski
2014-06-27 07:15:23 UTC
Permalink
On Thu, Jun 26, 2014 at 10:03 PM, Anand Kumar, Karthik <
Post by Anand Kumar, Karthik
We run postgres 9.3.3 on Centos 6.3, kernel 2.6.32-431.3.1. Every once in
it might be long shot, but when we had problems with lots of backends
"sitting" in semop, it was solved by:

1. disabling zone_reclaim (echo 0 > /proc/sys/vm/zone_reclaim_mode)
2. disabling transparent hugepage support - this has various names on
different kernel/distributions, but "find /sys | grep -i
transparent.*hugepage.*enable" will find it, and then just echo never there.

depesz
Anand Kumar, Karthik
2014-06-27 20:05:36 UTC
Permalink
Post by hubert depesz lubaczewski
1. disabling zone_reclaim (echo 0 > /proc/sys/vm/zone_reclaim_mode)
2. disabling transparent hugepage support - this has various names on different kernel/distributions, but "find /sys | grep -i transparent.*hugepage.*enable" will find it, and then just echo never there.
Thank you, yes, we have both zone_reclaim_mode and THP disabled
Jeff Janes
2014-06-27 17:51:20 UTC
Permalink
On Thu, Jun 26, 2014 at 1:03 PM, Anand Kumar, Karthik
Post by Anand Kumar, Karthik
Hi,
We run postgres 9.3.3 on Centos 6.3, kernel 2.6.32-431.3.1. Every once in a
Are there a lot of tuples in the table that have been inserted or
updated by still-open transactions? Such tuples can cause contention
when other processes need to check them for visibility.

You might want to upgrade and see if that fixes it, as there was a
partial fix in 9.3.4 for this (It fixed the problem where it occurred
in the planning done in the BIND phase when the unresolved tuples were
at the ends of an index).
Post by Anand Kumar, Karthik
Samples: 33K of event 'cycles', Event count (approx.): 20693251070
26.16% postmaster postgres [.] 0x0000000000188450
21.13% postmaster postgres [.] hash_search_with_hash_value
10.47% postmaster postgres [.] heap_page_prune_opt
4.21% postmaster postgres [.] LWLockAcquire
3.71% postmaster postgres [.] slot_getattr
1.97% postmaster postgres [.] HeapTupleSatisfiesMVCC
If it is what I am guessing at, then the top anonymous address is
probably coming from LWLockAcquire, and those are coming from
HeapTupleSatisfiesMVCC. I don't know how to verify that, though.

Cheers,

Jeff
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Andres Freund
2014-06-27 18:43:53 UTC
Permalink
Post by Jeff Janes
On Thu, Jun 26, 2014 at 1:03 PM, Anand Kumar, Karthik
Post by Anand Kumar, Karthik
Samples: 33K of event 'cycles', Event count (approx.): 20693251070
26.16% postmaster postgres [.] 0x0000000000188450
21.13% postmaster postgres [.] hash_search_with_hash_value
10.47% postmaster postgres [.] heap_page_prune_opt
4.21% postmaster postgres [.] LWLockAcquire
3.71% postmaster postgres [.] slot_getattr
1.97% postmaster postgres [.] HeapTupleSatisfiesMVCC
If it is what I am guessing at, then the top anonymous address is
probably coming from LWLockAcquire, and those are coming from
HeapTupleSatisfiesMVCC. I don't know how to verify that, though.
Recompile postgres with -fno-omit-frame-pointers and use perf record -g.

Greetings,

Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Anand Kumar, Karthik
2014-06-27 20:10:52 UTC
Permalink
Post by Jeff Janes
Are there a lot of tuples in the table that have been inserted or
updated by still-open transactions?


Yes, there are likely to be inserts. That table is a log capture table
used by our replication software, so essentially every
update/delete/insert will have a record inserted into the table. It has no
updates. The only thing that deletes from it is the process that had all
the semop calls.

We¹ll look into the postgres minor upgrade and recompile, thanks.
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Loading...