HI there,
I am in the middle of a complex upgrade for a customer. Essentially updating 3 clusters form 3.0.12 to a minimum of 3.4 .
I have hit an issue with logs, which others have hit, but having tried all suggestions I have no solution working… any help really appreciated.
On a small Dev cluster, running nodes on 3.0.12, in Primary/Secondary/Arbiter, unsharded, I have added a 3.2.22 data node as a secondary, and it is pushing out a lot of these, with very few connections/activity happening:
==================================================================
2020-02-12T20:42:12.953Z W - [conn606] DBException thrown :: caused by :: 9001 socket exception [CLOSED] for 172.31.63.194:49488
2020-02-12T20:42:12.958Z I - [conn606]
0x155c5e2 0x155c40d 0x14d8a50 0x150e006 0x150e71b 0x150e731 0x150e78d 0x14ff75d 0x150242e 0x7fdc06e0a6db 0x7fdc06b3388f
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"115C5E2","s":"_ZN5mongo15printStackTraceERSo"},{"b":"400000","o":"115C40D","s":"_ZN5mongo15printStackTraceEv"},{"b":"400000","o":"10D8A50","s":"_ZN5mongo11DBException13traceIfNeededERKS0_"},{"b":"400000","o":"110E006","s":"_ZN5mongo6Socket15handleRecvErrorEii"},{"b":"400000","o":"110E71B","s":"_ZN5mongo6Socket5_recvEPci"},{"b":"400000","o":"110E731","s":"_ZN5mongo6Socket11unsafe_recvEPci"},{"b":"400000","o":"110E78D","s":"_ZN5mongo6Socket4recvEPci"},{"b":"400000","o":"10FF75D","s":"_ZN5mongo13MessagingPort4recvERNS_7MessageE"},{"b":"400000","o":"110242E","s":"_ZN5mongo17PortMessageServer17handleIncomingMsgEPv"},{"b":"7FDC06E03000","o":"76DB"},{"b":"7FDC06A12000","o":"12188F","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.2.22", "gitVersion" : "105acca0d443f9a47c1a5bd608fd7133840a58dd", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "4.15.0-1057-aws", "version" : "#59-Ubuntu SMP Wed Dec 4 10:02:00 UTC 2019", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "C2070FF92CF0E7C7AF25D84027F691037262CEA2" }, { "b" : "7FFD040E5000", "path" : "linux-vdso.so.1", "elfType" : 3, "buildId" : "D05895E5E385880D40A2B0A20CF7D8C9B06423D6" }, { "b" : "7FDC07E27000", "path" : "/usr/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "0D054641049B9747C05D030262295DFDFDD3055D" }, { "b" : "7FDC079E4000", "path" : "/usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "9C228817BA6E0730F4FCCFAC6E033BD1E0C5620A" }, { "b" : "7FDC077DC000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "9826FBDF57ED7D6965131074CB3C08B1009C1CD8" }, { "b" : "7FDC075D8000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "25AD56E902E23B490A9CCDB08A9744D89CB95BCC" }, { "b" : "7FDC0723A000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "A33761AB8FB485311B3C85BF4253099D7CABE653" }, { "b" : "7FDC07022000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "41BDC55C07D5E5B1D8AB38E2C19B1F535855E084" }, { "b" : "7FDC06E03000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "28C6AADE70B2D40D1F0F3D0A1A0CAD1AB816448F" }, { "b" : "7FDC06A12000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "B417C0BA7CC5CF06D1D1BED6652CEDB9253C60D0" }, { "b" : "7FDC0808F000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "64DF1B961228382FE18684249ED800AB1DCEAAD4" } ] }}
mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x155c5e2]
mongod(_ZN5mongo15printStackTraceEv+0xDD) [0x155c40d]
mongod(_ZN5mongo11DBException13traceIfNeededERKS0_+0x140) [0x14d8a50]
mongod(_ZN5mongo6Socket15handleRecvErrorEii+0xEE6) [0x150e006]
mongod(_ZN5mongo6Socket5_recvEPci+0x5B) [0x150e71b]
mongod(_ZN5mongo6Socket11unsafe_recvEPci+0x11) [0x150e731]
mongod(_ZN5mongo6Socket4recvEPci+0x3D) [0x150e78d]
mongod(_ZN5mongo13MessagingPort4recvERNS_7MessageE+0x9D) [0x14ff75d]
mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x2EE) [0x150242e]
libpthread.so.0(+0x76DB) [0x7fdc06e0a6db]
libc.so.6(clone+0x3F) [0x7fdc06b3388f]
==============================================================================
I have:
- ulimit set to 64000
- AWS EC2 instance … running Ubuntu 18.04
- Tried setting sysctl -w net.ipv4.tcp_keepalive_time=120 - No difference
Any other suggestions greatfully received. I am sort of at a loss to know what is causing this… is it a version incompatability?
Many thx in advance