[Magma] - Opening bucket during recovery takes a long time
Description
Components
Affects versions
Fix versions
Labels
Environment
Link to Log File, atop/blg, CBCollectInfo, Core dump
Release Notes Description
Attachments
is cloned by
is duplicated by
Activity

Steve Watanabe October 18, 2021 at 10:06 PM
Is there a way for ns_server to determine, via Stats, the bucket is ready to have dcp connections created successfully?

Ben Huddleston October 18, 2021 at 9:57 PM
It is indeed . https://github.com/couchbase/kv_engine/blob/master/engines/ep/src/warmup.h#L112-L219 Gives a good description of the various phases of warmup. After the PopulateVBucketMap phase vBucket stats should be retrievable. However, it's not until the Done state that Dcp Consumers are createable. The bulk of warmp (certainly for couchstore buckets) is going to be in some of the data loading phases done after PopulateVBucketMap. The changes that I mentioned earlier didn't change this though. They changed it so that after we enable traffic (mutations etc.) we also have to wait for all warmup threads to finish before we accept a Dcp Consumer (to prevent a race condition).
(Side note - comment on isFinishedLoading() function allowing creation of DcpConsumers is out of data, it should be isComplete() now - will update).

Steve Watanabe October 18, 2021 at 9:45 PM
Node .105 determines the state of "default" by obtaining stats (command 0x10) from memcached on node .111. We do this to "poll" the state of the bucket on the other node. At some point memcached returns the state of each vbucket in the stats. e.g. here's some debug logs.
and ns_considers this to mean the bucket is ready. Is it possible memcached is returning the vbucket stats when it's not completed all it's warm up activities (which means it'll return Etmpfail for dcp connections)?

Steve Watanabe October 15, 2021 at 9:39 PMEdited
Looking at the debug.log for node .111 we see the bucket loaded and we start a dcp connection which errors out due to (per Ben) the bucket not being entirely warmed.
And then the bucket does get marked as warmed
So it seems as if we're prematurely trying to establish the dcp connection. On node .105 we see the bucket has become ready and so dcp connections will be set up (and fail). So it appears node .105's determination of the bucket being ready is wrong.

Ben Huddleston October 15, 2021 at 7:39 AM
/ http://review.couchbase.org/c/kv_engine/+/162170 Feels like a likely cause for this.
Looking at the logs picked out, "Engine warmup is complete" is a bit of a rubbish log message, it gets logged per shard I think rather than per bucket and with the changes for we now require all of the warmup tasks to actually finish for the bucket to allow a DCP consumer to be created. We actually want to look at the last occurence of that message (we then see a DCPConsumer successfully open after that).
hopefully this is enough info for you to go on. When exactly is ns_server expecting to be able to open a DCPConsumer?

Steve Watanabe October 14, 2021 at 7:36 PM
This appears to be a different cause. Looking at node 107.105 the janitor_agent has died while attempting to establish a dcp connection to node 107.111 which fails with a etmpfail
Looking at the memcached.log on node 107.111 I don't see what may be leading to the error (with my untrained memcached-eye).
Assigning this ticket to kv for further triage. Feel free to return the ticket to me afterwards.

Steve Watanabe September 20, 2021 at 9:48 PM
Based on updates above from who asked "We will address this post Neo. Please let us know if that is OK." and the response from "yes post Neo is fine. " I'm changing the target to Morpheus.

Meni Hillel September 18, 2021 at 4:40 AM
For Neo, it seems that is the best we can do - increase the timeout. What would be an acceptable timeout under extreme condition? More importantly - we'll probably should document what one may need to do if timeout hits. Is there a way to estimate the duration of a given request?

Shivani Gupta September 17, 2021 at 6:19 PM
Ok that's good. So from end-user perspective it is not bucket creation that can take long and time-out. Which is good because bucket creation will be frequent in slower developer type VMs who are just trying it out.
I suggest we change the title of the ticket to 'Opening a Magma bucket during recovery takes a long time'. This issue is about opening an existing bucket, not about creating a new bucket.

Sarath Lakshman September 17, 2021 at 6:15 PMEdited
Bucket creation should not take time. It will take time only in the case of the bucket with existing data. For an empty bucket, it won't create even a single sstable until the data is inserted. ns_server/kv team can clarify cases other than delta recovery.

Shivani Gupta September 17, 2021 at 6:09 PMEdited
so can the creation of a new bucket also take this long? Actually I am not sure why the ticket header is 'Decrease creation time of bucket'?
Details
Assignee
Balakumaran GopalBalakumaran GopalReporter
Balakumaran GopalBalakumaran GopalIs this a Regression?
NoTriage
TriagedOperating System
Centos 64-bitIssue Impact
externalStory Points
1Priority
CriticalInstabug
Open Instabug
Details
Details
Assignee

Reporter

Is this a Regression?
Triage
Operating System
Issue Impact
Story Points
Priority
Instabug
PagerDuty
PagerDuty Incident
PagerDuty
PagerDuty Incident
PagerDuty

Sentry
Linked Issues
Sentry
Linked Issues
Sentry
Zendesk Support
Linked Tickets
Zendesk Support
Linked Tickets
Zendesk Support

Script to Repro
Steps to Repro
1. Create a 5 node cluster
2021-07-13 15:05:12,207 | test | INFO | pool-3-thread-6 | [table_view:display:72] Rebalance Overview
------------------------------------------------------------------------------
Nodes
Services
Version
CPU
Status
------------------------------------------------------------------------------
172.23.98.196
kv
7.1.0-1066-enterprise
1.73779708663
Cluster node
172.23.98.195
None
<--- IN ---
172.23.120.206
None
<--- IN ---
172.23.104.186
None
<--- IN ---
172.23.120.201
None
<--- IN ---
------------------------------------------------------------------------------
2. Create bucket/scopes/collections/data
2021-07-13 15:10:04,253 | test | INFO | MainThread | [table_view:display:72] Bucket statistics
---------------------------------------------------------------------------------------------------------------------+---------------------------------------------
Bucket
Type
Storage Backend
Replicas
Durability
TTL
Items
RAM Quota
RAM Used
Disk Used
---------------------------------------------------------------------------------------------------------------------+---------------------------------------------
r4dK9wYeZBZ%iqABnPUVxh6g5r2Sj4CAH8N89or9vhAsV3Sv3AK-22-897000
couchbase
magma
2
none
0
3000000
10485760000
2407566136
1992997674
---------------------------------------------------------------------------------------------------------------------+---------------------------------------------
3. Do a graceful failover/delta recovery/rebalance of node 172.23.104.186.
4. Do a graceful failover/delta recovery/rebalance of another node 172.23.120.201.
This rebalance fails as shown below.
Some interesting things to note.
1. This is fairly easily reproducible but not on every single run. At least 3/10 times we can hit this from 10-15 runs I have had with each run having 15-20 tests.
2. This is only seen on the second graceful failover/delta recovery/rebalance(step 4) and when it does on first graceful failover/delta recovery/rebalance(step 3) it fails like .
cbcollect_info attached.