Eventing function with feed boundary (from now) sometimes deploy from everything

Description

Occasionally I see an Eventing function with a "deploy from now" function actually deploy form everything.  This issue is hard to reproduce but seems to happen about 1 out of every 20 or 30 undeploy/re-deploy or pause/resumes. When I am importing and working on a lot of functions.

I had an Eventing source bucket with 10M docs and was working with an Eventing function that is set/fixed to have a feed boundary "from now" yet occasionally when I resume that function "curl_loc2" it processes all 10 Million docs when it should do nothing.

I triggered this issue again on 3/11/2021 a bit prior to 5pm PST (I had just two function one undeployed and the other I was pausing and resuming)

The time range when this happened was sometime between 2021-03-11T16:34:58.349-08:00 to 2021-03-11T16:40:23.393-08:00

I took a CB collect when this happened 

ls -ltr cbcollect_MB-44931_deploying_from_wrong_feed_boundary.zip -rw-r--r-- 1 linuxbrew linuxbrew 55111489 Mar 11 16:46 cbcollect_deploying_from_wrong_feed_boundary.zip

This has been uploaded via

curl --upload-file cbcollect_MB-44931_deploying_from_wrong_feed_boundary.zip https://s3.amazonaws.com/cb-engineering/

 The couchbase-server was a 7.0 source build created on Mar 11, 2021 14:00 PST running on debian 10 (I have seen this issue for a week or so now)

Linux couch01 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux

The odd thing is that I only have two (2) functions, I have 3,073 items in my metadata Eventing scratchpad (I should only have 2048 (with one active deployed function "curl_loc2" and one undeployed function "basicCurlGet_auth_bearer_mockserver"). Looking at the bucket 

CREATE PRIMARY INDEX midx ON metadata

SELECT countYellow Star
FROM metadata
WHERE assigned_worker LIKE "worker_curl_loc2%"

1024

SELECT countYellow Star
FROM metadata
WHERE assigned_worker LIKE "worker_basicCurlGet_auth_bearer_mockserver%"

0

 

SELECT COUNTYellow Star
FROM metadata
WHERE ( assigned_worker NOT LIKE "worker_basicCurlGet%" )
AND ( assigned_worker NOT LIKE "worker_curl_loc2%" )

2048
So what are these other 2048 items I seem to have 1024 with keys like

"id": "eventing::2266648328::curl_loc2::vb::991"

{ "assigned_worker": "", "bootstrap_stream_req_done": false, "current_vb_owner": "", "dcp_stream_status": "stopped", "dcp_stream_requested": false, "last_checkpoint_time": "2021-03-11 16:34:10.61623231 -0800 PST m=+244.472199355", "last_doc_timer_feedback_seqno": 0, "last_processed_seq_no": 53056, "node_uuid": "", "node_requested_vb_stream": "", "node_uuid_requested_vb_stream": "", "ownership_history": [ { "assigned_worker": "worker_curl_loc2_34", "current_vb_owner": "127.0.0.1:8096", "operation": "bootstrap", "seq_no": 0, "timestamp": "2021-03-11 16:33:01.923556352 -0800 PST m=+175.779523414" }, { "assigned_worker": "worker_curl_loc2_34", "current_vb_owner": "127.0.0.1:8096", "operation": "stream_requested", "seq_no": 53056, "timestamp": "2021-03-11 16:33:01.938840315 -0800 PST m=+175.794807336" }, { "assigned_worker": "worker_curl_loc2_34", "current_vb_owner": "127.0.0.1:8096", "operation": "running", "seq_no": 53056, "timestamp": "2021-03-11 16:33:02.935518042 -0800 PST m=+176.791485082" }, { "assigned_worker": "worker_curl_loc2_34", "current_vb_owner": "127.0.0.1:8096", "operation": "stopped", "seq_no": 53056, "timestamp": "2021-03-11 16:34:10.615778418 -0800 PST m=+244.471745452" } ], "previous_assigned_worker": "worker_curl_loc2_34", "previous_node_uuid": "36a90a493d8526d635cc5a111a316c48", "previous_vb_owner": "127.0.0.1:8096", "vb_id": 991, "vb_uuid": 132371710274914, "worker_requested_vb_stream": "", "manifest_id": "0", "currently_processed_doc_id_timer": "2021-03-12T00:33:01Z", "last_cleaned_up_doc_id_timer_event": "2021-03-12T00:33:01Z", "last_doc_id_timer_sent_to_worker": "2021-03-12T00:33:01Z", "last_processed_doc_id_timer_event": "2021-03-12T00:33:01Z", "next_doc_id_timer_to_process": "2021-03-12T00:33:02Z", "currently_processed_cron_timer": "2021-03-12T00:33:02Z", "last_processed_cron_timer_event": "", "next_cron_timer_to_process": "2021-03-12T00:33:02Z", "version": "evt-7.0.0-0000-ee" }

And I have unexpected cruft of 1024 documents with keys like the following: 

"id": "eventing::3624865054::zzz::vb::1000"

FYI I did have a function zzz at one time but I deleted that and refreshed the browser I only have the function "curl_loc2" deployed and one undeployed function "basicCurlGet_auth_bearer_mockserver" - nothing else.

{ "assigned_worker": "worker_zzz_23", "bootstrap_stream_req_done": true, "current_vb_owner": "127.0.0.1:8096", "dcp_stream_status": "running", "dcp_stream_requested": false, "last_checkpoint_time": "2021-03-11 16:29:19.732910148 -0800 PST m=+78.377885359", "last_doc_timer_feedback_seqno": 0, "last_processed_seq_no": 53101, "node_uuid": "36a90a493d8526d635cc5a111a316c48", "node_requested_vb_stream": "", "node_uuid_requested_vb_stream": "", "ownership_history": [ { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "bootstrap", "seq_no": 0, "timestamp": "2021-03-11 16:16:16.311672471 -0800 PST m=+6479.557751949" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "stream_requested", "seq_no": 0, "timestamp": "2021-03-11 16:16:16.326739434 -0800 PST m=+6479.572818896" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "running", "seq_no": 0, "timestamp": "2021-03-11 16:16:17.335642099 -0800 PST m=+6480.581721535" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.31426233 -0800 PST m=+7144.560341796" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.315333343 -0800 PST m=+7144.561412826" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.316378507 -0800 PST m=+7144.562458002" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.317401001 -0800 PST m=+7144.563480524" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.318378674 -0800 PST m=+7144.564458192" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.319371167 -0800 PST m=+7144.565450657" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.320185194 -0800 PST m=+7144.566264718" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.321020453 -0800 PST m=+7144.567100008" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.321882216 -0800 PST m=+7144.567961683" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.322935644 -0800 PST m=+7144.569015146" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.324032428 -0800 PST m=+7144.570111958" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.325231755 -0800 PST m=+7144.571311253" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.326297044 -0800 PST m=+7144.572376574" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.327542117 -0800 PST m=+7144.573621592" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.328411111 -0800 PST m=+7144.574490617" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.329478293 -0800 PST m=+7144.575557805" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.330677267 -0800 PST m=+7144.576756726" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.331860633 -0800 PST m=+7144.577940079" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "metadata_recreated", "seq_no": 0, "timestamp": "2021-03-11 16:27:21.332928961 -0800 PST m=+7144.579008453" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "stream_requested", "seq_no": 53101, "timestamp": "2021-03-11 16:28:14.716363288 -0800 PST m=+13.361338490" }, { "assigned_worker": "worker_zzz_23", "current_vb_owner": "127.0.0.1:8096", "operation": "running", "seq_no": 53101, "timestamp": "2021-03-11 16:28:15.712662048 -0800 PST m=+14.357637261" } ], "previous_assigned_worker": "worker_zzz_23", "previous_node_uuid": "36a90a493d8526d635cc5a111a316c48", "previous_vb_owner": "127.0.0.1:8096", "vb_id": 1000, "vb_uuid": 224369975207543, "worker_requested_vb_stream": "", "manifest_id": "0", "currently_processed_doc_id_timer": "2021-03-12T00:28:14Z", "last_cleaned_up_doc_id_timer_event": "2021-03-12T00:28:14Z", "last_doc_id_timer_sent_to_worker": "2021-03-12T00:28:14Z", "last_processed_doc_id_timer_event": "2021-03-12T00:27:21Z", "next_doc_id_timer_to_process": "2021-03-12T00:28:15Z", "currently_processed_cron_timer": "2021-03-12T00:28:15Z", "last_processed_cron_timer_event": "", "next_cron_timer_to_process": "2021-03-12T00:28:15Z", "version": "evt-7.0.0-0000-ee" }

Components

Affects versions

Fix versions

Labels

Environment

None

Link to Log File, atop/blg, CBCollectInfo, Core dump

None

Release Notes Description

None

Attachments

2
  • 05 May 2021, 08:25 PM
  • 05 May 2021, 10:04 AM

Activity

Chanabasappa Ghali May 15, 2021 at 5:31 AM

 since you have test environment where this issue was reproduced consistently. It would be great if you can verify this issue.

Thanks

CB robot May 11, 2021 at 9:20 AM

Build couchbase-server-7.0.0-5142 contains eventing commit f8098a4 with commit message:
https://couchbasecloud.atlassian.net/browse/MB-44931#icft=MB-44931: Move all settings/depcfg UI endpoints to /functions

Vinayaka Kamath May 11, 2021 at 5:16 AM

Yes user will not see this behavior if they use REST endpoints to deploy.

Jon Strabala May 10, 2021 at 4:42 PM

 I assume that this is 100% UI related and that using the REST API is a valid work around in production to avoid this edge case, can you verify.

Jon Strabala May 6, 2021 at 10:49 PM

 thanks for following through, your proposed patch "Move all settings/depcfg UI endpoints to /functions" looks good for both code changes and for settings.

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Is this a Regression?

Unknown

Triage

Untriaged

Due date

Story Points

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created March 12, 2021 at 8:35 PM
Updated June 17, 2021 at 10:09 PM
Resolved May 11, 2021 at 7:48 AM
Instabug

Flag notifications