Add an interface to ServerValues that lets us read synctrees just in time #2499

inlined · 2020-01-07T18:01:59Z

Addresses #2487 by creating a new interface that lets us defer reading existing values from a SyncTree until it's actually needed and restricted to the smallest possible scope.

schmidt-sebastian

This approach may be fine, but I need to convince myself that the mutability of SyncTree doesn't create some weird race conditions if multiple increments are stacked on top of each other. Maybe you or @mikelehen can help me convince that this is a non-issue since we only store the latest mutated state in the pendingWriteTree.

schmidt-sebastian · 2020-01-07T18:08:25Z

packages/database/src/core/util/ServerValues.ts

@@ -26,6 +26,53 @@ import { ChildrenNode } from '../snap/ChildrenNode';
 import { SyncTree } from '../SyncTree';
 import { Indexable } from './misc';

+/* It's critical for performance that we not calculate actual values from a SyncTree


s/not/do not/

schmidt-sebastian · 2020-01-07T18:35:50Z

@mikelehen won't be able to take a look until later today. Unfortunately, that means that we have to revert #2348. We can likely put it back into the the next release.

mikelehen

So I don't think this just-in-time reading of SyncTree is necessary, and it won't help #2487 since that involves update() which doesn't hit the SyncTree code path.

I think the root cause is here:

firebase-js-sdk/packages/database/src/core/Repo.ts

Line 370 in 0adebf4

this.serverSyncTree_.calcCompleteEventCache(path),

It should be:

this.serverSyncTree_.calcCompleteEventCache(path.child(changedKey)),

As-is, we end up calling calcCompleteEventCache() at the root location update() was called on over and over, which in the user's example is /, so it's going to be expensive (and get more expensive the more writes you have pending).

And since we're applying update() increments to the wrong existing value, it presumably won't behave properly. So we'll want to add a regression test for that as well.

inlined · 2020-01-13T17:24:43Z

I'm a bit confused; are you saying that the current proposal is wrong or an over-optimization? In the case that there's a {".sv": {"increment": X}} then the change is roughly equivalent (though we'd still need refactoring since the snapshot path also does child traversal). Isn't it best to avoid hitting the SyncTree altogether when it's avoidable? There's no reason why this customer should ever have calculated a compiled result since the increment feature isn't even released yet.

And since we're applying update() increments to the wrong existing value, it presumably won't behave properly. So we'll want to add a regression test for that as well.

What do you mean we're applying update() to the wrong existing value? The previous and current code both traverse down the tree; it's just that the last iteration calculated the cache too soon because I needed a full snapshot to be pulled to reuse the snapshot helper function.

mikelehen · 2020-01-13T18:21:05Z

I'm saying:

The initial ServerValue.increment() implementation had a bug in update() that caused it to calculate the complete cache at the root of the .update() call over and over, instead of for the actual paths being updated. So rootRef.update({'a/b/c': 1, 'd/e/f': 2}) is going to call calcCompleteEventCache() for the entire repo twice instead of once for just a/b/c and once for just d/e/f
In addition to being a performance problem, this means that ref.update({'a/b/c': ServerValue.increment(1)}) would use the value at ref instead of at ref.child('a/b/c') as the base value for the increment, which means it will calculate the wrong result and raise incorrect events.
Given that the user that complained was using update(), they were likely hitting this problem and weren't even exercising the SyncTree codepath that this PR changed, since the SyncTree codepath is only used by onDisconnect operations.
This PR is likely an over-optimization. calcCompleteEventCache() is already relatively cheap since all the underlying data structures are immutable and so it can reuse their structures when computing the merged event cache.

…time

inlined · 2020-03-24T01:32:26Z

Modified based on @mikelehen's observation that update() wasn't using the synctree path. It couldn't because the synctree code was tied up in SparseSnapshotTree logic for the sake of onDisconnect. I moved the loop to onDisconnect so I could use the optimization in update() as well.

Using the mvp regression test provided by @jmschae in #2487 (thanks!) I was able to confirm that increment-master reduced the 1200% slowdown (N^2 write calculations) to a 20% slowdown (N write calculations). With this change, (0 write calculations) we actually beat the baseline performance in master, though I can't explain why and will chalk it up to noise.

schmidt-sebastian

This is much cleaner than the previous iteration.

I am not a huge fan of the names, though. I cannot come up with something better, but I am hoping that maybe you can:

class ExistingSnapshotValue implements DeferredExistingValue
class DeferredSyncTreeValue implements DeferredExistingValue

From the names alone, it seems like the implementations do not provide specialized interface implementations, but instead take something away. ExistingSnapshotValue is no longer deferred, and DeferredSyncTreeValue is no longer existing. We might need a different name for the interface.

schmidt-sebastian · 2020-03-24T16:20:58Z

packages/database/src/core/util/ServerValues.ts

+
+class ExistingSnapshotValue implements DeferredExistingValue {
+  private node_: Node;
+  constructor(node: Node) {


Nit: You can drop the suffix for members of classes that are not part of the public API, which will allow you to use constructor property assignments (constructor(readaonly node:Node)).

Renaming to node also allows you to get rid of the getter here.

If I do that I fail to implement the protocol because node is a value instead of a method.

Based on fix firebase/firebase-js-sdk#2499 to bug firebase/firebase-js-sdk#2487 Micro-benchmarking showed that N writes in succession led to N^2 performance when calcuating the resolved write tree for those writes (thankfully only at the subpath in this SDK). This change matches the optimization in JS that mitigated a 20% performance regression.

inlined

Per offline commends, also renamed the classes ValueProvider, DeferredValueProvider and ExistingValueProvider.

inlined · 2020-03-27T16:55:20Z

packages/database/src/core/util/ServerValues.ts

+
+class ExistingSnapshotValue implements DeferredExistingValue {
+  private node_: Node;
+  constructor(node: Node) {


If I do that I fail to implement the protocol because node is a value instead of a method.

…time (#2499) Addresses #2487 by creating a new interface that lets us defer reading existing values from a SyncTree until it's actually needed and restricted to the smallest possible scope.

Based on fix firebase/firebase-js-sdk#2499 to bug firebase/firebase-js-sdk#2487 Micro-benchmarking showed that N writes in succession led to N^2 performance when calcuating the resolved write tree for those writes (thankfully only at the subpath in this SDK). This change matches the optimization in JS that mitigated a 20% performance regression.

* Change FServerValues to just-in-time read from SyncTrees. Based on fix firebase/firebase-js-sdk#2499 to bug firebase/firebase-js-sdk#2487 Micro-benchmarking showed that N writes in succession led to N^2 performance when calcuating the resolved write tree for those writes (thankfully only at the subpath in this SDK). This change matches the optimization in JS that mitigated a 20% performance regression. * Rename types to match JS sdk

* Definition is currently in an "unreleased" extension to FIRServerValues * Tests are added to FData.m (integration tests) * Some tests are renamed to make it clear that ServerValues != timestamps * Functions that resolve server values now need access to any existing snapshot. This is done with lazy evaluation as described below: Based on fix firebase/firebase-js-sdk#2499 to bug firebase/firebase-js-sdk#2487 Micro-benchmarking showed that N writes in succession led to N^2 performance when calculating the resolved write tree for those writes (thankfully only at the subpath in this SDK). Lazy evaluation of current data matches the optimization in JS that mitigated a 20% performance regression.

inlined requested a review from schmidt-sebastian January 7, 2020 18:01

inlined requested review from jsdt and mikelehen as code owners January 7, 2020 18:01

inlined assigned schmidt-sebastian Jan 7, 2020

schmidt-sebastian reviewed Jan 7, 2020

View reviewed changes

mikelehen self-assigned this Jan 7, 2020

This was referenced Jan 7, 2020

Revert: Add Database.Servervalue.increment(x) #2500

Merged

Revert "Revert: Add Database.Servervalue.increment(x)" #2505

Merged

mikelehen suggested changes Jan 8, 2020

View reviewed changes

mikelehen removed their assignment Jan 8, 2020

schmidt-sebastian assigned inlined and unassigned schmidt-sebastian Jan 13, 2020

inlined added 3 commits March 23, 2020 16:34

Add an interface to ServerValues that lets us read synctrees just in …

ece6c5c

…time

Apply optimization to update()

63e4f82

[AUTOMATED]: Prettier Code Styling

4a3b18b

inlined changed the base branch from master to increment-master March 24, 2020 01:22

inlined force-pushed the inlined.fix-perf-regression branch from 7933b20 to 4a3b18b Compare March 24, 2020 01:23

fix typo

7addf9c

inlined requested review from schmidt-sebastian and removed request for jsdt March 24, 2020 01:32

schmidt-sebastian reviewed Mar 24, 2020

View reviewed changes

inlined mentioned this pull request Mar 24, 2020

Change FServerValues to just-in-time read from SyncTrees. firebase/firebase-ios-sdk#5183

Merged

PR feedback

9cff77b

inlined commented Mar 27, 2020

View reviewed changes

schmidt-sebastian approved these changes Mar 27, 2020

View reviewed changes

inlined merged commit 0e705f7 into increment-master Mar 30, 2020

firebase locked and limited conversation to collaborators Apr 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an interface to ServerValues that lets us read synctrees just in time #2499

Add an interface to ServerValues that lets us read synctrees just in time #2499

Add an interface to ServerValues that lets us read synctrees just in time #2499

Add an interface to ServerValues that lets us read synctrees just in time #2499

Conversation

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment