[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an interface to ServerValues that lets us read synctrees just in time #2499

Merged
merged 5 commits into from
Mar 30, 2020

Conversation

inlined
Copy link
Member
@inlined inlined commented Jan 7, 2020

Addresses #2487 by creating a new interface that lets us defer reading existing values from a SyncTree until it's actually needed and restricted to the smallest possible scope.

Copy link
Contributor
@schmidt-sebastian schmidt-sebastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach may be fine, but I need to convince myself that the mutability of SyncTree doesn't create some weird race conditions if multiple increments are stacked on top of each other. Maybe you or @mikelehen can help me convince that this is a non-issue since we only store the latest mutated state in the pendingWriteTree.

@@ -26,6 +26,53 @@ import { ChildrenNode } from '../snap/ChildrenNode';
import { SyncTree } from '../SyncTree';
import { Indexable } from './misc';

/* It's critical for performance that we not calculate actual values from a SyncTree
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/not/do not/

@schmidt-sebastian
Copy link
Contributor

@mikelehen won't be able to take a look until later today. Unfortunately, that means that we have to revert #2348. We can likely put it back into the the next release.

Copy link
Contributor
@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I don't think this just-in-time reading of SyncTree is necessary, and it won't help #2487 since that involves update() which doesn't hit the SyncTree code path.

I think the root cause is here:

this.serverSyncTree_.calcCompleteEventCache(path),

It should be:

this.serverSyncTree_.calcCompleteEventCache(path.child(changedKey)), 

As-is, we end up calling calcCompleteEventCache() at the root location update() was called on over and over, which in the user's example is /, so it's going to be expensive (and get more expensive the more writes you have pending).

And since we're applying update() increments to the wrong existing value, it presumably won't behave properly. So we'll want to add a regression test for that as well.

@mikelehen mikelehen removed their assignment Jan 8, 2020
@inlined
Copy link
Member Author
inlined commented Jan 13, 2020

I'm a bit confused; are you saying that the current proposal is wrong or an over-optimization? In the case that there's a {".sv": {"increment": X}} then the change is roughly equivalent (though we'd still need refactoring since the snapshot path also does child traversal). Isn't it best to avoid hitting the SyncTree altogether when it's avoidable? There's no reason why this customer should ever have calculated a compiled result since the increment feature isn't even released yet.

And since we're applying update() increments to the wrong existing value, it presumably won't behave properly. So we'll want to add a regression test for that as well.

What do you mean we're applying update() to the wrong existing value? The previous and current code both traverse down the tree; it's just that the last iteration calculated the cache too soon because I needed a full snapshot to be pulled to reuse the snapshot helper function.

@mikelehen
Copy link
Contributor
mikelehen commented Jan 13, 2020

I'm saying:

  1. The initial ServerValue.increment() implementation had a bug in update() that caused it to calculate the complete cache at the root of the .update() call over and over, instead of for the actual paths being updated. So rootRef.update({'a/b/c': 1, 'd/e/f': 2}) is going to call calcCompleteEventCache() for the entire repo twice instead of once for just a/b/c and once for just d/e/f
  2. In addition to being a performance problem, this means that ref.update({'a/b/c': ServerValue.increment(1)}) would use the value at ref instead of at ref.child('a/b/c') as the base value for the increment, which means it will calculate the wrong result and raise incorrect events.
  3. Given that the user that complained was using update(), they were likely hitting this problem and weren't even exercising the SyncTree codepath that this PR changed, since the SyncTree codepath is only used by onDisconnect operations.
  4. This PR is likely an over-optimization. calcCompleteEventCache() is already relatively cheap since all the underlying data structures are immutable and so it can reuse their structures when computing the merged event cache.

@inlined inlined changed the base branch from master to increment-master March 24, 2020 01:22
@inlined inlined force-pushed the inlined.fix-perf-regression branch from 7933b20 to 4a3b18b Compare March 24, 2020 01:23
@inlined
Copy link
Member Author
inlined commented Mar 24, 2020

Modified based on @mikelehen's observation that update() wasn't using the synctree path. It couldn't because the synctree code was tied up in SparseSnapshotTree logic for the sake of onDisconnect. I moved the loop to onDisconnect so I could use the optimization in update() as well.

Using the mvp regression test provided by @jmschae in #2487 (thanks!) I was able to confirm that increment-master reduced the 1200% slowdown (N^2 write calculations) to a 20% slowdown (N write calculations). With this change, (0 write calculations) we actually beat the baseline performance in master, though I can't explain why and will chalk it up to noise.

@inlined inlined requested review from schmidt-sebastian and removed request for jsdt March 24, 2020 01:32
Copy link
Contributor
@schmidt-sebastian schmidt-sebastian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much cleaner than the previous iteration.

I am not a huge fan of the names, though. I cannot come up with something better, but I am hoping that maybe you can:

class ExistingSnapshotValue implements DeferredExistingValue
class DeferredSyncTreeValue implements DeferredExistingValue

From the names alone, it seems like the implementations do not provide specialized interface implementations, but instead take something away. ExistingSnapshotValue is no longer deferred, and DeferredSyncTreeValue is no longer existing. We might need a different name for the interface.


class ExistingSnapshotValue implements DeferredExistingValue {
private node_: Node;
constructor(node: Node) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: You can drop the suffix for members of classes that are not part of the public API, which will allow you to use constructor property assignments (constructor(readaonly node:Node)).

Renaming to node also allows you to get rid of the getter here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I do that I fail to implement the protocol because node is a value instead of a method.

inlined added a commit to firebase/firebase-ios-sdk that referenced this pull request Mar 24, 2020
Based on fix firebase/firebase-js-sdk#2499
to bug firebase/firebase-js-sdk#2487

Micro-benchmarking showed that N writes in succession led to N^2
performance when calcuating the resolved write tree for those
writes (thankfully only at the subpath in this SDK). This change
matches the optimization in JS that mitigated a 20% performance
regression.
Copy link
Member Author
@inlined inlined left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per offline commends, also renamed the classes ValueProvider, DeferredValueProvider and ExistingValueProvider.


class ExistingSnapshotValue implements DeferredExistingValue {
private node_: Node;
constructor(node: Node) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I do that I fail to implement the protocol because node is a value instead of a method.

@inlined inlined merged commit 0e705f7 into increment-master Mar 30, 2020
inlined added a commit that referenced this pull request Mar 30, 2020
…time (#2499)

Addresses #2487 by creating a new interface that lets us defer reading existing values from a SyncTree until it's actually needed and restricted to the smallest possible scope.
inlined added a commit to firebase/firebase-ios-sdk that referenced this pull request Mar 30, 2020
Based on fix firebase/firebase-js-sdk#2499
to bug firebase/firebase-js-sdk#2487

Micro-benchmarking showed that N writes in succession led to N^2
performance when calcuating the resolved write tree for those
writes (thankfully only at the subpath in this SDK). This change
matches the optimization in JS that mitigated a 20% performance
regression.
inlined added a commit to firebase/firebase-ios-sdk that referenced this pull request Mar 30, 2020
* Change FServerValues to just-in-time read from SyncTrees.

Based on fix firebase/firebase-js-sdk#2499
to bug firebase/firebase-js-sdk#2487

Micro-benchmarking showed that N writes in succession led to N^2
performance when calcuating the resolved write tree for those
writes (thankfully only at the subpath in this SDK). This change
matches the optimization in JS that mitigated a 20% performance
regression.

* Rename types to match JS sdk
inlined added a commit to firebase/firebase-ios-sdk that referenced this pull request Apr 1, 2020
* Change FServerValues to just-in-time read from SyncTrees.

Based on fix firebase/firebase-js-sdk#2499
to bug firebase/firebase-js-sdk#2487

Micro-benchmarking showed that N writes in succession led to N^2
performance when calcuating the resolved write tree for those
writes (thankfully only at the subpath in this SDK). This change
matches the optimization in JS that mitigated a 20% performance
regression.

* Rename types to match JS sdk
inlined added a commit to firebase/firebase-ios-sdk that referenced this pull request Apr 1, 2020
* Definition is currently in an "unreleased" extension to FIRServerValues
* Tests are added to FData.m (integration tests)
* Some tests are renamed to make it clear that ServerValues != timestamps
* Functions that resolve server values now need access to any existing snapshot. This is done with lazy evaluation as described below:

Based on fix firebase/firebase-js-sdk#2499
to bug firebase/firebase-js-sdk#2487

Micro-benchmarking showed that N writes in succession led to N^2
performance when calculating the resolved write tree for those
writes (thankfully only at the subpath in this SDK). Lazy evaluation
of current data matches the optimization in JS that mitigated a
20% performance regression.
pranavrajgopal pushed a commit to firebase/firebase-ios-sdk that referenced this pull request Apr 8, 2020
* Definition is currently in an "unreleased" extension to FIRServerValues
* Tests are added to FData.m (integration tests)
* Some tests are renamed to make it clear that ServerValues != timestamps
* Functions that resolve server values now need access to any existing snapshot. This is done with lazy evaluation as described below:

Based on fix firebase/firebase-js-sdk#2499
to bug firebase/firebase-js-sdk#2487

Micro-benchmarking showed that N writes in succession led to N^2
performance when calculating the resolved write tree for those
writes (thankfully only at the subpath in this SDK). Lazy evaluation
of current data matches the optimization in JS that mitigated a
20% performance regression.
ryanwilson pushed a commit to firebase/firebase-ios-sdk that referenced this pull request Apr 24, 2020
* Definition is currently in an "unreleased" extension to FIRServerValues
* Tests are added to FData.m (integration tests)
* Some tests are renamed to make it clear that ServerValues != timestamps
* Functions that resolve server values now need access to any existing snapshot. This is done with lazy evaluation as described below:

Based on fix firebase/firebase-js-sdk#2499
to bug firebase/firebase-js-sdk#2487

Micro-benchmarking showed that N writes in succession led to N^2
performance when calculating the resolved write tree for those
writes (thankfully only at the subpath in this SDK). Lazy evaluation
of current data matches the optimization in JS that mitigated a
20% performance regression.
@firebase firebase locked and limited conversation to collaborators Apr 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants