Add option to only return document data #757

jakeleventhal · 2019-09-15T07:40:20Z

Is your feature request related to a problem? Please describe.
Often times, it is known ahead of time that the document reference will not be needed and that all that is needed is the document data. Fetching the entire document increases the size of the document payload by a lot, especially in collection queries when select() is used.
Describe the solution you'd like
There should be an option when calling get() on a query that would allow it to only return the data from the document.

Example:

// Proposed method
const userEmails =  await firestore.collection('Users').select('Email').get({
   dataOnly: true
});

should have the same result as:

// Current method
const users = await firestore.collection('Users').select('Email').get();
const userEmails = users.docs.map((doc) => doc.data());

The text was updated successfully, but these errors were encountered:

schmidt-sebastian · 2019-09-16T16:06:05Z

@jakeleventhal Thanks for the feature request! This is an interesting feature, but honestly not something that should expect in our clients soon. We currently retrieve the document names as part of the payload and use them internally as a key to various data structures. If possible, we would advise you to structure your data so that the names of the documents are only a small part of the total number of bytes transferred.

We can certainly look at other way to optimize our payloads (such a data compression, which should help if you have long collection names). If you have other suggestions or more data to let us know why your document names make up a significant portion of your data transfer, please do let us know.

jakeleventhal · 2019-09-17T02:53:33Z

Sorry, I don't think I was clear. I was referring to the metadata associated with each document.

For instance, I created a sample document with the following data (there's nothing else there):

{ "SampleField": "asdfasdf", "OtherField": "lkjlkjlkj" }

With my proposed feature, I would be fetching a fairly small amount of data. Document names aren't a problem especially in comparison to the metadata that comes with a document. When fetching this document, I called

const user = await firestore.doc('Users/pdUXDTKLa0rC4eeJly2K').get();
console.log(JSON.stringify(user));

However, the console statement shows that MUCH more data is actually fetched than just the two fields:

{
  "_ref": {
    "_firestore": {
      "_settings": {
        "libName": "gccl",
        "libVersion": "2.3.0"
      },
      "_settingsFrozen": true,
      "_serializer": {
        "timestampsInSnapshots":
        true
      },
      "_projectId": "my-project",
      "_lastSuccessfulRequest": 1568688345220,
      "_preferTransactions": false,
      "_clientPool": {
        "concurrentOperationLimit":100,
        "activeClients": {}
      }
    },
    "_path": {
      "segments": ["Users","sampleid"],
      "projectId": "my-project",
      "databaseId": "(default)"
    }
  },
  "_fieldsProto": {
    "SampleField": {
      "stringValue": "asdfasdf",
      "valueType": "stringValue"
    },
    "OtherField": {
      "stringValue": "lkjlkjlkj",
      "valueType": "stringValue"
    }
  },
  "_serializer": {
    "timestampsInSnapshots": true
  },
  "_readTime": {
    "_seconds": 1568688345,
    "_nanoseconds": 183637000
  },
  "_createTime": {
    "_seconds": 1568688326,
    "_nanoseconds": 437609000
  },
  "_updateTime": {
    "_seconds": 1568688326,
    "_nanoseconds":437609000
  }
}

schmidt-sebastian · 2019-09-17T16:37:56Z

Thank you for clarifying! Most of the data you show in your snippet is actually not fetched from the backend but is instance state that we manage in the client (everything under _ref and _serializer is client state). Furthermore, _readTime is not retrieved be document but rather just once for each query. I hope that makes you feel a bit better!

For reference, this is the actual data that we send for each document: https://github.com/googleapis/googleapis/blob/master/google/firestore/v1/document.proto#L37

There is certainly still some room for improvement, but unfortunately, it would be unwise for me promise that we will tackle this any time soon.

jakeleventhal · 2019-09-19T03:23:02Z

Yes, and even if there is actually a 0% reduction in payload size, it is still a nice feature since it's annoying to have to call .data() on every doc.

jakeleventhal · 2021-09-14T13:39:08Z

bump

schmidt-sebastian · 2021-09-15T22:06:25Z

We would need some more user feedback before scheduling this work.

jakeleventhal mentioned this issue Sep 15, 2019

Add Recursive Delete on Collections #746

Closed

yoshi-automation added the triage me I really want to be triaged. label Sep 15, 2019

schmidt-sebastian added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed triage me I really want to be triaged. labels Sep 16, 2019

google-cloud-label-sync bot added the api: firestore Issues related to the googleapis/nodejs-firestore API. label Jan 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to only return document data #757

Add option to only return document data #757

Add option to only return document data #757

Add option to only return document data #757

Comments