[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V2 Functions crashing when deployed with 12.6.1 #6414

Closed
milo- opened this issue Sep 28, 2023 · 42 comments
Closed

V2 Functions crashing when deployed with 12.6.1 #6414

milo- opened this issue Sep 28, 2023 · 42 comments

Comments

@milo-
Copy link
milo- commented Sep 28, 2023

After deploying using firebase-tools 12.6.1, our functions are crashing at runtime with the following stacktrace:

Error: TypeError: Cannot read properties of undefined (reading 'headers')
    at /layers/google.nodejs.yarn/yarn_modules/node_modules/firebase-functions/lib/v2/trace.js:12:68
    at /layers/google.nodejs.functions-framework/functions-framework/node_modules/@google-cloud/functions-framework/build/src/function_wrappers.js:141:25
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at console.error (/layers/google.nodejs.yarn/yarn_modules/node_modules/firebase-functions/lib/logger/compat.js:31:23)
    at sendCrashResponse (/layers/google.nodejs.functions-framework/functions-framework/node_modules/@google-cloud/functions-framework/build/src/logger.js:27:17)
    at sendResponse (/layers/google.nodejs.functions-framework/functions-framework/node_modules/@google-cloud/functions-framework/build/src/invoker.js:37:40)
    at /layers/google.nodejs.functions-framework/functions-framework/node_modules/@google-cloud/functions-framework/build/src/function_wrappers.js:34:40
    at bound (node:domain:433:15)
    at runBound (node:domain:444:12)
    at /layers/google.nodejs.functions-framework/functions-framework/node_modules/@google-cloud/functions-framework/build/src/function_wrappers.js:142:60
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

We're also crashing in our own code when trying to read request.rawBody.toString() as rawBody is undefined. It looks like something is wrong with the request objects being passed to the http functions.

We saw a very similar issue in 12.5.0 (#6290)

It looks like with this PR #6376 the issue might have been reintroduced.

@blidd-google can you take a look at this please?

[REQUIRED] Environment info

firebase-functions v2
Node 18.12.1

firebase-tools:

12.6.1

Platform:

[REQUIRED] Test case

Deploy a V2 http function with 12.6.1

Redeploy with 12.4.8 and functions work correctly

[REQUIRED] Steps to reproduce

Deploy a V2 http function with 12.6.1

[REQUIRED] Expected behaviour

It should not crash, correct properties should be defined on the request/response object

[REQUIRED] Actual behaviour

Crash, request object properties are undefined

@google-oss-bot
Copy link
Contributor

This issue does not seem to follow the issue template. Make sure you provide all the required information.

@aalej
Copy link
Contributor
aalej commented Sep 28, 2023

Hey @milo-, sorry to hear you’re encountering this issue. I’m currently trying to reproduce the error you mentioned but so far I am unable to replicate this. I tried deploying a mixture of v1 and v2 HTTP functions:

const functionv1 = require("firebase-functions/v1");
const functionv2 = require("firebase-functions/v2");
const logger = require("firebase-functions/logger");

exports.helloWorld1 = functionv1.https.onRequest((request, response) => {
  const rawBody = request.rawBody.toString();
  logger.info(`rawBody: ${rawBody}`, { structuredData: true });
  response.send(`Hello from Firebase v1! ${rawBody}`);
});

exports.helloWorld2 = functionv2.https.onRequest((request, response) => {
  const rawBody = request.rawBody.toString();
  logger.info(`rawBody: ${rawBody}`, { structuredData: true });
  response.send(`Hello from Firebase v2 2! ${rawBody}`);
});

exports.helloWorld3 = functionv2.https.onRequest((request, response) => {
  const rawBody = request.rawBody.toString();
  logger.info(`rawBody: ${rawBody}`, { structuredData: true });
  response.send(`Hello from Firebase v2 3! ${rawBody}`);
});

Making a POST request to any of the HTTP functions above does not raise any errors. In case I’m missing anything here, please let me know.

Could you provide additional information here such as if you’re deploying a mixture of v1 and v2 functions or just multiple v2 functions? By any chance, if you can provide a sample code snippet of the function(s) you’re deploying, we would highly appreciate it.

@aalej aalej added the Needs: Author Feedback Issues awaiting author feedback label Sep 28, 2023
@milo-
Copy link
Author
milo- commented Sep 28, 2023

Hi @aalej, yep we're deploying ~60 functions, 50/50 mixture of v1 functions and v2 functions.

We were seeing this issue in the following v2 functions:

  • beforeUserSignedIn
  • onSchedule
  • onRequest

All these functions are group functions. Ill try and get a smaller repro for you.

@google-oss-bot google-oss-bot added Needs: Attention and removed Needs: Author Feedback Issues awaiting author feedback labels Sep 28, 2023
@lelimit
Copy link
lelimit commented Sep 28, 2023

I also got problems at runtime after deploying onCall v2 functions with with 12.6.1.
TypeError: res.on is not a function

@milo-
Copy link
Author
milo- commented Sep 29, 2023

I've been unable to create a small reproduction. But can confirm this issue happens every time we deploy with 12.6.1.

Deploying the individual function fixes the issue, but when all functions are deployed together, the runtime error res.headers is undefined starts getting thrown again.

@blidd-google I've been digging into the source and the logs. Something that seems strange is there's a v1 and a v2 scraper, so I'd expect a source token for v1, source token for v2. However during deployment, it logs 4 different source tokens:

3 x UUID source tokens
1 x 384 character source token

@blidd-google
Copy link
Contributor

Hi @milo-, very sorry for all the disruptions you've run into. I'm investigating the issue now and am also having trouble creating a repro of the bug you've described, but I'll keep this thread updated.

@blidd-google
Copy link
Contributor
blidd-google commented Oct 3, 2023

A couple of follow up questions:

  • Is the correct grouped function being invoked before crashing at runtime due to the missing request object?
  • In addition to grouped functions, are you using multiple codebases to organize your functions? This would explain why more than 2 source tokens.
  • What is the shape of the HTTP request you are sending to the onRequest function?
  • Can you provide the logs from running the deploy command with the --debug flag?

@blidd-google blidd-google added Needs: Author Feedback Issues awaiting author feedback and removed Needs: Attention labels Oct 3, 2023
@arnoutvandervorst
Copy link

Same on my end, downgrade to 12.5.2 and everything works again. Got a mix of v2 and v1 (needed for firebase auth triggers). Deployment on 12.6.1 is extremely fast, over 10x faster than 12.5.2, which makes me think that something is not right. In my case, onRequest is not an issue, the onSchedule functions trigger the same issue "TypeError: Cannot read properties of undefined (reading 'headers')".

@sathishvj
Copy link
sathishvj commented Oct 3, 2023

I'm seeing something similar too - "TypeError: Cannot read properties of undefined (reading 'headers')". Neither onCall nor onRequest functions worked. onDocumentCreated was fine though.

I recently upgraded to 12.6.1 from 12.5.2 (I think. Or it might have been 12.5.4).
Functions that used to work earlier have just stopped working.
I'm going to downgrade and try again.

Yup, works after downgrading to 12.5.2.

@milo-
Copy link
Author
milo- commented Oct 3, 2023

@blidd-google

Is the correct grouped function being invoked before crashing at runtime due to the missing request object?

How do I find this out?

In addition to grouped functions, are you using multiple codebases to organize your functions? This would explain why more than 2 source tokens.

Nope just one codebase. It's a mixture of v1 and v2 functions, and a mix of grouped and ungrouped functions.

What is the shape of the HTTP request you are sending to the onRequest function?

onRequest is being called by a third party doing a POST request. When I try and log out the body with the log out the body it triggers the 'headers' crash 🤦‍♂️

beforeUserSignedIn and onSchedule are being called with whatever call Google makes under the hood to trigger these, and they're failing.

Can you provide the logs from running the deploy command with the --debug flag?

Yes, can you provide an email address I can send them to?

@google-oss-bot google-oss-bot added Needs: Attention and removed Needs: Author Feedback Issues awaiting author feedback labels Oct 3, 2023
@wokkaflokka
Copy link
Contributor

Perhaps this comment is unhelpful or not inline with expecations projects here -- and if so, I apologize and am happy to remove it. I'm leaving this comment to add public evidence that this issue is impacting another project, in addition to the impact already attested to by prior commenters.

My companies projects are observing the error exactly as reported in this issue. The error started occurring after upgrading to firebase-tools@12.6.1. It appears to affect functions non-deterministically -- we deploy over 100 (v2) functions, but so far, only one has been observed with the error reported here.

For that one function, the scope is a total outage. Every request to that function receives an HTTP 500 and exhibits the stacktrace in the initial ticket disclosure.

@Erfa
Copy link
Erfa commented Oct 4, 2023

Can confirm I also had this problem after upgrading to 12.6.1. Downgraded to 12.5.2 and my functions work again.

I really appreciate the team's push to move towards single builds though, I always wondered why it didn't work like that in the first place. I hope it can get stable soon.

@blidd-google
Copy link
Contributor
blidd-google commented Oct 4, 2023

Hi all, thanks for your reports and confirming the issues are from the single builds release. We are investigating but haven't been able to reproduce the behavior, making it difficult to diagnose the underlying issue; would anyone be able to share a minimal example that consistently reproduces the described problems? @Erfa @sathishvj @arnoutvandervorst @lelimit @wokkaflokka

@flying-wizzy
Copy link
flying-wizzy commented Oct 5, 2023

Same issue here, problems started after upgrading to 12.6.2 in my case. Mine is a scheduled pubsub function.

I'm getting

TypeError: Cannot read properties of undefined (reading 'headers')
at /workspace/node_modules/firebase-functions/lib/v2/trace.js:12:68
at /layers/google.nodejs.functions-framework/functions-framework/node_modules/@google-cloud/functions-framework/build/src/function_wrappers.js:141:25
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

Whenever it should be firing.

Edit:
As @milo- mentioned, the function works fine again after downgrading firebase-tools and redeploying the function again. Personally I downgraded to 12.5.4.

@milo-
Copy link
Author
milo- commented Oct 12, 2023

@blidd-google have you had any luck identifying the issue?

@blidd-google
Copy link
Contributor

Hi @milo-, unfortunately we are still unable to reproduce the issue - I've escalated with the Google Cloud Functions team, but it would be incredibly helpful if you or another user observing the same behavior could securely share code / a repro through Firebase Support channels so we could take a look at what's going on in more detail.

@lox
Copy link
lox commented Oct 17, 2023

We are also seeing this issue. My pet theory is that containers are sometimes running the wrong function. The various errors you get are when a function that is expecting to take a request doesn't have a request, etc.

We have 25 functions (all v2) and when we deploy them, we randomly get our http and onBeforeCreate functions failing with the above symptoms. I logged firebase/firebase-functions#1471 thinking it might be a firebase-functions or function-framework issue, but after some more thought it seems likely the changes in 12.6.1 are the culprit.

A smoking gun (IMO) is this log entry:

{
insertId: "652e589e000c13df16d51d30"
labels: {2}
logName: "projects/xxx/logs/run.googleapis.com%2Fvarlog%2Fsystem"
receiveTimestamp: "2023-10-17T09:49:18.793005369Z"
resource: {
labels: {
configuration_name: "http-trackanalytics"
revision_name: "http-trackanalytics-00057-baw"
location: "australia-southeast1"
project_id: "xxx-dev"
service_name: "http-trackanalytics"
}
type: "cloud_run_revision"
}
severity: "INFO"
textPayload: "Default STARTUP TCP probe succeeded after 1 attempt for container "firebase--on-feedback-written-1" on port 8080."
timestamp: "2023-10-17T09:49:18.791519Z"
}

Note that the service started by the tcp probe differes from the service name in the labels. The one in the probe is a firebase on update function vs an http function.

What's also noteable is that the http-trackanalytics function doesn't actually log any errors. I can see the 200 OPTIONS requests, but not the 500s.

@onurays
Copy link
onurays commented Oct 17, 2023

I confirm the exact problem with v12.7.0. I suspect that wrong functions are trying to be called according to logs. Crazy...

@blidd-google
Copy link
Contributor

Hi all, would anyone be willing to share their project ID and function ID of a function experiencing the broken behavior via Firebase Support channel linked here?

It would really help us to have a specific example of the issue to investigate; otherwise, it's hard for us to diagnose what the underlying problem is. If anyone is able to provide a repro, that would work great as well. Thanks all for your patience.

@lox
Copy link
lox commented Oct 17, 2023

Yup, absolutely. I am also happy to do a video call at a time that suits showcasing the issue.

@lox
Copy link
lox commented Oct 17, 2023

We have a support case 10253449 already open

@Mr-Goldberg
Copy link
Mr-Goldberg commented Oct 18, 2023

Having a similar issue.
With firebase-tools 12.7.0, when I deploy all the functions with "firebase deploy --only functions" some of the scheduled functions fail with the mentioned error 500 (...headers). However, if I deploy a single function with, for example, "firebase deploy --only functions:backend:service_logs", the function is being invoked successfully on the given schedule.
The problem is resolved when downgrading to firebase-tools 12.5.2.

UPD: support case 10253785 is sent.

@blidd-google
Copy link
Contributor

Thanks all for sharing the details of your deployments and helping us get to the bottom of this. One more follow up question - on your deployment attempts, is it always the same functions that are breaking, or are functions seeming to run into this error randomly?

@Mr-Goldberg
Copy link
Mr-Goldberg commented Oct 18, 2023 via email

@flying-wizzy
Copy link

Same here. I have a total of 5 functions deployed, 4 onCreate and onDelete that work fine, but the remaining scheduled pubsub function keeps failing.

@wokkaflokka
Copy link
Contributor
wokkaflokka commented Oct 18, 2023

Hi all, would anyone be willing to share their project ID and function ID of a function experiencing the broken behavior via Firebase Support channel linked here?

Just noting, I submitted this data via the support tracker just now (Case Number: 10253826)

It will be difficult for me to isolate this to a sandbox project, but if it helps I could meet for 30 minutes and demonstrate the issue on one of our projects.

on your deployment attempts, is it always the same functions that are breaking, or are functions seeming to run into this error randomly?

It seems to affect the same functions deterministically. I haven't systematically attempted to prove/confirm this.

@lox
Copy link
lox commented Oct 21, 2023

Any updates folks? This seems like an absolutely critical issue.

@chrisspiegl
Copy link

I want to add: I experienced the same issue with onDocumentWrite & onSchedule with v2 functions.

  • Downgrade to firebase-tools: 12.5.2 did work.
  • Deploying an individual function which is onSchedule does also work.
  • Deploying a set of functions is the one case where it breaks for all of these functions.
  • The functions that keep working are the http request functions.

It's crazy that this bug is not fixed at this time! I second @lox in saying that this is critical and more importantly: a serious break in trust for future updates (which are part of a minor version bump and multiple releases at this point)!

@lox
Copy link
lox commented Oct 22, 2023

Firebase folks, feel free to book me here for a quick video call to assist with reproducing the issue:

https://zcal.co/lox/firebase-6414

@ChromeQ
Copy link
ChromeQ commented Oct 22, 2023

I also ran into this since 12.6.0 (and 12.6.1 and 12.7.0)
I can also confirm from my logs that it is calling the incorrect functions after deployment. This is because the incorrect functions are potentially being deployed somehow, I have all v2 functions (3 firestore documents handlers + 2 scheduled functions) and I can see the following in the logs when deploying:
image

It shows the incorrect function was (potentially?) deployed, in this case "towntime" is the name of a new scheduled function added - this can be seen more clearly when searching for "Default STARTUP TCP probe succeeded after 1 attempt for container" and it deployed and worked correctly until the 20th October (when I upgraded)
image

Happy to get on that call with you I just booked using your link @lox

@lox
Copy link
lox commented Oct 22, 2023

@ChromeQ I am experiencing the issue myself, not a Firebase dev 😅

@blidd-google
Copy link
Contributor

Hi all, we've been able to reproduce the error and have identified the underlying bug. We're currently working on a fix and will keep folks updated on this issue thread. Thanks for your continued patience as we iron out the remaining kinks with the single builds feature - we believe that the improvements in deploy speed & reliability that single builds offer are worth the effort, and you should be able to take advantage of those improvements without problems very soon!

@lox
Copy link
lox commented Oct 27, 2023

@blidd-google are you able to address these questions:

  1. Did this affect all V2 functions? What was the identified issue?
  2. If functions were being incorrectly invoked, could there have been security impact? For instance onDocumentUpdate functions triggered by http without auth checks?
  3. Why was a change of this magnitude introduced in a minor version? Will the release regime in future change?

My confidence in Firebase Functions is at an all time low after this issue and the recent multi-hour deployment outage that didn't even merit a status page update.

Would really help to see a nuanced response here, are there any product managers involved in Firebase any more or is it more best-effort internally?

@joehan
Copy link
Contributor
joehan commented Oct 31, 2023

First off, a quick update: We have a backend fix for this issue rolling out to production. It should land this week.

To answer @lox 's questions:

Did this affect all V2 functions? What was the identified issue?
Though there were a couple issues related to this launch, this specific issue only affected deployments with multiple V2 functions using different trigger types.

A brief bit of background: previously, deployments of many functions would generate some nearly identical artifacts for each function. This update used 'source tokens' to tell Cloud Functions to reuse these artifacts where possible to make deploys faster, cheaper, and more reliable.

The cause of this issue is a bug where functions deployed with a source token used the trigger type of the function that created that source token instead of the correct trigger type. This trigger type controls how the Cloud Function decodes requests and responses. So, for example, if a source token from a Firestore triggered function was used for a HTTPS function, it would error out when trying to decode a HTTPS request as a request from Firestore.

If functions were being incorrectly invoked, could there have been security impact? For instance onDocumentUpdate functions triggered by http without auth checks?
This issue did not have any security impact. The correct functions are being called, this was just causing errors with how requests were decoded.

Why was a change of this magnitude introduced in a minor version? Will the release regime in future change?
We added this in a minor version because it is not an 'incompatible API change', per https://semver.org/.

We do understand the frustration here, and we definitely need to do a better job of catching issues like this before they reach you. This issue exposed some gaps in our test coverage, and we're currently working on a revamp the functions integration test suite. Additionally, while Github is a great place to report issues, please reach out to Firebase Support for help with any billing issues.

@lox
Copy link
lox commented Oct 31, 2023

Thanks for the detailed update @joehan!

Very much looking forward to being able to benefit from artifact re-use (and speed up our deploys).

@ChromeQ
Copy link
ChromeQ commented Nov 2, 2023

Thanks for the update @joehan , I gave this a try and it seemed to work for the most part (although I only deployed two functions of the same type so may not have been a problem as per your description above)
However there is an issue within the logs which gave me the impression it was not resolved:

image

Ignore the error (that was me prematurely deleting a storage bucket!) but you can see the bit above saying the function named "scrape-happiest-hour" was deployed and successfully probed the container, this was using firebase-tools@12.5.4.
After the upgrade to firebase-tools@12.8.0 this morning and another deploy at 10.36am - you can see the probe on the container named "scrape-towntime" (the other function I deployed at the same time as this one).
But the next lines are the invocation of the function and the correct name comes through "scrape-happiest-hour" - so it seems it is working correctly.

But as I mentioned the container name is incorrect and mixed up with the first named function, this is the command I ran:
firebase deploy --only functions:scrapeTowntime,functions:scrapeHappiestHour so it looks like the first function listed is the name coming through all the others as described in your analysis earlier.

Hope this helps track down any other lingering problem that may still exist.

@wokkaflokka
Copy link
Contributor
wokkaflokka commented Nov 2, 2023

@joehan thank you for the clear explanation -- very well done.

@ChromeQ -- If i take the explanation from @joehan, they've stated "We have a backend fix for this issue rolling out to production. It should land this week."

Based on that comment, it's not clear that the proposed fix has been released. The name mismatch is characteristic of the initial reported error and is more generally characteristic of the class of error as disclosed by maintainers. I hope your issue is fixed and wish I could help more.

@lox
Copy link
lox commented Nov 2, 2023

I assume https://github.com/firebase/firebase-tools/releases/tag/v12.8.0 didn't include the fix.

@arnoutvandervorst
Copy link

Based on that backend fix, any version after 12.5.4 won't work in our cases since that version started using single builds. When the backend fix has been released, we can safely upgrade to any new version.

@milo-
Copy link
Author
milo- commented Nov 5, 2023

@joehan has the backend fix landed in production? Thanks!

@blidd-google
Copy link
Contributor

The backend fix for the single builds feature has been fully released; functions deployed with CLI versions >=12.6.2 will be able to correctly decode requests. We will continue to monitor this thread in case anything else comes up, but thank you all for your patience as we ironed out the kinks for this feature. As @joehan mentioned, the Firebase functions team is working on overhauling our integration tests to mitigate potential future issues as we continue to make improvements to 2nd gen functions.

@joehan
Copy link
Contributor
joehan commented Nov 7, 2023

I'm going to close this issue for now since AFAIK, it is fixed. If you've had any issues with your functions, the quickest way to fix them should be to update to the latest version of firebase-tools and redeploy them. Please don't hesitate to ping here again (or open new issues) if you are still having trouble with functions deploy.

@joehan joehan closed this as completed Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests