-
Notifications
You must be signed in to change notification settings - Fork 841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batched statements with returned keys may fail parameters are set with 2-arg setObject #267
Comments
The fields are cleared in No follow-up Describe is sent, or one isn't processed in time, or isn't assigned to the correct object. So the fields that define expected result types remain null, but we still get results from the query. To fix this we must be sure to re-describe if we expect results and are re-planning. It doesn't seem like A workaround, as recommended in #266, is to use |
I initially thought that the issue was the call to It's cleared at:
after having been set by a Describe result:
in response to the Describe that was sent by So, in the logs, the
At this point we're OK and we have the result set fields. However, continuing from there:
in the first In working input cases, The underlying issue therefore appears to be that we fail to re-Describe the query correctly after re-planning if we require row set information. Or fail to process the Describe result properly, or in the right order. Relying on |
Proposed fix: If We still have to send the first Describe before the batch, because that's what's used to make sure the buffer doesn't overfill and cause a protocol deadlock. That can be removed if, at some future point, PgJDBC finally gets proper buffer management. A test that forces For some reason, in the ResultSet object, the first row is garbage, being the
while the later four rows correspond to expected values:
If this first entry is skipped then everything now works. The cause appears to be that the Describe message for the first statement gets a result saying that the result field is text-typed - but PgJDBC actually overrides that in the Bind message it sends, specifying the desired result format of column 1 as binary. So the Row Description response identifies the result as 'text' format, but the actual content is binary. That's not a Pg bug - we asked to describe the statement, and it did. Then we bound parameters that requested binary results and ran it. Nonetheless, it's causing the parameter to be misidentified. It looks like that's because we're doing the describe at the wrong phase. We shouldn't be describing the statement, but the bind. Doing it there, by forcing a |
There are some underlying issues with how batches that return results interact with result sets. The result set is assumed to only be able to have one set of fields, with one type per field, one binary/text mode per field, etc. That's fine for normal queries. However, when a query can be replanned mid-batch because input parameter types have changed, this can change the result types. There is no way to store this knowledge, because the result set can carry only one type per field. Part of this is really a problem with the JDBC API. It assumes that a batch returns a single resultset, with a single set of metadata describing the results. There's no room in the API for resultsets with mixed types. That's perfectly sensible for single query executions, but not for batches. Really, Right now multiple rows returned from a single batch entry just get concatenated onto the result set and there's no way to tell which batch execution they came from. There's not much we can do about those JDBC API limitations. It just has to be worked around. Additionally, in This is the immediate cause of the bug described at the top of this issue. I think one of the underlying issues is that we re-use the same SimpleQuery object when we re-plan after a parameter type change, doing an All this worked before because we basically turned batches that returned results into one-shot queries running a loop. Even with that I suspect there may be cases where mixing parameter types between iterations can cause issues. |
The most strictly correct option would be to prohibit parameter types from changing within a batch and to prohibit unknown-typed parameters. This might break existing applications and isn't necessary unless generated keys are requested. In many ways this would be the most correct approach given the JDBC API, but it's the most intrusive. Another correct option, and one that'll break fewer apps, is to query PostgreSQL to discover compatible types. Then treat those types (like varchar and text) as compatible on the JDBC side, and refuse all other type changes. This would increase startup overhead by quite a bit or require knowledge to be hardcoded in the driver. It'd be complex, especially when handling unknown types. It doesn't seem worthwhile. All the other options require that we accept that the type metadata in a resultset might not match the reality of what's in the resultset for all rows. We just have to document that and make it the user's problem - if they don't like it, don't mix types together and avoid the two-argument Options:
I'm evaluating these now. There's a lot of complex code to trace through. The most logically consistent options seems to be to copy the statement since it's not the exact same thing being run anymore. However, in that case need a way to tell which statement associates with which results coming back from the server, which we don't have at the moment. Or just use the first statement's fields anyway, which kind of negates the purpose of copying it. For that reason I suspect the second option will be better - eagerly creating the result set object and storing the field metadata. Except that that'll involve some significant internal API changes, since we work with the Statement object and use the 'fields' to determine whether there's an expected result set. It looks like we can reasonably easily copy the query, rather than unpreparing it, when replanning after type changes. That avoids the need to force a Describe for each execution. We have a queue to lazily clean up parsed statements already, which is automatically appended to when the ParseComplete message ( It'll still be necessary to ensure that binary mode is used for results in all executions or none, so first I have to find why we're using binary for the first execution and not later ones. |
See commit 763ae84 for the binary mode fix. The other part is still pending. |
Copying the query isn't looking very practical. If we force an extra Describe when the query is first executed, even if is fields were already defined by a pre-describe, that'll ensure that the fields are always populated during resultset processing. We know there'll be a Describe response in the pipeline, and while it's usually redundant it's also not costing a round-trip so it's pretty harmless. However, if the A proper fix to this looks like it'll require separating the statement-level Describe from the portal-level Describe, so we can keep field information about the generic statement and then keep track of any parameter-specific executions separately. That'll be quite a big change across some complex code for little benefit. All in all the whole thing is incredibly messy. |
I landed up going with a bit of a hack: force a Describe at each execution if we're running a batch with generated keys. If the I think a better fix for this would be to move the pre-Describe logic so that it's done after a query parse, near |
Per #266, batched statements can be re-planned mid-batch if parameter types change, and a change from a server-inferred type for an unknown/null value to a client-inferred type can count as a change.
This has an unexpected effect when interacting with generated keys (
RETURNING
) support in batches, where knowledge of the result field types is lost when re-planning.This can cause exceptions like a
NullPointerException
with a null 'fields' member in the Jdbc4ResultSet object returned by GetGeneratedKeys, with a stack like:or, after commit e655263:
or after commit e014cbd:
A test case demonstrating this issue will follow.
The text was updated successfully, but these errors were encountered: