[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batched statements with returned keys may fail parameters are set with 2-arg setObject #267

Open
ringerc opened this issue Mar 11, 2015 · 9 comments
Assignees
Labels
bug triage/needs-review Issue that needs a review - remove label if all is clear

Comments

@ringerc
Copy link
Member
ringerc commented Mar 11, 2015

Per #266, batched statements can be re-planned mid-batch if parameter types change, and a change from a server-inferred type for an unknown/null value to a client-inferred type can count as a change.

This has an unexpected effect when interacting with generated keys (RETURNING) support in batches, where knowledge of the result field types is lost when re-planning.

This can cause exceptions like a NullPointerException with a null 'fields' member in the Jdbc4ResultSet object returned by GetGeneratedKeys, with a stack like:

Jdbc4ResultSet(AbstractJdbc2ResultSet).checkColumnIndex(int) line: 2866   
Jdbc4ResultSet(AbstractJdbc2ResultSet).checkResultSet(int) line: 2884   
Jdbc4ResultSet(AbstractJdbc2ResultSet).getInt(int) line: 2118   
BulkUpdateJdbcEdgeCase.batchResultSet(Connection, ArrayList<ReportSectionBean>) line: 67   
BulkUpdateJdbcEdgeCase.main(String[]) line: 28

or, after commit e655263:

java.lang.NullPointerException: fields must be non-null
    at org.postgresql.jdbc2.AbstractJdbc2ResultSet.<init>(AbstractJdbc2ResultSet.java:111)
    at org.postgresql.jdbc3.AbstractJdbc3ResultSet.<init>(AbstractJdbc3ResultSet.java:26)
    at org.postgresql.jdbc3g.AbstractJdbc3gResultSet.<init>(AbstractJdbc3gResultSet.java:27)
    at org.postgresql.jdbc4.AbstractJdbc4ResultSet.<init>(AbstractJdbc4ResultSet.java:22)
    at org.postgresql.jdbc4.Jdbc4ResultSet.<init>(Jdbc4ResultSet.java:26)
    at org.postgresql.jdbc4.Jdbc4Statement.createResultSet(Jdbc4Statement.java:36)
    at org.postgresql.jdbc2.AbstractJdbc2Statement$BatchResultHandler.handleResultRows(AbstractJdbc2Statement.java:2717)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1938)
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:421)
    at org.postgresql.jdbc2.AbstractJdbc2Statement.executeBatch(AbstractJdbc2Statement.java:2929)
    at com.albourne.db.BulkUpdateJdbcEdgeCase.batchResultSet(BulkUpdateJdbcEdgeCase.java:63)
    at com.albourne.db.BulkUpdateJdbcEdgeCase.main(BulkUpdateJdbcEdgeCase.java:28)

or after commit e014cbd:

java.lang.IllegalStateException: Received resultset tuples, but no field structure for them
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1940)
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:421)
    at org.postgresql.jdbc2.AbstractJdbc2Statement.executeBatch(AbstractJdbc2Statement.java:2929)

A test case demonstrating this issue will follow.

@ringerc ringerc added the bug label Mar 11, 2015
@ringerc ringerc self-assigned this Mar 11, 2015
@ringerc
Copy link
Member Author
ringerc commented Mar 11, 2015

@ringerc
Copy link
Member Author
ringerc commented Mar 11, 2015

The fields are cleared in QueryExecutorImpl.sendParse(...) where query.setFields(null); is called.

No follow-up Describe is sent, or one isn't processed in time, or isn't assigned to the correct object. So the fields that define expected result types remain null, but we still get results from the query.

To fix this we must be sure to re-describe if we expect results and are re-planning. It doesn't seem like sendParse is the right place to do this, but arguably it's also the wrong place to clear the cached fields. The challenge isn't fixing this bug, it's fixing it in a way that doesn't make an already ugly situation even uglier or introduce new bugs.

A workaround, as recommended in #266, is to use setObject(int, Object, int) or setNull(int, int) to specify the expected type for nulls.

@ringerc
Copy link
Member Author
ringerc commented Mar 12, 2015

I initially thought that the issue was the call to setFields(null) at QueryExecutorImpl.java:1260 in sendParse(...). More tracing now suggests that the fields member is being cleared during SimpleQuery.unprepare(), so that wasn't the culprit.

It's cleared at:

SimpleQuery.unprepare() line: 238   
QueryExecutorImpl.sendParse(SimpleQuery, SimpleParameterList, boolean) line: 1254   
QueryExecutorImpl.sendOneQuery(SimpleQuery, SimpleParameterList, int, int, int) line: 1648  
QueryExecutorImpl.sendQuery(V3Query, V3ParameterList, int, int, int, QueryExecutorImpl$ErrorTrackingResultHandler) line: 1205   
QueryExecutorImpl.execute(Query[], ParameterList[], ResultHandler, int, int, int) line: 412 
Jdbc4PreparedStatement(AbstractJdbc2Statement).executeBatch() line: 2929    

after having been set by a Describe result:

SimpleQuery.setFields(Field[]) line: 182    
QueryExecutorImpl.processResults(ResultHandler, int) line: 2071 
QueryExecutorImpl.execute(Query, ParameterList, ResultHandler, int, int, int) line: 255 
Jdbc4PreparedStatement(AbstractJdbc2Statement).executeBatch() line: 2911    

in response to the Describe that was sent by sendQuery at QueryExecutorImpl.execute(...) line 253.

So, in the logs, the fields of the SimpleQuery gets set at RowDescription(1) processing, after the Parse, Describe phase:

13:36:09.237 (1)  FE=> Parse(stmt=S_1,query="INSERT INTO testtable (title) VALUES ($1) RETURNING "report_section_id"",oids={0})
13:36:09.240 (1)  FE=> Describe(statement=S_1)
13:36:09.240 (1)  FE=> Sync
13:36:09.242 (1)  <=BE ParseComplete [S_1]
13:36:09.243 (1)  <=BE ParameterDescription
13:36:13.722 (1)  <=BE RowDescription(1)
13:36:13.727 (1)         Field(,INT4,4,T)
13:36:18.654 (1)  <=BE ReadyForQuery(I)

At this point we're OK and we have the result set fields. However, continuing from there:

13:36:21.448 (1) batch execute 5 queries, handler=org.postgresql.jdbc2.AbstractJdbc2Statement$BatchResultHandler@2d209079, maxRows=0, fetchSize=0, flags=80
13:36:27.777 (1)  FE=> Bind(stmt=S_1,portal=null,$1=<NULL>)
13:36:31.835 (1)  FE=> Execute(portal=null,limit=0)
13:36:38.181 (1)  FE=> CloseStatement(S_1)
13:36:38.182 (1)  FE=> Parse(stmt=S_2,query="INSERT INTO testtable (title) VALUES ($1) RETURNING "report_section_id"",oids={1043})

in the first Execute the Query's fields gets cleared by unprepare.

In working input cases, unprepare is never called because we don't re-plan the query mid-batch.

The underlying issue therefore appears to be that we fail to re-Describe the query correctly after re-planning if we require row set information. Or fail to process the Describe result properly, or in the right order.

Relying on preDescribe (Jdbc4PreparedStatement.java:2871) therefore seems to be incorrect for returning-with-results.

@ringerc
Copy link
Member Author
ringerc commented Mar 12, 2015

Proposed fix: If AbstractJdbc2Statement.wantsGeneratedKeysAlways then if we're unpreparing and re-executing, send a new describe and queue it for processing, by setting describeStatement in sendOneQuery.

We still have to send the first Describe before the batch, because that's what's used to make sure the buffer doesn't overfill and cause a protocol deadlock. That can be removed if, at some future point, PgJDBC finally gets proper buffer management.

A test that forces describeStatement = true unconditionally (which should be OK, it's just an optimisation that we skip Describe) works - mostly.

For some reason, in the ResultSet object, the first row is garbage, being the byte[]:

[[0, 0, 0, -47]]

while the later four rows correspond to expected values:

 [[50, 49, 48]]
 [[50, 49, 49]]
 [[50, 49, 50]]
 [[50, 49, 51]]

If this first entry is skipped then everything now works.

The cause appears to be that the Describe message for the first statement gets a result saying that the result field is text-typed - but PgJDBC actually overrides that in the Bind message it sends, specifying the desired result format of column 1 as binary. So the Row Description response identifies the result as 'text' format, but the actual content is binary. That's not a Pg bug - we asked to describe the statement, and it did. Then we bound parameters that requested binary results and ran it.

Nonetheless, it's causing the parameter to be misidentified.

It looks like that's because we're doing the describe at the wrong phase. We shouldn't be describing the statement, but the bind.

Doing it there, by forcing a sendDescribePortal at QueryExecutorImpl.java:1701 or so, causes the first row to be correctly identified as binary. However we can't cope with a query returning binary for some executions and text for others, we store field information at the query level. When we re-plan the query we discard the knowledge of the prior execution that we needed.

@ringerc
Copy link
Member Author
ringerc commented Mar 12, 2015

There are some underlying issues with how batches that return results interact with result sets.

The result set is assumed to only be able to have one set of fields, with one type per field, one binary/text mode per field, etc. That's fine for normal queries.

However, when a query can be replanned mid-batch because input parameter types have changed, this can change the result types. There is no way to store this knowledge, because the result set can carry only one type per field.

Part of this is really a problem with the JDBC API. It assumes that a batch returns a single resultset, with a single set of metadata describing the results. There's no room in the API for resultsets with mixed types. That's perfectly sensible for single query executions, but not for batches.

Really, getGeneratedKeys() should be returning a set of result sets, like how a Statement batch with multiple result sets works. Then each execution could append a new result set, which is usually a single row - but not necessarily, each batch entry could return more than one row.

Right now multiple rows returned from a single batch entry just get concatenated onto the result set and there's no way to tell which batch execution they came from.

There's not much we can do about those JDBC API limitations. It just has to be worked around.


Additionally, in sendOneQuery the driver tests fields to see if it's valid and skips sending a Describe if it is (see: QueryExecutorImpl.java:1701). For a PreparedStatement that might later get re-planned it might clear it the field information in a later iteration, invalidating its assumption that it's valid and doesn't need describing.

This is the immediate cause of the bug described at the top of this issue.

I think one of the underlying issues is that we re-use the same SimpleQuery object when we re-plan after a parameter type change, doing an unprepare then re-sending it. It might be the same query, but it's not actually the same execution of the same query anymore, and should have its own object. Additionally, unprepare and close should be deferred until the whole batch has been sent.

All this worked before because we basically turned batches that returned results into one-shot queries running a loop. Even with that I suspect there may be cases where mixing parameter types between iterations can cause issues.

@ringerc
Copy link
Member Author
ringerc commented Mar 13, 2015

The most strictly correct option would be to prohibit parameter types from changing within a batch and to prohibit unknown-typed parameters. This might break existing applications and isn't necessary unless generated keys are requested. In many ways this would be the most correct approach given the JDBC API, but it's the most intrusive.

Another correct option, and one that'll break fewer apps, is to query PostgreSQL to discover compatible types. Then treat those types (like varchar and text) as compatible on the JDBC side, and refuse all other type changes. This would increase startup overhead by quite a bit or require knowledge to be hardcoded in the driver. It'd be complex, especially when handling unknown types. It doesn't seem worthwhile.

All the other options require that we accept that the type metadata in a resultset might not match the reality of what's in the resultset for all rows. We just have to document that and make it the user's problem - if they don't like it, don't mix types together and avoid the two-argument setObject call. So these only address the second issue, of the null fields. All of them also require that we make sure we always request binary mode results when binding parameters for a result, or never do, since the result set metadata cannot keep track of the type or binary-ness of a result on a per-row basis.

Options:

  • Disallow unknown types in batches with generated keys in a batch.
  • Don't unprepare the SimpleQuery (clearing fields etc); instead copy it and prepare a new one when type changes are discovered.
  • Create the ResultSet object when we pre-describe if we expect results, rather than lazily. Store the fields there, so it doesn't matter if we clear the statement.
  • Always Describe the portal for each iteration when running a prepared statement batch, don't rely on the pre-describe result for field metadata. Ensure that binary mode is always consistent in the bind requests.

I'm evaluating these now. There's a lot of complex code to trace through.

The most logically consistent options seems to be to copy the statement since it's not the exact same thing being run anymore. However, in that case need a way to tell which statement associates with which results coming back from the server, which we don't have at the moment. Or just use the first statement's fields anyway, which kind of negates the purpose of copying it.

For that reason I suspect the second option will be better - eagerly creating the result set object and storing the field metadata. Except that that'll involve some significant internal API changes, since we work with the Statement object and use the 'fields' to determine whether there's an expected result set.


It looks like we can reasonably easily copy the query, rather than unpreparing it, when replanning after type changes. That avoids the need to force a Describe for each execution.

We have a queue to lazily clean up parsed statements already, which is automatically appended to when the ParseComplete message (case '1', QueryExecutorImpl.java:1826) is hit. In case 'T' (row description handling at QueryExecutorImpl.java:2081) we already pop a queue of describe requests so we can associate the statement with a describe response there.

It'll still be necessary to ensure that binary mode is used for results in all executions or none, so first I have to find why we're using binary for the first execution and not later ones.

@ringerc
Copy link
Member Author
ringerc commented Mar 16, 2015

See commit 763ae84 for the binary mode fix. The other part is still pending.

@ringerc
Copy link
Member Author
ringerc commented Mar 16, 2015

Copying the query isn't looking very practical. QueryExecutorImpl.execute(Query[], ...) expects a query array. It's not easy to reach back up to that level and change the query-list later. Copying it in addBatch is too early, because we don't yet know whether the server's type inference for unknown-typed params will match what we send or not so we'd have to unconditionally copy, hurting the vast majority of workloads that don't care about this. It isn't practical to check and copy just before execution either, because we don't even know if we'll have to replan until we see the first Describe result and can compare it against the parameter-sets supplied.

If we force an extra Describe when the query is first executed, even if is fields were already defined by a pre-describe, that'll ensure that the fields are always populated during resultset processing. We know there'll be a Describe response in the pipeline, and while it's usually redundant it's also not costing a round-trip so it's pretty harmless. However, if the fields are null during later batch executions we'll fall back to forcing individual query executions to prevent send/recv deadlocks because we can no longer guess the result row widths, getting rid of the whole purpose of this change in the first place.

A proper fix to this looks like it'll require separating the statement-level Describe from the portal-level Describe, so we can keep field information about the generic statement and then keep track of any parameter-specific executions separately. That'll be quite a big change across some complex code for little benefit.

All in all the whole thing is incredibly messy.

@ringerc
Copy link
Member Author
ringerc commented Mar 16, 2015

I landed up going with a bit of a hack: force a Describe at each execution if we're running a batch with generated keys. If the fields gets cleared due to type changes we'll fall back to one round trip per batch entry at that point in execution, so there'll be a performance hit, but it won't break anymore. The cost of the additional Describe is negligible. It's fiddly to do it only for the first execution after a parse because of the pre-Describe, so I just do it for every run.

I think a better fix for this would be to move the pre-Describe logic so that it's done after a query parse, near sendParse. That way we can force a round-trip wait then, both for the first parse and again if we need to re-parse. That'll involve moving a fair bit of code around and have some less than clear consequences, though.

ringerc added a commit that referenced this issue Mar 16, 2015
This is a workaround for issue #267, where we unprepare a statement
in a batch when the parameter types change but we expect to still
know the field types in the statement we just unprepared.

It's much more complicated than that, see
#267 for the gory details.
@davecramer davecramer added the triage/needs-review Issue that needs a review - remove label if all is clear label May 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug triage/needs-review Issue that needs a review - remove label if all is clear
Projects
None yet
Development

No branches or pull requests

2 participants