[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sql): add table_storage() functionality #4817

Merged

Conversation

siddharth0815
Copy link
Contributor
@siddharth0815 siddharth0815 commented Jul 24, 2024

Exposes table_storage() function which gives the following results for each table :

  • tableName
  • walEnabled
  • totalDiskSize(including partitions)
  • total partition count (excluding attached and detachable partitions)
  • total row count
  • partitionBy

NW: added link to relevant issue, won't close it yet: #3816

Copy link
Collaborator
@ideoma ideoma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution.

Although the function does not show all the partitions, it can still be useful. Please address the comments and

IMPORTANT!

add tests!

@siddharth0815
Copy link
Contributor Author

Thanks for the contribution.

Although the function does not show all the partitions, it can still be useful. Please address the comments and

IMPORTANT!

add tests!

Thanks for the contribution.

Although the function does not show all the partitions, it can still be useful. Please address the comments and

IMPORTANT!

add tests!

@ideoma in the description it was said all_table_partitions() function should be considered to be something like :

select 'table_1' as table_name,'false' as walEnabled,'DAY' as partitionBy,count(*) partition_count,sum(numrows) row_count,(sum(disksize)) as size_b from table_partitions('table_1') group by table_name union all
select 'table_2' as table_name,'false' as walEnabled,'DAY' as partitionBy,count(*) partition_count,sum(numrows) row_count,(sum(disksize)) as size_b from table_partitions('table_2') group by table_name union all
select 'table_3' as table_name,'false' as walEnabled,'DAY' as partitionBy,count(*) partition_count,sum(numrows) row_count,(sum(disksize)) as size_b from table_partitions('table_3') group by table_name

Which is giving cumulative metrics of table partitions
Is there anything else expected in this issue? any understanding gap?

@puzpuzpuz puzpuzpuz added New feature Feature requests SQL Issues or changes relating to SQL execution labels Jul 29, 2024
@nwoolmer nwoolmer self-requested a review July 31, 2024 14:28
Copy link
Contributor
@nwoolmer nwoolmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few things around naming. The PR title should change (and have feat(sql): in it)Also, maybe you should upgrade the description as I don't think this solves the issue, its a slightly different (but still very useful!) function.

We've had a few people ask for table size etc. without having to go to filesystem etc.

@siddharth0815
Copy link
Contributor Author
siddharth0815 commented Jul 31, 2024

@ideoma in the description it was said all_table_partitions() function should be considered to be something like :

select 'table_1' as table_name,'false' as walEnabled,'DAY' as partitionBy,count(*) partition_count,sum(numrows) row_count,(sum(disksize)) as size_b from table_partitions('table_1') group by table_name union all
select 'table_2' as table_name,'false' as walEnabled,'DAY' as partitionBy,count(*) partition_count,sum(numrows) row_count,(sum(disksize)) as size_b from table_partitions('table_2') group by table_name union all
select 'table_3' as table_name,'false' as walEnabled,'DAY' as partitionBy,count(*) partition_count,sum(numrows) row_count,(sum(disksize)) as size_b from table_partitions('table_3') group by table_name

Which is giving cumulative metrics of table partitions Is there anything else expected in this issue? any understanding gap?

@nwoolmer can you please check this

@siddharth0815 siddharth0815 changed the title add all_table_partitions() functionality feat(sql): add all_table_partitions() functionality Jul 31, 2024
@siddharth0815 siddharth0815 changed the title feat(sql): add all_table_partitions() functionality feat(sql): add table_storage() functionality Jul 31, 2024
@nwoolmer
Copy link
Contributor
nwoolmer commented Aug 2, 2024

@ideoma in the description it was said all_table_partitions() function should be considered to be something like :

select 'table_1' as table_name,'false' as walEnabled,'DAY' as partitionBy,count(*) partition_count,sum(numrows) row_count,(sum(disksize)) as size_b from table_partitions('table_1') group by table_name union all
select 'table_2' as table_name,'false' as walEnabled,'DAY' as partitionBy,count(*) partition_count,sum(numrows) row_count,(sum(disksize)) as size_b from table_partitions('table_2') group by table_name union all
select 'table_3' as table_name,'false' as walEnabled,'DAY' as partitionBy,count(*) partition_count,sum(numrows) row_count,(sum(disksize)) as size_b from table_partitions('table_3') group by table_name

Which is giving cumulative metrics of table partitions Is there anything else expected in this issue? any understanding gap?

@nwoolmer can you please check this

Hey @siddharth0815 ,

I can see why there is confusion about this, you've not done anything wrong here.

In the original issue, the requester names the function all_tables_partitions. At face value, this sounds like a function which should return you the same results as table_partitions('tbl') but for all your tables, so you don't have to run individual queries and then concatenate them. Certainly, that was my first impression.

But then they provide a query in which they take a summary of these outputs. That could be achieved with a 'true' all_tables_partitions function, by returning all partition rows and then using a GROUP BY.

Your function above matches the request but their naming choice is probably not ideal. Your function is great and returns the summary they asked for - which is the stats for each table aggregated over all its partitions. And therefore, table_storage() seems like a good name for it.

We should ping the requestor to let them know their query is now the function table_storage, and then circle back to an all_tables_partitions function to run table_partitions for every table.

Copy link
Contributor
@nwoolmer nwoolmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for another great contribution!

@bluestreak01 bluestreak01 merged commit aac4cf7 into questdb:master Aug 2, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
New feature Feature requests SQL Issues or changes relating to SQL execution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants