-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(Airbyte-ci): add command generate-erd-schema #43310
Conversation
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
[skip ci] Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd_schema/pipeline.py
Outdated
Show resolved
Hide resolved
[skip ci] Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
add parsers for discover, gemini md output; [skip ci] Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
[skip ci] Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
[skip ci] Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
[skip ci] Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
[skip ci] Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
[skip ci] Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd_schema/pipeline.py
Outdated
Show resolved
Hide resolved
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd_schema/pipeline.py
Outdated
Show resolved
Hide resolved
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd_schema/pipeline.py
Outdated
Show resolved
Hide resolved
@@ -18,6 +18,9 @@ more-itertools = "^8.11.0" | |||
docker = "^7" | |||
semver = "^3" | |||
airbyte-protocol-models = "*" | |||
# In an ideal world, we would import the version of airbyte-ci for the connector airbyte-ci will interact with | |||
# Also note that the version that this will currently import is <2 because of the version clash with pydantic | |||
airbyte-cdk = "*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment to highlight this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been moved outside of the pipelines and therefore is not relevant anymore. See #43310 (review) for more details
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recreating comments as GitHub has flagged them as outdated
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd/pipeline.py
Outdated
Show resolved
Hide resolved
.with_exec(["npm", "install", "-g", "dbdocs"]) | ||
.with_env_variable("DBDOCS_TOKEN", DBDOCS_TOKEN) | ||
.with_workdir("/airbyte_dbdocs") | ||
.with_new_file("/airbyte_dbdocs/dbdocs.dbml", contents=source_dbml_content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen some weird cases where the dbdocs diagram wasn't updated as part of the airbyte-ci command but running the same command locally would update the dbdocs project. Can we trust that if source_dbml_content
is different, the cache won't be triggered or is it safer to add .with_env_variable("CACHEBUSTER", str(uuid.uuid4()))
?
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd/pipeline.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd/relationships.py
Outdated
Show resolved
Hide resolved
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd/commands.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm requesting change to ask you to follow the airbyte-ci
is just an orchestrator principle
I suggest declaring ERD generation logic in an independent python package which would take inputs passed from airbyte-ci
.
airbyte-ci
would consume your package output and export back the ERDs to the connector container:
Your airbyte-ci step would:
- Run
discover
on the connector and capture the catalog - Mount the catalog to the container in which your package will be installed
- Run the erd generation command in your package
- Export the generated ERD files back to the host in the connector folder
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd/commands.py
Outdated
Show resolved
Hide resolved
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd/dbml_assembler.py
Outdated
Show resolved
Hide resolved
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd/pipeline.py
Outdated
Show resolved
Hide resolved
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd/pipeline.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disclaimer: I did not review the internal logic of your ERD packages, I focused on airbyte-ci
.
I've a lot of minor suggestion but nothing blocking! Thanks for following my refactoring original suggest, you got this!
I'm not sure if you want to version control the ERD.
If you want: the current logic is 🆗 .
If you don't: I think it's better to pass the generated file from the erd container to your UploadToDbDocs
step. It will avoid an intermediate "write to host FS / repo" operation.
If you still want to keep the same flow but don't plan on committing generated files (as it could be helpful for debugging to have the generated file on your FS), please add a .gitignore
rule.
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/context.py
Outdated
Show resolved
Hide resolved
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd/pipeline.py
Show resolved
Hide resolved
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd/pipeline.py
Outdated
Show resolved
Hide resolved
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd/pipeline.py
Outdated
Show resolved
Hide resolved
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd/pipeline.py
Outdated
Show resolved
Hide resolved
airbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd/pipeline.py
Outdated
Show resolved
Hide resolved
It has been discussed that we should have to ability to commit it so that we can see the different versions and debug. I'm not sure if we will do this at first but I want us to have the ability to do so. |
What
add new command in
airbyte-ci connectors generate-erd
and addresses https://github.com/airbytehq/airbyte-internal-issues/issues/9172How
report /steps example
Review guide
airbyte-ci/connectors/erd/relationships.py
: how we manage relationshipsairbyte-ci/connectors/erd/dbml_assembler.py
: how we generate the dbml fileairbyte-ci/connectors/pipelines/pipelines/airbyte_ci/connectors/generate_erd
: the command logicUser Impact
Given a couple of pre-requisite, users will now be able to generate ERDs in dbdocs using airbyte-ci. For more information, read the README
Can this PR be safely reverted and rolled back?