[go: nahoru, domu]

Page MenuHomePhabricator

Create a git-sync container image to be used with airflow
Closed, ResolvedPublic

Description

We have been discussing in T368033: Design a suitable DAG deployment method what sort of system we might use for DAG management for Airflow under Kubernetes.

It looks like we may want to use a git-sync sidecar method, so for this we would need a container image with this functionality.

The upstream git-sync project is here: https://github.com/kubernetes/git-sync

We may want to use this project and build a version with blubber/kokkuri

Bear in mind that we may find other uses for git-sync, such as T347421: [NEEDS GROOMING] schema services should be moved to k8s.

Event Timeline

Gehel triaged this task as High priority.Jul 9 2024, 8:10 AM
Gehel moved this task from Incoming to Scratch on the Data-Platform-SRE board.
bking changed the task status from Open to In Progress.Jul 12 2024, 1:32 PM
bking claimed this task.
bking subscribed.

I have the git-sync binary building locally on a Linux machine via the repo-supplied build script . Next step is to get it working with blubber.

I forgot to tag this task, but I also have an MR up to add the new repo to the trusted runners.

bking opened https://gitlab.wikimedia.org/repos/data-engineering/git-sync/-/merge_requests/7

Add .gitlab-ci.yml (gitlab CI config), use wget instead of git for fetching source code, use envvars

bking merged https://gitlab.wikimedia.org/repos/data-engineering/git-sync/-/merge_requests/7

Add .gitlab-ci.yml (gitlab CI config), use wget instead of git for fetching source code, use envvars

tested by running the following command within the docker container: /usr/local/bin/git-sync --branch=main --depth=1 --dest=repo --max-sync-failures=0 --ref=main --repo=https://gitlab.wikimedia.org/repos/releng/kokkuri.git --rev=HEAD --root=/home/git-sync

In production, we will probably want to invoke git-sync in a similar fashion as this github comment .

At this point, I believe the image is done but will need review from team...hmm, I think there's a Phab status for that ;) ....

Per Slack conversation with @BTullis , he has reviewed the image and is happy with it for the time being. Side note, I had some issues discovering info about existing WMF-hosted docker registry images during the course of this ticket. I created ticket T371549 for this. Resolving this ticket, as the AC is complete.

Yep, I executed:

btullis@marlin:~$ docker run -it docker-registry.wikimedia.org/repos/data-engineering/git-sync
runuser@9bcf787d48ad:/home/git-sync$ git-sync --one-time --ref=main --depth=1 --link=repo --max-failures=0 --repo=https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags --root=/home/runuser

Logs were as follows:

{"logger":"","ts":"2024-08-07 15:53:42.035888","caller":{"file":"main.go","line":568},"level":0,"msg":"starting up","version":"4.2.4","pid":433,"uid":900,"gid":900,"home":"/home/runuser","flags":["--depth=1","--link=repo","--max-failures=0","--one-time=true","--ref=main","--repo=https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags","--root=/home/runuser"]}
{"logger":"","ts":"2024-08-07 15:53:42.037091","caller":{"file":"main.go","line":673},"level":0,"msg":"git version","version":"git version 2.39.2"}
{"logger":"","ts":"2024-08-07 15:53:43.120806","caller":{"file":"main.go","line":1682},"level":0,"msg":"update required","ref":"main","local":"5569f857581fa58facff457891caaa20956a43ef","remote":"5569f857581fa58facff457891caaa20956a43ef","syncCount":0}
{"logger":"","ts":"2024-08-07 15:53:43.149370","caller":{"file":"main.go","line":1728},"level":0,"msg":"updated successfully","ref":"main","remote":"5569f857581fa58facff457891caaa20956a43ef","syncCount":1}
{"logger":"","ts":"2024-08-07 15:53:43.149459","caller":{"file":"main.go","line":906},"level":0,"msg":"exiting after one sync","status":0}

Looks good to me.