From b87c6ef2b5269920b17284e2937d56478d7e1bd7 Mon Sep 17 00:00:00 2001 From: Jarek Potiuk Date: Tue, 17 May 2022 20:48:19 +0200 Subject: [PATCH] Split contributor's quick start into separate guides. The foldable parts were not good. They made links not to work as well as they were not too discoverable. Fixes: #23174 --- CONTRIBUTORS_QUICK_START.rst | 1591 +++-------------------- CONTRIBUTORS_QUICK_START_CODESPACES.rst | 45 + CONTRIBUTORS_QUICK_START_GITPOD.rst | 81 ++ CONTRIBUTORS_QUICK_START_PYCHARM.rst | 132 ++ CONTRIBUTORS_QUICK_START_VSCODE.rst | 125 ++ 5 files changed, 597 insertions(+), 1377 deletions(-) create mode 100644 CONTRIBUTORS_QUICK_START_CODESPACES.rst create mode 100644 CONTRIBUTORS_QUICK_START_GITPOD.rst create mode 100644 CONTRIBUTORS_QUICK_START_PYCHARM.rst create mode 100644 CONTRIBUTORS_QUICK_START_VSCODE.rst diff --git a/CONTRIBUTORS_QUICK_START.rst b/CONTRIBUTORS_QUICK_START.rst index a4f0fbc75742e..2f9af5bbc7796 100644 --- a/CONTRIBUTORS_QUICK_START.rst +++ b/CONTRIBUTORS_QUICK_START.rst @@ -16,7 +16,7 @@ under the License. ************************* -Contributor's Quick Guide +Contributor's Quick Start ************************* .. contents:: :local: @@ -24,37 +24,41 @@ Contributor's Quick Guide Note to Starters ################ -There are two ways you can run the Airflow dev env on your machine: - 1. With a Docker Container - 2. With a local virtual environment -Before deciding which method to choose, there are a couple factors to consider: -Running Airflow in a container is the most reliable way: it provides a more consistent environment and allows integration tests with a number of integrations (cassandra, mongo, mysql, etc.). However it also requires **4GB RAM, 40GB disk space and at least 2 cores**. -If you are working on a basic feature, installing Airflow on a local environment might be sufficient. +Airflow is quite a complex project, and setting up a working environment, but we made it rather simple if +you follow the guide. + +There are three ways you can run the Airflow dev env: -- |Virtual Env Guide| +1. With a Docker Containers and Docker Compose (on your local machine). This environment is managed + with `Breeze `_ tool written in Python that makes the environment management, yeah you + guessed it - a breeze. +2. With a local virtual environment (on your local machine). +3. With a remote, managed environment (via remote development environment) + +Before deciding which method to choose, there are a couple factors to consider: -.. |Virtual Env Guide| raw:: html +* Running Airflow in a container is the most reliable way: it provides a more consistent environment + and allows integration tests with a number of integrations (cassandra, mongo, mysql, etc.). + However it also requires **4GB RAM, 40GB disk space and at least 2 cores**. +* If you are working on a basic feature, installing Airflow on a local environment might be sufficient. + For a comprehensive venv tutorial - visit + `Virtual Env guide `_ +* You need to have usually a paid account to access managed, remote virtual environment. - For a comprehensive venv tutorial - visit Virtual Env Guide +Local machine development +######################### -Prerequisites -############# +If you do not work with remote development environment, you need those prerequisites. 1. Docker Community Edition 2. Docker Compose 3. pyenv (you can also use pyenv-virtualenv or virtualenvwrapper) -4. jq - - -Installing Prerequisites on Ubuntu -################################## +The below setup describe Ubuntu installation. It might be slightly different on different machines. Docker Community Edition ------------------------ - 1. Installing required packages for Docker and setting up docker repo .. code-block:: bash @@ -97,9 +101,6 @@ Note : After adding user to docker group Logout and Login again for group member $ docker run hello-world - - - Docker Compose -------------- @@ -123,8 +124,6 @@ Docker Compose $ docker-compose --version - - Pyenv and setting up virtual-env -------------------------------- Note: You might have issues with pyenv if you have a Mac with an M1 chip. Consider using virtualenv as an alternative. @@ -166,50 +165,8 @@ Pyenv and setting up virtual-env $ pyenv activate airflow-env - -Installing jq --------------------------------- - -``jq`` is a lightweight and flexible command-line JSON processor. - -Install ``jq`` with the following command: - -.. code-block:: bash - - $ sudo apt install jq - - - -Setup and develop using PyCharm -############################### - -.. raw:: html - -
- Setup and develop using PyCharm - - - -Setup Airflow with Breeze -------------------------- - - - -.. note:: - - Only ``pip`` installation is currently officially supported. - - While they are some successes with using other tools like `poetry `_ or - `pip-tools `_, they do not share the same workflow as - ``pip`` - especially when it comes to constraint vs. requirements management. - Installing via ``Poetry`` or ``pip-tools`` is not currently supported. - - If you wish to install airflow using those tools you should use the constraint files and convert - them to appropriate format and workflow that your tool requires. - - Forking and cloning Project -~~~~~~~~~~~~~~~~~~~~~~~~~~~ +--------------------------- 1. Goto |airflow_github| and fork the project. @@ -224,7 +181,7 @@ Forking and cloning Project alt="Forking Apache Airflow project"> -2. Goto your github account's fork of airflow click on ``Code`` and copy the clone link. +2. Goto your github account's fork of airflow click on ``Code`` you will find the link to your repo. .. raw:: html @@ -233,48 +190,41 @@ Forking and cloning Project alt="Cloning github fork of Apache airflow"> +3. Follow `Cloning a repository`_ + to clone the repo locally (you can also do it in your IDE - see the `Using your IDE `_ + chapter below. +Typical development tasks +######################### -3. Open your IDE or source code editor and select the option to clone the repository - - .. raw:: html - -
- Cloning github fork to Pycharm -
- - -4. Paste the copied clone link in the URL field and submit. - - .. raw:: html - -
- Cloning github fork to Pycharm -
- +For many of the development tasks you will need ``Breeze`` to be configured. ``Breeze`` is a development +environment which uses docker and docker-compose and it's main purpose is to provide a consistent +and repeatable environment for all the contributors and CI. When using ``Breeze`` you avoid the "works for me" +syndrome - because not only others can reproduce easily what you do, but also the CI of Airflow uses +the same environment to run all tests - so you should be able to easily reproduce the same failures you +see in CI in your local environment. Setting up Breeze -~~~~~~~~~~~~~~~~~ -1. Open terminal and enter into virtual environment ``airflow-env`` and goto project directory +----------------- -.. code-block:: bash +1. Install ``pipx`` - follow the instructions in `Install pipx `_ - $ pyenv activate airflow-env - $ cd ~/Projects/airflow/ -2. Initializing breeze autocomplete +2. Run ``pipx install -e ./dev/breeze`` in your checked-out repository. Make sure to follow any instructions + printed by ``pipx`` during the installation - this is needed to make sure that ``breeze`` command is + available in your PATH. + +3. Initialize breeze autocomplete .. code-block:: bash $ breeze setup-autocomplete -3. Initialize breeze environment with required python version and backend. This may take a while for first time. +4. Initialize breeze environment with required python version and backend. This may take a while for first time. .. code-block:: bash - $ breeze --python 3.8 --backend mysql + $ breeze --python 3.7 --backend mysql .. note:: If you encounter an error like "docker.credentials.errors.InitializationError: @@ -287,8 +237,12 @@ Setting up Breeze Once the package is installed, execute the breeze command again to resume image building. -4. Once the breeze environment is initialized, create airflow tables and users from the breeze CLI. ``airflow db reset`` - is required to execute at least once for Airflow Breeze to get the database/tables created. +5. When you enter Breeze environment you should see prompt similar to ``root@e4756f6ac886:/opt/airflow#``. This + means that you are inside the Breeze container and ready to run most of the development tasks. You can leave + the environment with ``exit`` and re-enter it with just ``breeze`` command. + Once you enter breeze environment, create airflow tables and users from the breeze CLI. ``airflow db reset`` + is required to execute at least once for Airflow Breeze to get the database/tables created. If you run + tests, however - the test database will be initialized automatically for you. .. code-block:: bash @@ -297,42 +251,25 @@ Setting up Breeze --email admin@example.com --firstname foo --lastname bar -5. Closing Breeze environment. After successfully finishing above command will leave you in container, - type ``exit`` to exit the container +6. Exiting Breeze environment. After successfully finishing above command will leave you in container, + type ``exit`` to exit the container. The database created before will remain and servers will be + running though, until you stop breeze environment completely. .. code-block:: bash root@b76fcb399bb6:/opt/airflow# root@b76fcb399bb6:/opt/airflow# exit -.. code-block:: bash - - $ breeze stop - -Installing airflow in the local virtual environment ``airflow-env`` with breeze. -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -1. It may require some packages to be installed; watch the output of the command to see which ones are missing. - -.. code-block:: bash - - $ sudo apt-get install sqlite libsqlite3-dev default-libmysqlclient-dev postgresql - -2. Initialize virtual environment with breeze. +6. You can stop the environment (which means deleting the databases and database servers running in the + background) via ``breeze stop`` command. .. code-block:: bash - $ ./scripts/tools/initialize_virtualenv.py - -3. Add following line to ~/.bashrc in order to call breeze command from anywhere. - -.. code-block:: bash + $ breeze stop - export PATH=${PATH}:"/home/${USER}/Projects/airflow" - source ~/.bashrc Using Breeze -~~~~~~~~~~~~ +------------ 1. Starting breeze environment using ``breeze start-airflow`` starts Breeze environment with last configuration run( In this case python and backend will be picked up from last execution ``breeze --python 3.8 --backend mysql``) @@ -409,8 +346,6 @@ Using Breeze $ root@0c6e4ff0ab3d:/opt/airflow# airflow webserver - - 2. Now you can access airflow web interface on your local machine at |http://127.0.0.1:28080| with user name ``admin`` and password ``admin``. @@ -429,9 +364,6 @@ Using Breeze MySQL Workbench with Host ``127.0.0.1``, port ``23306``, user ``root`` and password blank(leave empty), default schema ``airflow``. - If you cannot connect to MySQL, refer to the Prerequisites section in the - |Breeze documentation| and try increasing Docker disk space. - .. raw:: html
@@ -487,19 +419,11 @@ Following are some of important topics of Breeze documentation: Additional tools to the Docker Image -- |Internal details of Breeze| - -.. |Internal details of Breeze| raw:: html - - - Internal details of Breeze - - - |Breeze Command-Line Interface Reference| .. |Breeze Command-Line Interface Reference| raw:: html - Breeze Command-Line Interface Reference @@ -511,113 +435,154 @@ Following are some of important topics of Breeze documentation: Cleaning the environment -- |Other uses of the Airflow Breeze environment| -.. |Other uses of the Airflow Breeze environment| raw:: html +Configuring Pre-commit +---------------------- + +Before committing changes to github or raising a pull request, code needs to be checked for certain quality standards +such as spell check, code syntax, code formatting, compatibility with Apache License requirements etc. This set of +tests are applied when you commit your code. + +.. raw:: html - Other uses of the Airflow Breeze environment +
+ CI tests GitHub +
+To avoid burden on CI infrastructure and to save time, Pre-commit hooks can be run locally before committing changes. -Setting up Debug -~~~~~~~~~~~~~~~~ +1. Installing required packages -1. Configuring Airflow database connection +.. code-block:: bash -- Airflow is by default configured to use SQLite database. Configuration can be seen on local machine - ``~/airflow/airflow.cfg`` under ``sql_alchemy_conn``. + $ sudo apt install libxml2-utils -- Installing required dependency for MySQL connection in ``airflow-env`` on local machine. +2. Installing required Python packages - .. code-block:: bash +.. code-block:: bash - $ pyenv activate airflow-env - $ pip install PyMySQL + $ pyenv activate airflow-env + $ pip install pre-commit -- Now set ``sql_alchemy_conn = mysql+pymysql://root:@127.0.0.1:23306/airflow?charset=utf8mb4`` in file - ``~/airflow/airflow.cfg`` on local machine. +3. Go to your project directory -1. Debugging an example DAG +.. code-block:: bash -- Add Interpreter to PyCharm pointing interpreter path to ``~/.pyenv/versions/airflow-env/bin/python``, which is virtual - environment ``airflow-env`` created with pyenv earlier. For adding an Interpreter go to ``File -> Setting -> Project: - airflow -> Python Interpreter``. + $ cd ~/Projects/airflow - .. raw:: html -
- Adding existing interpreter -
+1. Running pre-commit hooks + +.. code-block:: bash + + $ pre-commit run --all-files + No-tabs checker......................................................Passed + Add license for all SQL files........................................Passed + Add license for all other files......................................Passed + Add license for all rst files........................................Passed + Add license for all JS/CSS/PUML files................................Passed + Add license for all JINJA template files.............................Passed + Add license for all shell files......................................Passed + Add license for all python files.....................................Passed + Add license for all XML files........................................Passed + Add license for all yaml files.......................................Passed + Add license for all md files.........................................Passed + Add license for all mermaid files....................................Passed + Add TOC for md files.................................................Passed + Add TOC for upgrade documentation....................................Passed + Check hooks apply to the repository..................................Passed + black................................................................Passed + Check for merge conflicts............................................Passed + Debug Statements (Python)............................................Passed + Check builtin type constructor use...................................Passed + Detect Private Key...................................................Passed + Fix End of Files.....................................................Passed + ........................................................................... -- In PyCharm IDE open airflow project, directory ``/files/dags`` of local machine is by default mounted to docker - machine when breeze airflow is started. So any DAG file present in this directory will be picked automatically by - scheduler running in docker machine and same can be seen on ``http://127.0.0.1:28080``. +5. Running pre-commit for selected files -- Copy any example DAG present in the ``/airflow/example_dags`` directory to ``/files/dags/``. +.. code-block:: bash -- Add a ``__main__`` block at the end of your DAG file to make it runnable. It will run a ``back_fill`` job: + $ pre-commit run --files airflow/decorators.py tests/utils/test_task_group.py - .. code-block:: python - if __name__ == "__main__": - dag.clear() - dag.run() -- Add ``AIRFLOW__CORE__EXECUTOR=DebugExecutor`` to Environment variable of Run Configuration. +6. Running specific hook for selected files - - Click on Add configuration +.. code-block:: bash - .. raw:: html + $ pre-commit run black --files airflow/decorators.py tests/utils/test_task_group.py + black...............................................................Passed + $ pre-commit run flake8 --files airflow/decorators.py tests/utils/test_task_group.py + Run flake8..........................................................Passed -
- Add Configuration pycharm -
- - Add Script Path and Environment Variable to new Python configuration - .. raw:: html +7. Enabling Pre-commit check before push. It will run pre-commit automatically before committing and stops the commit -
- Add environment variable pycharm -
+.. code-block:: bash -- Now Debug an example dag and view the entries in tables such as ``dag_run, xcom`` etc in MySQL Workbench. + $ cd ~/Projects/airflow + $ pre-commit install + $ git commit -m "Added xyz" +8. To disable Pre-commit +.. code-block:: bash -Starting development --------------------- + $ cd ~/Projects/airflow + $ pre-commit uninstall -Creating a branch -~~~~~~~~~~~~~~~~~ +- For more information on visit |STATIC_CODE_CHECKS.rst| -1. Click on the branch symbol in the status bar +.. |STATIC_CODE_CHECKS.rst| raw:: html - .. raw:: html + + STATIC_CODE_CHECKS.rst -
- Creating a new branch -
+- Following are some of the important links of STATIC_CODE_CHECKS.rst -2. Give a name to a branch and checkout + - |Pre-commit Hooks| - .. raw:: html + .. |Pre-commit Hooks| raw:: html -
- Giving a name to a branch -
+ + Pre-commit Hooks + + - |Running Static Code Checks via Breeze| + + .. |Running Static Code Checks via Breeze| raw:: html + + Running Static Code Checks via Breeze + + +Installing airflow in the local venv +------------------------------------ + +1. It may require some packages to be installed; watch the output of the command to see which ones are missing. + +.. code-block:: bash + + $ sudo apt-get install sqlite libsqlite3-dev default-libmysqlclient-dev postgresql + $ ./scripts/tools/initialize_virtualenv.py + + +2. Add following line to ~/.bashrc in order to call breeze command from anywhere. + +.. code-block:: bash + export PATH=${PATH}:"/home/${USER}/Projects/airflow" + source ~/.bashrc +Running tests with Breeze +------------------------- -Testing -~~~~~~~ +You can usually conveniently run tests in your IDE (see IDE below) using virtualenv but with Breeze you +can be sure that all the tests are run in the same environment as tests in CI. All Tests are inside ./tests directory. @@ -627,18 +592,21 @@ All Tests are inside ./tests directory. .. code-block:: bash - root@51d89409f7a2:/opt/airflow# pytest tests/utils/test_trigger_rule.py - ================================================ test session starts ================================================ - platform linux -- Python 3.8.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /usr/local/bin/python - cachedir: .pytest_cache - rootdir: /opt/airflow, configfile: pytest.ini - plugins: forked-1.4.0, rerunfailures-9.1.1, requests-mock-1.9.3, asyncio-0.18.1, cov-3.0.0, httpx-0.20.0, xdist-2.5.0, flaky-3.7.0, timeouts-1.2.1, anyio-3.5.0, instafail-0.4.2 - asyncio: mode=strict - setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s - collected 1 item + root@63528318c8b1:/opt/airflow# pytest tests/utils/test_decorators.py + ======================================= test session starts ======================================= + platform linux -- Python 3.8.6, pytest-6.0.1, py-1.9.0, pluggy-0.13.1 -- /usr/local/bin/python + cachedir: .pytest_cache + rootdir: /opt/airflow, configfile: pytest.ini + plugins: celery-4.4.7, requests-mock-1.8.0, xdist-1.34.0, flaky-3.7.0, rerunfailures-9.0, instafail + -0.4.2, forked-1.3.0, timeouts-1.2.1, cov-2.10.0 + setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s + collected 3 items + + tests/utils/test_decorators.py::TestApplyDefault::test_apply PASSED [ 33%] + tests/utils/test_decorators.py::TestApplyDefault::test_default_args PASSED [ 66%] + tests/utils/test_decorators.py::TestApplyDefault::test_incorrect_default_args PASSED [100%] - tests/utils/test_trigger_rule.py::TestTriggerRule::test_valid_trigger_rules PASSED [100%] - =========================================== 1 passed, 1 warning in 0.66s ============================================ + ======================================== 3 passed in 1.49s ======================================== - Running All the test with Breeze by specifying required python version, backend, backend version @@ -646,6 +614,7 @@ All Tests are inside ./tests directory. $ breeze --backend mysql --mysql-version 5.7 --python 3.8 --db-reset --test-type All tests + - Running specific type of test - Types of tests @@ -729,931 +698,44 @@ All Tests are inside ./tests directory. Local and Remote Debugging in IDE +Contribution guide +################## -Pre-commit -~~~~~~~~~~ +- To know how to contribute to the project visit |CONTRIBUTING.rst| -Before committing changes to github or raising a pull request, code needs to be checked for certain quality standards -such as spell check, code syntax, code formatting, compatibility with Apache License requirements etc. This set of -tests are applied when you commit your code. +.. |CONTRIBUTING.rst| raw:: html -.. raw:: html + CONTRIBUTING.rst -
- CI tests GitHub -
+- Following are some of important links of CONTRIBUTING.rst + - |Types of contributions| -To avoid burden on CI infrastructure and to save time, Pre-commit hooks can be run locally before committing changes. + .. |Types of contributions| raw:: html -1. Installing required packages + + Types of contributions -.. code-block:: bash - $ sudo apt install libxml2-utils + - |Roles of contributor| -2. Installing required Python packages + .. |Roles of contributor| raw:: html -.. code-block:: bash + Roles of + contributor - $ pyenv activate airflow-env - $ pip install pre-commit -3. Go to your project directory + - |Workflow for a contribution| -.. code-block:: bash + .. |Workflow for a contribution| raw:: html - $ cd ~/Projects/airflow + + Workflow for a contribution -1. Running pre-commit hooks -.. code-block:: bash - - $ pre-commit run --all-files - No-tabs checker......................................................Passed - Add license for all SQL files........................................Passed - Add license for all other files......................................Passed - Add license for all rst files........................................Passed - Add license for all JS/CSS/PUML files................................Passed - Add license for all JINJA template files.............................Passed - Add license for all shell files......................................Passed - Add license for all python files.....................................Passed - Add license for all XML files........................................Passed - Add license for all yaml files.......................................Passed - Add license for all md files.........................................Passed - Add license for all mermaid files....................................Passed - Add TOC for md files.................................................Passed - Add TOC for upgrade documentation....................................Passed - Check hooks apply to the repository..................................Passed - black................................................................Passed - Check for merge conflicts............................................Passed - Debug Statements (Python)............................................Passed - Check builtin type constructor use...................................Passed - Detect Private Key...................................................Passed - Fix End of Files.....................................................Passed - ........................................................................... - -5. Running pre-commit for selected files - -.. code-block:: bash - - $ pre-commit run --files airflow/decorators.py tests/utils/test_task_group.py - - - -6. Running specific hook for selected files - -.. code-block:: bash - - $ pre-commit run black --files airflow/decorators.py tests/utils/test_task_group.py - black...............................................................Passed - $ pre-commit run flake8 --files airflow/decorators.py tests/utils/test_task_group.py - Run flake8..........................................................Passed - - - -7. Enabling Pre-commit check before push. It will run pre-commit automatically before committing and stops the commit - -.. code-block:: bash - - $ cd ~/Projects/airflow - $ pre-commit install - $ git commit -m "Added xyz" - -8. To disable Pre-commit - -.. code-block:: bash - - $ cd ~/Projects/airflow - $ pre-commit uninstall - - -- For more information on visit |STATIC_CODE_CHECKS.rst| - -.. |STATIC_CODE_CHECKS.rst| raw:: html - - - STATIC_CODE_CHECKS.rst - -- Following are some of the important links of STATIC_CODE_CHECKS.rst - - - |Pre-commit Hooks| - - .. |Pre-commit Hooks| raw:: html - - - Pre-commit Hooks - - - |Running Static Code Checks via Breeze| - - .. |Running Static Code Checks via Breeze| raw:: html - - Running Static Code Checks via Breeze - - - - - -Contribution guide -~~~~~~~~~~~~~~~~~~ - -- To know how to contribute to the project visit |CONTRIBUTING.rst| - -.. |CONTRIBUTING.rst| raw:: html - - CONTRIBUTING.rst - -- Following are some of important links of CONTRIBUTING.rst - - - |Types of contributions| - - .. |Types of contributions| raw:: html - - - Types of contributions - - - - |Roles of contributor| - - .. |Roles of contributor| raw:: html - - Roles of - contributor - - - - |Workflow for a contribution| - - .. |Workflow for a contribution| raw:: html - - - Workflow for a contribution - - - -Raising Pull Request -~~~~~~~~~~~~~~~~~~~~ - -1. Go to your GitHub account and open your fork project and click on Branches - - .. raw:: html - -
- Goto fork and select branches -
- -2. Click on ``New pull request`` button on branch from which you want to raise a pull request. - - .. raw:: html - -
- Accessing local airflow -
- -3. Add title and description as per Contributing guidelines and click on ``Create pull request``. - - .. raw:: html - -
- Accessing local airflow -
- - -Syncing Fork and rebasing Pull request -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Often it takes several days or weeks to discuss and iterate with the PR until it is ready to merge. -In the meantime new commits are merged, and you might run into conflicts, therefore you should periodically -synchronize main in your fork with the ``apache/airflow`` main and rebase your PR on top of it. Following -describes how to do it. - - -- |Syncing fork| - -.. |Syncing fork| raw:: html - - - Update new changes made to apache:airflow project to your fork - - -- |Rebasing pull request| - -.. |Rebasing pull request| raw:: html - - - Rebasing pull request - -.. raw:: html - -
- - - -Setup and develop using Visual Studio Code -########################################## - -.. raw:: html - -
- Setup and develop using Visual Studio Code - - - -Setup Airflow with Breeze -------------------------- - - - -.. note:: - - Only ``pip`` installation is currently officially supported. - - While they are some successes with using other tools like `poetry `_ or - `pip-tools `_, they do not share the same workflow as - ``pip`` - especially when it comes to constraint vs. requirements management. - Installing via ``Poetry`` or ``pip-tools`` is not currently supported. - - If you wish to install airflow using those tools you should use the constraint files and convert - them to appropriate format and workflow that your tool requires. - - -Forking and cloning Project -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -1. Goto |airflow_github| and fork the project. - - .. |airflow_github| raw:: html - - https://github.com/apache/airflow/ - - .. raw:: html - -
- Forking Apache Airflow project -
- -2. Goto your github account's fork of airflow click on ``Code`` and copy the clone link. - - .. raw:: html - -
- Cloning github fork of Apache airflow -
- - - -3. Open your IDE or source code editor and select the option to clone the repository - - .. raw:: html - -
- Cloning github fork to Visual Studio Code -
- - -4. Paste the copied clone link in the URL field and submit. - - .. raw:: html - -
- Cloning github fork to Visual Studio Code -
- - -Setting up Breeze -~~~~~~~~~~~~~~~~~ -1. Open terminal and enter into virtual environment ``airflow-env`` and goto project directory - -.. code-block:: bash - - $ pyenv activate airflow-env - $ cd ~/Projects/airflow/ - -2. Initializing breeze autocomplete - -.. code-block:: bash - - $ breeze setup-autocomplete - $ source ~/.bash_completion.d/breeze-complete - -3. Initialize breeze environment with required python version and backend. This may take a while for first time. - -.. code-block:: bash - - $ breeze --python 3.8 --backend mysql - -.. note:: - If you encounter an error like "docker.credentials.errors.InitializationError: - docker-credential-secretservice not installed or not available in PATH", you may execute the following command to fix it: - - .. code-block:: bash - - $ sudo apt install golang-docker-credential-helper - - Once the package is installed, execute the breeze command again to resume image building. - -4. Once the breeze environment is initialized, create airflow tables and users from the breeze CLI. ``airflow db reset`` - is required to execute at least once for Airflow Breeze to get the database/tables created. - -.. code-block:: bash - - root@b76fcb399bb6:/opt/airflow# airflow db reset - root@b76fcb399bb6:/opt/airflow# airflow users create --role Admin --username admin --password admin \ - --email admin@example.com --firstname foo --lastname bar - - -5. Closing Breeze environment. After successfully finishing above command will leave you in container, - type ``exit`` to exit the container - -.. code-block:: bash - - root@b76fcb399bb6:/opt/airflow# - root@b76fcb399bb6:/opt/airflow# exit - -.. code-block:: bash - - $ breeze stop - -Installing airflow in the local virtual environment ``airflow-env`` with breeze. -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -1. It may require some packages to be installed; watch the output of the command to see which ones are missing. - -.. code-block:: bash - - $ sudo apt-get install sqlite libsqlite3-dev default-libmysqlclient-dev postgresql - $ ./scripts/tools/initialize_virtualenv.py - - -2. Add following line to ~/.bashrc in order to call breeze command from anywhere. - -.. code-block:: bash - - export PATH=${PATH}:"/home/${USER}/Projects/airflow" - source ~/.bashrc - -Using Breeze -~~~~~~~~~~~~ - -1. Starting breeze environment using ``breeze start-airflow`` starts Breeze environment with last configuration run( - In this case python and backend will be picked up from last execution ``breeze --python 3.8 --backend mysql``) - It also automatically starts webserver, backend and scheduler. It drops you in tmux with scheduler in bottom left - and webserver in bottom right. Use ``[Ctrl + B] and Arrow keys`` to navigate. - -.. code-block:: bash - - $ breeze start-airflow - - Use CI image. - - Branch name: main - Docker image: ghcr.io/apache/airflow/main/ci/python3.8:latest - Airflow source version: 2.4.0.dev0 - Python version: 3.8 - Backend: mysql 5.7 - - - Port forwarding: - - Ports are forwarded to the running docker containers for webserver and database - * 12322 -> forwarded to Airflow ssh server -> airflow:22 - * 28080 -> forwarded to Airflow webserver -> airflow:8080 - * 25555 -> forwarded to Flower dashboard -> airflow:5555 - * 25433 -> forwarded to Postgres database -> postgres:5432 - * 23306 -> forwarded to MySQL database -> mysql:3306 - * 21433 -> forwarded to MSSQL database -> mssql:1443 - * 26379 -> forwarded to Redis broker -> redis:6379 - - Here are links to those services that you can use on host: - * ssh connection for remote debugging: ssh -p 12322 airflow@127.0.0.1 pw: airflow - * Webserver: http://127.0.0.1:28080 - * Flower: http://127.0.0.1:25555 - * Postgres: jdbc:postgresql://127.0.0.1:25433/airflow?user=postgres&password=airflow - * Mysql: jdbc:mysql://127.0.0.1:23306/airflow?user=root - * MSSQL: jdbc:sqlserver://127.0.0.1:21433;databaseName=airflow;user=sa;password=Airflow123 - * Redis: redis://127.0.0.1:26379/0 - - -.. raw:: html - -
- Accessing local airflow -
- - -- Alternatively you can start the same using following commands - - 1. Start Breeze - - .. code-block:: bash - - $ breeze --python 3.8 --backend mysql - - 2. Open tmux - - .. code-block:: bash - - $ root@0c6e4ff0ab3d:/opt/airflow# tmux - - 3. Press Ctrl + B and " - - .. code-block:: bash - - $ root@0c6e4ff0ab3d:/opt/airflow# airflow scheduler - - - 4. Press Ctrl + B and % - - .. code-block:: bash - - $ root@0c6e4ff0ab3d:/opt/airflow# airflow webserver - - - - -2. Now you can access airflow web interface on your local machine at |http://127.0.0.1:28080| with user name ``admin`` - and password ``admin``. - - .. |http://127.0.0.1:28080| raw:: html - - http://127.0.0.1:28080 - - .. raw:: html - -
- Accessing local airflow -
- -3. Setup mysql database in - MySQL Workbench with Host ``127.0.0.1``, port ``23306``, user ``root`` and password - blank(leave empty), default schema ``airflow``. - - .. raw:: html - -
- Connecting to mysql -
- -4. Stopping breeze - -.. code-block:: bash - - root@f3619b74c59a:/opt/airflow# stop_airflow - root@f3619b74c59a:/opt/airflow# exit - $ breeze stop - -5. Knowing more about Breeze - -.. code-block:: bash - - $ breeze --help - - -For more information visit : |Breeze documentation| - -.. |Breeze documentation| raw:: html - - Breeze documentation - -Following are some of important topics of Breeze documentation: - - -- |Choosing different Breeze environment configuration| - -.. |Choosing different Breeze environment configuration| raw:: html - - Choosing different Breeze environment configuration - - -- |Troubleshooting Breeze environment| - -.. |Troubleshooting Breeze environment| raw:: html - - Troubleshooting - Breeze environment - - -- |Installing Additional tools to the Docker Image| - -.. |Installing Additional tools to the Docker Image| raw:: html - - Installing - Additional tools to the Docker Image - - -- |Internal details of Breeze| - -.. |Internal details of Breeze| raw:: html - - - Internal details of Breeze - - -- |Breeze Command-Line Interface Reference| - -.. |Breeze Command-Line Interface Reference| raw:: html - - Breeze Command-Line Interface Reference - - -- |Cleaning the environment| - -.. |Cleaning the environment| raw:: html - - - Cleaning the environment - - -- |Other uses of the Airflow Breeze environment| - -.. |Other uses of the Airflow Breeze environment| raw:: html - - Other uses of the Airflow Breeze environment - - - -Setting up Debug -~~~~~~~~~~~~~~~~ - -1. Configuring Airflow database connection - -- Airflow is by default configured to use SQLite database. Configuration can be seen on local machine - ``~/airflow/airflow.cfg`` under ``sql_alchemy_conn``. - -- Installing required dependency for MySQL connection in ``airflow-env`` on local machine. - - .. code-block:: bash - - $ pyenv activate airflow-env - $ pip install PyMySQL - -- Now set ``sql_alchemy_conn = mysql+pymysql://root:@127.0.0.1:23306/airflow?charset=utf8mb4`` in file - ``~/airflow/airflow.cfg`` on local machine. - -1. Debugging an example DAG - -- In Visual Studio Code open airflow project, directory ``/files/dags`` of local machine is by default mounted to docker - machine when breeze airflow is started. So any DAG file present in this directory will be picked automatically by - scheduler running in docker machine and same can be seen on ``http://127.0.0.1:28080``. - -- Copy any example DAG present in the ``/airflow/example_dags`` directory to ``/files/dags/``. - -- Add a ``__main__`` block at the end of your DAG file to make it runnable. It will run a ``back_fill`` job: - - .. code-block:: python - - - if __name__ == "__main__": - dag.clear() - dag.run() - -- Add ``"AIRFLOW__CORE__EXECUTOR": "DebugExecutor"`` to the ``"env"`` field of Debug configuration. - - - Using the ``Run`` view click on ``Create a launch.json file`` - - .. raw:: html - -
- Add Debug Configuration to Visual Studio Code - Add Debug Configuration to Visual Studio Code - Add Debug Configuration to Visual Studio Code -
- - - Change ``"program"`` to point to an example dag and add ``"env"`` and ``"python"`` fields to the new Python configuration - - .. raw:: html - -
- Add environment variable to Visual Studio Code Debug configuration -
- -- Now Debug an example dag and view the entries in tables such as ``dag_run, xcom`` etc in mysql workbench. - - - -Starting development --------------------- - - -Creating a branch -~~~~~~~~~~~~~~~~~ - -1. Click on the branch symbol in the status bar - - .. raw:: html - -
- Creating a new branch -
- -2. Give a name to a branch and checkout - - .. raw:: html - -
- Giving a name to a branch -
- - - -Testing -~~~~~~~ - -All Tests are inside ./tests directory. - -- Running Unit tests inside Breeze environment. - - Just run ``pytest filepath+filename`` to run the tests. - -.. code-block:: bash - - root@63528318c8b1:/opt/airflow# pytest tests/utils/test_decorators.py - ======================================= test session starts ======================================= - platform linux -- Python 3.8.6, pytest-6.0.1, py-1.9.0, pluggy-0.13.1 -- /usr/local/bin/python - cachedir: .pytest_cache - rootdir: /opt/airflow, configfile: pytest.ini - plugins: celery-4.4.7, requests-mock-1.8.0, xdist-1.34.0, flaky-3.7.0, rerunfailures-9.0, instafail - -0.4.2, forked-1.3.0, timeouts-1.2.1, cov-2.10.0 - setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s - collected 3 items - - tests/utils/test_decorators.py::TestApplyDefault::test_apply PASSED [ 33%] - tests/utils/test_decorators.py::TestApplyDefault::test_default_args PASSED [ 66%] - tests/utils/test_decorators.py::TestApplyDefault::test_incorrect_default_args PASSED [100%] - - ======================================== 3 passed in 1.49s ======================================== - -- Running All the test with Breeze by specifying required python version, backend, backend version - -.. code-block:: bash - - $ ./breeze-legacy --backend mysql --mysql-version 5.7 --python 3.8 --db-reset --test-type All tests - - -- Running specific type of test - - - Types of tests - - - Running specific type of test - - .. code-block:: bash - - $ ./breeze-legacy --backend mysql --mysql-version 5.7 --python 3.8 --db-reset --test-type Core - - -- Running Integration test for specific test type - - - Running an Integration Test - - .. code-block:: bash - - $ ./breeze-legacy --backend mysql --mysql-version 5.7 --python 3.8 --db-reset --test-type All --integration mongo - - -- For more information on Testing visit : |TESTING.rst| - -.. |TESTING.rst| raw:: html - - TESTING.rst - -- Following are the some of important topics of TESTING.rst - - - |Airflow Test Infrastructure| - - .. |Airflow Test Infrastructure| raw:: html - - - Airflow Test Infrastructure - - - - |Airflow Unit Tests| - - .. |Airflow Unit Tests| raw:: html - - Airflow Unit - Tests - - - - |Helm Unit Tests| - - .. |Helm Unit Tests| raw:: html - - Helm Unit Tests - - - - - |Airflow Integration Tests| - - .. |Airflow Integration Tests| raw:: html - - - Airflow Integration Tests - - - - |Running Tests with Kubernetes| - - .. |Running Tests with Kubernetes| raw:: html - - - Running Tests with Kubernetes - - - - |Airflow System Tests| - - .. |Airflow System Tests| raw:: html - - Airflow - System Tests - - - - |Local and Remote Debugging in IDE| - - .. |Local and Remote Debugging in IDE| raw:: html - - Local and Remote Debugging in IDE - - -Pre-commit -~~~~~~~~~~ - -Before committing changes to github or raising a pull request, code needs to be checked for certain quality standards -such as spell check, code syntax, code formatting, compatibility with Apache License requirements etc. This set of -tests are applied when you commit your code. - -.. raw:: html - -
- CI tests GitHub -
- - -To avoid burden on CI infrastructure and to save time, Pre-commit hooks can be run locally before committing changes. - -1. Installing required packages - -.. code-block:: bash - - $ sudo apt install libxml2-utils - -2. Installing required Python packages - -.. code-block:: bash - - $ pyenv activate airflow-env - $ pip install pre-commit - -3. Go to your project directory - -.. code-block:: bash - - $ cd ~/Projects/airflow - - -1. Running pre-commit hooks - -.. code-block:: bash - - $ pre-commit run --all-files - No-tabs checker......................................................Passed - Add license for all SQL files........................................Passed - Add license for all other files......................................Passed - Add license for all rst files........................................Passed - Add license for all JS/CSS/PUML files................................Passed - Add license for all JINJA template files.............................Passed - Add license for all shell files......................................Passed - Add license for all python files.....................................Passed - Add license for all XML files........................................Passed - Add license for all yaml files.......................................Passed - Add license for all md files.........................................Passed - Add license for all mermaid files....................................Passed - Add TOC for md files.................................................Passed - Add TOC for upgrade documentation....................................Passed - Check hooks apply to the repository..................................Passed - black................................................................Passed - Check for merge conflicts............................................Passed - Debug Statements (Python)............................................Passed - Check builtin type constructor use...................................Passed - Detect Private Key...................................................Passed - Fix End of Files.....................................................Passed - ........................................................................... - -5. Running pre-commit for selected files - -.. code-block:: bash - - $ pre-commit run --files airflow/decorators.py tests/utils/test_task_group.py - - -6. Running specific hook for selected files - -.. code-block:: bash - - $ pre-commit run black --files airflow/decorators.py tests/utils/test_task_group.py - black...............................................................Passed - $ pre-commit run flake8 --files airflow/decorators.py tests/utils/test_task_group.py - Run flake8..........................................................Passed - - -7. Enabling Pre-commit check before push. It will run pre-commit automatically before committing and stops the commit - -.. code-block:: bash - - $ cd ~/Projects/airflow - $ pre-commit install - $ git commit -m "Added xyz" - -8. To disable Pre-commit - -.. code-block:: bash - - $ cd ~/Projects/airflow - $ pre-commit uninstall - - -- For more information on visit |STATIC_CODE_CHECKS.rst| - -.. |STATIC_CODE_CHECKS.rst| raw:: html - - - STATIC_CODE_CHECKS.rst - -- Following are some of the important links of STATIC_CODE_CHECKS.rst - - - |Pre-commit Hooks| - - .. |Pre-commit Hooks| raw:: html - - - Pre-commit Hooks - - - |Running Static Code Checks via Breeze| - - .. |Running Static Code Checks via Breeze| raw:: html - - Running Static Code Checks via Breeze - -Contribution guide -~~~~~~~~~~~~~~~~~~ - -- To know how to contribute to the project visit |CONTRIBUTING.rst| - -.. |CONTRIBUTING.rst| raw:: html - - CONTRIBUTING.rst - -- Following are some of important links of CONTRIBUTING.rst - - - |Types of contributions| - - .. |Types of contributions| raw:: html - - - Types of contributions - - - - |Roles of contributor| - - .. |Roles of contributor| raw:: html - - Roles of - contributor - - - - |Workflow for a contribution| - - .. |Workflow for a contribution| raw:: html - - - Workflow for a contribution - - - -Raising Pull Request -~~~~~~~~~~~~~~~~~~~~ +Raising Pull Request +-------------------- 1. Go to your GitHub account and open your fork project and click on Branches @@ -1684,7 +766,7 @@ Raising Pull Request Syncing Fork and rebasing Pull request -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +-------------------------------------- Often it takes several days or weeks to discuss and iterate with the PR until it is ready to merge. In the meantime new commits are merged, and you might run into conflicts, therefore you should periodically @@ -1707,267 +789,22 @@ describes how to do it. Rebasing pull request -.. raw:: html - -
- - -Setup and develop using Gitpod online workspaces -################################################ - -.. raw:: html - -
- Setup and develop using Gitpod online workspaces - - - -Setup Airflow with Breeze -------------------------- - - -Forking and cloning Project -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -1. Goto |airflow_github| and fork the project. - - .. |airflow_github| raw:: html - - https://github.com/apache/airflow/ - - .. raw:: html - -
- Forking Apache Airflow project -
- -2. Goto your github account's fork of airflow click on ``Code`` and copy the clone link. - - .. raw:: html - -
- Cloning github fork of Apache airflow -
- -3. Add goto https://gitpod.io/# as shown. - - .. raw:: html - -
- Open personal airflow clone with Gitpod -
- -Setting up Breeze -~~~~~~~~~~~~~~~~~ - -1. Breeze is already initialized in one of the terminals in Gitpod - -2. Once the breeze environment is initialized, create airflow tables and users from the breeze CLI. ``airflow db reset`` - is required to execute at least once for Airflow Breeze to get the database/tables created. - -.. note:: - - This step is needed when you would like to run/use webserver. - -.. code-block:: bash - - root@b76fcb399bb6:/opt/airflow# airflow db reset - root@b76fcb399bb6:/opt/airflow# airflow users create --role Admin --username admin --password admin \ - --email admin@example.com --firstname foo --lastname bar - - -3. Closing Breeze environment. After successfully finishing above command will leave you in container, - type ``exit`` to exit the container - -.. code-block:: bash - - root@b76fcb399bb6:/opt/airflow# - root@b76fcb399bb6:/opt/airflow# exit - -.. code-block:: bash - - $ breeze stop - - -Installing Airflow with Breeze. -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Gitpod default image have all the required packages installed. - -1. Add following line to ~/.bashrc in order to call breeze command from anywhere. - -.. code-block:: bash - - export PATH=${PATH}:"/workspace/airflow" - source ~/.bashrc - - -Using Breeze -~~~~~~~~~~~~ - -1. Starting breeze environment using ``breeze start-airflow`` starts Breeze environment with last configuration run. - It also automatically starts webserver, backend and scheduler. It drops you in tmux with scheduler in bottom left - and webserver in bottom right. Use ``[Ctrl + B] and Arrow keys`` to navigate. - -.. code-block:: bash - - $ breeze start-airflow - - Use CI image. - - Branch name: main - Docker image: ghcr.io/apache/airflow/main/ci/python3.8:latest - Airflow source version: 2.4.0.dev0 - Python version: 3.8 - Backend: mysql 5.7 - - - Port forwarding: - - Ports are forwarded to the running docker containers for webserver and database - * 12322 -> forwarded to Airflow ssh server -> airflow:22 - * 28080 -> forwarded to Airflow webserver -> airflow:8080 - * 25555 -> forwarded to Flower dashboard -> airflow:5555 - * 25433 -> forwarded to Postgres database -> postgres:5432 - * 23306 -> forwarded to MySQL database -> mysql:3306 - * 21433 -> forwarded to MSSQL database -> mssql:1443 - * 26379 -> forwarded to Redis broker -> redis:6379 - - Here are links to those services that you can use on host: - * ssh connection for remote debugging: ssh -p 12322 airflow@127.0.0.1 pw: airflow - * Webserver: http://127.0.0.1:28080 - * Flower: http://127.0.0.1:25555 - * Postgres: jdbc:postgresql://127.0.0.1:25433/airflow?user=postgres&password=airflow - * Mysql: jdbc:mysql://127.0.0.1:23306/airflow?user=root - * MSSQL: jdbc:sqlserver://127.0.0.1:21433;databaseName=airflow;user=sa;password=Airflow123 - * Redis: redis://127.0.0.1:26379/0 - - -.. raw:: html - -
- Accessing local airflow -
- -2. You can access the ports as shown - -.. raw:: html - -
- Accessing ports via VSCode UI -
- - - -Starting development --------------------- - - -Creating a branch -~~~~~~~~~~~~~~~~~ - -1. Click on the branch symbol in the status bar - - .. raw:: html - -
- Creating a new branch -
- -2. Give a name to a branch and checkout - - .. raw:: html - -
- Giving a name to a branch -
- - - -Testing -~~~~~~~ - -All Tests are inside ``./tests`` directory. - -- Running Unit tests inside Breeze environment. - - Just run ``pytest filepath+filename`` to run the tests. - -.. code-block:: bash +Using your IDE +############## - root@4a2143c17426:/opt/airflow# pytest tests/utils/test_session.py - ======================================= test session starts ======================================= - platform linux -- Python 3.7.12, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /usr/local/bin/python - cachedir: .pytest_cache - rootdir: /opt/airflow, configfile: pytest.ini - plugins: anyio-3.3.4, flaky-3.7.0, asyncio-0.16.0, cov-3.0.0, forked-1.3.0, httpx-0.15.0, instafail-0.4.2, rerunfailures-9.1.1, timeouts-1.2.1, xdist-2.4.0, requests-mock-1.9.3 - setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s - collected 4 items - - tests/utils/test_session.py::TestSession::test_raised_provide_session PASSED [ 25%] - tests/utils/test_session.py::TestSession::test_provide_session_without_args_and_kwargs PASSED [ 50%] - tests/utils/test_session.py::TestSession::test_provide_session_with_args PASSED [ 75%] - tests/utils/test_session.py::TestSession::test_provide_session_with_kwargs PASSED [100%] - - ====================================== 4 passed, 11 warnings in 33.14s ====================================== - -- Running All the tests with Breeze by specifying required Python version, backend, backend version - -.. code-block:: bash - - $ ./breeze-legacy --backend mysql --mysql-version 5.7 --python 3.8 --db-reset --test-type All tests - - -- Running specific test in container using shell scripts. Testing in container scripts are located in - ``./scripts/in_container`` directory. - -.. code-block:: bash - - root@4a2143c17426:/opt/airflow# ls ./scripts/in_container/ - _in_container_script_init.sh quarantine_issue_header.md run_mypy.sh - _in_container_utils.sh run_anything.sh run_prepare_airflow_packages.sh - airflow_ci.cfg run_ci_tests.sh run_prepare_provider_documentation.sh - bin run_docs_build.sh run_prepare_provider_packages.sh - check_environment.sh run_extract_tests.sh run_resource_check.sh - check_junitxml_result.py run_fix_ownership.sh run_system_tests.sh - configure_environment.sh run_flake8.sh run_tmux_welcome.sh - entrypoint_ci.sh run_generate_constraints.sh stop_tmux_airflow.sh - entrypoint_exec.sh run_init_script.sh update_quarantined_test_status.py - prod run_install_and_test_provider_packages.sh - - root@df8927308887:/opt/airflow# ./scripts/in_container/run_docs_build.sh +If you are familiar with Python development and use your favourite editors, Airflow can be setup +similarly to other projects of yours. However, if you need specific instructions for your IDE you +will find more detailed instructions here: -- Running specific type of test - - - Types of tests - - - Running specific type of test - - .. note:: - - Before starting a new instance, let's clear the volume and databases "fresh like a daisy". You - can do this by: - - .. code-block::bash +* `Pycharm/IntelliJ `_ +* `Visual Studio Code `_ - $ breeze stop - .. code-block:: bash - - $ ./breeze-legacy --backend mysql --mysql-version 5.7 --python 3.8 --db-reset --test-type Core - - -- Running Integration test for specific test type - - - Running an Integration Test +Using Remote development environments +##################################### - .. code-block:: bash +In order to use remote development environment, you usually need a paid account, but you do not have to +setup local machine for development. - $ ./breeze-legacy --backend mysql --mysql-version 5.7 --python 3.8 --db-reset --test-type All --integration mongo +* `GitPod `_ +* `GitHub Codespaces `_ diff --git a/CONTRIBUTORS_QUICK_START_CODESPACES.rst b/CONTRIBUTORS_QUICK_START_CODESPACES.rst new file mode 100644 index 0000000000000..70e0a8b3f47cd --- /dev/null +++ b/CONTRIBUTORS_QUICK_START_CODESPACES.rst @@ -0,0 +1,45 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + + +Setup and develop using GitHub Codespaces +######################################### + +1. Goto |airflow_github| and fork the project. + + .. |airflow_github| raw:: html + + https://github.com/apache/airflow/ + + .. raw:: html + +
+ Forking Apache Airflow project +
+ +2. Follow `Codespaces Quickstart `_ to start + a new codespace. + +3. Once the codespace starts your terminal should be already in ``Breeze`` environment and you should + be able to edit and run the tests in VS Code interface. + +4. You can use `Quick start quide for Visual Studio Code `_ for details + as Codespaces use Visual Studio Code as interface. + + +Follow the `Quick start `_ for typical development tasks. diff --git a/CONTRIBUTORS_QUICK_START_GITPOD.rst b/CONTRIBUTORS_QUICK_START_GITPOD.rst new file mode 100644 index 0000000000000..3615d300978b9 --- /dev/null +++ b/CONTRIBUTORS_QUICK_START_GITPOD.rst @@ -0,0 +1,81 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +.. contents:: :local: + +Connect your project to Gitpod +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +1. Goto |airflow_github| and fork the project. + + .. |airflow_github| raw:: html + + https://github.com/apache/airflow/ + + .. raw:: html + +
+ Forking Apache Airflow project +
+ +2. Goto your github account's fork of airflow click on ``Code`` and copy the clone link. + + .. raw:: html + +
+ Cloning github fork of Apache airflow +
+ +3. Add goto https://gitpod.io/# as shown. + + .. raw:: html + +
+ Open personal airflow clone with Gitpod +
+ +Set up Breeze in Gitpod +~~~~~~~~~~~~~~~~~~~~~~~ + +Gitpod default image have all the required packages installed. + +1. Run ``pipx install -e ./dev/breeze`` to install Breeze + +2. Run ``breeze`` to enter breeze in Gitpod. + +Setting up database in Breeze +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Once you enter breeze environment is initialized, create airflow tables and users from the breeze CLI. +The ``airflow db reset`` command is required to execute at least once for Airflow Breeze to +get the database/tables created. When you run the tests, your database will be initialized automatically +the first time you run tests. + +.. note:: + + This step is needed when you would like to run/use webserver. + +.. code-block:: bash + + root@b76fcb399bb6:/opt/airflow# airflow db reset + root@b76fcb399bb6:/opt/airflow# airflow users create --role Admin --username admin --password admin \ + --email admin@example.com --firstname foo --lastname bar + +Follow the `Quick start `_ for typical development tasks. diff --git a/CONTRIBUTORS_QUICK_START_PYCHARM.rst b/CONTRIBUTORS_QUICK_START_PYCHARM.rst new file mode 100644 index 0000000000000..88c04c5545036 --- /dev/null +++ b/CONTRIBUTORS_QUICK_START_PYCHARM.rst @@ -0,0 +1,132 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +.. contents:: :local: + +Setup your project +################## + +1. Open your IDE or source code editor and select the option to clone the repository + + .. raw:: html + +
+ Cloning github fork to Pycharm +
+ + +2. Paste the repository link in the URL field and submit. + + .. raw:: html + +
+ Cloning github fork to Pycharm +
+ +Setting up debugging +#################### + +It requires "airflow-env" virtual environment configured locally. + +1. Configuring Airflow database connection + +- Airflow is by default configured to use SQLite database. Configuration can be seen on local machine + ``~/airflow/airflow.cfg`` under ``sql_alchemy_conn``. + +- Installing required dependency for MySQL connection in ``airflow-env`` on local machine. + + .. code-block:: bash + + $ pyenv activate airflow-env + $ pip install PyMySQL + +- Now set ``sql_alchemy_conn = mysql+pymysql://root:@127.0.0.1:23306/airflow?charset=utf8mb4`` in file + ``~/airflow/airflow.cfg`` on local machine. + +2. Debugging an example DAG + +- Add Interpreter to PyCharm pointing interpreter path to ``~/.pyenv/versions/airflow-env/bin/python``, which is virtual + environment ``airflow-env`` created with pyenv earlier. For adding an Interpreter go to ``File -> Setting -> Project: + airflow -> Python Interpreter``. + + .. raw:: html + +
+ Adding existing interpreter +
+ +- In PyCharm IDE open airflow project, directory ``/files/dags`` of local machine is by default mounted to docker + machine when breeze airflow is started. So any DAG file present in this directory will be picked automatically by + scheduler running in docker machine and same can be seen on ``http://127.0.0.1:28080``. + +- Copy any example DAG present in the ``/airflow/example_dags`` directory to ``/files/dags/``. + +- Add a ``__main__`` block at the end of your DAG file to make it runnable. It will run a ``back_fill`` job: + + .. code-block:: python + + if __name__ == "__main__": + dag.clear() + dag.run() + +- Add ``AIRFLOW__CORE__EXECUTOR=DebugExecutor`` to Environment variable of Run Configuration. + + - Click on Add configuration + + .. raw:: html + +
+ Add Configuration pycharm +
+ + - Add Script Path and Environment Variable to new Python configuration + + .. raw:: html + +
+ Add environment variable pycharm +
+ +- Now Debug an example dag and view the entries in tables such as ``dag_run, xcom`` etc in MySQL Workbench. + +Creating a branch +################# + +1. Click on the branch symbol in the status bar + + .. raw:: html + +
+ Creating a new branch +
+ +2. Give a name to a branch and checkout + + .. raw:: html + +
+ Giving a name to a branch +
+ +Follow the `Quick start `_ for typical development tasks. diff --git a/CONTRIBUTORS_QUICK_START_VSCODE.rst b/CONTRIBUTORS_QUICK_START_VSCODE.rst new file mode 100644 index 0000000000000..c1baf3191017c --- /dev/null +++ b/CONTRIBUTORS_QUICK_START_VSCODE.rst @@ -0,0 +1,125 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +.. contents:: :local: + +Setup your project +################## + +1. Open your IDE or source code editor and select the option to clone the repository + + .. raw:: html + +
+ Cloning github fork to Visual Studio Code +
+ + +2. Paste the copied clone link in the URL field and submit. + + .. raw:: html + +
+ Cloning github fork to Visual Studio Code +
+ + +Setting up debugging +#################### + +1. Configuring Airflow database connection + +- Airflow is by default configured to use SQLite database. Configuration can be seen on local machine + ``~/airflow/airflow.cfg`` under ``sql_alchemy_conn``. + +- Installing required dependency for MySQL connection in ``airflow-env`` on local machine. + + .. code-block:: bash + + $ pyenv activate airflow-env + $ pip install PyMySQL + +- Now set ``sql_alchemy_conn = mysql+pymysql://root:@127.0.0.1:23306/airflow?charset=utf8mb4`` in file + ``~/airflow/airflow.cfg`` on local machine. + +1. Debugging an example DAG + +- In Visual Studio Code open airflow project, directory ``/files/dags`` of local machine is by default mounted to docker + machine when breeze airflow is started. So any DAG file present in this directory will be picked automatically by + scheduler running in docker machine and same can be seen on ``http://127.0.0.1:28080``. + +- Copy any example DAG present in the ``/airflow/example_dags`` directory to ``/files/dags/``. + +- Add a ``__main__`` block at the end of your DAG file to make it runnable. It will run a ``back_fill`` job: + + .. code-block:: python + + + if __name__ == "__main__": + dag.clear() + dag.run() + +- Add ``"AIRFLOW__CORE__EXECUTOR": "DebugExecutor"`` to the ``"env"`` field of Debug configuration. + + - Using the ``Run`` view click on ``Create a launch.json file`` + + .. raw:: html + +
+ Add Debug Configuration to Visual Studio Code + Add Debug Configuration to Visual Studio Code + Add Debug Configuration to Visual Studio Code +
+ + - Change ``"program"`` to point to an example dag and add ``"env"`` and ``"python"`` fields to the new Python configuration + + .. raw:: html + +
+ Add environment variable to Visual Studio Code Debug configuration +
+ +- Now Debug an example dag and view the entries in tables such as ``dag_run, xcom`` etc in mysql workbench. + +Creating a branch +################# + +1. Click on the branch symbol in the status bar + + .. raw:: html + +
+ Creating a new branch +
+ +2. Give a name to a branch and checkout + + .. raw:: html + +
+ Giving a name to a branch +
+ +Follow the `Quick start `_ for typical development tasks.