|
|
Criado por Zdeněk Šimůnek
aproximadamente 5 anos atrás
|
|
| Questão | Responda |
| DAG | Directed Acyclic Graph - Collection of tasks, their dependencies and settings. - Defined in .py script as code. |
| XCom | Feature for cross communication between tasks. |
| dags_folder | - The folder where airflow pipelines live. - This path must be absolute. - Airflow looks in your DAGS_FOLDER for modules that contain DAG objects in their GLOBAL NAMESPACE and adds the objects it finds in the DagBag. |
| DAG Run | - An instance of a DAG, containing task instances that run for a specific execution_date. - Created by the Airflow scheduler or an external trigger. |
| Task | - A Task defines a unit of work within a DAG; it is represented as a node in the DAG graph, and it is written in Python. - Each task is an implementation of an Operator. |
| Operator | An operator describes a single task in a workflow. |
| Sensor | An Operator that waits (polls) for a certain time, file, database row, S3 key, etc. |
| chain(op1, [op2, op3], [op4, op5], op6) | op1 >> [op2, op3] op2 >> op4 op3 >> op5 [op4, op5] >> op6 |
| Task Instance | An instance of a task - that has been assigned to a DAG and has a state associated with a specific DAG run (i.e for a specific execution_date). |
| execution_date | The logical date and time for a DAG Run and its Task Instances. |
| Jinja | Jinja is a modern and designer-friendly templating language for Python, modelled after Django’s templates. |
| Hooks | - Hooks are interfaces to external platforms and databases like Hive, S3, MySQL, Postgres, HDFS, and Pig. - Hooks implement a common interface when possible, and act as a building block for operators. |
| Pools | Airflow pools can be used to limit the execution parallelism on arbitrary sets of tasks. |
| Connections | The information needed to connect to external systems is stored in the Airflow metastore database. A conn_id is defined there, and hostname / login / password / schema information attached to it. Airflow pipelines retrieve centrally-managed connections information by specifying the relevant conn_id. |
Quer criar seus próprios Flashcards gratuitos com GoConqr? Saiba mais.