rAPId Integration
dorc is built to include an easy integration with rAPId and can be optionally enabled. To enable rAPId for your pipelines you want to apply both the rAPId_config attribute in the configuration model and also the rapid_layer_config attribute in the universal configuration model.
rAPId Config
Alongside the layer configuration you set a further rAPIdConfig attribute to the Configuration block for environment specific details on your rAPId instance for which to use.
from utils.config import rAPIdConfig
rAPIdConfig(
url: str
data_bucket_name: str
user_pool_id: str
dorc_rapid_client_id: str
)
url- Your rAPId url e.g.https://instance.rapid.com/apidata_bucket_name- The name of the S3 bucket that's used as the rAPId data bucket.
Note: The data bucket will need to have Amazon EventBridge notifications enabled
user_pool_id- Your rAPId Cognito user pool id.dorc_rapid_client_id- The rAPId client id of the dorc client. dorc requires a rAPId client with theUSER_ADMINpermission
rAPId Layer Configuration
The layer configuration is a block applied to the universal configuration that allows dorc to understand how the folder structure of your pipelines correspond to your layers within rAPId. For instance consider the folder structure of your dorc project below:
/pipeline-config
/src
/raw-to-processed
/example
/processed-to-curated
/example
Dockerfile
This would typically link up to a rAPId account that would contain the three layers raw, processed, curated. You then tell dorc how the folder pipelines within your folder structure links to the layers by providing the rapid_layer_config attribute in UniversalConfig.
from utils.config import UniversalConfig, LayerConfig
UniversalConfig(
...
...
...
rapid_layer_config = [
LayerConfig(folder='raw-to-processed', source='raw', target='processed'),
LayerConfig(folder='processed-to-curated', source='processed', target='curated')
]
)
Pipeline rAPId Client Credentials
There are two ways to use a rAPId client within a pipeline:
- Pass a already existing rAPId client key into the pipeline definition rAPId trigger block. If you have a more general rAPId client that you are happy with all pipelines using you can create this manually within rAPId and pass this to all pipelines.
- Automatic creation of a rAPId client by dorc. If no client key is passed dorc will automatically create a new rAPId client for the pipeline.
rAPId Permissions
dorc will automatically create a client for the pipeline because no rapid_client_key has been passed. When this is the case dorc will create the client with READ and WRITE permissions for the domain specified as well as the rAPId PRIVATE permission for both the source and target layers read using the rapid_layer_config.
Usage
To use rAPId within your pipeline function dorc automatically sets a client key and client secret into the function environment variables. These can read from the values:
RAPID_CLIENT_ID
RAPID_CLIENT_SECRET
RAPID_URL