Deliver records to R2
Pipelines can convert a stream of records into compressed output files, and deliver the files to an R2 bucket in your account.
If this is your first time using Pipelines, follow the instructions in the get started guide to ingest records via HTTP and deliver them into an R2 bucket.
To create or update a Pipeline using Wrangler, run the following command in a terminal:
npx wrangler pipelines create [PIPELINE-NAME] --r2-bucket [R2-BUCKET-NAME]After running this command, you'll be prompted to authorize Cloudflare Workers Pipelines to create R2 API tokens on your behalf. These tokens are required by your Pipeline. Your Pipeline uses the tokens when loading data into your bucket. You can approve the request through the browser link which will open automatically.
If you prefer not to authenticate this way, you may pass your R2 API Tokens to Wrangler:
npx wrangler pipelines create [PIPELINE-NAME] --r2 [R2-BUCKET-NAME] --r2-access-key-id [ACCESS-KEY-ID] --r2-secret-access-key [SECRET-ACCESS-KEY]Partitioning organizes data into directories based on specific fields to improve query performance. It helps by reducing the amount of data scanned for queries, enabling faster reads.
Output files are prefixed with event date and hour. For example, the output from a Pipeline in your R2 bucket might look like this:
- event_date=2024-09-06/hr=15/37db9289-15ba-4e8b-9231-538dc7c72c1e-15.json.gz- event_date=2024-09-06/hr=15/37db9289-15ba-4e8b-9231-538dc7c72c1e-15.json.gzYou can specify an optional prefix for all the output files stored in your specified R2 bucket. The data will remain partitioned by date.
To modify the prefix for a Pipeline using Wrangler:
wrangler pipelines update <pipeline-name> --r2-prefix "test"All the output records generated by your pipeline will be stored under the prefix "test", and will look like this:
- test/event_date=2024-09-06/hr=15/37db9289-15ba-4e8b-9231-538dc7c72c1e-15.json.gz- test/event_date=2024-09-06/hr=15/37db9289-15ba-4e8b-9231-538dc7c72c1e-15.json.gz