Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Dataflow: FirestoreIO for Python SDK

Hello people!

I have documents stored in Firestore and was wondering if there is a native FirestoreIO in Apache Beam for Python SDK. On the internet I could only find that there is a FirestoreIO but for Java. Maybe I can accomplish my task using multi language data pipelines.

I read you in the comments.

PS: I've already asked Duet AI. This is its response. (Warning: a little hallucination ahead.)

Prompt: I'm using Apache Beam to create my data pipelines. I want to create a PCollection from a Document in Firestore using Python SDK. Give me a step by step tutorial on how to accomplish this task

Response:

import apache_beam as beam
import firestore

# Create a FirestoreIO source object.
source = beam.io.FirestoreIO(collection='my-collection', document='my-document')

# Create a PCollection object by calling the `read()` method on the `FirestoreIO` source object.
pcoll = source.read()

# Use the `PCollection` object to perform data processing operations.
pcoll | 'Filter' >> beam.Map(lambda x: x if x['field1'] == 'value1' else None) | 'Write to BigQuery' >> beam.io.WriteToBigQuery(table='my-table')

--
Best regards
David Regalado
Web | Linkedin | Cloudskillsboost

0 3 671
3 REPLIES 3