Hello,
Im a beginner trying to do a project. I generated script from chat gpt for python where it will generate dummy data of employees name, address, ssn, password etc. load it into cloud storage bucket and use data fusion wrangler to transform data. Unfortunatley when I pulled data from cloud buck in data fusion i saw some of the field are missing value, and last two columns is totally empty. Can anyone help me trouble shoot this problem?
Here is my python code:
I didn't do anything on the GCP UI. Used python script to load data in bucket. Please guide me.
Hi @Asif_Shaharia,
Welcome to Google Cloud Community!
One possible reason why you’re experiencing missing values and completely blank columns is using a wrong delimiter in your Data Fusion Wrangler. If Data Fusion expects a different delimiter like for example a semicolon (;), you’ll encounter data loss. Since you have employee_data.csv
, you should be using a comma (,) delimiter.
And also, in your Data Fusion pipeline, examine the data types assigned to each field in the Wrangler transformations. Ensure that the types are compatible with the data you’re trying to process.
Note: Data Fusion is a visual point-and-click interface enabling code-free deployment of ETL/ELT data pipelines. If you really want to use python code in your pipeline, I highly suggest to use Dataflow instead.
I hope the above information is helpful.