Hello, I am creating a dataflow pipeline using java that will read from BQ and then insert the data into a mysql db. It reads using bigquery io, but my understanding now is that i cannot use side inputs with bigquery io. The read performed is something like below where it queries a string "select * from table"
PCollection<TableRow> data = pipeline.apply("read from BQ",
BigQueryIO
.readTableRows()
.fromQuery(query)
.usingStandardSql());
But i need to make an api call to another application that will go into a db and retrieve a date for me. so how can i do it to where I get that date from the api call and then include it into my query where it will be something like:
select * from table where date = "date from api call"
Hi @egooge,
Welcome to Google Cloud Community!
You're correct, you can't directly use side inputs with BigQueryIO.readTableRows(). This is because BigQuery jobs need to be defined with all their parameters upfront before they're submitted to the BigQuery service.
Here’s how to achieve your API call:
String apiDate = makeAPICallToGetDate(); // Implement your API call here
String query = "SELECT * FROM `your_project.your_dataset.your_table` WHERE date = '" + apiDate + "'";
PCollection<TableRow> data = pipeline.apply("read from BQ",
BigQueryIO
.readTableRows()
.fromQuery(query)
.usingStandardSql());
private static String makeAPICallToGetDate() {
URL url = new URL(API_URL);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");
BufferedReader in = new BufferedReader(new InputStreamReader(connection.getInputStream()));
String inputLine;
StringBuilder response = new StringBuilder();
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
in.close();
String dateTimeString = getFieldValue(response.toString(), "datetime");
DateTimeFormatter formatter = DateTimeFormatter.ISO_DATE_TIME;
return LocalDateTime.parse(dateTimeString, formatter);
}
I hope the above information is helpful.