Hi, I'm using Python to connect to fetch data from Big Query from my Corporate server. However I get the error
TransportError: HTTPSConnectionPool(host='oauth2.googleapis.com', port=443): Max retries exceeded with url: /token (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)')))
I have provided custom certificate in the path /home/jovyan/nscacert_combined.pem
How can I make the BigQuery to use this certificate or to by pass SSL verification?
My code snippet is below, I'm putting "something" as a placeholder value.
import sys
import os
os.environ['PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION'] = 'python'
print(os.environ['PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION'])
os.environ["http_proxy"] = "http://28"
os.environ["https_proxy"] = "http://28"
os.environ["CURL_CA_BUNDLE"] = "/home/jovyan/nscacert_combined.pem"
import pandas as pd
import numpy as np
import requests
from google.cloud import bigquery
from google.oauth2 import service_account
from google.api_core import retry
credentials = service_account.Credentials.from_service_account_file("ServiceAccountKey.json")
project_id = "something"
dataset_id = "something"
max_retries = retry.Retry(deadline = 5, predicate = retry.if_exception_type((IOError,)))
client = bigquery.Client(credentials = credentials, project = project_id)
updated_query = "SELECT * something..."
query_job = client.query(updated_query, retry = max_retries)
results = query_job.result()
df = results.to_dataframe()
To resolve the SSL certificate verification error when using the BigQuery Python client with a custom SSL certificate, you need to configure the client to use your custom certificate for SSL verification. Here's how you can achieve this:
Create a Custom requests.Session: Initialize a requests.Session object and set its verify attribute to point to your custom CA bundle (the path to your certificate file).
Create an AuthorizedSession: Use the google.auth.transport.requests.AuthorizedSession class to create an authorized session that uses your custom requests.Session and your service account credentials.
Pass the Custom Session to the BigQuery Client: When initializing the BigQuery client, pass your AuthorizedSession to the _http parameter of the client constructor.
Here's the modified code snippet incorporating these steps:
import os
import pandas as pd
import numpy as np
import requests
from google.cloud import bigquery
from google.oauth2 import service_account
from google.auth.transport.requests import AuthorizedSession
from google.api_core import retry
# Set environment variables (if needed)
os.environ['PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION'] = 'python'
os.environ["http_proxy"] = "http://28"
os.environ["https_proxy"] = "http://28"
# Remove CURL_CA_BUNDLE as it's not used by requests
# os.environ["CURL_CA_BUNDLE"] = "/home/jovyan/nscacert_combined.pem"
# Create a custom requests Session with your custom certificate
session = requests.Session()
session.verify = '/home/jovyan/nscacert_combined.pem'
# Load your service account credentials
credentials = service_account.Credentials.from_service_account_file("ServiceAccountKey.json")
# Create an AuthorizedSession with the custom Session
authed_session = AuthorizedSession(credentials, session=session)
# Initialize the BigQuery client with the custom HTTP session
project_id = "your_project_id"
client = bigquery.Client(credentials=credentials, _http=authed_session, project=project_id)
# Proceed with your query
updated_query = "SELECT * FROM `your_dataset.your_table`"
max_retries = retry.Retry(deadline=5, predicate=retry.if_exception_type(IOError))
query_job = client.query(updated_query, retry=max_retries)
results = query_job.result()
df = results.to_dataframe()