Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Issues faced using PostgreSQL connector

After a night of trying to get this to work I thought I'd give a little feedback here. Apologies in advance to any devs who've doubtless worked hard on this up to this point.

First off: why is it not letting me require SSL for the connection? This led to a delay until I tracked it down to 'no pg_hba.conf entry for host . . . . no encryption" errors. All my normal remote connections are hostssl in pg_hba.conf, and it's a bad idea to default to allowing unencrypted connections, let alone requiring it as seems to be the case - it should be secure by default. Static IPs are nice, but IPs can be spoofed, and it does nothing for the confidentiality of the connection (ironically permission is denied to configure egressMode in the Static IP script example because the URL doesn't start with https:).

Setting up a proxy or VPN shouldn't be the offered solution when PostgreSQL has mature options for this kind of thing already. The only way I can think of reasonably using this for anything remotely private is doing my own logical replication to a Compute Engine VPS and setting up a service connector (or BigData, but the potential cost for replication is prohibitive). This might actually work a lot better due to other schema-related issues that will follow, but is a poor workaround.

I decided to allow the 'host' connection for now and grant it only access to some simple public data to try it out. After getting over this, the next hurdle: the number of connections required was not made clear. I tried to establish the user with CONNECTION LIMIT 2 - I only allowed one connector node, after all - but apparently this is not enough, and nor was 3, 4 or 5.  When it did get going, it stopped with:

"failed to get schema for entity type [TABLE NAME]"

and

"com.google.cloud.connectors.runtime.v1.errors.ConnectorException: Unable to get entity metadata from source. metadata { key: "cause" value: "Server error [SQL state: 53300]: too many connections for role \"[ROLE NAME REMOVED\"\nConnection was forcibly closed\n"

I think the critical number might actually be six; in any case, it stopped complaining after I let it have ten. It can't be trying to connect once every table, can it? Because that comes to another issue:

We have a bunch of temporary (or rather, unlogged, since they need to persist for a bit for users) search result tables in public schema. Several thousands at times. Maybe this is not a great schema. Nonetheless that is the state of things, and the end result is it spends a lot of time and data looking up and listing tables that won't exist by the time it's done, and only partially getting a list of 'real' tables.

It might have been better to specify the tables and functions that it should be accessing ahead of time (or to set a custom filter like the existing ones for pg_temp_* that I see in the query) bearing in mind that it can be hard to safely retrospectively limit public schema access - it might break something, integration user may not have access, etc.

 

I got to the point of actually getting connectors (although due to the schema table issue above, I had to refresh a few times to get first one, and then another table - the first had disappeared from the list, but fortunately the connector remained. A final sticking point was that when I put 'priority desc' it output: "priority desc" ASC, which was both opposite to the intent and resulted in a SyncBus error.

Happily, I did actually get some data out of it in the end (processing this is a separate head-scratcher; it took some puzzling to reproduce a simple "active = 't'" filter - thankfully it can do that on the connector). I was sorely tempted to just give up halfway through. It's definitely not GA-level just yet.

Solved Solved
0 3 834
1 ACCEPTED SOLUTION

Thank you @GreenReaper  for the valuable feedback. PostgresSQL connector is in the preview stage . SSL feature is in roadmap and it will be available when connector is available in GA. We will also take your feedback related to filter  improvements .

View solution in original post

3 REPLIES 3

Thank you @GreenReaper  for the valuable feedback. PostgresSQL connector is in the preview stage . SSL feature is in roadmap and it will be available when connector is available in GA. We will also take your feedback related to filter  improvements .

Thank you. I do appreciate it's a lot of work - and edge cases which might not be the case in more professionally-managed DBs (but that perhaps reflects the target market of apps). I see the static IP docs are now updated, which might save some head-scratching.

One last issue I should mention, while retrying the schema load it soaked up 260 count of secrets access for the password. This may be unavoidable if it keeps restarting connections, but I thought I should mention it, since it'd be easy to leave it retrying and blow through 10,000 in a few days - perhaps it should limit the number of connections rather than retry for that error. 

 
I understand the per-user code is something along the lines of:
SELECT rolname, rolconnlimit
FROM pg_roles
WHERE rolconnlimit <> -1;

…though you could be more specific in querying for the user's role, which would at least catch cases where a limit is not inherited.

The trace was:

 
at com.google.cloud.connectors.runtime.v2.services.AdminServiceImplV2.checkStatus(AdminServiceImplV2.java:57) at com.google.cloud.connectors.runtime.v2.AdminServiceGrpc$MethodHandlers.invoke(AdminServiceGrpc.java:272) at io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182) at io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35) at io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23) at io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40) at com.google.cloud.connectors.runtime.v1.services.ExceptionHandler$ExceptionHandlingServerCallListener.onHalfClose(ExceptionHandler.java:57) at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:351) at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:860) at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) "
}
integration_connectors_trace_id"5d2ecf11-29b7-4515-9582-ecba734ee1e3"

@GreenReaper We do the connection pooling of 100 per node (Each connection has 2 nodes initially by default). So based on the received traffic, higher number of postgres connections can be used. Also, we do not refresh the schema unless the connection goes from Error state to the Active state. If possible, can you private message me the connection details and the exact error you are encountering.

Thanks.

Top Labels in this Space