Announcements
The Google Cloud Community will be in read-only from July 16 - July 22 as we migrate to a new platform; refer to this community post for more details.
Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Analytics and Monetization issue when switching over to Apigee OPDK DR Data Center

Issue: While performing the drill to failover to a DR data center, we are observing below issues:

  • The analytics data is still being written to the old PG master node
  • Monetization messages are stuck in qpid queue

Background and overview:

We have two Apigee OPDK data centers - a primary DC and a DR DC, in active-passive mode. At a given point in time, the API traffic will flow in any one of the DC. If the primary DC fails, the traffic is to be routed to the DR.

  • We have a 14 node setup in each DC - 

Node 1

Apigee Message Processor and Router

Node 2

Apigee Message Processor and Router

Node 3

Apigee Message Processor and Router

Node 4

Apigee Message Processor and Router

Node 5

Apigee Message Processor and Router

Node 6

Apigee Message Processor and Router

Node 7

Cassandra and Zookeeper

Node 8

Cassandra and Zookeeper

Node 9

Cassandra and Zookeeper

Node 10

Edge UI, management server, OpenLDAP

Node 11

Edge UI, management server, OpenLDAP

Node 12

Postgres DB

Node 13

Qpid Server

Node 14

Qpid Server

  • Postgres replication is enabled with PG node in DC-1 as master and the one in DR as slave.
  • Cassandra ring has been established between the two DCs.
  • Followed this Apigee documentation to add DC-2 data center as a disaster recovery DC - https://docs.apigee.com/private-cloud/v4.50.00/adding-data-center
  • Monetization is installed on both DC-1 and DC-2.

We performed a planned failover and failback activity to verify that the DC-2 Apigee instance works as expected. However, we faced some issues mentioned below during the activity.

  1.  DB error - can not write to the read-only DB: 
    • "ERROR: cannot execute INSERT in a read-only transaction"
    • "STATEMENT: INSERT INTO analytics. ****............**** GROUP BY apiproxy,apigee_timestamp,api_product"
  2. Monetization messages are stuck in qpid queue: We see that the monetization messages are not processes and are stuck in qpid queue.

 

A quick summary of failover steps that we performed:

  1. Ensure DC-2 components are up and running, ensure Postgres and Cassandra are in sync with DC-2, and other prerequisites
  2. Stop traffic on DC-1 Apigee instance
  3. Promote DC-2 PG as master and DC-1 PG as slave - https://docs.apigee.com/private-cloud/v4.50.00/handling-postgressql-database-failover
  4. Change Postgres database settings for monetization - https://docs.apigee.com/private-cloud/v4.50.00/change-pg-settings-monetization
  5. Restart all Apigee components in suggested order - https://docs.apigee.com/private-cloud/v4.50.00/starting-stopping-and-restarting-apigee-edge
  6. Update LBs to point to DC-2 Apigee instance and enable traffic on DC-2 Apigee instance
  7. Monitor Apigee traffic, check for component logs

Could anyone help with why the why the analytics data is being routed to the old PG master node even after the PG DB failover as suggested in the documents?

0 2 347
2 REPLIES 2