I got an issue happened since 3h30 UTC with Dataproc Metastore
As I look into Logs Explorer, it happened after Metastore run script hive-schema-3.1.0.cloudspanner.sql
Error message:
Starting metastore schema initialization to 3.1.0
Initialization script hive-schema-3.1.0.cloudspanner.sql
...
Error: FAILED_PRECONDITION: Operation with name "projects/xxx/instances/dpms-7bef6b94-a914-4ea8-b44/databases/hive/operations/rfea6af8e_6a40_422a_bd1a_8d98607d54ed" failed with status = GrpcStatusCode{transportCode=FAILED_PRECONDITION} and message = Duplicate name in schema: VERSION. (state=,code=9)
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
Use --verbose for detailed stacktrace.
*** schemaTool failed ***
Before that, I got an failed validate schema log also:
/opt/hive/bin/schematool -dbType cloudspanner -validate
...
Validating sequence number for SEQUENCE_TABLE
NEXT_VAL for MPartitionColumnStatistics in SEQUENCE_TABLE < max(CS_ID) in PART_COL_STATS
Failed in bit-reversal sequence number validation for SEQUENCE_TABLE.
...
My Dataproc Metastore can't start and connect from this moment anymore
Do you guys got this issue like me? Please help me resolve this
Thank you
Solved! Go to Solution.
The error message "Duplicate name in schema: VERSION" indicates that the VERSION
table or column already exists in the Cloud Spanner database. This can happen if the Hive Metastore schema has already been initialized for the database.
To resolve this issue, you can try the following:
Backup First: Before making any changes, ensure you have a backup of your Cloud Spanner database.
If you're sure that the VERSION
table is the cause of the issue and you want to delete it, use the following Cloud Spanner SQL statement:
DROP TABLE hive.VERSION;
/opt/hive/bin/schematool -dbType cloudspanner -initSchema
⚠️ Ensure you're using the correct version of the schema initialization script that matches your Hive Metastore version.
If you continue to face issues, consider reaching out to Google Cloud Support for further assistance.
The error message "Duplicate name in schema: VERSION" indicates that the VERSION
table or column already exists in the Cloud Spanner database. This can happen if the Hive Metastore schema has already been initialized for the database.
To resolve this issue, you can try the following:
Backup First: Before making any changes, ensure you have a backup of your Cloud Spanner database.
If you're sure that the VERSION
table is the cause of the issue and you want to delete it, use the following Cloud Spanner SQL statement:
DROP TABLE hive.VERSION;
/opt/hive/bin/schematool -dbType cloudspanner -initSchema
⚠️ Ensure you're using the correct version of the schema initialization script that matches your Hive Metastore version.
If you continue to face issues, consider reaching out to Google Cloud Support for further assistance.
Thank you for your recommendation, I think I should contract GCP Support because it is managed service, I don't know how to access Cloud Spanner and command shell
Yes, that would be a good idea.
hi @hoadx did you find a solution for this?
I would reach out to support for assistance with this issue!