How to resolve the inconsistency between Greenplum master node and seg node metadata

Keywords: Database GreenPlum

As a distributed database, Greenplum's metadata often (true, often encounter) some errors, the typical one is the inconsistency between master node and seg node metadata. This paper introduces the solution combined with a specific error.

phenomenon

When checking the database with gpcheckcat-p 5432 database ename, output similar to the following will be reported:

Relation oid: 12345678
Relation name: ns1.table1

    Name of test which found this issue: missing_extraneous_pg_class
    Name of test which found this issue: missing_extraneous_pg_attribute
    Name of test which found this issue: missing_extraneous_pg_type
        Missing relation metadata for {'oid':12345678} on master (mdw:5432) seg65 (sdw20:40000) seg66 (sdw20:40001) seg67 (sdw20:40002) seg68 (sdw20:40003) seg69 (sdw21:40000) seg70 (sdw21:40001) seg71 (sdw21:40002) seg72 (sdw21:40003) seg73 (sdw22:40000) seg74 (sdw22:40001) seg75 (sdw22:40002) seg76 (sdw22:40003) seg77 (sdw23:40000) seg78 (sdw23:40001) seg79 (sdw23:40002) seg80 (sdw23:40003) seg81 (sdw24:40000) seg82 (sdw24:40001) seg83 (sdw24:40002) seg84 (sdw24:40003) seg85 (sdw25:40000) seg86 (sdw25:40001) seg87 (sdw25:40002) seg88 (sdw25:40003) 

 

Analysis

This error indicates that there is no definition of ns1.table1 (relname=table1, oid=12345678) in master node and 24 SEG nodes (querying gp_segment_configuration metadata tables will find that these SEG nodes are primary node s). That is to say, pg_class, pg_attribute and pg_type metadata tables have no definition of this relationship. At this point, trying to drop ns1.table1 on the master will find that there is no such table. However, this relationship definition still exists on the remaining SEG nodes. The remaining segment node s can be found by querying gp_segment_configuration.

Solve

The basic idea, of course, is to drop the remaining table definitions. First, you need to find the hostname and node of all primary segment node s that are not listed in the error from gp_segment_configuration; then login to these segment nodes and drop ns1.table1, respectively. The specific process is as follows:

  • root landing master host
  • Switch to Greenplum administrator users, such as gpadmin:
su - gpadmin
  • Find all primary segment node s:
psql -d databasename -c "select hostname,port from gp_segment_configuration where role='p';"
  • Find all hostname + ports that are not listed in the error message and write statements in the following format:
PGOPTIONS='-c gp_session_role=utility' psql -h hostname -p port  -d databasename -c 'drop table if exists ns1.table1;'
  • Execute all commands generated in the previous step under the gpadmin user
  • Find all hostname + ports that are not listed in the error message and write statements in the following format:
PGOPTIONS='-c gp_session_role=utility' psql -h hostname -p port  -d databasename -c 'select * from pg_class where oid=12345678;'
  • Execute all the commands generated in the previous step under the gpadmin user to make sure that each command returns 0 records

Posted by teejayuu on Wed, 02 Jan 2019 09:39:08 -0800