greenplum杂记【未完待续】

时间:2022-11-20 00:25:50

查看当前的连接及sql执行

SELECT * from pg_stat_activity;

查看配置的segment内存上限

[gpadmin@mdw1 pg_log]$ gpconfig -s gp_vmem_protect_limit 
Values on all segments are consistent
GUC : gp_vmem_protect_limit
Master value: 8192
Segment value: 8192

查看配置最大连接数

[gpadmin@mdw1 pg_log]$ gpconfig -s max_connections;
Values on all segments are consistent
GUC : max_connections
Master value: 250
Segment value: 750

查看当前segment分配

db_t=# select * from gp_segment_configuration;
dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+---------------------+---------------------+------------------+------------
1 | -1 | p | p | s | u | 5432 | gpmaster.com | gpmaster.com | |
2 | 0 | p | p | s | u | 40000 | gpseg1.com | gpseg1.com | 41000 |
3 | 1 | p | p | s | u | 40000 | gpseg2.com | gpseg2.com | 41000 |
4 | 0 | m | m | s | u | 50000 | gpseg2.com | gpseg2.com | 51000 |
5 | 1 | m | m | s | u | 50000 | gpseg1.com | gpseg1.com | 51000 |
6 | -1 | m | m | s | u | 5432 | gpstandbymaster.com | gpstandbymaster.com | |
(6 rows)

status = d,代表挂了

连接某segment上的postgresql

PGOPTIONS='-c gp_session_role=utility' psql -h gpseg2.com -p 40000 -d db_t
db_
t=# select * from t1;
i1 | i2 | s1 | s2
----+----+----+----

2 | | |
4 | | |
6 | | |
(3 rows)


PGOPTIONS='-c gp_session_role=utility' psql -h gpseg2.com -p 50000 -d db_t
db_
t=# \dt
List of relations
Schema | Name | Type | Owner | Storage
--------+------+-------+---------+---------

public | t1 | table | gpadmin | heap
(1 row)

db_t=# select * from t1;
i1 | i2 | s1 | s2
----+----+----+----
1 | | |
3 | | |
5 | | |
7 | | |
(4 rows)

查看某表数据分布情况

db_t=# select gp_segment_id, * from t1;
gp_segment_id | i1 | i2 | s1 | s2
---------------+----+----+----+----
0 | 1 | | |
0 | 3 | | |
0 | 5 | | |
0 | 7 | | |
1 | 2 | | |
1 | 4 | | |
1 | 6 | | |
(7 rows)
db_t=# select gp_segment_id, count(*) , abs(count(*) - avg(count(*)) over(order by gp_segment_id rows between unbounded preceding and unbounded following))/count(*) skew from t1 group by gp_segment_id order by gp_segment_id limit 100;
gp_segment_id | count | skew
---------------+-------+------------------------
0 | 4 | 0.12500000000000000000
1 | 3 | 0.16666666666666666667
(2 rows)

pirmary和mirror切换

db_t=# select * from gp_segment_configuration;
dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+---------------------+---------------------+------------------+------------
1 | -1 | p | p | s | u | 5432 | gpmaster.com | gpmaster.com | |
2 | 0 | p | p | s | u | 40000 | gpseg1.com | gpseg1.com | 41000 |
3 | 1 | p | p | s | u | 40000 | gpseg2.com | gpseg2.com | 41000 |
4 | 0 | m | m | s | u | 50000 | gpseg2.com | gpseg2.com | 51000 |
5 | 1 | m | m | s | u | 50000 | gpseg1.com | gpseg1.com | 51000 |
6 | -1 | m | m | s | u | 5432 | gpstandbymaster.com | gpstandbymaster.com | |
(6 rows)

上表中,有两个字段,rolepreferred_role.
preferred_role代表,安装集群的时候,规划的该segment是p(primary)还是m(mirror)
role代表,当前该segment的状态

当前seg1和seg2上各有1个primary和1个mirror
可以进行测试:

  1. 将seg1上的primary干掉
    ps -ef | grep green,找到两个进程,然后根据端口号判定哪个是primary
  2. 查看上表状态,会发现seg2上,两个进程的role都变成了primary,而seg1上的master进程,状态变成了down

这时候,进行恢复,在gpmaster上,执行gprecoverseg,会将kill掉的那个进程拉起来,这时候再看表gp_segment_configuration,状态都是up,但是primary和mirror状态未变化。

有强迫症的同学,会想要让各个segment进程恢复到预定状态,这个时候,可以尝试kill掉preferrred_role是mirror,但是role是primary的进程。之后再执行一遍gprecoverseg

这里有个坑,就是执行gprecoverseg会触发起来的这个segment和正常的进行数据同步,有强迫症的同学,一定要等所有segment数据同步完成后,再操作,这里参见 https://yq.aliyun.com/articles/51165

master切换

greenplum master挂了,standby不会自动切换
greenplum master挂了,standby不会自动切换
greenplum master挂了,standby不会自动切换

一旦master挂了,要确保:
1. master服务器上,进程已kill
2. /tmp/.xx5432xxx文件被删除(这个是锁,不然standby无法active)
3. standby上source环境变量,同时export GPPORT=5432
4. 执行 standbyactive

这样就会将standby更新为新的master,但是旧的master就挂了,需要再添加

psql命令

连接

psql -d gp -h 192.168.123.92 -p 5432 -U gpadmin

创建数据库

createdb -h 192.168.123.92 -p 5432 -U gpadmin db_t

查看数据库列表

psql -l -h 192.168.123.92 -p 5432 -U gpadmin

查看集群信息

查看配置

select name, setting, unit, min_val, max_val, short_desc, extra_desc from pg_settings;

查看当前GP集群信息

select * from gp_segment_configuration order by address;

查看所有表

select schemaname,tablename,tableowner from pg_tables;

查看当前默认schema,更改当前schema

SHOW search_path;
SET search_path=t_schema;

查看分区设计

SELECT partitionboundary, partitiontablename, partitionname, 
partitionlevel, partitionrank
FROM pg_partitions
WHERE tablename='sales';

把master host上的一个csv文件装载到某个表内

copy t1 from '/home/gpadmin/data_to_load/t1.csv' delimiter as',';

\命令

列出所有数据库

\l

列出当前模式schema下所有表

\dt

显示命令执行的时间

\timing

查看某表的详细信息、索引等

\d+ t1

列出模式

\dn

查看各个segment数据分布

方法一:

# select gp_segment_id,count(col1) from t1 group by 1;

方法二:
1. 查找segment列表
“`
# select role, port, hostname from gp_segment_configuration where role=’p’ order by address;
role | port | hostname
——+——-+———-
p | 5432 | mdw
p | 40000 | sdw1
p | 40001 | sdw1
p | 40000 | sdw2
p | 40001 | sdw2
p | 40000 | sdw3
p | 40001 | sdw3
p | 40000 | sdw4
p | 40001 | sdw4
(9 rows)

```

2. 依次连接各个segment
连接后只允许操作各个节点数据

PGOPTIONS='-c gp_session_role=utility' psql -d haikang -h sdw1 -p 40000

3. 执行select count(*)找到对应表在此segment上的存储量

查看状态gpstate

常用

gpstate 
gpstate -c
gpstate -s
gpstate -f