一、日志/data/clustrix/log/query.log
记录节点慢SQL/错误SQL/DDL 等信息,节点分开记录
Each entry in the query.log is categorized as one of these types. Specific logging for each query type is controlled by the global or session variable indicated.
|
Query Type |
Description |
|---|---|
| ALTER CLUSTER | Changes made to your cluster via the ALTER CLUSTER command are always logged to the query.log automatically. This logging is not controlled by a global variable. |
| BAD | The query reads more rows than necessary to return the expected results. This may indicate a bad plan or missing index. Logging of BAD queries is not enabled by default (session_log_bad_queries). |
| DDL | The query is DDL (i.e. schema change such as CREATE, DROP, ALTER), or a SET GLOBAL or SESSION command. All DDL queries are initially logged by default (session_log_ddl). |
| SLOW | Query execution time exceeded the threshold specified by the variable session_log_slow_threshold_ms. |
| SQLERR | These database errors are things such as syntax errors, timeout notifications, and permission issues. All SQLERR queries will be logged by default (session_log_error_queries). |
These are the variables that control query and user logging. The defaults shown are generally acceptable for most installations.
|
Name |
Description |
Default Value |
Session Variable |
|---|---|---|---|
| session_log_bad_queries | Log BAD queries to the query.log | false |
|
| session_log_ddl | Log DDL statements to query.log | true | |
| session_log_error_queries | Log ERROR statements to query.log | true | |
| session_log_slow_queries | Log SLOW statements to query.log | true | |
| session_log_slow_threshold_ms | Query duration threshold in milliseconds before logging this query | 10000 |
|
| session_log_users | Log LOGIN/LOGOUT to user.log | false |
MySQL [(none)]> show variables like '%session_log%';
+--------------------------------+--------+
| Variable_name | Value |
+--------------------------------+--------+
| session_log_bad_queries | false |
| session_log_bad_read_ratio | 100 |
| session_log_bad_read_threshold | 4000 |
| session_log_ddl | true |
| session_log_error_queries | true |
| session_log_slow_queries | true |
| session_log_slow_threshold_ms | 100000 |
| session_log_users | false |
+--------------------------------+--------+
8 rows in set (0.01 sec)
修改慢SQL时间,超过1s的SQL都被记录
MySQL [(none)]> set global session_log_slow_threshold_ms=1000;
Query OK, 0 rows affected (0.01 sec)
MySQL [(none)]> show global variables like 'session_log_slow_threshold_ms';
+-------------------------------+-------+
| Variable_name | Value |
+-------------------------------+-------+
| session_log_slow_threshold_ms | 1000 |
+-------------------------------+-------+
查看慢SQL日志(集群所有的操作):
MySQL [(none)]> select * from tail_query_log order by timestamp desc limit 10;
二、 日志 /data/clustrix/log/user.log
监控用户登录登出状态,单个节点文件记录单个节点信息,查看集群,system.tail_user_log表中查询。
默认关闭监控用户状态
MySQL [system]> show variables like 'session_log_users';
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| session_log_users | false |
+-------------------+-------+
1 row in set (0.01 sec)MySQL [system]> set global session_log_users=1;
Query OK, 0 rows affected (0.51 sec)MySQL [system]> show variables like 'session_log_users';
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| session_log_users | true |
+-------------------+-------+
查看集群用户登录登出状态
MySQL [system]> select * from system.tail_user_log;
+--------+----------------------------+------+----------+---------+---------------------------------------------------------------------------------------------------------------+----------+
| nodeid | timestamp | num | hostname | command | message | repeated |
+--------+----------------------------+------+----------+---------+---------------------------------------------------------------------------------------------------------------+----------+
| 1 | 2019-11-26 07:22:23.396171 | 0 | nid | NULL | 1 ip-10-1-3-88.cn-northwest-1.compute.internal clxnode: USER SID:16385 db="system" user=root@localhost LOGOUT | 0 |
| 1 | 2019-11-26 07:22:24.371812 | 1 | nid | NULL | 1 ip-10-1-3-88.cn-northwest-1.compute.internal clxnode: USER SID:17409 db=#undef user=root@localhost LOGIN | 0 |
| 1 | 2019-11-26 07:22:54.216004 | 2 | nid | NULL | 1 ip-10-1-3-88.cn-northwest-1.compute.internal clxnode: USER SID:18433 db=#undef user=root@localhost LOGIN | 0 |
| 1 | 2019-11-26 07:23:05.066316 | 3 | nid | NULL | 1 ip-10-1-3-88.cn-northwest-1.compute.internal clxnode: USER SID:18433 db=#undef user=root@localhost LOGOUT | 0 |
| 2 | 2019-11-26 07:23:07.523621 | 0 | nid | NULL | 2 ip-10-1-3-242.cn-northwest-1.compute.internal clxnode: USER SID:6146 db="system" user=root@localhost LOGOUT | 0 |
| 2 | 2019-11-26 07:23:08.049621 | 1 | nid | NULL | 2 ip-10-1-3-242.cn-northwest-1.compute.internal clxnode: USER SID:7170 db=#undef user=root@localhost LOGIN | 0 |
+--------+----------------------------+------+----------+---------+---------------------------------------------------------------------------------------------------------------+----------+
6 rows in set (0.00 sec)
三、Rebalance 配置
查看rebalance状态
sql> select * from system.rebalancer_activity_log order by started desc limit 10;
提高rebalance性能
sql> set global rebalancer_rebalance_task_limit = 8; sql> set global rebalancer_vdev_task_limit = 4; sql> set global task_rebalancer_rebalance_distribution_interval_ms = 5000; sql> set global task_rebalancer_rebalance_interval_ms = 5000;
如果出现负载过高
sql> set global rebalancer_rebalance_task_limit = default;sql> set global rebalancer_vdev_task_limit = default; sql> set global task_rebalancer_rebalance_distribution_interval_ms = default; sql> set global task_rebalancer_rebalance_interval_ms = default;
四、查看那些配置不是默认值
sql> select * from system.global_variable_definitions where current_value != default_value;
五、监控SQL
列出消耗CPU最大的3个历史SQL
SELECT nodeid, exec_count, exec_ms, exec_ms/exec_count as avg_ms, left(statement,100)
FROM system.qpc_queries
ORDER BY exec_ms desc
LIMIT 3;
列出过去24小时内最频繁运行的100个SQL
sql> SELECT query_key,
min(rank), max(rank), database, left(statement,100), sum(exec_count) as calc_exec_count, round(avg(avg_rows_read)) as calc_avg_rows_read, round(avg(avg_exec_ms)) as calc_avg_exec_ms FROM clustrix_statd.qpc_history WHERE timestamp BETWEEN (now() - interval 24 hour) AND now() AND database !='clustrix_statd' GROUP BY query_key ORDER BY calc_exec_count DESC, calc_avg_rows_read DESC LIMIT 100;
列出过去24小时内返回行数最多的100个SQL
sql> SELECT query_key,
min(rank),
max(rank),
database,
left(statement,100),
sum(exec_count) as calc_exec_count,
round(avg(avg_rows_read)) as calc_avg_rows_read,
round(avg(avg_exec_ms)) as calc_avg_exec_ms
FROM clustrix_statd.qpc_history
WHERE timestamp BETWEEN (now() - interval 24 hour) AND now()
AND database !='clustrix_statd'
GROUP BY query_key
ORDER BY calc_avg_rows_read DESC,
calc_exec_count DESC
LIMIT 100;
列出过去24小时内返回运行时间最长的100个SQL
sql> SELECT query_key,
min(rank),
max(rank),
database,
left(statement,100),
sum(exec_count) as calc_exec_count,
round(avg(avg_rows_read)) as calc_avg_rows_read,
round(avg(avg_exec_ms)) as calc_avg_exec_ms
FROM clustrix_statd.qpc_history
WHERE timestamp BETWEEN (now() - interval 24 hour) AND now()
AND database !='clustrix_statd'
GROUP BY query_key
ORDER BY calc_avg_exec_ms DESC,
calc_exec_count DESC
LIMIT 100;
六、system's tables
MySQL [(none)]> use system;
Database changed
MySQL [system]> show tables;
+-----------------------------------+
| Tables_in_system |
+-----------------------------------+
| activity |
| alerts_intervals |
| alerts_messages |
| alerts_parameters |
| alerts_subscriptions |
| alter_progress |
| autoinc_sequences |
| backups |
| backup_masters |
| backup_status |
| backup_tables |
| barriers |
| base_allocators |
| bigc_state |
| binlogs |
| binlog_commits |
| binlog_commits_segments |
| binlog_ignore_databases |
| binlog_ignore_tables |
| binlog_log_databases |
| binlog_log_tables |
| binlog_segments |
| bm_latch_waits |
| bm_stats |
| broadcast_nodes |
| check_constraints |
| cluster_session_stats |
| cluster_session_variables |
| columns |
| constraints |
| containers |
| container_stats |
| container_truncates |
| container_type_codes |
| cpm_history |
| cpm_info |
| cpu_activity |
| cpu_allocations |
| cpu_load |
| databases |
| debugpoints |
| deferred_foreign_keys |
| deferred_foreign_key_columns |
| definers |
| device_containers |
| device_containers2 |
| device_space_stats |
| disks |
| disk_activity |
| disk_paths |
| dlog_stats |
| dropped_binlogs |
| engines |
| error_codes |
| event_map |
| failpoints |
| fibers |
| flow_control_channels |
| flow_control_peers |
| foreign_keys |
| foreign_key_columns |
| global_stats |
| global_variables |
| global_variables_ignored |
| global_variable_definitions |
| groups |
| gtm_accepter |
| gtm_coord |
| gtm_coord_invocations |
| gtm_ddl |
| gtm_ddl_server |
| gtm_poisoned_invoc |
| gtm_poisoned_trx |
| gtm_repick_accepters |
| gtm_resolver |
| gtm_send_queues |
| hash_distribution_map |
| heaps |
| imported_index_pds |
| index_sizes |
| index_stats |
| init_graph_stats |
| internal_routines |
| internode_latency |
| invocations |
| irp_queues |
| key_caches |
| layercons |
| layers |
| layer_merges |
| license |
| load |
| lockman |
| lockman_holders |
| lockman_victims |
| lpdcache |
| lpds |
| ltm_transactions |
| mdstat |
| media_scanners |
| membership |
| memory_table_replicas |
| missing_pds |
| missing_pd_columns |
| missing_pd_details |
| mounts |
| mvcc_waiters |
| mysql_binlogs |
| mysql_binlog_index |
| mysql_binlog_segments |
| mysql_binlog_stats |
| mysql_binlog_trims |
| mysql_character_sets |
| mysql_collations |
| mysql_db_replication_policy |
| mysql_error_codes |
| mysql_indexed_binlogs |
| mysql_master_status |
| mysql_registered_slaves |
| mysql_repconfig |
| mysql_repslave_svars |
| mysql_repstate |
| mysql_repstate_until |
| mysql_sessions |
| mysql_slave_connection_status |
| mysql_slave_db_replication_policy |
| mysql_slave_driver_status |
| mysql_slave_log_updates |
| mysql_slave_rewrite_db |
| mysql_slave_skip_errors |
| mysql_slave_stats |
| mysql_slave_status |
| mysql_slave_variables |
| mysql_table_replication_policy |
| named_locks |
| networking |
| network_activity |
| nodeinfo |
| nodes |
| objects |
| partitioned_hash_distributions |
| partitions |
| partition_endpoints |
| partition_functions |
| pdcache_raw |
| pdms |
| pds |
| pd_groups |
| pd_group_columns |
| pending_invites |
| periodic_tasks |
| poisoned_barriers |
| privileges |
| problem_nodes |
| processlist |
| processlistfull |
| proc_cpu |
| proc_cpu_rates |
| proc_diskstats |
| proc_diskstat_rates |
| proc_interrupts |
| proc_interrupt_rates |
| proc_meminfo |
| proc_net_dev |
| proc_net_dev_rates |
| proc_softirqs |
| proc_softirq_rates |
| profiled_invocations |
| profiled_plans |
| profiled_statements |
| profiled_til |
| profiled_transactions |
| profiling |
| program_cache |
| protection_log |
| ps |
| public_keys |
| qpc_lru |
| qpc_plans |
| qpc_queries |
| queues |
| queue_readers |
| queue_replays |
| queue_replay_streams |
| queue_replay_waiters |
| queue_status |
| range_hash_distributions |
| rebalancer_activity_log |
| rebalancer_activity_targets |
| rebalancer_copies |
| rebalancer_copy_activity |
| rebalancer_copy_work_queue |
| rebalancer_hash_distributions |
| rebalancer_queued_activity |
| rebalancer_redistributes |
| rebalancer_replicas |
| rebalancer_representations |
| rebalancer_scheduled_replicas |
| rebalancer_slices |
| rebalancer_splits |
| rebalancer_started_activity |
| rebalancer_summary |
| rebalancer_vdevs |
| redistributes |
| relations |
| relation_alters |
| relation_alter_columns |
| relation_build_status |
| relation_sizes |
| replicas |
| replicated_containers |
| replication_checkpoint |
| replication_master_status |
| replica_copies |
| replica_sizes |
| replica_status_codes |
| representations |
| representation_builds |
| representation_columns |
| representation_sizes |
| representation_stats |
| restore_database_builds |
| restore_object_builds |
| ril_cache_p |
| ril_stats |
| routines |
| routine_parameters |
| sequences |
| sequence_state |
| sessions |
| session_call_stacks |
| session_containers |
| session_local_variables |
| session_pdcache |
| session_row_stats |
| sighandlers |
| sighandlers_json |
| signals |
| skiplists |
| slave_row_stats |
| slave_slices |
| slave_slice_readers |
| slave_slice_relations |
| slave_slice_writers |
| slices |
| slice_splits |
| softfailed_devices |
| softfailed_nodes |
| softfailing_containers |
| sqlstates |
| stats |
| strmaps |
| suspended_closures |
| system_users |
| sys_block_devs |
| tables_and_views |
| table_pdms |
| table_replicas |
| table_sizes |
| table_slices |
| tail_clustrix_log |
| tail_nanny_log |
| tail_query_log |
| tail_user_log |
| tail_webui_log |
| tasks |
| task_placement |
| tcp |
| temporary_tables |
| time_zones |
| tracepoints |
| transactions |
| triggers |
| trigger_event |
| trigger_orientation |
| trigger_timing |
| trxstate |
| trxstate_stats |
| underprotected_slices |
| users |
| user_accessible_databases |
| user_accessible_tables |
| user_accessible_triggers |
| user_accessible_users |
| user_acl |
| user_routine_acl |
| vdev_io |
| vdev_stat |
| version_history |
| views |
| virtual_relations |
| virtual_views |
| vmstats |
| wals |
| wal_windows |
+-----------------------------------+
295 rows in set (0.14 sec)

