We can tune Redshift query by the following four steps literately
1. Identify slow queries
select trim(t3.usename) AS username , t1.query AS query_ID, t1.starttime, t1.elapsed, t2.querytxt from svl_qlog t1 join STL_QUERY t2 ON t1.query = t2.query JOIN pg_user t3 ON t1.userid = t3.usesysid order by t1.elapsed desc limit 50;
2. Table Design
select correct sort key, distribution key, and column compression algorithms (see blog post "
Redshift table optimizing with SORTKEY and DISTKEY.")
3. Query Design
4. WLM Design
http://docs.aws.amazon.com/redshift/latest/dg/t_determine_if_query_runs_on_disk.html
http://docs.aws.amazon.com/redshift/latest/dg/r_SVL_QLOG.html
http://docs.aws.amazon.com/redshift/latest/dg/c-optimizing-query-performance.html
select query, elapsed, substring from svl_qlog order by elapsed desc limit 5;
select query, step, rows, workmem, label, is_diskbased from svl_query_summary where query = 4839858 order by workmem desc;
select owner as node, diskno, used, capacity from stv_partitions order by 1, 2, 3, 4;
select owner, host, diskno, used, capacity,
(used-tossed)/capacity::numeric *100 as pctused
from stv_partitions order by owner;
SELECT userid, xid, pid, query, segment, locus,
datediff(ms, starttime, endtime) as duration, compile
from svl_compile
where query = 35878 or query = 35879
order by query, segment;
Redshift Tips
http://oksoft.blogspot.com/2013/03/redshift-tips.html
http://www.slideshare.net/AmazonWebServices/amazon-redshift-best-practices
No comments:
Post a Comment