Monday, June 2, 2014

Tuning redshift query performance

We can tune Redshift query by the following four steps literately

1. Identify slow queries

select trim(t3.usename) AS username , t1.query AS query_ID, t1.starttime, t1.elapsed, t2.querytxt from svl_qlog t1 join STL_QUERY t2 ON t1.query = t2.query JOIN pg_user t3 ON t1.userid = t3.usesysid  order by t1.elapsed desc limit 50;

2. Table Design

select correct sort key, distribution key, and column compression algorithms (see blog post "Redshift table optimizing with SORTKEY and DISTKEY.")

3. Query Design



4. WLM Design


http://docs.aws.amazon.com/redshift/latest/dg/t_determine_if_query_runs_on_disk.html http://docs.aws.amazon.com/redshift/latest/dg/r_SVL_QLOG.html http://docs.aws.amazon.com/redshift/latest/dg/c-optimizing-query-performance.html select query, elapsed, substring from svl_qlog order by elapsed desc limit 5; select query, step, rows, workmem, label, is_diskbased from svl_query_summary where query = 4839858 order by workmem desc; select owner as node, diskno, used, capacity from stv_partitions order by 1, 2, 3, 4; select owner, host, diskno, used, capacity, (used-tossed)/capacity::numeric *100 as pctused from stv_partitions order by owner; SELECT userid, xid, pid, query, segment, locus, datediff(ms, starttime, endtime) as duration, compile from svl_compile where query = 35878 or query = 35879 order by query, segment; Redshift Tips http://oksoft.blogspot.com/2013/03/redshift-tips.html http://www.slideshare.net/AmazonWebServices/amazon-redshift-best-practices

No comments:

Post a Comment