yiyujia@hadoopDev:~ $ curl -s get.gvmtool.net | bash yiyujia@hadoopDev:~ $ source "/root/.gvm/bin/gvm-init.sh" yiyujia@hadoopDev:~ $ gvm help yiyujia@hadoopDev:~ $ gvm install groovy yiyujia@hadoopDev:~ $ gvm install grails yiyujia@hadoopDev:~ $ gvm install grails 2.2.4 yiyujia@hadoopDev:~ $ grails | Enter a script name to run. Use TAB for completion: grails> quit | Goodbye yiyujia@hadoopDev:~ $
Tuesday, June 10, 2014
set up Grail on Groovy environment on CentOS
Monday, June 2, 2014
Tuning redshift query performance
We can tune Redshift query by the following four steps literately
1. Identify slow queries
select trim(t3.usename) AS username , t1.query AS query_ID, t1.starttime, t1.elapsed, t2.querytxt from svl_qlog t1 join STL_QUERY t2 ON t1.query = t2.query JOIN pg_user t3 ON t1.userid = t3.usesysid order by t1.elapsed desc limit 50;2. Table Design
select correct sort key, distribution key, and column compression algorithms (see blog post "Redshift table optimizing with SORTKEY and DISTKEY.")3. Query Design
4. WLM Design
http://docs.aws.amazon.com/redshift/latest/dg/t_determine_if_query_runs_on_disk.html
http://docs.aws.amazon.com/redshift/latest/dg/r_SVL_QLOG.html
http://docs.aws.amazon.com/redshift/latest/dg/c-optimizing-query-performance.html
select query, elapsed, substring from svl_qlog order by elapsed desc limit 5;
select query, step, rows, workmem, label, is_diskbased from svl_query_summary where query = 4839858 order by workmem desc;
select owner as node, diskno, used, capacity from stv_partitions order by 1, 2, 3, 4;
select owner, host, diskno, used, capacity,
(used-tossed)/capacity::numeric *100 as pctused
from stv_partitions order by owner;
SELECT userid, xid, pid, query, segment, locus,
datediff(ms, starttime, endtime) as duration, compile
from svl_compile
where query = 35878 or query = 35879
order by query, segment;
Redshift Tips
http://oksoft.blogspot.com/2013/03/redshift-tips.html
http://www.slideshare.net/AmazonWebServices/amazon-redshift-best-practices
Subscribe to:
Posts (Atom)