How to find spark master URL on Amazon EMR -
i new spark , trying install spark on amazon cluster version 1.3.1. when
sparkconf sparkconfig = new sparkconf().setappname("sparksqltest").setmaster("local[2]");
it work me , came know testing purpose can set local[2]
when tried use cluster mode changed
sparkconf sparkconfig = new sparkconf().setappname("sparksqltest").setmaster("spark://localhost:7077");
with getting below error
tried associate unreachable remote address [akka.tcp://sparkmaster@localhost:7077]. address gated 5000 ms, messages address delivered dead letters. reason: connection refused 15/06/10 15:22:21 info client.appclient$clientactor: connecting master akka.tcp://sparkmaster@localhost:7077/user/master..
could please let me how set master url.
if using bootstrap action https://github.com/awslabs/emr-bootstrap-actions/tree/master/spark configuration setup spark on yarn. set master yarn-client
or yarn-cluster
. sure define number of executors memory , cores. more details spark on yarn @ https://spark.apache.org/docs/latest/running-on-yarn.html
addition regarding executor settings memory , core sizing:
take @ default yarn node manager configs each type @ http://docs.aws.amazon.com/elasticmapreduce/latest/developerguide/taskconfiguration_h2.html, yarn.scheduler.maximum-allocation-mb
. can determine number of cores basic ec2 info url (http://aws.amazon.com/ec2/instance-types/). max size of executor memory has fit within max allocation less spark's overhead , in increments of 256mb. description of calculation @ http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/. don't forget little on half executor memory can used rdd cache.
Comments
Post a Comment