How to find spark master URL on Amazon EMR -


i new spark , trying install spark on amazon cluster version 1.3.1. when

sparkconf sparkconfig = new sparkconf().setappname("sparksqltest").setmaster("local[2]"); 

it work me , came know testing purpose can set local[2]

when tried use cluster mode changed

sparkconf sparkconfig = new sparkconf().setappname("sparksqltest").setmaster("spark://localhost:7077"); 

with getting below error

tried associate unreachable remote address [akka.tcp://sparkmaster@localhost:7077]. address gated 5000 ms, messages address delivered dead letters. reason: connection refused 15/06/10 15:22:21 info client.appclient$clientactor: connecting master akka.tcp://sparkmaster@localhost:7077/user/master..

could please let me how set master url.

if using bootstrap action https://github.com/awslabs/emr-bootstrap-actions/tree/master/spark configuration setup spark on yarn. set master yarn-client or yarn-cluster. sure define number of executors memory , cores. more details spark on yarn @ https://spark.apache.org/docs/latest/running-on-yarn.html

addition regarding executor settings memory , core sizing:

take @ default yarn node manager configs each type @ http://docs.aws.amazon.com/elasticmapreduce/latest/developerguide/taskconfiguration_h2.html, yarn.scheduler.maximum-allocation-mb. can determine number of cores basic ec2 info url (http://aws.amazon.com/ec2/instance-types/). max size of executor memory has fit within max allocation less spark's overhead , in increments of 256mb. description of calculation @ http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/. don't forget little on half executor memory can used rdd cache.


Comments

Popular posts from this blog

javascript - gulp-nodemon - nodejs restart after file change - Error: listen EADDRINUSE events.js:85 -

Fatal Python error: Py_Initialize: unable to load the file system codec. ImportError: No module named 'encodings' -

javascript - oscilloscope of speaker input stops rendering after a few seconds -