Boosting spark.yarn.executor.memoryOverhead

I'm trying to run a (py)Spark job on EMR that will process a large amount of data. Currently my job is failing with the following error message:

Reason: Container killed by YARN for exceeding memory limits.
5.5 GB of 5.5 GB physical memory used.
Consider boosting spark.yarn.executor.memoryOverhead.

So I google'd how to do this, and found that I should pass along the spark.yarn.executor.memoryOverhead parameter with the --conf flag. I'm doing it this way:

aws emr add-steps\
--cluster-id %s\
--profile EMR\
--region us-west-2\
--steps Name=Spark,Jar=command-runner.jar,\
Args=[\
/usr/lib/spark/bin/spark-submit,\
--deploy-mode,client,\
/home/hadoop/%s,\
--executor-memory,100g,\
--num-executors,3,\
--total-executor-cores,1,\
--conf,'spark.python.worker.memory=1200m',\
--conf,'spark.yarn.executor.memoryOverhead=15300',\
],ActionOnFailure=CONTINUE" % (cluster_id,script_name)\

But when I rerun the job it keeps giving me the same error message, with the 5.5 GB of 5.5 GB physical memory used, which implies that my memory did not increase.. any hints on what I am doing wrong?

EDIT

Here are details on how I initially create the cluster:

aws emr create-cluster\
--name "Spark"\
--release-label emr-4.7.0\
--applications Name=Spark\
--bootstrap-action Path=s3://emr-code-matgreen/bootstraps/install_python_modules.sh\
--ec2-attributes KeyName=EMR2,InstanceProfile=EMR_EC2_DefaultRole\
--log-uri s3://emr-logs-zerex\
--instance-type r3.xlarge\
--instance-count 4\
--profile EMR\
--service-role EMR_DefaultRole\
--region us-west-2'

Thanks.

标签： amazon-web-services apache-spark pyspark emr amazon-emr

2条回答

Viruses.

2楼-- · 2020-05-23 18:22

After a couple of hours I found the solution to this problem. When creating the cluster, I needed to pass on the following flag as a parameter:

--configurations file://./sparkConfig.json\

With the JSON file containing:

[
    {
      "Classification": "spark-defaults",
      "Properties": {
        "spark.executor.memory": "10G"
      }
    }
  ]

This allows me to increase the memoryOverhead in the next step by using the parameter I initially posted.

0人赞添加讨论(0) 举报

别忘想泡老子

3楼-- · 2020-05-23 18:27

If you are logged into an EMR node and want to further alter Spark's default settings without dealing with the AWSCLI tools you can add a line to the spark-defaults.conf file. Spark is located in EMR's /etc directory. Users can access the file directly by navigating to or editing /etc/spark/conf/spark-defaults.conf

So in this case we'd append spark.yarn.executor.memoryOverhead to the end of the spark-defaults file. The end of the file looks very similar to this example:

spark.driver.memory              1024M
spark.executor.memory            4305M
spark.default.parallelism        8
spark.logConf                    true
spark.executorEnv.PYTHONPATH     /usr/lib/spark/python
spark.driver.maxResultSize       0
spark.worker.timeout             600
spark.storage.blockManagerSlaveTimeoutMs 600000
spark.executorEnv.PYTHONHASHSEED 0
spark.akka.timeout               600
spark.sql.shuffle.partitions     300
spark.yarn.executor.memoryOverhead 1000M

Similarly, the heap size can be controlled with the --executor-memory=xg flag or the spark.executor.memory property.

Hope this helps...

0人赞添加讨论(0) 举报

Boosting spark.yarn.executor.memoryOverhead

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间