-->

Running apache Spark on windows

2019-09-19 05:27发布

问题:

I am trying to run apache spark on windows . Can someone give me a step by step instruction to do this . I have downloaded spark ,sbt and scala . Can some one give step by step . I want to run this as a standalone program

回答1:

If you are using building with sbt approach, then you'll need git also.
Install Scala, sbt and git on your machine. Download Spark source code and run following command

sbt assembly

In case,if you use prebuilt release,Here is the step by step process:
How to run Apache Spark on Windows7 in standalone mode



回答2:


You can find the step by step guide Here. in the sigmoidanalytics site. But it is different for different versions of the spark.
If you are Trying to use eclipse to built a standalone application using maven spark dependency you have to install cygwin and add cygwin/bin to your path because Spark uses linux command " ls " for finding the file permission.



回答3:

Depends what you're trying to run. If trying to run Spark Shell, please follow instructions from http://nishutayaltech.blogspot.co.uk/2015/04/how-to-run-apache-spark-on-windows7-in.html

If trying to run your own spark job, create a simple application in (either JAVA/SCALA/Python). I use Scala for development, so in Scala include folliwing libs:

"org.apache.spark" %% "spark-core" % "2.1.0",
"org.apache.spark" %% "spark-sql" % "2.1.0",
"org.apache.spark" %% "spark-streaming" % "2.1.0",

And then write a simple main method to test it :

 object MainProcessorJob extends App {
 private val applicationName = "FileProcessor"
 private val cores = "local[5]"
 private val intervalSecs = 1
 start()
 def start(): Unit = {
 val sparkConf = new SparkConf(true)
 val sparkContext = new SparkContext(cores, applicationName, sparkConf)
 ----------
}

You should be able to right click and run this in intelliJ / eclipse.