error: object xml is not a member of package com.d

2019-08-23 22:10发布

I am trying to read XML file using SBT but i am facing issue when i compile it.

build.sbt

name:= "First Spark"
version:= "1.0"
organization := "in.goai"
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.0"
libraryDependencies += "com.databricks" % "spark-avro_2.10" % "2.0.1"
libraryDependencies += "org.scala-lang.modules" %% "scala-xml" % "1.0.2"
resolvers += Resolver.mavenLocal

.scala file

package in.goai.spark

import scala.xml._
import com.databricks.spark.xml
import org.apache.spark.sql.SQLContext
import org.apache.spark.{SparkContext, SparkConf}

object SparkMeApp {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("First Spark")
    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext(sc)
    val fileName = args(0)
    val df = sqlContext.read.format("com.databricks.spark.xml").option("rowTag", "book").load("fileName")
    val selectedData = df.select("title", "price")
    val d = selectedData.show
    println(s"$d")

  }
}

when i compile it by giving "sbt package" it shows bellow error

[error] /home/hadoop/dev/first/src/main/scala/SparkMeApp.scala:4: object xml is not a member of package com.databricks.spark
[error] import com.databricks.spark.xml
[error]        ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 9 s, completed Sep 22, 2017 4:11:19 PM

Do i need to add any other jar files related to xml? please suggest and please provide me any link which gives information about jar files for different file formats

1条回答
\"骚年 ilove
2楼-- · 2019-08-23 22:40

Because you're using Scala 2.11 and Spark 2.0, in build.sbt, change your dependencies to the following:

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.0"
libraryDependencies += "com.databricks" %% "spark-avro" % "3.2.0"
libraryDependencies += "com.databricks" %% "spark-xml" % "0.4.1"
libraryDependencies += "org.scala-lang.modules" %% "scala-xml" % "1.0.6"
  1. Change the spark-avro version to 3.2.0: https://github.com/databricks/spark-avro#requirements
  2. Add "com.databricks" %% "spark-xml" % "0.4.1": https://github.com/databricks/spark-xml#scala-211
  3. Change the scala-xml version to 1.0.6, the current version for Scala 2.11: http://mvnrepository.com/artifact/org.scala-lang.modules/scala-xml_2.11

In your code, delete the following import statement:

import com.databricks.spark.xml

Note that your code doesn't actually use the spark-avro or scala-xml libraries. Remove those dependencies from your build.sbt (and the import scala.xml._ statement from your code) if you're not going to use them.

查看更多
登录 后发表回答