I was trying to programmatically Load a dynamodb table into HDFS (via java, and not hive), I couldnt find examples online on how to do it, so thought I'd download the jar containing org.apache.hadoop.hive.dynamodb and reverse engineer the process.
Unfortunately, I couldn't find the file as well :(.
Could someone answer the following questions for me (listed in order of priority).
- Java example that loads a dynamodb table into HDFS (that can be passed to a mapper as a table input format).
- the jar containing org.apache.hadoop.hive.dynamodb.
Thanks!
1- I am not aware of any such example, but you might find this library useful. It provides InputFormats, OutputFormats, and Writable classes for reading and writing data to Amazon DynamoDB tables.
2- I don't think they have made it available publically.
It's in
hive-bigbird-handler.jar
. Unfortunately AWS doesn't provide any source or at least Java Doc about it. But you can find the jar on any node of an EMR Cluster:You might want to checkout this Article:
Tip: search for
hive-bigbird-handler.jar
to get to the interesting parts... ;-)