5
0
mirror of https://github.com/apache/sqoop.git synced 2025-05-04 10:30:02 +08:00

Add documentation for --append and --target-dir.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149914 13f79535-47bb-0310-9956-ffa450edef68
This commit is contained in:
Andrew Bayer 2011-07-22 20:03:57 +00:00
parent 568a827a1c
commit 1f9ca86a2f
3 changed files with 40 additions and 0 deletions

View File

@ -2,6 +2,9 @@
Import control options Import control options
~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~
--append::
Append data to an existing HDFS dataset
--as-sequencefile:: --as-sequencefile::
Imports data to SequenceFiles Imports data to SequenceFiles
@ -30,6 +33,9 @@ Import control options
--table (table-name):: --table (table-name)::
The table to read (required) The table to read (required)
--target-dir (dir)::
Explicit HDFS target directory for the import.
--warehouse-dir (dir):: --warehouse-dir (dir)::
Tables are uploaded to the HDFS path +/warehouse/dir/(tablename)/+ Tables are uploaded to the HDFS path +/warehouse/dir/(tablename)/+

View File

@ -27,6 +27,9 @@ include::common-args.txt[]
Import control options Import control options
~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~
--append::
Append data to an existing HDFS dataset
--as-sequencefile:: --as-sequencefile::
Imports data to SequenceFiles Imports data to SequenceFiles
@ -55,6 +58,9 @@ Import control options
--table (table-name):: --table (table-name)::
The table to read (required) The table to read (required)
--target-dir (dir)::
Explicit HDFS target directory for the import.
--warehouse-dir (dir):: --warehouse-dir (dir)::
Tables are uploaded to the HDFS path +/warehouse/dir/(tablename)/+ Tables are uploaded to the HDFS path +/warehouse/dir/(tablename)/+

View File

@ -52,6 +52,8 @@ include::connecting.txt[]
`-----------------------------`-------------------------------------- `-----------------------------`--------------------------------------
Argument Description Argument Description
--------------------------------------------------------------------- ---------------------------------------------------------------------
+\--append+ Append data to an existing dataset\
in HDFS
+\--as-sequencefile+ Imports data to SequenceFiles +\--as-sequencefile+ Imports data to SequenceFiles
+\--as-textfile+ Imports data as plain text (default) +\--as-textfile+ Imports data as plain text (default)
+\--columns <col,col,col...>+ Columns to import from table +\--columns <col,col,col...>+ Columns to import from table
@ -63,6 +65,7 @@ Argument Description
+\--split-by <column-name>+ Column of the table used to split work\ +\--split-by <column-name>+ Column of the table used to split work\
units units
+\--table <table-name>+ Table to read +\--table <table-name>+ Table to read
+\--target-dir <dir>+ HDFS destination dir
+\--warehouse-dir <dir>+ HDFS parent for table destination +\--warehouse-dir <dir>+ HDFS parent for table destination
+\--where <where clause>+ WHERE clause to use during import +\--where <where clause>+ WHERE clause to use during import
+-z,\--compress+ Enable compression +-z,\--compress+ Enable compression
@ -170,6 +173,16 @@ $ sqoop import --connnect <connect-str> --table foo --warehouse-dir /shared \
This command would write to a set of files in the +/shared/foo/+ directory. This command would write to a set of files in the +/shared/foo/+ directory.
You can also explicitly choose the target directory, like so:
----
$ sqoop import --connnect <connect-str> --table foo --target-dir /dest \
...
----
This will import the files into the +/dest+ directory. +\--target-dir+ is
incompatible with +\--warehouse-dir+.
When using direct mode, you can specify additional arguments which When using direct mode, you can specify additional arguments which
should be passed to the underlying tool. If the argument should be passed to the underlying tool. If the argument
+\--+ is given on the command-line, then subsequent arguments are sent +\--+ is given on the command-line, then subsequent arguments are sent
@ -181,6 +194,13 @@ $ sqoop import --connect jdbc:mysql://server.foo.com/db --table bar \
--direct -- --default-character-set=latin1 --direct -- --default-character-set=latin1
---- ----
By default, imports go to a new target location. If the destination directory
already exists in HDFS, Sqoop will refuse to import and overwrite that
directory's contents. If you use the +\--append+ argument, Sqoop will import
data to a temporary directory and then rename the files into the normal
target directory in a manner that does not conflict with existing filenames
in that directory.
File Formats File Formats
^^^^^^^^^^^^ ^^^^^^^^^^^^
@ -494,4 +514,12 @@ $ hadoop fs -cat EMPLOYEES/part-m-00000 | head -n 10
... ...
---- ----
Performing an incremental import of new data, after having already
imported the first 100,000 rows of a table:
----
$ sqoop import --connect jdbc:mysql://db.foo.com/somedb --table sometable \
--where "id > 100000" --target-dir /incremental_dataset --append
----