mirror of
https://github.com/apache/sqoop.git
synced 2025-05-02 17:22:25 +08:00
SQOOP-3293: Document SQOOP-2976
(Fero Szabo by Szabolcs Vasas)
This commit is contained in:
parent
a7f5e0d298
commit
d57f9fb06b
@ -411,3 +411,13 @@ To switch back to the previous version of Hadoop 0.20, for example, run:
|
||||
++++
|
||||
ant test -Dhadoopversion=20
|
||||
++++
|
||||
|
||||
== Building the documentation
|
||||
|
||||
Building the documentation requires that you have toxml installed.
|
||||
Also, one needs to set the XML_CATALOG_FILES environment variable.
|
||||
|
||||
++++
|
||||
export XML_CATALOG_FILES=/usr/local/etc/xml/catalog
|
||||
ant docs
|
||||
++++
|
||||
|
@ -257,7 +257,7 @@ username is +someuser+, then the import tool will write to
|
||||
the import with the +\--warehouse-dir+ argument. For example:
|
||||
|
||||
----
|
||||
$ sqoop import --connnect <connect-str> --table foo --warehouse-dir /shared \
|
||||
$ sqoop import --connect <connect-str> --table foo --warehouse-dir /shared \
|
||||
...
|
||||
----
|
||||
|
||||
@ -266,7 +266,7 @@ This command would write to a set of files in the +/shared/foo/+ directory.
|
||||
You can also explicitly choose the target directory, like so:
|
||||
|
||||
----
|
||||
$ sqoop import --connnect <connect-str> --table foo --target-dir /dest \
|
||||
$ sqoop import --connect <connect-str> --table foo --target-dir /dest \
|
||||
...
|
||||
----
|
||||
|
||||
@ -444,6 +444,27 @@ argument, or specify any Hadoop compression codec using the
|
||||
+\--compression-codec+ argument. This applies to SequenceFile, text,
|
||||
and Avro files.
|
||||
|
||||
Enabling Logical Types in Avro and Parquet import for numbers
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
To enable the use of logical types in Sqoop's avro schema generation,
|
||||
i.e. used during both avro and parquet imports, one has to use the
|
||||
sqoop.avro.logical_types.decimal.enable flag. This is necessary if one
|
||||
wants to store values as decimals in the avro file format.
|
||||
|
||||
Padding number types in avro import
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Certain databases, such as Oracle and Postgres store number and decimal
|
||||
values without padding. For example 1.5 in a column declared
|
||||
as NUMBER (20,5) is stored as is in Oracle, while the equivalent
|
||||
DECIMAL (20, 5) is stored as 1.50000 in an SQL server instance.
|
||||
This leads to a scale mismatch during avro import.
|
||||
|
||||
To avoid this error, one can use the sqoop.avro.decimal_padding.enable flag
|
||||
to turn on padding with 0s. This flag has to be used together with the
|
||||
sqoop.avro.logical_types.decimal.enable flag set to true.
|
||||
|
||||
Large Objects
|
||||
^^^^^^^^^^^^^
|
||||
|
||||
@ -777,3 +798,12 @@ rows copied into HDFS:
|
||||
$ sqoop import --connect jdbc:mysql://db.foo.com/corp \
|
||||
--table EMPLOYEES --validate
|
||||
----
|
||||
|
||||
Enabling logical types in avro import and also turning on padding with 0s:
|
||||
|
||||
----
|
||||
$ sqoop import -Dsqoop.avro.decimal_padding.enable=true -Dsqoop.avro.logical_types.decimal.enable=true
|
||||
--connect $CON --username $USER --password $PASS --query "select * from table_name where \$CONDITIONS"
|
||||
--target-dir hdfs://nameservice1//etl/target_path --as-avrodatafile --verbose -m 1
|
||||
|
||||
----
|
||||
|
Loading…
Reference in New Issue
Block a user