5
0
mirror of https://github.com/apache/sqoop.git synced 2025-05-02 18:11:13 +08:00

SQOOP-3301: Document SQOOP-3216 - metastore related change

(Fero Szabo via Szabolcs Vasas)
This commit is contained in:
Szabolcs Vasas 2018-04-10 10:51:16 +02:00
parent c146b3f94e
commit af7a594d98
3 changed files with 91 additions and 8 deletions

View File

@ -58,6 +58,11 @@ can also specify the metastore connect string here:
--meta-connect (jdbc-uri)::
Specifies the JDBC connect string used to connect to the metastore
--meta-user::
Specify a jdbc connection username to a database. Default value is 'SA'
--meta-password::
Specify a jdbc connection password to a database. Default value is empty string.
Common options
~~~~~~~~~~~~~~
@ -68,6 +73,29 @@ Common options
--verbose::
Print more information while working
EXAMPLES
--------
Listing available jobs in the metastore:
sqoop job --list --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
Creating a new job in the metastore:
sqoop job --create myjob1 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password -- import
--connect jdbc:mysql://mysqlhost:3306/sqoop --username sqoop --password sqoop --table "TestTable" -m 1
Executing an existing job:
sqoop job --exec myjob1 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
Showing the definition of an existing job:
sqoop job --show myjob2 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
Deleting an existing job:
sqoop job --delete myjob1 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
ENVIRONMENT
-----------

View File

@ -18,17 +18,24 @@
////
The +metastore+ tool configures Sqoop to host a shared Hsqldb metadata repository.
This tool basically just starts Hsqldb, while the +job tool+ creates the necessary
tables that will contain the job metadata if they don't exist.
Multiple users and/or remote users can define and execute saved jobs (created
with +sqoop job+) defined in this metastore.
Clients must be configured to connect to the metastore in +sqoop-site.xml+ or
with the +--meta-connect+ argument. These commands MySql, Hsqldb, PostgreSql, Oracle, DB2,
and SqlServer databases as well. All services other than Hsqldb and Postgres require the
download of the corresponding JDBC driver and connect string structured in the correct format.
with the +--meta-connect+ argument. Sqoop supports MySql, Hsqldb, PostgreSql, Oracle, DB2,
and SqlServer as the +metastore+ server implementations, but please note that
the +metastore+ tool can only manage the startup and shutdown
of Hsqldb so far.
All services other than Hsqldb and Postgres require the download of the
corresponding JDBC driver and connect string structured in the correct format.
Migration of metastore data from one database service to another is not directly supported, but is possible.
.JDBC Connect String Formats:
.JDBC Connection String Formats:
[grid="all"]
`---------------------------`------------------------------------------
Service Connect String Format

View File

@ -151,10 +151,57 @@ filesystem other than your home directory.
If you configure +sqoop.metastore.client.enable.autoconnect+ with the
value +false+, then you must explicitly supply +\--meta-connect+.
If the +--meta-connect+ option is present, then Sqoop will try to connect to the
+metastore+ database specified in this parameter value. It will use the username
and password specified in the +--meta-username+ and +--meta-password+ parameters.
If they are not present Sqoop will use empty username/password. If the database
in the connection string is not supported then Sqoop will throw an exception.
If the +--meta-connect+ parameter is not preset and the +sqoop.metastore.client.enable.autoconnect+
configuration parameter is false (default value is true) then Sqoop will throw an error since
there are no applicable +metastore+ implementations.
Job data can be stored in MySql, PostgreSql, DB2, SqlServer, and Oracle with
the +\--meta-connect+ argument. The +\--meta-username+ and +\--meta-password+ arguments are necessary
if the database containing the saved jobs requires a username and password.
In case of using any of these implementations, you have to ensure that the
database is online and accessible when Sqoop tries to access them.
Examples
~~~~~~~~
Listing available jobs in the metastore:
----
sqoop job --list --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
----
Creating a new job in the metastore:
----
sqoop job --create myjob1 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password -- import
--connect jdbc:mysql://mysqlhost:3306/sqoop --username sqoop --password sqoop --table "TestTable" -m 1
----
Executing an existing job:
----
sqoop job --exec myjob1 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
----
Showing the definition of an existing job:
----
sqoop job --show myjob2 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
----
Deleting an existing job:
----
sqoop job --delete myjob1 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
----
Using a Hsqldb:
----
$ sqoop job --exec myjob --meta-connect jdbc:hsqldb:hsql://localhost:3000/ --meta-username *username* --meta-password *password*
@ -240,13 +287,14 @@ The metastore is available over TCP/IP. The port is controlled by the
+sqoop.metastore.server.port+ configuration parameter, and defaults to 16000.
Clients should connect to the metastore by specifying
+sqoop.metastore.client.autoconnect.url+ or +\--meta-connect+ with the
+sqoop.metastore.client.autoconnect.url+ or +\--meta-connect+ with a
JDBC-URI string. For example,
+jdbc:hsqldb:hsql://metaserver.example.com:16000/sqoop+.
This metastore may be hosted on a machine within the Hadoop cluster, or
elsewhere on the network.
Alternatively, one can start an RDBMS to host the metastore and pass the
connection parameters to Sqoop. This metastore may be hosted on a machine
within the Hadoop cluster, or elsewhere in the network. Sqoop supports the
following database implementations: MySql, Oracle, Postgresql, MSSql and DB2.
+sqoop-merge+
-------------