5
0
mirror of https://github.com/apache/sqoop.git synced 2025-05-03 05:50:31 +08:00

SQOOP-3301: Document SQOOP-3216 - metastore related change

(Fero Szabo via Szabolcs Vasas)
This commit is contained in:
Szabolcs Vasas 2018-04-10 10:51:16 +02:00
parent c146b3f94e
commit af7a594d98
3 changed files with 91 additions and 8 deletions

View File

@ -58,6 +58,11 @@ can also specify the metastore connect string here:
--meta-connect (jdbc-uri):: --meta-connect (jdbc-uri)::
Specifies the JDBC connect string used to connect to the metastore Specifies the JDBC connect string used to connect to the metastore
--meta-user::
Specify a jdbc connection username to a database. Default value is 'SA'
--meta-password::
Specify a jdbc connection password to a database. Default value is empty string.
Common options Common options
~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~
@ -68,6 +73,29 @@ Common options
--verbose:: --verbose::
Print more information while working Print more information while working
EXAMPLES
--------
Listing available jobs in the metastore:
sqoop job --list --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
Creating a new job in the metastore:
sqoop job --create myjob1 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password -- import
--connect jdbc:mysql://mysqlhost:3306/sqoop --username sqoop --password sqoop --table "TestTable" -m 1
Executing an existing job:
sqoop job --exec myjob1 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
Showing the definition of an existing job:
sqoop job --show myjob2 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
Deleting an existing job:
sqoop job --delete myjob1 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
ENVIRONMENT ENVIRONMENT
----------- -----------

View File

@ -18,17 +18,24 @@
//// ////
The +metastore+ tool configures Sqoop to host a shared Hsqldb metadata repository. The +metastore+ tool configures Sqoop to host a shared Hsqldb metadata repository.
This tool basically just starts Hsqldb, while the +job tool+ creates the necessary
tables that will contain the job metadata if they don't exist.
Multiple users and/or remote users can define and execute saved jobs (created Multiple users and/or remote users can define and execute saved jobs (created
with +sqoop job+) defined in this metastore. with +sqoop job+) defined in this metastore.
Clients must be configured to connect to the metastore in +sqoop-site.xml+ or Clients must be configured to connect to the metastore in +sqoop-site.xml+ or
with the +--meta-connect+ argument. These commands MySql, Hsqldb, PostgreSql, Oracle, DB2, with the +--meta-connect+ argument. Sqoop supports MySql, Hsqldb, PostgreSql, Oracle, DB2,
and SqlServer databases as well. All services other than Hsqldb and Postgres require the and SqlServer as the +metastore+ server implementations, but please note that
download of the corresponding JDBC driver and connect string structured in the correct format. the +metastore+ tool can only manage the startup and shutdown
of Hsqldb so far.
All services other than Hsqldb and Postgres require the download of the
corresponding JDBC driver and connect string structured in the correct format.
Migration of metastore data from one database service to another is not directly supported, but is possible. Migration of metastore data from one database service to another is not directly supported, but is possible.
.JDBC Connect String Formats: .JDBC Connection String Formats:
[grid="all"] [grid="all"]
`---------------------------`------------------------------------------ `---------------------------`------------------------------------------
Service Connect String Format Service Connect String Format

View File

@ -151,10 +151,57 @@ filesystem other than your home directory.
If you configure +sqoop.metastore.client.enable.autoconnect+ with the If you configure +sqoop.metastore.client.enable.autoconnect+ with the
value +false+, then you must explicitly supply +\--meta-connect+. value +false+, then you must explicitly supply +\--meta-connect+.
If the +--meta-connect+ option is present, then Sqoop will try to connect to the
+metastore+ database specified in this parameter value. It will use the username
and password specified in the +--meta-username+ and +--meta-password+ parameters.
If they are not present Sqoop will use empty username/password. If the database
in the connection string is not supported then Sqoop will throw an exception.
If the +--meta-connect+ parameter is not preset and the +sqoop.metastore.client.enable.autoconnect+
configuration parameter is false (default value is true) then Sqoop will throw an error since
there are no applicable +metastore+ implementations.
Job data can be stored in MySql, PostgreSql, DB2, SqlServer, and Oracle with Job data can be stored in MySql, PostgreSql, DB2, SqlServer, and Oracle with
the +\--meta-connect+ argument. The +\--meta-username+ and +\--meta-password+ arguments are necessary the +\--meta-connect+ argument. The +\--meta-username+ and +\--meta-password+ arguments are necessary
if the database containing the saved jobs requires a username and password. if the database containing the saved jobs requires a username and password.
In case of using any of these implementations, you have to ensure that the
database is online and accessible when Sqoop tries to access them.
Examples
~~~~~~~~
Listing available jobs in the metastore:
----
sqoop job --list --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
----
Creating a new job in the metastore:
----
sqoop job --create myjob1 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password -- import
--connect jdbc:mysql://mysqlhost:3306/sqoop --username sqoop --password sqoop --table "TestTable" -m 1
----
Executing an existing job:
----
sqoop job --exec myjob1 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
----
Showing the definition of an existing job:
----
sqoop job --show myjob2 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
----
Deleting an existing job:
----
sqoop job --delete myjob1 --meta-connect jdbc:oracle:thin:@//myhost:1521/ORCLCDB
--meta-username ms_user --meta-password ms_password
----
Using a Hsqldb:
---- ----
$ sqoop job --exec myjob --meta-connect jdbc:hsqldb:hsql://localhost:3000/ --meta-username *username* --meta-password *password* $ sqoop job --exec myjob --meta-connect jdbc:hsqldb:hsql://localhost:3000/ --meta-username *username* --meta-password *password*
@ -240,13 +287,14 @@ The metastore is available over TCP/IP. The port is controlled by the
+sqoop.metastore.server.port+ configuration parameter, and defaults to 16000. +sqoop.metastore.server.port+ configuration parameter, and defaults to 16000.
Clients should connect to the metastore by specifying Clients should connect to the metastore by specifying
+sqoop.metastore.client.autoconnect.url+ or +\--meta-connect+ with the +sqoop.metastore.client.autoconnect.url+ or +\--meta-connect+ with a
JDBC-URI string. For example, JDBC-URI string. For example,
+jdbc:hsqldb:hsql://metaserver.example.com:16000/sqoop+. +jdbc:hsqldb:hsql://metaserver.example.com:16000/sqoop+.
This metastore may be hosted on a machine within the Hadoop cluster, or Alternatively, one can start an RDBMS to host the metastore and pass the
elsewhere on the network. connection parameters to Sqoop. This metastore may be hosted on a machine
within the Hadoop cluster, or elsewhere in the network. Sqoop supports the
following database implementations: MySql, Oracle, Postgresql, MSSql and DB2.
+sqoop-merge+ +sqoop-merge+
------------- -------------