diff --git a/build.xml b/build.xml index b843e0d2..efdee4d0 100644 --- a/build.xml +++ b/build.xml @@ -317,7 +317,7 @@ failonerror="true" output="${build.dir}/tools-list" error="/dev/null"> - + diff --git a/src/docs/man/sqoop.txt b/src/docs/man/sqoop.txt index 7200e546..7cab47c9 100644 --- a/src/docs/man/sqoop.txt +++ b/src/docs/man/sqoop.txt @@ -55,8 +55,9 @@ HIVE_HOME:: SEE ALSO -------- -The _Sqoop User Guide_ is available in HTML-form in '/usr/share/sqoop/' (if you installed via RPM or -DEB), or in the 'docs/' directory if you installed from a tarball. +The _Sqoop User Guide_ is available in HTML-form in '/usr/share/doc/sqoop/' +(if you installed via RPM or DEB), or in the 'docs/' directory if you +installed from a tarball. //// diff --git a/src/docs/user/Sqoop-manpage.txt b/src/docs/user/Sqoop-manpage.txt deleted file mode 100644 index 92eaa210..00000000 --- a/src/docs/user/Sqoop-manpage.txt +++ /dev/null @@ -1,224 +0,0 @@ -sqoop(1) -======== - -//// - Licensed to Cloudera, Inc. under one or more - contributor license agreements. See the NOTICE file distributed with - this work for additional information regarding copyright ownership. - Cloudera, Inc. licenses this file to You under the Apache License, Version 2.0 - (the "License"); you may not use this file except in compliance with - the License. You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. -//// - -NAME ----- -sqoop - SQL-to-Hadoop import tool - -SYNOPSIS --------- -'sqoop' - -DESCRIPTION ------------ -Sqoop is a tool designed to help users of large data import existing -relational databases into their Hadoop clusters. Sqoop uses JDBC to -connect to a database, examine each table's schema, and auto-generate -the necessary classes to import data into HDFS. It then instantiates -a MapReduce job to read tables from the database via the DBInputFormat -(JDBC-based InputFormat). Tables are read into a set of files loaded -into HDFS. Both SequenceFile and text-based targets are supported. Sqoop -also supports high-performance imports from select databases including MySQL. - -OPTIONS -------- - -The +--connect+ option is always required. To perform an import, one of -+--table+ or +--all-tables+ is required as well. Alternatively, you can -specify +--generate-only+ or one of the arguments in "Additional commands." - - -Database connection options -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - ---connect (jdbc-uri):: - Specify JDBC connect string (required) - ---driver (class-name):: - Manually specify JDBC driver class to use - ---username (username):: - Set authentication username - ---password (password):: - Set authentication password - (Note: This is very insecure. You should use -P instead.) - --P:: - Prompt for user password - ---direct:: - Use direct import fast path (mysql only) - -Import control options -~~~~~~~~~~~~~~~~~~~~~~ - ---all-tables:: - Import all tables in database - (Ignores +--table+, +--columns+, +--order-by+, and +--where+) - ---columns (col,col,col...):: - Columns to export from table - ---split-by (column-name):: - Column of the table used to split the table for parallel import - ---hadoop-home (dir):: - Override $HADOOP_HOME - ---hive-home (dir):: - Override $HIVE_HOME - ---warehouse-dir (dir):: - Tables are uploaded to the HDFS path +/warehouse/dir/(tablename)/+ - ---as-sequencefile:: - Imports data to SequenceFiles - ---as-textfile:: - Imports data as plain text (default) - ---hive-import:: - If set, then import the table into Hive - ---hive-create-only:: - Creates table in hive and skips the data import step - ---hive-overwrite:: - Overwrites existing table in hive. - By default it does not overwrite existing table. - ---table (table-name):: - The table to import - ---hive-table (table-name):: - When used with --hive-import, overrides the destination table name - ---where (clause):: - Import only the rows for which _clause_ is true. - e.g.: `--where "user_id > 400 AND hidden == 0"` - ---compress:: --z:: - Uses gzip to compress data as it is written to HDFS - ---direct-split-size (size):: - When using direct mode, write to multiple files of - approximately _size_ bytes each. - ---inline-lob-limit (size):: - When importing LOBs, keep objects inline up to - _size_ bytes. - -Export control options -~~~~~~~~~~~~~~~~~~~~~~ - ---export-dir (dir):: - Export from an HDFS path into a table (set with - --table) - -Output line formatting options -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -include::output-formatting.txt[] -include::output-formatting-args.txt[] - -Input line parsing options -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -include::input-formatting.txt[] -include::input-formatting-args.txt[] - -Code generation options -~~~~~~~~~~~~~~~~~~~~~~~ - ---bindir (dir):: - Output directory for compiled objects - ---class-name (name):: - Sets the name of the class to generate. By default, classes are - named after the table they represent. Using this parameters - ignores +--package-name+. - ---generate-only:: - Stop after code generation; do not import - ---outdir (dir):: - Output directory for generated code - ---package-name (package):: - Puts auto-generated classes in the named Java package - -Library loading options -~~~~~~~~~~~~~~~~~~~~~~~ ---jar-file (file):: - Disable code generation; use specified jar - ---class-name (name):: - The class within the jar that represents the table to import/export - -Additional commands -~~~~~~~~~~~~~~~~~~~ - -These commands cause Sqoop to report information and exit; -no import or code generation is performed. - ---debug-sql (statement):: - Execute 'statement' in SQL and display the results - ---help:: - Display usage information and exit - ---list-databases:: - List all databases available and exit - ---list-tables:: - List tables in database and exit - ---verbose:: - Print more information while working - -Database-specific options -~~~~~~~~~~~~~~~~~~~~~~~~~ - -Additional arguments may be passed to the database manager -after a lone '-' on the command-line. - -In MySQL direct mode, additional arguments are passed directly to -mysqldump. - -ENVIRONMENT ------------ - -JAVA_HOME:: - As part of its import process, Sqoop generates and compiles Java code - by invoking the Java compiler *javac*(1). As a result, JAVA_HOME must - be set to the location of your JDK (note: This cannot just be a JRE). - e.g., +/usr/java/default+. Hadoop (and Sqoop) requires Sun Java 1.6 which - can be downloaded from http://java.sun.com. - -HADOOP_HOME:: - The location of the Hadoop jar files. If you installed Hadoop via RPM - or DEB, these are in +/usr/lib/hadoop-20+. - -HIVE_HOME:: - If you are performing a Hive import, you must identify the location of - Hive's jars and configuration. If you installed Hive via RPM or DEB, - these are in +/usr/lib/hive+. -