5
0
mirror of https://github.com/apache/sqoop.git synced 2025-05-03 02:51:00 +08:00
Commit Graph

84 Commits

Author SHA1 Message Date
Andrew Bayer
0238221e7d Add bin/sqoop script.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149886 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:43 +00:00
Andrew Bayer
799f7e9070 Shims can be loaded from the classpath, or jars
CompilationManager now packs the active shim jar into the job jar.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149885 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:43 +00:00
Andrew Bayer
9a7dcdeafb MapReduce-API specific classes moved to shim jars.
All classes which depend on MapReduce APIs which change from
interfaces to classes between 0.20 and 0.22 are moved to distribution-
specific shim jars.
"Common" shim classes are now compiled multiple times against different
Hadoop distributions.
Shim classes are broken out into separate jars; ShimLoader now picks
the appropriate jar to load at runtime.
Configuration constants moved into HadoopShim.
BlobRef/ClobRef methods changed to use Mapper.Context for binary compatibility.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149884 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:42 +00:00
Andrew Bayer
1c8a2a3e8b Allow dependency cleaning without cleaning build
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149883 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:42 +00:00
Andrew Bayer
8fb67486a6 Add Cloudera mvn repository to ivy for CDH
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149882 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:42 +00:00
Andrew Bayer
db68b92108 Add 'ant javadoc' target.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149881 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:41 +00:00
Andrew Bayer
b55cb598da Add shim classes to allow compilation against different Hadoop distributions
Version-incompatible code now moved to HadoopShim subclasses.
HadoopShim singleton instance dynamically loaded based on VersionInfo.
Separate MRUnit builds from Apache and CDH placed in /lib subdirs.
Modified 'ant package' target to properly include all shims.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149880 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:41 +00:00
Andrew Bayer
22190b9ba3 Add ability to compile against Cloudera or Apache Hadoop.
Added more thorough compilation instructions.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149879 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:41 +00:00
Andrew Bayer
0fde7dff8d Hive DDL generator uses INTEGER when it means INT.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149878 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:40 +00:00
Andrew Bayer
9374ed8398 Add support for mysqlimport-based export jobs.
Using --direct in conjunction with --export-dir on a MySQL database will use
mysqlimport to emit the data to the database.
DirectMySQLManager now creates instances of MySQLExportJob.
src/test/.../MySQLUtils is renamed to MySQLTestUtils to avoid conflict with
src/java/.../MySQLUtils added by this patch.
MySQLUtils contains methods factored out of import-specific code for sharing
with exports.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149877 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:40 +00:00
Andrew Bayer
f5fc5a5f7e Enable cobertura coverage reporting.
To use this, run:
ant cobertura -Dcobertura.home=/path/to/cobertura

That will run the standard smoke tests. To get a full report, you'll
need to re-run it on the thirdparty tests:

ant cobertura -Dcobertura.home=/path/to/cobertura -Dthirdparty=true \
    -Dsqoop.thirdparty.lib.dir=/path/to/jdbc/drivers

The results of both runs will be combined.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149876 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:40 +00:00
Andrew Bayer
abbfab1941 Refactor ExportJob to facilitate multiple kinds of export jobs.
Add 'JobBase' which attempts to unify ExportJob with ImportJobBase.
ImportJobBase and ExportJobBase override job-type specific behavior.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149875 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:40 +00:00
Andrew Bayer
8147d262d8 Enable findbugs on build and fix all warnings.
Some spurious warnings (and inconsequential warnings in test code)
have been disabled by src/test/findbugsExcludeFile.xml.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149874 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:39 +00:00
Andrew Bayer
7214230695 Support BINARY, VARBINARY, and RAW (Oracle) types
Added support for importing byte array columns as BytesWritable.
Tested with MySQL, Oracle, HSQLDB.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149873 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:39 +00:00
Andrew Bayer
2240be8807 Cache connections to Oracle across ConnManagers.
OracleManager now caches Connection instances for subsequent OracleManager
instances.
Refactored uses of ConnManager to call close() before discarding them.
This allows the Oracle JUnit tests to sleep less frequently to wait for Oracle
to reap closed server-side connection resources, improving Oracle test speed
by 50%.

Sleeping cannot be fully eliminated because MapReduce-side Connections are not
governed by this caching mechanism.

Also added some debugging advice re. this topic to OracleManagerTest's comment.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149872 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:39 +00:00
Andrew Bayer
43f9e2f2b0 Added unit test to check network setup needed for postgres tests.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149871 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:38 +00:00
Andrew Bayer
6cbe7572e9 If --hive-import and --generate-only are specified, create a ddl script file.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149870 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:38 +00:00
Andrew Bayer
df76e995e8 Users can precisely control export parallelism.
Uses CombineFileInputFormat to run exports over a target number
of mappers independent of the number of input files.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149869 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:38 +00:00
Andrew Bayer
a0dd7e7490 Changed license headers to reference Cloudera instead of the ASF.
Adds NOTICE.txt file

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149868 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:36 +00:00
Andrew Bayer
b72a134b52 Show imported row count after job completion.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149867 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:36 +00:00
Andrew Bayer
bb29ce9492 Support for CLOB/BLOB data in external files.
CLOB/BLOB data may now be stored in additional files in HDFS which are
accessible through streams if the data cannot be fully materialized in RAM.
Adds tests for external large objects.
Refactored large object loading into the map() method from readFields().

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149866 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:35 +00:00
Andrew Bayer
32a67749b1 Run mysqldump in map task instead of on the client.
Major refactoring of DataDrivenImportJob to support mysqldump in mappers.
ImportJobBase added below DataDrivenImportJob.
MySQLDumpImportJob added on top of ImportJobBase.
LocalMySQLManager -> renamed to -> DirectMySQLManager now just runs MysqldumpIJ.
MySQLDumpImportJob configures MySQLDumpMapper to run mysqldump instances on
multiple nodes and is split-aware (via MySQLDumpInputFormat).
TestImportJob works with new ImportJobBase framework.
Added test that imports a subset of columns in mysql imports.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149865 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:35 +00:00
Andrew Bayer
bdd405f756 Improve batch testrunner support for third-party tests.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149864 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:35 +00:00
Andrew Bayer
14c9e0bf88 Use DataDrivenDBInputFormat with Oracle.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149863 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:34 +00:00
Andrew Bayer
7482c71cf9 Initial support for CLOB/BLOB types
Tests pass in Oracle and MySQL compatibility suites

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149862 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:34 +00:00
Andrew Bayer
71b01cdb7f Compilation, dependency resolution, and tests pass.
Modified build.xml to run without Hadoop's build-contrib wrapper.
Added MRUnit jar from Hadoop MapReduce (not exposed via mvn).
Added 'package' and 'tar' targets for redistribution.
Added ivy settings files for direct dependencies.
Added gitignores where appropriate.
Move documentation from /doc to /src/docs.
Add LICENSE.txt.
Move readme.txt to README.txt.
Provide more fine-grained control of third-party redistributables
via 'redist' ivy configuration.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149861 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:34 +00:00
Andrew Bayer
af2ec3a03f MAPREDUCE-1445. Refactor Sqoop tests to support better ConnManager testing. Contributed by Aaron Kimball.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149860 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:33 +00:00
Andrew Bayer
c5c613cb92 MAPREDUCE-1444. Sqoop ConnManager instances can leak Statement objects. Contributed by Aaron Kimball.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149859 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:33 +00:00
Andrew Bayer
61d5da2500 MAPREDUCE-1467. Add a --verbose flag to Sqoop. Contributed by Aaron Kimball.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149858 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:33 +00:00
Andrew Bayer
6a215d0fbc MAPREDUCE-1469. Sqoop should disable speculative execution in export. Contributed by Aaron Kimball.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149857 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:33 +00:00
Andrew Bayer
a625fd478c MAPREDUCE-1341. Sqoop should have an option to create hive tables and skip the table import step. Contributed by Leonid Furman.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149856 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:32 +00:00
Andrew Bayer
de836d714a MAPREDUCE-1356. Allow user-specified hive table name in sqoop. Contributed by Aaron Kimball.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149855 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:32 +00:00
Andrew Bayer
69f04fff8b MAPREDUCE-1395. Sqoop does not check return value of Job.waitForCompletion(). Contributed by Aaron Kimball.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149854 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:31 +00:00
Andrew Bayer
40da832916 MAPREDUCE-1394. Sqoop generates incorrect URIs in paths sent to Hive. Contributed by Aaron Kimball.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149853 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:31 +00:00
Andrew Bayer
e7a8e519f3 MAPREDUCE-1327. Fix Sqoop handling of Oracle timezone with timestamp data
types in import. Contributed by Leonid Furman

From: Christopher Douglas <cdouglas@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149852 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:31 +00:00
Andrew Bayer
6174268d28 MAPREDUCE-1313. Fix NPE in Sqoop when table with null fields uses escape
during import. Contributed by Aaron Kimball

From: Christopher Douglas <cdouglas@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149851 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:31 +00:00
Andrew Bayer
c7f64e4f8c MAPREDUCE-1212. Mapreduce contrib project ivy dependencies are not included in binary target. Contributed by Aaron Kimball.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149850 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:31 +00:00
Andrew Bayer
05929a73e5 MAPREDUCE-1310. CREATE TABLE statements for Hive do not correctly specify delimiters. Contributed by Aaron Kimball.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149849 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:30 +00:00
Andrew Bayer
b74084196f MAPREDUCE-1235. Fix a MySQL timestamp incompatibility in Sqoop. Contributed by Aaron Kimball
From: Christopher Douglas <cdouglas@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149848 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:30 +00:00
Andrew Bayer
681461461a MAPREDUCE-1174. Sqoop improperly handles table/column names which are reserved sql words. Contributed by Aaron Kimball.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149847 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:30 +00:00
Andrew Bayer
8b483c6ded MAPREDUCE-1146. Sqoop dependencies break Eclipse build on Linux. Contributed by Aaron Kimball.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149846 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:29 +00:00
Andrew Bayer
4686e0fee7 MAPREDUCE-1148. SQL identifiers are a superset of Java identifiers. Contributed by Aaron Kimball.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149845 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:29 +00:00
Andrew Bayer
12827a1765 MAPREDUCE-1224. Calling "SELECT t.* from <table> AS t" to get meta information is too expensive for big tables. Contributed by Spencer Ho.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149844 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:29 +00:00
Andrew Bayer
8e813b95a4 MAPREDUCE-1168. Export data to databases via Sqoop. Contributed by Aaron Kimball.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149843 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:28 +00:00
Andrew Bayer
2312eeff5a MAPREDUCE-1239. Fix contrib components build dependencies
From: Giridharan Kesavan <gkesavan@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149842 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:28 +00:00
Andrew Bayer
b451865d53 HADOOP-5107. Use Maven ant tasks to publish artifacts. (Giridharan Kesavan
via omalley)

From: Owen O'Malley <omalley@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149841 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:28 +00:00
Andrew Bayer
ec8f687d97 MAPREDUCE-1169. Improvements to mysqldump use in Sqoop. Contributed by Aaron Kimball.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149840 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:28 +00:00
Andrew Bayer
84adbeea26 MAPREDUCE-1037. Continue running contrib tests if Sqoop tests fail. Contributed by Aaron Kimball
From: Christopher Douglas <cdouglas@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149839 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:27 +00:00
Andrew Bayer
a0229d9738 MAPREDUCE-1036. Document Sqoop API. Contributed by Aaron Kimball
From: Christopher Douglas <cdouglas@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149838 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:27 +00:00
Andrew Bayer
9afc7a8aee MAPREDUCE-1069. Implement Sqoop API refactoring. Contributed by Aaron Kimball.
From: Thomas White <tomwhite@apache.org>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149837 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:03:27 +00:00