5
0
mirror of https://github.com/apache/sqoop.git synced 2025-05-04 04:31:18 +08:00
Commit Graph

210 Commits

Author SHA1 Message Date
Andrew Bayer
0b1f47c459 SQOOP-152. Support for test against cluster.
This change allows Sqoop unit tests to be run against a real cluster.

(Konstantin Boudnik via arvind)

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150012 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:28 +00:00
Andrew Bayer
d920b8a0e8 SQOOP-140. Control max. number of fetched records.
This change adds the ability of specifying the max. number of fetched records
from the database. This will solve problems that may arise when importing
large tables.

(Michael Häusler via ahmed)

From: Ahmed Radwan <ahmed@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150011 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:28 +00:00
Andrew Bayer
de2fc6c2b3 SQOOP-164. Allow unit tests to use external dbs.
Modified the thirdparty tests to pick host URL from system properties.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150010 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:27 +00:00
Andrew Bayer
3f8252a28c SQOOP-154. Fix connection leak in OracleManager.
From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150009 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:27 +00:00
Andrew Bayer
49613bb5b7 SQOOP-159. Fixing HBase test failures.
Changes include explicitly setting the Zookeeper client port and increasing
the memory limit from 256m to 512m in build.xml.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150008 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:27 +00:00
Andrew Bayer
8812957f98 Increment version number and prev.git.hash after branching for release.
From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150007 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:26 +00:00
Andrew Bayer
4e6351f372 SQOOP-148. Use catalog views for OracleManager.
This change updates the OracleManager to use catalog views for resolving
the necessary metadata.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150006 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:26 +00:00
Andrew Bayer
70caf779b0 SQOOP-142. Document requirements for direct import
Updated the documentation with details on direct mode execution
requirements.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150005 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:26 +00:00
Andrew Bayer
e33bdbced1 SQOOP-111. Documentation fix.
Sqoop user guide inaccurately claims that Hive does not support escaping
of characters. This change updates the user guide to fix this and make the
claim based on the current capabilities of Hive.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150004 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:25 +00:00
Andrew Bayer
fbb283f54d SQOOP-143. Simplify test configuration.
This change removes the test that asserts the presence of a non-default hosts
file configuration. It also adds the necessary comments to the PostgresqlTest
to allow configuring the server for default hosts file configuration.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150003 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:25 +00:00
Andrew Bayer
1d5b7011a9 SQOOP-124. Improve error handling during export.
This change introduces the ability to use a staging table for intermediate
storage during execution for regular export jobs in insert mode. This allows
all of exported data to first be populated in the staging table and then
inserted into the destination table in a single transaction. Thus if a failure
were to occur during export, it is less likely to corrupt the destination
table data. Moreover, the staging table is emptied before the export
job starts populating it, which ensures that re-running the job does not
require any special clean up.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150002 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:25 +00:00
Andrew Bayer
23cebe14ba SQOOP-139. Doc update for SQOOP-125 changes.
This fix mainly corrects a minor option naming inconsistency in the
documentation.

From: Ahmed Radwan <ahmed@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150001 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:24 +00:00
Andrew Bayer
2de36d7aab SQOOP-141. BlobRef accessor returns incorrect data.
This change fixes the BlobRef implementation to return the appropriate stream
source or byte array from the BytesWritable instance by taking into
consideration the actual data length.

(Peter Hall via arvind)

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150000 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:24 +00:00
Andrew Bayer
cc288b6784 SQOOP-126. Support for loading options from file.
This change allows Sqoop to load options from an options file. An
options file is specified using --options-file. All options that
are otherwise specified on the command line should be specified
in this file in the order they would otherwise appear on the command
line. Options files can contain empty lines and comments for
readability. More than one options file may be used for a single
tool invocation if so preferred. Leading and trailing spaces are
ignored unless they appear within single or double quotes. Quoted
options extending into multiple lines are not supported.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149999 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:24 +00:00
Andrew Bayer
819c1dbb0b SQOOP-138. Fixing intermittent IVY failure.
This change fixes the problem due to which on certain systems IVY
is unable to download the hbase artifacts from maven repository.
It also includes some clean up of documentation and build files
that relate to the removal of shim layer mechanism.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149998 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:23 +00:00
Andrew Bayer
ae66d60c56 SQOOP-101. Sqoop build to use IVY for dependencies.
This change modifies Sqoop build to use IVY for retrieving HBase and
Zookeeper dependencies. Along with this update, the version number
for HBase and Hadoop have been incremented to match the CDH3 Beta 3
versions. Due to this, a couple of tests had to be modified in order
to accommodate the changed behavior of the Hadoop classes.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149997 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:23 +00:00
Andrew Bayer
2eaa878ff0 SQOOP-12. Alternate NULL formats.
This fix allows the user to optionally specify different null
representations. It addresses both the import and export use
cases, in addition to both string and non-string column types.

From: Ahmed Radwan <ahmed@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149996 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:23 +00:00
Andrew Bayer
55cce082c2 SQOOP-135. Update documentation for query imports.
Documentation updated to explicitly state the limitations of the
free-form query based import facility. Also, fixed a documentation
example that was missing the 'WHERE $CONDITIONS' clause.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149995 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:22 +00:00
Andrew Bayer
b22904cbfe SQOOP-133. Removing shim layer mechanism.
This change removes the ShimLoader and various Shim classes such as CDH3Shim
etc. It introduces a couple of new classes - ConfigurationConstants and
ConfigurationHelper - that provide a unique place for articulating interface
related details such as configuration keys that can likely change from version
to version of Hadoop.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149994 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:22 +00:00
Andrew Bayer
df738df6c1 SQOOP-125. Allow user to specify database type.
This fix allows the user to optionally specify the connection
manager class to be used, instead of inferring it from the
connection string.

From: Ahmed Radwan <ahmed@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149993 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:21 +00:00
Andrew Bayer
b794959457 SQOOP-131. Hive import using free form query.
This change fixes a problem that prevented users from importing into Hive
data extracted using a free form query.

From: Ahmed Radwan <ahmed@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149992 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:21 +00:00
Andrew Bayer
bf0deb178e SQOOP-128. Allow custom export writers to cleanup.
This change allows customized ExportRecordWriters the opportunity to execute
code after the last commit is performed and before the database connection
is closed.

(Guy le Mar via Arvind Prabhakar)

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149991 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:21 +00:00
Andrew Bayer
5dd36c62da SQOOP-114. Fix for NPE.
This is a fix for a case of malformed command line arguments,
where the tool name is correct, the option is also correct,
but the value of the option is missing.

From: Ahmed Radwan <ahmed@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149990 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:20 +00:00
Andrew Bayer
e3638e8ce0 SQOOP-37. Escape table and column names for Hive.
Hive allows the use of keywords as column and table names as long as they are
escaped using back-ticks. This change makes Sqoop always escape table and
column names using back-ticks thereby allowing Sqoop to work with Hive tables
that use keywords for either the table name or column names.

(Lars Francke via arvind)

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149989 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:20 +00:00
Andrew Bayer
00821910b3 SQOOP-119. TextSplitter creates incorrect bounds.
The TextSplitter implementation used when creating splits on top
of String based columns, has a bug in its logic which causes
the bounds for splits to be created incorrectly. This results
in the import of duplicate data. This change fixes the TextSplitter
in order to ensure that the bound checks are created correctly.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149988 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:20 +00:00
Andrew Bayer
574835eb5a SQOOP-97. Remove mysql-connector-j from Sqoop.
This change removes the MySQL JDBC driver distribution that was
bundled with Sqoop previously. This is done to make sure that the
Sqoop distribution is completely Apache 2.0 compliant.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149987 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:19 +00:00
Andrew Bayer
683c04d10d SQOOP-108. Automatically obtain HBase and ZK deps.
Users currently need to specify hbase.home and zk.home in build.properties.
This change helps automatically resolve these dependencies by downloading
release tarballs. Would be best to do this via SQOOP-101 but the hbase and
zk maven layout currently has some issues that are painful to workaround
in ivy.

Reason: Improvement
Author: Eli Collins via Arvind Prabhakar

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149986 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:19 +00:00
Andrew Bayer
81608e42b6 SQOOP-110. Allow empty strings to represent NULL.
This change modifies the ClassWriter implementation to provide
support for interpeting empty strings as NULL values for datatypes
other than String. For String datatype, an explicit string 'null'
is interpreted as NULL value and empty string is not. This is
because certain databases distinguish between NULL and empty
strings.

The clone implementation generated by ClassWriter has also been
modified to make it more defensive against the presence of NULL
values.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149985 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:19 +00:00
Andrew Bayer
6da167490e SQOOP-99. CDH3Shim mapped to Hadoop 0.20.3.
Current CDH3 build includes version 0.20.3 of Hadoop which is now
mapped to CDH3Shim loader. Apart from that, this change includes
a change in build.xml and OracleUtils test class that allows the
ability to override connect string for Oracle tests.

From: Arvind Prabhakar <arvind@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149984 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:18 +00:00
Andrew Bayer
8527e7b458 Increment version number and prev.git.hash after branching for release.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149983 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:18 +00:00
Andrew Bayer
118bad1424 SQOOP-62. Fix failing Oracle compatibility tests.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149982 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:18 +00:00
Andrew Bayer
d1206e1238 SQOOP-94. COMPILING.txt does not mention all dependencies.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149981 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:17 +00:00
Andrew Bayer
3509b7941e SQOOP-90. Tool to merge datasets imported via incremental import.
Adds 'merge' tool.
Adds MergeJob, Merge*Mapper, MergeReducer.
Merge-specific arguments added to SqoopOptions, BaseSqoopTool.
Add TestMerge to test that this tool functions as expected.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149980 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:17 +00:00
Andrew Bayer
ea716f5426 SQOOP-89. Support multiple output files in ManagerCompatTestCase.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149979 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:17 +00:00
Andrew Bayer
c7b1ddb708 SQOOP-88. Parameterize pre-table-import validity checks in SqlManager.
Adds SqlManager.checkTableImportOptions() method called by importTable().

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149978 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:17 +00:00
Andrew Bayer
1a63734c0d SQOOP-86. Release note generation requires python 2.5.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149977 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:16 +00:00
Andrew Bayer
09c6fe1ef8 SQOOP-85. Additional documentation build adjustments.
Disable XML validation under CentOS for more targets.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149976 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:16 +00:00
Andrew Bayer
56e3553f40 SQOOP-81. Creation of /tmp/sqoop/compile is not multi-user friendly.
Create nonce dirs as /tmp/sqoop-${user.name}/compile.
Remove unused compilation directories at end of execution.
Do not memoize nonce compilation directory names in metastore.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149975 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:16 +00:00
Andrew Bayer
899216e175 SQOOP-84. Fix documentation build error under CentOS.
Disable xmlto validation on CentOS.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149974 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:16 +00:00
Andrew Bayer
2ba041796e SQOOP-80. bin/configure-sqoop improperly escapes shell variable names.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149973 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:15 +00:00
Andrew Bayer
1632e9e1dc SQOOP-79. Add scripts to start/stop metastore in background, use pidfiles.
Added start-metastore.sh and stop-metastore.sh scripts, which start and stop
the metastore silently.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149972 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:15 +00:00
Andrew Bayer
f4f0197174 SQOOP-78. Consolidate web-based documentation and tarball documentation.
Eliminates "webdocs" target from build, merged with "docs" target.
Web-based format for document delivery is used for local documentation as well.
Links to release notes, API documentation added to documentation homepage.
Release notes are generated for non-SNAPSHOT builds as part of "package" step.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149971 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:15 +00:00
Andrew Bayer
f88bfe27a2 SQOOP-60. Include SIP-6 document in source repository.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149970 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:15 +00:00
Andrew Bayer
5cc48f3051 SQOOP-51. Document driver jar installation process.
Updated the "supported databases" section of the user guide to reflect the
current driver installation process.
Updated "support" section of the user guide to mention current issue tracker.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149969 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:14 +00:00
Andrew Bayer
d656663a14 SQOOP-42. Document saved jobs, metastore, and incremental imports.
Added manual pages and user guide sections for sqoop-job and sqoop-metastore.
Updated sqoop-import documentation to describe incremental imports.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149968 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:14 +00:00
Andrew Bayer
36f93eac1d SQOOP-77. Rename saved sessions to saved jobs.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149967 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:13 +00:00
Andrew Bayer
e825dc560c SQOOP-75. Allow configuration of ManagerFactories through a configuration subdirectory.
Files in conf/managers.d/ are treated as configuration files that specify
classes for sqoop.connection.factories. If this property is unset, these
files are processed in order.
ClassLoaderStack no longer attempts to load a jar if the test class is
already available with the current set of ClassLoaders.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149966 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:13 +00:00
Andrew Bayer
4abb414829 SQOOP-76. Add automated release note generation script.
From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149965 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:13 +00:00
Andrew Bayer
beb0b2e1c2 SQOOP-47. NPE if --append is specified with HBase import target.
AppendUtils checks for missing append source and exits gracefully.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149964 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:13 +00:00
Andrew Bayer
66c753c2e7 SQOOP-72. sqoop-eval should use PreparedStatement.execute().
Modify sqoop-eval to use generic execute() instead of
executeQuery()/executeUpdate().
The --update argument to sqoop-eval is unnecessary and has been removed.
Modify ResultSetPrinter to properly display left border on output.

From: Aaron Kimball <aaron@cloudera.com>

git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149963 13f79535-47bb-0310-9956-ffa450edef68
2011-07-22 20:04:12 +00:00