This change introduces a new option that can be used to pass custom
connection parameters while creating JDBC connections. If no connection
parameters are specified, the system defaults to the old behavior.
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150051 13f79535-47bb-0310-9956-ffa450edef68
This change introduces a new Connection Manager for SQL Server along
with basic test case to exercise part of the functionality. It also
addresses the problem noted in SQOOP-229 by overriding the
getCurTimestampQuery method as suggested.
(Patrick Angeles via Arvind Prabhakar)
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150049 13f79535-47bb-0310-9956-ffa450edef68
This patch adds a checkstyle module to detect trailing white
spaces. It also removed various current instances of trailing
white spaces in the code.
From: Ahmed Radwan <ahmed@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150046 13f79535-47bb-0310-9956-ffa450edef68
Recently the PostgresqlManager was updated to escape all identifier
names. This change addresses a couple of places where the identifier
was either not being escaped, or was being lower-case converted as
per the previous logic.
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150044 13f79535-47bb-0310-9956-ffa450edef68
This patch fixes a bug that prevents importing data into
an existing hive table with the 'hive-overwrite' argument set.
From: Ahmed Radwan <ahmed@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150043 13f79535-47bb-0310-9956-ffa450edef68
Adding setter-methods and a field-based equals-implementation to
the generated classes. These new methods enhance the usage of the
generated classes.
(Michael Häusler via ahmed)
From: Ahmed Radwan <ahmed@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150037 13f79535-47bb-0310-9956-ffa450edef68
This change introduces a new method in ConnManager that allows the
various implementations to optionally override it and specify
custom bounds query used for calculating splits during free form
query based imports.
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150033 13f79535-47bb-0310-9956-ffa450edef68
The SqoopRecord.toString() and SqoopRecord.toString(DelimiterSet) methods
always append an end-of-record delimiter. Sqoop uses its own OutputFormat
when rendering these to text files, so that the user's delimiters are
preserved.
Other users could use this OutputFormat when working with SqoopRecord
instances in their own MapReduce code, but it would also be nice to "play
nice" with TextOutputFormat in the event that the intent is
newline-terminated records.
This patch allows users to suppress end-of-record delimiter generation when
formatting records with toString.
(Aaron Kimball via Arvind Prabhakar)
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150025 13f79535-47bb-0310-9956-ffa450edef68
This change introduces a setField(fieldName, fieldVal) method for
SqoopRecord instances which would allow an arbitrary programmatic
"setter" function without requiring reflection.
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150021 13f79535-47bb-0310-9956-ffa450edef68
The change will look for ToolPlugin definitions in the
sqoop.tool.plugins configuration entry, or conf/tools.d. Each
ToolPlugin returns a list of ToolDesc entries, which are then
registered with SqoopTool.register() before the user's arguments
are parsed. The user can then run 'sqoop <custom-tool> args...'
as if it were part of the natural Sqoop system.
(Aaron Kimball via Arvind Prabhakar)
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150020 13f79535-47bb-0310-9956-ffa450edef68
Minor changes to AsyncSqlExecThread to use execute instead of executeUpdate
and to DBRecordReader to allow subclasses to access the configuration object.
From: Ahmed Radwan <ahmed@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150014 13f79535-47bb-0310-9956-ffa450edef68
This change adds the ability of specifying the max. number of fetched records
from the database. This will solve problems that may arise when importing
large tables.
(Michael Häusler via ahmed)
From: Ahmed Radwan <ahmed@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150011 13f79535-47bb-0310-9956-ffa450edef68
This change updates the OracleManager to use catalog views for resolving
the necessary metadata.
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150006 13f79535-47bb-0310-9956-ffa450edef68
This change removes the test that asserts the presence of a non-default hosts
file configuration. It also adds the necessary comments to the PostgresqlTest
to allow configuring the server for default hosts file configuration.
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150003 13f79535-47bb-0310-9956-ffa450edef68
This change introduces the ability to use a staging table for intermediate
storage during execution for regular export jobs in insert mode. This allows
all of exported data to first be populated in the staging table and then
inserted into the destination table in a single transaction. Thus if a failure
were to occur during export, it is less likely to corrupt the destination
table data. Moreover, the staging table is emptied before the export
job starts populating it, which ensures that re-running the job does not
require any special clean up.
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150002 13f79535-47bb-0310-9956-ffa450edef68
This change fixes the BlobRef implementation to return the appropriate stream
source or byte array from the BytesWritable instance by taking into
consideration the actual data length.
(Peter Hall via arvind)
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1150000 13f79535-47bb-0310-9956-ffa450edef68
This change allows Sqoop to load options from an options file. An
options file is specified using --options-file. All options that
are otherwise specified on the command line should be specified
in this file in the order they would otherwise appear on the command
line. Options files can contain empty lines and comments for
readability. More than one options file may be used for a single
tool invocation if so preferred. Leading and trailing spaces are
ignored unless they appear within single or double quotes. Quoted
options extending into multiple lines are not supported.
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149999 13f79535-47bb-0310-9956-ffa450edef68
This fix allows the user to optionally specify different null
representations. It addresses both the import and export use
cases, in addition to both string and non-string column types.
From: Ahmed Radwan <ahmed@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149996 13f79535-47bb-0310-9956-ffa450edef68
This change removes the ShimLoader and various Shim classes such as CDH3Shim
etc. It introduces a couple of new classes - ConfigurationConstants and
ConfigurationHelper - that provide a unique place for articulating interface
related details such as configuration keys that can likely change from version
to version of Hadoop.
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149994 13f79535-47bb-0310-9956-ffa450edef68
This fix allows the user to optionally specify the connection
manager class to be used, instead of inferring it from the
connection string.
From: Ahmed Radwan <ahmed@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149993 13f79535-47bb-0310-9956-ffa450edef68
This change fixes a problem that prevented users from importing into Hive
data extracted using a free form query.
From: Ahmed Radwan <ahmed@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149992 13f79535-47bb-0310-9956-ffa450edef68
This is a fix for a case of malformed command line arguments,
where the tool name is correct, the option is also correct,
but the value of the option is missing.
From: Ahmed Radwan <ahmed@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149990 13f79535-47bb-0310-9956-ffa450edef68
Hive allows the use of keywords as column and table names as long as they are
escaped using back-ticks. This change makes Sqoop always escape table and
column names using back-ticks thereby allowing Sqoop to work with Hive tables
that use keywords for either the table name or column names.
(Lars Francke via arvind)
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149989 13f79535-47bb-0310-9956-ffa450edef68
The TextSplitter implementation used when creating splits on top
of String based columns, has a bug in its logic which causes
the bounds for splits to be created incorrectly. This results
in the import of duplicate data. This change fixes the TextSplitter
in order to ensure that the bound checks are created correctly.
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149988 13f79535-47bb-0310-9956-ffa450edef68
This change modifies the ClassWriter implementation to provide
support for interpeting empty strings as NULL values for datatypes
other than String. For String datatype, an explicit string 'null'
is interpreted as NULL value and empty string is not. This is
because certain databases distinguish between NULL and empty
strings.
The clone implementation generated by ClassWriter has also been
modified to make it more defensive against the presence of NULL
values.
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149985 13f79535-47bb-0310-9956-ffa450edef68
Current CDH3 build includes version 0.20.3 of Hadoop which is now
mapped to CDH3Shim loader. Apart from that, this change includes
a change in build.xml and OracleUtils test class that allows the
ability to override connect string for Oracle tests.
From: Arvind Prabhakar <arvind@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149984 13f79535-47bb-0310-9956-ffa450edef68
Adds 'merge' tool.
Adds MergeJob, Merge*Mapper, MergeReducer.
Merge-specific arguments added to SqoopOptions, BaseSqoopTool.
Add TestMerge to test that this tool functions as expected.
From: Aaron Kimball <aaron@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149980 13f79535-47bb-0310-9956-ffa450edef68
Create nonce dirs as /tmp/sqoop-${user.name}/compile.
Remove unused compilation directories at end of execution.
Do not memoize nonce compilation directory names in metastore.
From: Aaron Kimball <aaron@cloudera.com>
git-svn-id: https://svn.apache.org/repos/asf/incubator/sqoop/trunk@1149975 13f79535-47bb-0310-9956-ffa450edef68