DataX/kuduwriter/doc/kuduwirter.md
2020-11-17 18:41:29 +08:00

5.4 KiB
Raw Blame History

datax-kudu-plugins

datax kudu的writer插件

eg:

{
  "name": "kuduwriter",
  "parameter": {
    "kuduConfig": {
      "kudu.master_addresses": "***",
      "timeout": 60000,
      "sessionTimeout": 60000

    },
    "table": "",
    "replicaCount": 3,
    "truncate": false,
    "writeMode": "upsert",
    "partition": {
      "range": {
        "column1": [
          {
            "lower": "2020-08-25",
            "upper": "2020-08-26"
          },
          {
            "lower": "2020-08-26",
            "upper": "2020-08-27"
          },
          {
            "lower": "2020-08-27",
            "upper": "2020-08-28"
          }
        ]
      },
      "hash": {
        "column": [
          "column1"
        ],
        "number": 3
      }
    },
    "column": [
      {
        "index": 0,
        "name": "c1",
        "type": "string",
        "primaryKey": true
      },
      {
        "index": 1,
        "name": "c2",
        "type": "string",
        "compress": "DEFAULT_COMPRESSION",
        "encoding": "AUTO_ENCODING",
        "comment": "注解xxxx"
      }
    ],
    "batchSize": 1024,
    "bufferSize": 2048,
    "skipFail": false,
    "encoding": "UTF-8"
  }
}

必须参数:

        "writer": {
          "name": "kuduwriter",
          "parameter": {
            "kuduConfig": {
              "kudu.master_addresses": "***"
            },
            "table": "***",
            "column": [
              {
                "name": "c1",
                "type": "string",
                "primaryKey": true
              },
              {
                "name": "c2",
                "type": "string",
              },
              {
                "name": "c3",
                "type": "string"
              },
              {
                "name": "c4",
                "type": "string"
              }
            ]
          }
        }

主键列请写到最前面

image-20200901193148188

配置列表
name default description 是否必须
kuduConfig kudu配置 kudu.master_addresses等
table 导入目标表名
partition 分区
column
name 列名
type string 列的类型现支持INT, FLOAT, STRING, BIGINT, DOUBLE, BOOLEAN, LONG。
index 升序排列 列索引位置(要么全部列都写,要么都不写)如reader中取到的某一字段在第二位置eg name id age但kudu目标表结构不同egidname age此时就需要将index赋值为102默认顺序012
primaryKey false 是否为主键(请将所有的主键列写在前面),不表明主键将不会检查过滤脏数据
compress DEFAULT_COMPRESSION 压缩格式
encoding AUTO_ENCODING 编码
replicaCount 3 保留副本个数
hash hash分区
number 3 hash分区个数
range range分区
lower range分区下限 (eg: sql建表partition value='haha' 对应“lower”“haha”“upper”“haha\000”)
upper range分区上限(eg: sql建表partition "10" <= VALUES < "20" 对应“lower”“10”“upper”“20”)
truncate false 是否清空表,本质上是删表重建
writeMode upsert upsertinsertupdate
batchSize 512 每xx行数据flush一次结果最好不要超过1024
bufferSize 3072 缓冲区大小
skipFail false 是否跳过插入不成功的数据
timeout 60000 client超时时间,如创建表删除表操作的超时时间。单位ms
sessionTimeout 60000 session超时时间 单位ms