Embulk コマンドのセットアップ
これみて(最新バージョンに読み替えよう)
shase428.hatenablog.jp
Bundle 環境のセットアップ
$ mkdir embulk_sample
$ embulk mkbundle bundle
2023-11-20 15:24:23.759 +0900: Embulk v0.9.23
Initializing bundle...
Creating Gemfile
Creating .bundle/config
Creating embulk/input/example.rb
Creating embulk/output/example.rb
Creating embulk/filter/example.rb
bundle/Gemfileを以下のように書き換える
source 'https://rubygems.org/'
gem 'embulk', '< 0.10'
gem 'embulk-input-command'
$ cd bundle
$ embulk bundle install --path=vendor/bundle
2023-11-20 15:28:10.995 +0900: Embulk v0.9.23
Fetching gem metadata from https://rubygems.org/........
Fetching gem metadata from https://rubygems.org/.
Resolving dependencies...
Using bundler 1.16.0
Fetching msgpack 1.4.1 (java)
Installing msgpack 1.4.1 (java)
Fetching embulk 0.11.1 (java)
Installing embulk 0.11.1 (java)
Fetching embulk-input-command 0.1.4
Installing embulk-input-command 0.1.4
Bundle complete! 2 Gemfile dependencies, 4 gems now installed.
Bundled gems are installed into `./vendor/bundle`
$ embulk bundle list
2023-11-20 15:29:24.988 +0900: Embulk v0.9.23
Gems included by the bundle:
* bundler (1.16.0)
* embulk (0.11.1)
* embulk-input-command (0.1.4)
* msgpack (1.4.1)
config.yml の作成
$ cd ..
$ vim config.yml
config.yml の中身
in:
type: command
command: echo "a,b" && echo "1,2" && echo "10,11"
parser:
charset: UTF-8
newline: LF
type: csv
delimiter: ','
columns:
- {name: a, type: long}
- {name: b, type: long}
out:
type: stdout
$ embulk preview -b bundle config.yml
2023-11-20 15:44:48.731 +0900: Embulk v0.9.23
2023-11-20 15:44:49.287 +0900 [WARN] (main): DEPRECATION: JRuby org.jruby.embed.ScriptingContainer is directly injected.
2023-11-20 15:44:51.113 +0900 [INFO] (main): BUNDLE_GEMFILE is being set: "/Users/foobar/tmp/embulk_sample/bundle/Gemfile"
2023-11-20 15:44:51.114 +0900 [INFO] (main): Gem's home and path are being cleared.
2023-11-20 15:44:52.949 +0900 [INFO] (main): Started Embulk v0.9.23
2023-11-20 15:44:53.063 +0900 [INFO] (0001:preview): Loaded plugin embulk-input-command (0.1.4)
2023-11-20 15:44:53.101 +0900 [INFO] (0001:preview): Try to read 32,768 bytes from input source
2023-11-20 15:44:53.107 +0900 [INFO] (0001:preview): Running command [sh, -c, echo "a,b" && echo "1,2" && echo "10,11"]
2023-11-20 15:44:53.139 +0900 [INFO] (0001:preview): Running command [sh, -c, echo "a,b" && echo "1,2" && echo "10,11"]
2023-11-20 15:44:53.204 +0900 [WARN] (0001:preview): Skipped line -:1 (java.lang.NumberFormatException: For input string: "a"): a,b
+--------+--------+
| a:long | b:long |
+--------+--------+
| 1 | 2 |
| 10 | 11 |
+--------+--------+
$ embulk run -b bundle config.yml
2023-11-20 15:45:32.403 +0900: Embulk v0.9.23
2023-11-20 15:45:33.306 +0900 [WARN] (main): DEPRECATION: JRuby org.jruby.embed.ScriptingContainer is directly injected.
2023-11-20 15:45:36.532 +0900 [INFO] (main): BUNDLE_GEMFILE is being set: "/Users/foobar/tmp/embulk_sample/bundle/Gemfile"
2023-11-20 15:45:36.538 +0900 [INFO] (main): Gem's home and path are being cleared.
2023-11-20 15:45:40.841 +0900 [INFO] (main): Started Embulk v0.9.23
2023-11-20 15:45:41.039 +0900 [INFO] (0001:transaction): Loaded plugin embulk-input-command (0.1.4)
2023-11-20 15:45:41.155 +0900 [INFO] (0001:transaction): Using local thread executor with max_threads=20 / output tasks 10 = input tasks 1 * 10
2023-11-20 15:45:41.161 +0900 [INFO] (0001:transaction): {done: 0 / 1, running: 0}
2023-11-20 15:45:41.194 +0900 [INFO] (0015:task-0000): Running command [sh, -c, echo "a,b" && echo "1,2" && echo "10,11"]
2023-11-20 15:45:41.262 +0900 [WARN] (0015:task-0000): Skipped line -:1 (java.lang.NumberFormatException: For input string: "a"): a,b
1,2
10,11
2023-11-20 15:45:41.271 +0900 [INFO] (0001:transaction): {done: 1 / 1, running: 0}
2023-11-20 15:45:41.290 +0900 [INFO] (main): Committed.
2023-11-20 15:45:41.290 +0900 [INFO] (main): Next config diff: {"in":{},"out":{}}
ファイル構成
$ tree -L 2
.
├── bundle
│ ├── Gemfile
│ ├── Gemfile.lock
│ ├── embulk
│ └── vendor
└── config.yml