|
| 1 | +--- |
| 2 | +layout: posts |
| 3 | +title: "Installing Maven-style Embulk plugins" |
| 4 | +date: 2024-06-13 |
| 5 | +description: "We recently started to provide a couple of methods to install the Maven-style Embulk plugins more easily, which was not very easy in the beginning of Maven-style plugins, indeed. This article is a brief introduction of the methods to install the Maven-style Embulk plugins." |
| 6 | +author: "dmikurube" |
| 7 | +--- |
| 8 | + |
| 9 | +Since [Embulk v0.11.0 was released a year ago](https://github.com/embulk/embulk/releases/tag/v0.11.0), we have pushed the new Maven-style Embulk plugins rather than the legacy RubyGems-style plugins. |
| 10 | + |
| 11 | +See also: [Embulk v0.11 is coming soon: JRuby](https://www.embulk.org/articles/2023/04/13/embulk-v0.11-is-coming-soon.html#jruby) |
| 12 | + |
| 13 | +We recently started to provide a couple of methods to install the Maven-style Embulk plugins more easily, which was not very easy in the beginning of Maven-style plugins, indeed. |
| 14 | + |
| 15 | +This article is a brief introduction of the methods to install the Maven-style Embulk plugins. |
| 16 | + |
| 17 | +## Revisit: Embulk home |
| 18 | + |
| 19 | +Embulk now has a concept of the "Embulk home" directory, which is a directory to contain `embulk.properties` and Embulk plugin installations. The Maven-style Embulk plugins will also be installed in the Embulk home directory. |
| 20 | + |
| 21 | +See again: [Embulk v0.11 is coming soon: Embulk home](https://www.embulk.org/articles/2023/04/13/embulk-v0.11-is-coming-soon.html#embulk-home) |
| 22 | + |
| 23 | +## #1: Embulk's built-in subcommand `install` |
| 24 | + |
| 25 | +[Embulk v0.11.3](https://github.com/embulk/embulk/releases/tag/v0.11.3) introduced a new Embulk subcommand: `embulk install`, instead of `embulk gem install` for RubyGems-style plugins. This subcommand takes a Maven artifact notation as its argument. The example below installs [`org.embulk:embulk-input-s3:0.6.0` from Maven Central](https://central.sonatype.com/artifact/org.embulk/embulk-input-s3/0.6.0). |
| 26 | + |
| 27 | +``` |
| 28 | +$ java -jar embulk-0.11.3.jar install "org.embulk:embulk-input-s3:0.6.0" |
| 29 | +... |
| 30 | +... |
| 31 | +2024-06-13 15:46:11.537 +0900 [INFO] (main): The path "/home/user/.embulk/lib/m2/repository" (m2_repo) does not exist. Creating it as a directory. |
| 32 | +2024-06-13 15:46:11.619 +0900 [INFO] (main): No alternative remote Maven repositories are specified. Downloading artifacts from Maven Central. |
| 33 | +2024-06-13 15:46:11.633 +0900 [INFO] (main): Downloading org.embulk:embulk-input-s3:pom:0.6.0 from https://repo.maven.apache.org/maven2 |
| 34 | +2024-06-13 15:46:12.725 +0900 [INFO] (main): Downloaded org.embulk:embulk-input-s3:pom:0.6.0 at /home/user/.embulk/lib/m2/repository/org/embulk/embulk-input-s3/0.6.0/embulk-input-s3-0.6.0.pom |
| 35 | +2024-06-13 15:46:12.776 +0900 [INFO] (main): Downloading com.amazonaws:aws-java-sdk-s3:pom:1.11.466 from https://repo.maven.apache.org/maven2 |
| 36 | +2024-06-13 15:46:13.027 +0900 [INFO] (main): Downloaded com.amazonaws:aws-java-sdk-pom:pom:1.11.466 at /home/user/.embulk/lib/m2/repository/com/amazonaws/aws-java-sdk-pom/1.11.466/aws-java-sdk-pom-1.11.466.pom |
| 37 | +... |
| 38 | +... |
| 39 | +2024-06-13 15:46:14.857 +0900 [INFO] (main): Downloading org.embulk:embulk-input-s3:jar:0.6.0 from https://repo.maven.apache.org/maven2 |
| 40 | +2024-06-13 15:46:14.857 +0900 [INFO] (main): Downloading com.amazonaws:aws-java-sdk-s3:jar:1.11.466 from https://repo.maven.apache.org/maven2 |
| 41 | +... |
| 42 | +... |
| 43 | +2024-06-13 15:46:15.720 +0900 [INFO] (main): Downloaded org.embulk:embulk-input-s3:jar:0.6.0 at /home/user/.embulk/lib/m2/repository/org/embulk/embulk-input-s3/0.6.0/embulk-input-s3-0.6.0.jar |
| 44 | +2024-06-13 15:46:15.721 +0900 [INFO] (main): Downloaded com.amazonaws:aws-java-sdk-s3:jar:1.11.466 at /home/user/.embulk/lib/m2/repository/com/amazonaws/aws-java-sdk-s3/1.11.466/aws-java-sdk-s3-1.11.466.jar |
| 45 | +... |
| 46 | +... |
| 47 | +2024-06-13 15:46:15.730 +0900 [INFO] (main): Installed org.embulk:embulk-input-s3:jar:0.6.0 at /home/user/.embulk/lib/m2/repository/org/embulk/embulk-input-s3/0.6.0/embulk-input-s3-0.6.0.jar |
| 48 | +2024-06-13 15:46:15.730 +0900 [INFO] (main): Installed com.amazonaws:aws-java-sdk-s3:jar:1.11.466 at /home/user/.embulk/lib/m2/repository/com/amazonaws/aws-java-sdk-s3/1.11.466/aws-java-sdk-s3-1.11.466.jar |
| 49 | +... |
| 50 | +... |
| 51 | +``` |
| 52 | + |
| 53 | +This subcommand downloads also the dependencies of the specified Maven artifact transitively as you can see in the example above. |
| 54 | + |
| 55 | +Note that you can change the destination Embulk home directory by Embulk's standard options. See the example below. |
| 56 | + |
| 57 | +``` |
| 58 | +$ java -jar embulk-0.11.3.jar -Xembulk_home=/tmp/foo install "org.embulk:embulk-input-s3:0.6.0" |
| 59 | +... |
| 60 | +
|
| 61 | +$ env EMBULK_HOME=/tmp/bar java -jar embulk-0.11.3.jar install "org.embulk:embulk-input-s3:0.6.0" |
| 62 | +... |
| 63 | +``` |
| 64 | + |
| 65 | +It now supports only [Maven Central](https://central.sonatype.com/) as the remote repository, unfortunately. |
| 66 | + |
| 67 | +## #2: Out-of-Embulk Embulk plugin installer |
| 68 | + |
| 69 | +Embulk has had the `mkbundle` subcommand and the `-b` option so that users can maintain plugin installations by `Gemfile`, but it works only for RubyGems-style plugins, of course. |
| 70 | + |
| 71 | +[The Gradle `org.embulk.runset` plugin](https://github.com/embulk/gradle-embulk-runset) is an alternative for Maven-style Embulk plugin. It works out of the Embulk package at all. |
| 72 | + |
| 73 | +To use this, set up an environment for [Gradle](https://gradle.org/install/) at first. [Gradle 8.7](https://docs.gradle.org/8.7/userguide/userguide.html) is at least required. You may want to choose [the Gradle wrapper](https://docs.gradle.org/8.7/userguide/userguide.html) in typical use-cases. |
| 74 | + |
| 75 | +Next, write `build.gradle` to declare the Maven-based Embulk plugins you wanted to install. |
| 76 | + |
| 77 | +``` |
| 78 | +plugins { |
| 79 | + id "org.embulk.runset" version "0.2.0" // Just apply this Gradle plugin. |
| 80 | +} |
| 81 | +
|
| 82 | +repositories { |
| 83 | + mavenCentral() |
| 84 | +} |
| 85 | +
|
| 86 | +installEmbulkRunSet { |
| 87 | + // Set your Embulk home directory (absolute path) to install the Embulk plugins and "embulk.properties". |
| 88 | + embulkHome file("/home/user/my-embulk-home") |
| 89 | +
|
| 90 | + // Specify the Maven-style Embulk plugin by the "artifact" directive. |
| 91 | + artifact "org.embulk:embulk-input-s3:0.6.0" |
| 92 | +
|
| 93 | + // You can specify multiple versions of the same Embulk plugin so that you can choose the version at runtime. |
| 94 | + // You can also specify an artifact with the split-style notation. |
| 95 | + artifact group: "org.embulk", name: "embulk-input-s3", version: "0.5.3" |
| 96 | +
|
| 97 | + // Specify this if you need JRuby. |
| 98 | + // It downloads jruby-complete-9.1.15.0.jar, and set the "jruby" Embulk System Property in "embulk.properties". |
| 99 | + jruby "org.jruby:jruby-complete:9.1.15.0" |
| 100 | +
|
| 101 | + // Specify this if you need to set some Embulk System Properties manually. |
| 102 | + // It sets the "key" Embulk System Property to "value" in "embulk.properties". |
| 103 | + embulkSystemProperty "key", "value" |
| 104 | +} |
| 105 | +``` |
| 106 | + |
| 107 | +Then, run `gradle installEmbulkRunSet` (`./gradlew` when you use the Gradle wrapper) to set up. |
| 108 | + |
| 109 | +``` |
| 110 | +$ gradlew installEmbulkRunSet |
| 111 | +
|
| 112 | +> Configure project : |
| 113 | +Supplied embulkHome "/home/user/my-embulk-home" does not exist, then will be created. |
| 114 | +Setting to copy org.embulk:embulk-input-s3:0.6.0:jar into org/embulk/embulk-input-s3/0.6.0 |
| 115 | +Setting to copy com.amazonaws:aws-java-sdk-s3:1.11.466:jar into com/amazonaws/aws-java-sdk-s3/1.11.466 |
| 116 | +... |
| 117 | +... |
| 118 | +Setting to copy org.embulk:embulk-input-s3:0.5.3:jar into org/embulk/embulk-input-s3/0.5.3 |
| 119 | +... |
| 120 | +... |
| 121 | +Setting to copy org.embulk:embulk-input-s3:0.5.3:pom into org/embulk/embulk-input-s3/0.5.3 |
| 122 | +... |
| 123 | +... |
| 124 | +Setting to copy org.jruby:jruby-complete:9.1.15.0:jar into org/jruby/jruby-complete/9.1.15.0 |
| 125 | +
|
| 126 | +BUILD SUCCESSFUL in 2s |
| 127 | +1 actionable task: 1 executed |
| 128 | +``` |
| 129 | + |
| 130 | +The Embulk System Properties file `embulk.properties` is automatically generated in the specified Embulk home, too. |
| 131 | + |
| 132 | +``` |
| 133 | +#Generated by the "org.embulk.embulk-runset" Gradle plugin. |
| 134 | +#Thu Jun 13 16:53:31 JST 2024 |
| 135 | +key=value |
| 136 | +jruby=file\:///home/user/my-embulk-home/lib/m2/repository/org/jruby/jruby-complete/9.1.15.0/jruby-complete-9.1.15.0.jar |
| 137 | +``` |
| 138 | + |
| 139 | +## Run! |
| 140 | + |
| 141 | +In either style of installation, you can run Embulk with the installed Maven-style Embulk plugins. |
| 142 | + |
| 143 | +See the example `s3_with_maven.yaml` below. |
| 144 | + |
| 145 | +```yaml |
| 146 | +in: |
| 147 | + # The full-style type notation for Maven-style Embulk plugins. |
| 148 | + type: |
| 149 | + source: maven |
| 150 | + group: org.embulk |
| 151 | + name: s3 |
| 152 | + version: 0.6.0 |
| 153 | + bucket: ... |
| 154 | + parser: |
| 155 | + type: csv |
| 156 | + ... |
| 157 | +out: |
| 158 | + type: stdout |
| 159 | +``` |
| 160 | +
|
| 161 | +Then, run Embulk! |
| 162 | +
|
| 163 | +``` |
| 164 | +$ java -jar embulk-0.11.4.jar -Xembulk_home=/home/user/my-embulk-home run s3_with_maven.yml |
| 165 | +2024-06-13 17:01:55.373 +0900 [INFO] (main): embulk_home is set from command-line: /home/user/my-embulk-home |
| 166 | +2024-06-13 17:01:55.378 +0900 [INFO] (main): m2_repo is set as a sub directory of embulk_home: /home/user/my-embulk-home/lib/m2/repository |
| 167 | +2024-06-13 17:01:55.378 +0900 [INFO] (main): gem_home is set as a sub directory of embulk_home: /home/user/my-embulk-home/lib/gems |
| 168 | +2024-06-13 17:01:55.378 +0900 [INFO] (main): gem_path is set empty. |
| 169 | +2024-06-13 17:01:55.378 +0900 [DEBUG] (main): Embulk system property "default_guess_plugin" is set to: "gzip,bzip2,json,csv" |
| 170 | +2024-06-13 17:01:55.634 +0900 [INFO] (main): Started Embulk v0.11.4 |
| 171 | +2024-06-13 17:01:55.811 +0900 [INFO] (0001:transaction): Loaded plugin embulk-input-s3 (maven:org.embulk:s3:0.6.0) |
| 172 | +... |
| 173 | +... |
| 174 | +2024-06-13 17:01:55.948 +0900 [INFO] (0001:transaction): Loaded plugin embulk-output-stdout |
| 175 | +... |
| 176 | +... |
| 177 | +2024-06-13 17:01:56.052 +0900 [INFO] (0001:transaction): Loaded plugin embulk-parser-csv |
| 178 | +... |
| 179 | +... |
| 180 | +2024-06-13 17:01:56.691 +0900 [INFO] (0001:transaction): Start listing file with prefix [******] |
| 181 | +2024-06-13 17:01:57.577 +0900 [INFO] (0001:transaction): Found total [1] files |
| 182 | +2024-06-13 17:01:57.721 +0900 [INFO] (0001:transaction): Using local thread executor with max_threads=16 / output tasks 8 = input tasks 1 * 8 |
| 183 | +2024-06-13 17:01:57.759 +0900 [INFO] (0001:transaction): {done: 0 / 1, running: 0} |
| 184 | +... |
| 185 | +... |
| 186 | +1,foo |
| 187 | +2,bar |
| 188 | +3,baz |
| 189 | +2024-06-13 17:01:58.602 +0900 [INFO] (0001:transaction): {done: 1 / 1, running: 0} |
| 190 | +2024-06-13 17:01:58.603 +0900 [INFO] (0001:transaction): Incremental job, setting last_path to [******.csv] |
| 191 | +2024-06-13 17:01:58.618 +0900 [INFO] (0001:transaction): Embulk system property "plugins.output.stdout" is not set. |
| 192 | +2024-06-13 17:01:58.619 +0900 [INFO] (0001:transaction): Embulk system property "plugins.default.output.stdout" is not set. |
| 193 | +2024-06-13 17:01:58.621 +0900 [INFO] (main): Committed. |
| 194 | +2024-06-13 17:01:58.629 +0900 [INFO] (main): Next config diff: {"in":{"last_path":"******.csv"},"out":{}} |
| 195 | +``` |
| 196 | +
|
| 197 | +We hope those installation methods will help you. |
0 commit comments