Code Generation
As hand-writing code for a lot of drivers in multiple languages would be quite a nightmare, we have invested a very large amount of time into finding a way to automate this.
So in the end we need 3 parts:
-
Protocol definition
-
Language template
-
A maven plugin which generates the code
This maven plugin uses a given protocol definition as well as a language template and generates code for reading/writing data in that protocol with the given language.
The Types Base
module provides all the structures the Protocol
modules output which are then used in the Language
templates to generate code.
Protocol Base
and Language Base
hereby just provide the interfaces that reference these types and provide the API for the plc4x-maven-plugin
to use.
These modules are also maintained in a repository which is separate from the rest of the PLC4X code.
This is generally only due to some restrictions in the Maven build system. If you are interested in understanding the reasons - please read the chapter on Problems with Maven
near the end of this page.
Concrete protocol spec parsers, code generators as well as templates that actually generate code are implemented in derived modules all located under the code-generation part of the main project repository.
We didn’t want to tie ourselves to only one way to specify protocols and to generate code. Generally multiple types of formats for specifying drivers are thinkable and the same way, multiple ways of generating code are possible. Currently, however we only have one parser: MSpec
and one generator: Freemarker
.
These add more layers to the hierarchy.
So for example in case of generating a Siemens S7
Driver for Java
this would look like this:
The dark blue parts are the ones released externally, the turquoise ones are part of the main PLC4X repo.
Introduction
The maven plugin is built up very modular.
So in general it is possible to add new forms of providing protocol definitions as well as language templates.
For the formats of specifying a protocol we have tried out numerous tools and frameworks, however the results were never quite satisfying.
Usually using them required a large amount of workarounds, which made the solution quite complicated. This is mainly the result, that tools like Thrift, Avro, GRPc, … all are made for transferring an object structure from A to B. They lay focus on keeping the structure of the object in takt and not offer ways to control the format for transferring them.
Existing industry standards, such as ASN.1
unfortunately mostly relied on large portions of text to describe part of the parsing or serializing logic, which made it pretty much useless for a fully automated code genration.
In the end only DFDL
and the corresponding Apache project Apache Daffodil seemed to provide what we were looking for.
With this we were able to provide first driver versions fully specified in XML.
The downside was, that the PLC4X community regarded this XML format as pretty complicated and when implementing an experimental code generator we quickly noticed that generating a nice object model would not be possible, due to the lack of an ability to model inheritance of types into a DFDL schema.
In the end we came up with our own format which we called MSpec
and is described in the MSpec Format description.
Configuration
The plc4x-maven-plugin
has a very limited set of configuration options.
In general all you need to specify, is the protocolName
and the languageName
.
An additional option outputFlavor
allows generating multiple versions of a driver for a given language.
This can come in handy if we want to be able to generate read-only
or passive mode
driver variants.
In order to be able to refactor and improve protocol specifications without having to update all drivers for a given protocol, we recently added a protocolVersion
attribute, that allows us to provide and use multiple versions of one protocol.
So in case of us updating the fictional wombat-protocol
, we could add a version 2
mspec
for that, then use the version 2 in the java-driver and continue to use version 1 in all other languages.
Once all drivers are updated we could eliminate the version again.
Last, not least, we have a pretty generic options
config option, which is a Map type.
With options is it possible to pass generic options to the code-generation. So if a driver or language requires further customization, these options can be used. For a list of all supported options for a given language template, please refer to the corresponding language page.
Currently, the Java
module makes use of such an option for specifying the Java package
the generated code uses.
If no package
option is provided, the default package org.apache.plc4x.{language-name}.{protocol-name}.{output-flavor}
is used, but especially when generating custom drivers, which are not part of the Apache PLC4X project, different package names are better suited.
So in these cases, the user can simply override the default package name.
There is also an additional parameter: outputDir
, which defaults to ${project.build.directory}/generated-sources/plc4x/
and usually shouldn’t require being changed in case of a Java
project, but usually requires tweaking when generating code for other languages.
Here’s an example of a driver pom for building a S7
driver for java
:
<?xml version="1.0" encoding="UTF-8"?> <!-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at https://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <parent> <groupId>org.apache.plc4x.plugins</groupId> <artifactId>plc4x-code-generation</artifactId> <version>0.13.0-SNAPSHOT</version> </parent> <artifactId>test-java-s7-driver</artifactId> <build> <plugins> <plugin> <groupId>org.apache.plc4x.plugins</groupId> <artifactId>plc4x-maven-plugin</artifactId> <executions> <execution> <id>test</id> <phase>generate-sources</phase> <goals> <goal>generate-driver</goal> </goals> <configuration> <protocolName>s7</protocolName> <languageName>java</languageName> <outputFlavor>read-write</outputFlavor> </configuration> </execution> </executions> </plugin> </plugins> </build> <dependencies> <dependency> <groupId>org.apache.plc4x.plugins</groupId> <artifactId>plc4x-code-generation-driver-base-java</artifactId> <version>0.13.0-SNAPSHOT</version> </dependency> <dependency> <groupId>org.apache.plc4x.plugins</groupId> <artifactId>plc4x-code-generation-language-java</artifactId> <version>0.13.0-SNAPSHOT</version> <!-- Scope is 'provided' as this way it's not shipped with the driver --> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.plc4x.plugins</groupId> <artifactId>plc4x-code-generation-protocol-s7</artifactId> <version>0.13.0-SNAPSHOT</version> <!-- Scope is 'provided' as this way it's not shipped with the driver --> <scope>provided</scope> </dependency> </dependencies> </project>
So the plugin configuration is pretty straight forward, all that is specified, is the protocolName
, languageName
and the output-flavor
.
The dependency:
<dependency> <groupId>org.apache.plc4x.plugins</groupId> <artifactId>plc4x-code-generation-driver-base-java</artifactId> <version>0.13.0-SNAPSHOT</version> </dependency>
For example contains all classes the generated code relies on.
The definitions of both the s7
protocol and java
language are provided by the two dependencies:
<dependency> <groupId>org.apache.plc4x.plugins</groupId> <artifactId>plc4x-code-generation-language-java</artifactId> <version>0.13.0-SNAPSHOT</version> <!-- Scope is 'provided' as this way it's not shipped with the driver --> <scope>provided</scope> </dependency>
and:
<dependency> <groupId>org.apache.plc4x.plugins</groupId> <artifactId>plc4x-code-generation-protocol-s7</artifactId> <version>0.13.0-SNAPSHOT</version> <!-- Scope is 'provided' as this way it's not shipped with the driver --> <scope>provided</scope> </dependency>
The reason for why the dependencies are added as code-dependencies and why the scope is set the way it is, is described in the Why are the protocol and language dependencies done so strangely? section.
Custom Modules
The plugin uses the Java Serviceloader mechanism to find modules.
Protocol Modules
In order to provide a new protocol module, all that is required, it so create a module containing a META-INF/services/org.apache.plc4x.plugins.codegenerator.protocol.Protocol
file referencing an implementation of the org.apache.plc4x.plugins.codegenerator.protocol.Protocol
interface.
This interface is located in the org.apache.plc4x.plugins:plc4x-code-generation-protocol-base
module and generally only defines three methods:
package org.apache.plc4x.plugins.codegenerator.protocol; import org.apache.plc4x.plugins.codegenerator.types.exceptions.GenerationException; import java.util.Optional; public interface Protocol { /** * The name of the protocol what the plugin will use to select the correct protocol module. * * @return the name of the protocol. */ String getName(); /** * Returns a map of type definitions for which code has to be generated. * * @return the Map of types that need to be generated. * @throws GenerationException if anything goes wrong parsing. */ TypeContext getTypeContext() throws GenerationException; /** * @return the protocolVersion is applicable */ default Optional<String> getVersion() { return Optional.empty(); } }
The name
is being used for the module to find the right language module, so the result of getName()
needs to match the value provided in the maven config-option protocolName
.
As mentioned before, we support multiple versions of a protocol, so if getVersions()
returns a non-empty version, this is used to select the version.
The most important method for the actual code-generation however is the getTypeContext()
method, which returns a TypeContext
type which generally contains a list of all parsed types for this given protocol.
Language Modules
Analog to the Protocol Modules the Language modules are constructed very similar.
The LanguageOutput
interface is very simplistic too and is located in the org.apache.plc4x.plugins:plc4x-code-generation-language-base
module and generally only defines four methods:
package org.apache.plc4x.plugins.codegenerator.language; import org.apache.plc4x.plugins.codegenerator.types.definitions.ComplexTypeDefinition; import org.apache.plc4x.plugins.codegenerator.types.exceptions.GenerationException; import java.io.File; import java.util.Map; public interface LanguageOutput { /** * The name of the template is what the plugin will use to select the correct language module. * * @return the name of the template. */ String getName(); List<String> supportedOutputFlavors(); /** * An additional method which allows generator to have a hint which options are supported by it. * This method might be used to improve user experience and warn, if set options are ones generator does not support. * * @return Set containing names of options this language output can accept. */ Set<String> supportedOptions(); void generate(File outputDir, String version, String languageName, String protocolName, String outputFlavor, Map<String, TypeDefinition> types, Map<String, String> options) throws GenerationException; }
The file for registering Language modules is located at: META-INF/services/org.apache.plc4x.plugins.codegenerator.language.LanguageOutput
The name
being used by the plugin to find the language output module defined by the maven config option languageName
.
supportedOutputFlavors
provides a possible list of flavors, that can be referred to by the maven config option outputFlavor
.
supportedOptions
provides a list of options
that the current language module is able to use and which can be passed in to the maven configuration using the options
settings.
Problems with Maven
Why are the 4 modules released separately?
We mentioned in the introduction, that the first 4 modules are maintained and released from outside the main PLC4X repository.
This is due to some restrictions in Maven, which result from the way Maven generally works.
The main problem is that when starting a build, in the validate
-phase, Maven goes through the configuration, downloads the plugins and configures these.
This means that Maven also tries to download the dependencies of the plugins too.
In case of using a Maven plugin in a project which also builds the maven plugin itself, this is guaranteed to fail - Especially during releases.
While during normal development, Maven will probably just download the latest SNAPSHOT
from our Maven repository and will be happy with this and not complain even if this version will be overwritten later on in the build.
It will just use the new version as soon as it has to.
During releases however the release plugin changes the version to a release version and then spawns a build. In this case the build will fail because there is no Plugin with that version to download from anywhere. In this case the only option would be to manually build and deploy the plugin in the release version and to re-start the release (Which is not a nice thing for the release manager).
For this reason we have stripped down the plugin and its dependencies to an absolute minimum and have released that separately from the rest, hoping due to the minimality of the dependencies that we will not have to do it very often.
As soon as the tooling is released, the version is updated in the PLC4X build and the release version is used without any complications.
Why are the protocol and language dependencies done so strangely?
It would certainly be a lot cleaner, if we provided the dependencies to protocol and language modules as plugin dependencies.
However, as we mentioned in the previous subchapter, Maven tries to download and configure the plugins prior to running the build. So during a release the new versions of the modules wouldn’t exist, this would cause the build to fail.
We could release the protocol- and the language modules separately too, but we want the language and protocol modules to be part of the project, to not over-complicate things - especially during a release.
In order to keep the build and the release as simple as possible, we built the Maven plugin in a way, that it uses the modules dependencies and creates its own Classloader to contain all of these modules at runtime.
This brings the benefit of being able to utilize Maven’s capability of determining the build order and dynamically creating the modules build classpath.
Adding a normal dependency however would make Maven deploy the artifacts with the rest of the modules.
We don’t want that as both the protocol as well as the language-modules are useless as soon as they have been used to generate the code.
So we use a trick that is usually used in Web applications, for example:
Here the vendor of a Servlet engine is expected to provide an implementation of the Servlet API
.
It is forbidden for an application to bring this along, but it is required to build the application.
For this the Maven scope provided
, which tells Maven to provide it during the build, but to exclude it from any applications it builds, because it will be provided by the system running the application.
This is not quite true, but it does the trick.