Device tree kernel summit 2017 notes julia

Notes of of 2017 Devicetree Workshop in Prague on 26 Oct 2017
by Julia Lawall [comments by Frank Rowand, October 31]

Talk 1: YAML suggestion for schemas
Observation: YAML format is similar to DTS. YAML is a superset of JSON, so a YAML parser can parser JSON YAML is more friendly for humans.

Gives the option of using indentation to indicate subproperties. Braces can be used as in JSON.

YAML allows references. Parser inlines labels. Thus DT breaks YAML parsers, because it expects real labels and anchors.

Hex is supported in YAML, even though it does not work in JSON. In the proposed format it is possible to specify that something should represent a sequence of bytes.

XML was not considered because it was not considered to be human readable and is too verbose.

YAML doesn't define a number format. Bignums are possible.

There is not an intent to eliminate the use of the C preprocessor. It will remain in place as is.

There is a tool for converting from DTS to YAML; this will replace dtc and will integrate the C preprocessor. Validation will be done on the YAML file. There will be a need for a set of schema files against which to validate the YAML.

[Frank: One or more tools will be needed to convert DTS to YAML. Pantelis has implemented a prototype tool 'yamldt' to do this task. Whether the input to the tool is unprocessed DTS, the output from cpp, or the output of the dtc compiler is an implementation detail. Currently, the output of the dtc compiler is not suitable as input to the conversion tool because it does not preserve the location (line numbers and columns) of the original source.]

Why does YAML have to be human readable. We only need to read the schema files? In the proposed tool chain it is not important. But the number encoding is better and YAML is more extensible. Human readable is good if in the future there is a desire to change the source file format.

There are tools to convert human readable YAML to machine readable YAML, ie with {}.

There are many ways to encode DTS into YAML. There is a need to constrain this.

There are currently efforts to express schemas in either JSON or YAML.


 * 1) include is available in YAML via the C preprocessor. It does not exist in YAML itself.

There is a concern about diff, which is fine for YAML. There is a need to consider versioning. JSON is whitespace insensitive so diff insensitive to whitespace could be useful. Currently diffing is hard on DTS files, due to includes.

[ Frank: scripts/dtc/dtx_diff was created to diff DTS files. Use cases, advantages, and things to be aware of are described in [[Media:Dt_debugging_elce_2015_151006_0421.pdf | "Solving Device Tree Issues"]] (updated), ELCE October 2015 by Frank Rowand (PDF). dtx_diff is referred to as "dtdiff" in this presentation. ]

YAML has # as a native comment format. Comments are preserved in some parsers. The use of # for comments is a bit unfortunate, because many device tree property names begin with #. The solution is to put the name in "", but it does not look very nice. One could add quotes on all property names, to make the code look more consistent.

Are the generated DTBs different? It is possible to generate a bit compatible output. Possible differences derive from the optimization model, not from the use of YAML specifically.

If we go with YAML, we have to be compatible with existing YAML tools.

In the future, we will expect people to create YAML files, and at least to understand the format of the files, which is necessary to be able to create schema files.

[Frank: Device tree source files and schema files are independent objects, with different formats. There is currently no plan to create YAML device tree source files. One of the options being considered for the format of schema files is YAML. Schema files could thus be hand created as YAML.]

Process question: Do the validation tools want the source format or the deduplicated format? Normally the deduplicated format.

There is a question about whether one could convert DTB to YAML? This seems not possible need to the reconstruct labels. Then the questions is whether one can validate binaries. DTB -> decode -> validate.

[Frank: it was suggested that a tool could be created to convert from DTB to DTS. Nobody volunteered to create the tool. Label reconstruction is not an issue if the dtc "@" option was used to create the DTB, because in that case the label information is stored in the symbols area of the DTB. The resulting DTS could then be converted to YAML in the same manner as any other DTS.

The issue with validating against a DTB instead of the original source is that the source location information is not available from the DTB. ]

Lists are indicated with [] and can be nested.

There was a suggestion to solve the label problem with !phandle, as is done for paths.

Moving on to schema

Constraints can be arbitrary C code, as acceptable to BPF.

How to express "if this exists then some other property must exist"? Have category property to indicate required or optional. This is apparently possible.

From a binding want to be able to generate documentation and to verify that a node validates the binding.

--

Grant Likely presents his JSON schema-based schema proposal.

Maybe YAML DTS files can be called DTY files.

JSON schema defines a vocabulary. Looks for property names and connects them to the names in the DTS document.

JSON schema could be written in YAML, ie without braces.

Goal to constrain what properties can appear. Can use "allOf" to constrain everything. Property names are expressed with regular expressions. Can use oneOf to select between options.

We would have a master schema file that describes what a schema is allowed to contain. Schema developer would do something much more concrete.

There needs to be some kind of coverage record for the validation. When we run the validation, do we actually validate all of the properties. A record is needed of this.

If the validator doesn't know a keyword, it ignores it.

Currently, the error messages are terrible. This an implementation issue and not a fundamental problem. Context is provided for failures. Currently, use ruaml that preserves line numbers.

There is a concern about adding more dependencies for the kernel. Python, ruaml, etc. These are not required build dependencies. They are required development dependencies. Verification can be done during the build, but is not necessary.

[Frank: I am strongly aligned with Rob Landley on the idea of not adding dependencies to the kernel build process. Making the verification optional in the kernel build system should resolve this concern. On the other hand, the policy and expectations should be that verification is the default in the kernel build system and is the expected process. ]

The master is used to validate all DTY files. The specific bindings then only have to make their specific constraints.

What triggers validation. Eg spi bus triggers validation for all of its children. Currently it is not clear how this is specified. Perhaps it comes from compatible annotations. Then have a question of how to express nodes that are for devices that support both spi and i2c. Use oneOf to express this.

---

Device tree in the Zephyr RTOS (Kumar Gala)
Originally, lots of hardware configuration was exposed in Kconfig, but this was totally useless and confusing to the user. More simple devices are relevant than in the Linux case. Highly resource constraints systems, so storing DTB, then parse it, etc is not feasible. Propose code generation based on .dts.

Already using YAML. Generate #defines from YAML descriptions. The defines don't have very nice names for practical use - very long. Also concerns about uniqueness. Fixup files used to normalize names.

Xilinx does something similar. Finds it to be a mess. C code that uses the fixup files gets too complicated. Variable names correspond to hardware descriptions, not programming concepts. Lots of duplication of code. Again, the problem is that there is not enough memory to run the device tree parser.

[Frank: the memory issue on the target system includes both the device tree data and the code to parse the device tree.]

How could the C structure be generated? Problem that it includes driver implementation data. May need to separate things, to remove the driver-specific aspects.

--

U-boot SPL (Simon Glass)
SPL = secondary program loader

Constrained memory. fdtgrep transforms a binary device tree to drop nodes that are not needed. Includes only nodes that have a particular property, excludes nodes that have other properties. Uses libfdt to parse the binary.

dtoc takes a .dtb file and generates C structures. Generates C code and header file.

phandle produces the address of another structure followed by parameters.

Multiple compatible strings results in #define to share the structure but have available the relevant names.

Have to extract all the information as needed in the probe function. of_data_to_platdata is too expensive, so have to write the corresponding needed statements by hand.

[Frank: this paragraph from David Gibson:] dtb format designed to be compact enough for servers from 2005. Some aspects could be more compact. dtb contains version information, so it is possible to change the format.

If supporting many boards, the DT code swamps the U-boot code.

Needs for C generation: What are good names for the type and the structure name?

Should filtering be done on the command line, or should there be some data format for describing what should be kept and what should be discarded.

Would prefer not to require the user to make yet another specification file.

How to generate unique names.

Seem to need some kind of schema binding in order to generate good C code. schema could provide nicknames.

One should talk to Simon and Kumar if interested in this code generation issue for small systems.

--

Criteria for acceptance of board files
Should all board files should be upstream. We hope so, but perhaps not everyone will want to offer them. Would we reject something? Nothing that was submitted has merited complaints. There is a concern about receiving too many files for hardware that comes in many variants. Could make a directory with all the boards from a single vendor. Should not have a directory with a single file.

--

Thomas

Problem with code duplication in node definitions. Problems with some constants and then with label references. Could use the C preprocessor. Macros would be like 1000 lines of backslashes. Could we use relative references?

Should we have nice syntax (template language) and wait for someone to implement it, or use macros that look ugly. Some sense that the macro solution is not that bad.

Should we use scripting (perl?) to generate dts files? The result would perhaps be more readable than the C preprocessor.

See whether relative references can be added to dtc. This seems like it should be straightforward.