Difference between revisions of "Buildroot:ReproducibleBuilds"

From eLinux.org
Jump to: navigation, search
(Journal Update - 9/8/19)
(Meetings: W33)
Line 9: Line 9:
 
Weekly meetings on [https://appear.in/buildroot appear.in/buildroot] every Tuesdays at 14:30 UTC.
 
Weekly meetings on [https://appear.in/buildroot appear.in/buildroot] every Tuesdays at 14:30 UTC.
  
 +
* 2019-W33
 +
** Atharva just started work on categorisation
 +
*** Time: too difficult, it's hard to identify in the diff output that this is a time
 +
*** Build path: same issue, really. Because the diff is on a hexdump, the path is spread over multiple lines. Could be handled by adding a 'strings' tool to diffoscope. Requires patching diffoscope itself.
 +
**** It would also be nice if diffoscope could avoid showing a difference again if it was already shown by another tool. E.g. if strings shows some difference, the same difference doesn't need to be shown with readelf. Very difficult to implement, probably. We could, however, in our post-processing keep only the first tool that identifies differences for a specific file.
 +
*** Build ID: Easy to categorise, so let's go for it. Normally should be handled by toolchain-wrapper so we won't see this, but it can be useful anyway if it creeps back in at some point.
 +
** Refactoring of diffoscope took some time but is now sent to the list.
 +
*** Distinguish between filesystem differences and package differences
 +
*** If there is a filesystem difference, the reason will say "filesystem difference"; otherwise, the reason will be the first package with a difference.
 +
** Patchelf issue (XXXX with different length) is solved now because we use same-length output directory
 +
** Build-ID issue can be solved by explicitly specifying build-ID in toolchain-wrapper
 +
*** Atharva will work on this
 +
** Now some package-specific problems show up
 +
*** openjpeg encodes the build path. This needs to be solved per package.
 +
*** softether has seemingly random differences
 +
*** Atharva can look at these package-specific issues once the categorisation has been started.
 
* 2019-W32
 
* 2019-W32
 
** Atharva did work on the buildid issue
 
** Atharva did work on the buildid issue

Revision as of 08:08, 13 August 2019

As part of a Google Summer of Code project, Atharva Lele works on reproducible builds.

Away time

Arnout is away on: 10/6; July 6-28; 15-19/8. Atharva cannot work 31/7 - 2/8: Focusing on GRE (US Masters Admissions Examination)

Meetings

Weekly meetings on appear.in/buildroot every Tuesdays at 14:30 UTC.

  • 2019-W33
    • Atharva just started work on categorisation
      • Time: too difficult, it's hard to identify in the diff output that this is a time
      • Build path: same issue, really. Because the diff is on a hexdump, the path is spread over multiple lines. Could be handled by adding a 'strings' tool to diffoscope. Requires patching diffoscope itself.
        • It would also be nice if diffoscope could avoid showing a difference again if it was already shown by another tool. E.g. if strings shows some difference, the same difference doesn't need to be shown with readelf. Very difficult to implement, probably. We could, however, in our post-processing keep only the first tool that identifies differences for a specific file.
      • Build ID: Easy to categorise, so let's go for it. Normally should be handled by toolchain-wrapper so we won't see this, but it can be useful anyway if it creeps back in at some point.
    • Refactoring of diffoscope took some time but is now sent to the list.
      • Distinguish between filesystem differences and package differences
      • If there is a filesystem difference, the reason will say "filesystem difference"; otherwise, the reason will be the first package with a difference.
    • Patchelf issue (XXXX with different length) is solved now because we use same-length output directory
    • Build-ID issue can be solved by explicitly specifying build-ID in toolchain-wrapper
      • Atharva will work on this
    • Now some package-specific problems show up
      • openjpeg encodes the build path. This needs to be solved per package.
      • softether has seemingly random differences
      • Atharva can look at these package-specific issues once the categorisation has been started.
  • 2019-W32
    • Atharva did work on the buildid issue
      • -Wl,--build-id=none should be part of the wrapper
      • When passing it in TARGET_CFLAGS, it doesn't solve everything (probably because some packages ignore TARGET_CFLAGS)
    • Atharva updated the autobuild-run script to use same length paths to avoid the patchelf issue.
    • Atharva still needs to update the wiki with the list of patches (for GSoC admin)
    • Atharva will send the tooling branch to the list.
      • Nothing done yet for categorisation.
    • Atharva to review Thomas's patches on genrandconfig
    • Atharva will start work on categorisation. This is an additional json key in the details.
      • Compare number of XXX to identify the output dir issue
      • Compare paths for output dir issue
      • Yann's idea was to identify differences in the date, but that may be difficult to do in general.
    • JSON output of diffoscope is not so human-readable (even after passing through jq). At the BR hackaton it was suggested to run diffoscope again with the normal output, but limit that output with the --max-text-report-size option to e.g. 40KB.
  • 2019-W31
    • Yann did not attend
    • Atharva gives an overview of what was done in the last couple of weeks
    • diffoscope
      • Generated post-processed JSON with limited added/removed lines looks good
      • For a big difference, the uncompressed size is 1MB and compressed 136K (which makes the complete tarball 250K)
      • It could still be possible to reduce the size further, e.g. limiting the number of hunks per file, but that would be difficult (to make sure we still have relevant info).
      • Still needs to be tested a bit more - there are still errors on some diffoscope output.
      • Unexpected diffoscope output should be handled gracefully (i.e. pass unchanged).
      • "not-reproducible" part should not be in the JSON, but should be added to the package name when creating the reason string
      • the same file can be part of several packages, this should be supported
        • but that can be done later
      • the way unified_diff is split into added/removed lines is not entirely OK, because the offsets and the context are forgotten
        • still OK-ish because usually the information is good enough
        • so can be fixed later
      • Post the current state to the list ASAP to allow review, then focus on fixing the issues found until now
    • Different output directories
      • this gives *big* differences in the tarball
      • One issue is the build-id that is added at link time
        • probably caused by the paths that are added in RPATH or in debug info or somewhere, and that gets stripped off later
        • can be solved by passing -Wl,--build-id=none
      • Another issue is that the directory still seems to be in there, but replaced by XXXX, so different directory length leads to different output binary (currently we use "output" and "output-1")
        • workaround: use "output-0" and "output-1" or something else which is the same length
        • ask advice on #reproducible-builds
  • 2019-W30
    • diffoscope
      • can do size limit, either or both per-file and globally on the report
      • HTML output is "nice" for viewing as a human
      • still we want JSON as it is easier to parse from a script
      • can do both at the same time!
      • we still need a way to limit the diff ourselves
    • categorisation
      • categorise deltas into a set of categories
      • failure reports will contain things like:
        • "this file is not reproducible; probable reason is an embedded path; try to use compiler option -foo or set environment variable BAR=bla"
        • "this file is not reproducible; probable reason is an embedded time; try to see if upstream uses SOURCE_DATE_EPOCH, or set compiler option -foo"
          • can pass diff string to 'date' to check if its a time issue
      • this is now the most important topic
    • Atharva should maintain a list of current patchwork status here on the wiki page; this will be useful for the official GSoC report
    • Atharva has GRE examinations on Fri Aug 2. He should focus on that; just post the current status of the patches on the list for feedback (with a note that it's RFC).
      • To compensate, the work can continue until August 26.
  • 2019-W29
    • No meeting, mentors not available
    • Discussions on IRC instead, to compensate
  • 2019-W28
    • Yann has diffoscope installed, but diffoscope does not work in the Autobuilder script.
      • Not a problem with the script, but with the machine.
      • Yann will investigate and fix.
    • GCC Compile Farm Account:
      • Is fast and can be used for large builds.
      • Yann, Arnout, Thomas should be added as references while requesting an account.
      • Should be requested as soon as possible since it takes time to accept the request.
    • Atharva will run builds manually on random configs to find how diffoscope output varies with different kinds of reproducibility issues.
      • Use small and popular libraries: zlib, libpng, openssl, libressl.
      • Vary things like time, path and check if non-reproducibility is introduced.
      • Check what part is non-reproducible and what causes it before checking how to fix it.
      • Find patterns and categorize problems and solutions.
      • After knowing diffoscope output format, start tooling to check where differences come from.
  • 2019-W27
    • initial patch to make tarballs reproducible
      • mtime already taken care of in the infra, needs a comment
    • GZIP environment variable
      • stop exporting GZIP
      • fix fs/common.mk to use -n
      • wait for autobuilders to detect packages that are broken; fix them
      • don't do a wrapper, unless there are too many breakage
    • diffoscope in the autobuilders
      • Thonas installed a cut-down version on his instance
        • this givers partial results
      • Yann will install a full-blown version in a newer instance
    • autobuilder script
      • Atharva will respin with the requested changes
    • ELC-E
      • Atharva did submit for a talk to present their work
      • Atharva asked the LF for funding for travel expenses
  • 2019-W26
    • Fix reproducible issues
      • tar problem
      • GZIP environment variable
      • Nothing has been done yet.
      • Priority for coming week.
    • autobuilder scripts
      • Changes requested on Builder class series. Almost done.
        • This will conflict with a patch from Thomas.
        • Include Thomas's patches in the Builder class series.
      • No changes requested on the reason patch, but it depends on the Builder series.
        • Put it in a series together with the Builder class, so it can be applied together.
    • Atharva wrote in his journal that he was "waiting for feedback"
      • He was not actually waiting, but working on the separate/different output directories.
      • This should have been mentioned in the journal, that's what it is for.
    • Environment variable for KCONFIG_PROBABILITY.
      • Atharva just made a local branch that fixes KCONFIG_PROBABILITY to 1.
      • Good enough.
      • It could be useful to have an option that makes genrandconfig behave predictably, cfr. KCONFIG_SEED
        • Something that Atharva can do when there's nothing else to do.
    • ELC-E
      • Atharva will submit an abstract for a talk at ELC-E, deadline is June 30.
      • Draft should be shared with Yann and Arnout soon so they can review.
      • Atharva will apply for travel funding at ELC-E as well, perhaps the Buildroot Association can contribute funding as well.
  • 2019-W25
    • autobuilder scripts:
      • first patches to autobuilder script applied and deployed; first results trickling in! Wee! :-)
    • reason for failure is still unknown'
      • add a reason file in the result dir,
      • tweak the PHP code to report that if available, and fallback to the curent behaviour if missing
    • we need an autobuilder instance that has diffoscope installed, to get more intersting results
      • Yann wil look at doing that in his instance (or spawn another one)
    • Builder class
      • initial big ptch for proof of concept pushed; comments from Arnout
      • introducing the class really needs a big patch (bonus point if it can be made mechanical):
        • move functions in the class,
        • add the self parameter
        • call functions from foo() to self.foo()
        • instanciate the object
      • then migrate variables one by one from kwargs to object members, to stop duplicating code
    • Atharva will shift his working hours ahead after college starts. Will work from around 12:00Z. College will start end of June/first week of July,
  • 2019-W24
    • reprotest
      • When building under reprotest, building tar fails: https://pastebin.com/2UbQSuu4 - maybe some issue with uid mapping?
      • For now, leave it alone, we can revisit later.
      • Reprotest already does two builds and compares the results. It's pretty invasive in terms of what it expects from the environment.
      • Probably better to use reprotest as inspiration and do the same from autobuild-run.
    • disorderfs
      • It uses a FUSE filesystem to randomize the order in which files are listed.
      • Also didn't succeed, autobuild-run fails because the output directory isn't removed.
      • Also the second build failed, because a file was changed while it was being tarred.
      • Cfr. https://pastebin.com/pzGfF1c9
      • For now, leave it alone, we can revisit later.
    • Next steps: choose between:
      • Continue on reprotest and disorderfs
      • Improve autobuild-run script, e.g. build in two different directories
      • Improve reporting on the autobuild website.
    • For next week:
      • Collect the review feedback which has not been implemented yet
      • Introduce Builder class in autobuild-run
      • Use this to store the output_dir
      • This makes it easy to do two builds with different output_dir
      • In parallel, mark failures as reason=reproducible
        • Add a 'reason' file in the build results
        • Use that in PHP script
  • 2019-W23
    • Initial round-up of autobuild scripts patches
      • Basically, look OK-ish
      • Not bisectable because reverse order
      • Re-spin in correct order, but still split for ease of review
      • Ultimately, to be committed squashed together
    • diffoscope is silent on success, and so is cmp -> diffoscope_result.txt can be used to determine if reason should be set to reproducible.
    • Atharva will evaluate the reprotest and disorderfs projects to see if they can be useful for our reproducible tests.
    • Atharva should add a journal (log) to this wiki page two or three times a week
  • 2019-W22
    • As discussed on IRC, diffoscope only needs to be done if cmp detects differences. However, it doesn't take long anyway, and it *will* report if there is a difference.
    • diffoscope must be done on output/target/ and target/images, but autobuilders don't enable any images. So when doing a reproducible test, a tarball must be generated.
      • Manually try this, to be sure that it also looks inside the generated images.
      • Enable one / all target filesystems to check this manually.
      • Disable BR2_REPRODUCIBLE for this test, so there actually are some differences.
    • diffoscope has a lot of dependencies, we don't want all of these on the autobuilders
      • Try what the output is if the external tools are not installed
      • autobuilder script should fall back on cmp if diffoscope is not installed
    • Start patching autobuilder script to do a reproducible test.
      • Randomly enable BR2_REPRODUCIBLE, e.g. 10% of the times
      • Do the same build a second time. Only variation is time.
      • Run diffoscope on the result.
  • 2019-W21
    • Confirmed that starting from next week, work is full-time on GSoC (end of exams)
    • Review of the Yocto implementation
      • differences: Yocto is a distribution, so has a cache of the output, while buildroot does not
      • SOURCE_DATE_EPOCH and TZ: already done (depends on BR2_REPRODUCIBLE)
    • Doing similar in Buildroot:
      • Do a first build with a successfull config from autobuilders, after enabling BR2_REPRODUCIBLE
      • Then mv $(O)/target to $(O)/target-1; make clean; make
      • And then run diffoscope target-1 target/
    • Identify diffoscope dependencies to run it in autobuilders (eventually)
      • How to save and present the result on autobuilder site?
  • 2019-W20
    • introductions
    • confirm overal actions and planning

Patchwork links

This section contains links to the latest patches by Atharva. This is required for GSoC reporting.

  • Different output directories and JSON related tooling: Under review: Patchwork
  • autobuild-run: fix cross tools prefix for diffoscope: Accepted: Patchwork
  • builder-class series (autobuild-run refactor): Partially Accepted: Patchwork
  • Makefile: don't export GZIP environment variable: Accepted: Patchwork
  • fs/common.mk: do not store original names and timestamps when creating GZIP rootfs: Accepted: Patchwork
  • fs/cpio: make cpio rootfs reproducible: Accepted: Patchwork
  • package/cpio: add host version: Accepted: Patchwork
  • fs/tar: explicitly set extended header values to ensure binary reproducibility: Accepted: Patchwork
  • utils/genrandconfing: randomly enable BR2_REPRODUCIBLE 10% of the times: Accepted: Patchwork

Yocto's Implementation

  • Shared State Mechanism: If input metadata hashes are same, outputs are reused. If inputs have changed, tools from Reproducible-Builds to be used. Further development yet to be done.
  • At this stage, binary contents should be same. However file timestamps (due to package managers) may be different.
  • Static Timezone value: Bugzilla
  • Adapted SOURCE_DATE_EPOCH: Bugzilla, Source-Date-Epoch - Reproducible Builds
  • Archives generated with deterministic metadata (using archive tools' arguments)
  • Remove non-deterministic data from rootfs

Diffoscope Dependencies

  • Depends on: python3, PyPI modules: libarchive-c, python-magic
  • External tools requied: Rscript, abootimg, apktool, bsdtar, bzip2, cbfstool, cd-iccdump, cmp, compare, convert, db_dump, diff, docx2txt, dumpxsb, enjarify, fdtdump, ffprobe, getfacl, ghc, gifbuild, gpg, gzip, identify, img2txt, isoinfo, javap, js-beautify, lipo, llvm-bcanalyzer, llvm-dis, lsattr, lz4, msgunfmt, nm, objcopy, objdump, ocamlobjinfo, odt2txt, oggDump, otool, pdftotext, pedump, pgpdump, ppudump, procyon, ps2ascii, readelf, showttf, sng, sqlite3, ssconvert, ssh-keygen, stat, tcpdump, unsquashfs, wasm2wat, xxd, xz, zipinfo, zipnote
  • This has tools used to compare a lot of file formats that probably aren't generated (like android APKs, Windows/Mac executables) in a Buildroot run. We can exclude those.
  • APT packages (available in Ubuntu, Debian): abootimg, acl, apktool, binutils-multiarch, bzip2, caca-utils, colord, coreutils, db-util, default-jdk-headless | default-jdk | java-sdk, device-tree-compiler, diffutils, docx2txt, e2fsprogs, enjarify, ffmpeg, fontforge-extras, fp-utils, genisoimage, gettext, ghc, ghostscript, giflib-tools, gnumeric, gnupg, gzip, imagemagick, jsbeautifier, libarchive-tools, llvm, lz4 | liblz4-tool, mono-utils, ocaml-nox, odt2txt, oggvideotools, openssh-client, pgpdump, poppler-utils, procyon-decompiler, r-base-core, sng, sqlite3, squashfs-tools, tcpdump, unzip, xmlbeans, xxd | vim-common, xz-utils, zip

Sample Diffoscope Output

  • Minimal config build (make defconfig; make). Will run diffoscope on a build from Autobuilder config tomorrow.
  • Builds run about 10 minutes apart.
  • Moved first build to target, and rerun. Then run diffoscope target-1 target > diff.txt
  • diffoscope log: https://paste.ubuntu.com/p/VpMbW4qQQP/
  • Except for a time record in the busybox binary, all other differences seem to be only timestamps of file generation.

Planning

  • Week 20: study how yocto does it
  • Week 21: ...
  • Week 22: do two builds in autobuild-run script
  • Week 23: revisit patches to autobuild-run
  • Week 24: autobuild-run: different output directories; report with reason=reproducible
  • Week 25: improve how reproducible results are shown on http://autobuild.buildroot.org
  • Week 26: Fix the reproducible issues found until now
  • Week 27: Further extend autobuild-run script with more variation
  • Week 28: Tooling to understand where the differences come from (e.g. which package)
  • Week 29: More tooling to analyse differences
  • Week 30: More tooling to analyse differences

Progress Journal

  • 06/06/2019:
    • Setup Gitlab account to track issues and progress
    • Pushed dev branch to Gitlab, as well as re-spun commits and pushed
  • 07/06/2019:
    • Submitted v2 patches to mailing list
    • Submitted patch to enable BR2_REPRODUCIBLE
    • Started evaluating reprotest and its working
  • 08/06/2019:
  • 09/06/2019 - 11/06/2019:
    • Finished v3 patches, sent to mailing list
    • Evaluated reprotest & disorderfs, discussed with Arnout during meeting
    • Less work done than possible due to lack of planning, now have planned till week 30
  • 12/06/2019:
    • Explored PHP components of Autobuilder website to identify what and how to modify
    • Learned basic syntax and working of PHP since I've never worked with it before
    • Brushed up on using classes in python because it has been a while since I worked using classes
    • Now it's easier for me to implement the Builder class
  • 13/06/2019:
    • Worked on implementing and transitioning to Builder class
    • Will push code to Gitlab tomorrow after removing errors
  • 14/06/2019:
    • First step of transitioning to Builder class done, pushed to Gitlab
    • Testing it thoroughly before sending to mailing list
    • Created and sent reproducible-v4 to mailing list, Gitlab
  • 15/06/2019 - 18/06/2019:
    • Worked on implementing builder class
    • Pushed proof of concept to Gitlab
    • Received feedback and now revising my patches
  • 20/06/2019:
    • Finished work on Builder class, waiting for feedback from Mentors
  • 21/06/2019:
    • Going to work on reason file in autobuild-run
    • Sent Builder class to mailist list for feedback
  • 22/06/2019:
    • Worked on implementing reason-file, pushed to Gitlab
    • Waiting for merge of builder-class to send to mailing list
    • Waiting for feedback from Mentors
  • 23/06/2019 - 25/06/2019:
    • Working on different output directories
  • 26/06/2019:
    • Revising builder-class with feedback
    • Having a bit of trouble with merging Thomas' patches
  • 27/06/2019:
    • Worked on ELC-E 2019 talk abstract
    • Worked further on builder-class, testing ongoing..
  • 28/06/2019:
    • Submitted ELC-E 2019 abstract
    • Figured out required travel charges to ELC-E
  • 29/06/2019:
    • Submitted travel funding request
  • 03/07/2019:
    • Worked on fixing GZIP environment variable
    • Reworked builder-class according to comments
  • 06/07/2019:
    • Worked on different output directories for reproducible builds test
  • 11/07/2019:
    • Analyzing diffoscope outputs as discussed in meeting
  • 12/07/2019:
    • Patch for reproducible cpio rootfs
    • Analyzing source of difference for /usr/bin/getconf
  • 13/07/2019 - 15/07/2019:
    • Rest due to being ill.
  • 16/07/2019 - 17/07/2019:
    • cpio and GZIP patches rework
    • Testing of zlib by varying time, output directories
      • No differences found in binaries
    • Tried to fix difference found in uClibc (getconf)
      • Seems tricky, will ask mentors on the weekend when I have more time to look into it
  • 18/07/2019:
    • Diffoscope JSON output:
      • patch to switch to JSON formatted output
      • rudimentary work to extract which package is the cause of differences
  • 19/07/2019:
    • Patch for extracting package from diffoscope output up on Gitlab
      • Needs some more refinement before sending to mailing list
    • Discovered reproducibility issues in libtasn1, mpg123
  • 22/07/2019 - 24/07/2019:
    • Formatting of JSON to have added and removed lines and suggestions
    • Cleaning up of code and testing in various configurations
  • 25/07/2019 - 29/07/2019:
    • Refactor JSON related code to make it more robust
    • Testing it in more configurations
    • Report package version along w/ package name
    • Evaluate if HTML output of diffoscope is useful
  • 04/08/2019:
    • Look into reproducibility issue of different output directories
      • Asked for help on #reproducible-builds IRC
      • Buildroot infra sanitizes rpath while installing to target, hence "output" -> "XXXXXX"
      • Debian uses same length paths for different directory testing
        • This is the solution to be used for now
  • 05/08/2019:
    • Make autobuild-run use same length paths for outputdir
    • Implement try-catch blocks for modifying JSON in autobuild-run
    • Further testing..
  • 06/08/2019:
    • Build ID issue:
      • Test setting -Wl,--build-id=none in TARGET_CFLAGS
      • Should be done in toolchain_wrapper.c
  • 07/08/2019 - 09/08/2019:
    • Refactoring JSON tooling patches
    • Exploring toolchain_wrapper.c

Code

Atharva Lele's on-going work to make the autobuilder scripts reproducible-aware (check the reproducible-vN branches).

ELC-E 2019

A proposal for a talk at Embedded Linux Conference, Europe on Reproducible Builds in Buildroot has been submitted.
Abstract PDF: PDF

GSoC Proposal

The proposal PDF can be found here: PDF