summaryrefslogtreecommitdiffstats
path: root/taskcluster/docker/system-symbols-linux-scraper
diff options
context:
space:
mode:
Diffstat (limited to 'taskcluster/docker/system-symbols-linux-scraper')
-rw-r--r--taskcluster/docker/system-symbols-linux-scraper/Dockerfile2
-rw-r--r--taskcluster/docker/system-symbols-linux-scraper/README64
2 files changed, 65 insertions, 1 deletions
diff --git a/taskcluster/docker/system-symbols-linux-scraper/Dockerfile b/taskcluster/docker/system-symbols-linux-scraper/Dockerfile
index edafc97c83..e9785c93c1 100644
--- a/taskcluster/docker/system-symbols-linux-scraper/Dockerfile
+++ b/taskcluster/docker/system-symbols-linux-scraper/Dockerfile
@@ -12,7 +12,7 @@ VOLUME /builds/worker/checkouts
RUN apt-get update && \
apt-get install --no-install-recommends -y \
- 7zip binutils build-essential cpio curl debuginfod elfutils flatpak jq \
+ file 7zip binutils build-essential cpio curl debuginfod elfutils flatpak jq \
libxml2-utils python3-pip rpm2cpio squashfs-tools unzip wget zip && \
apt-get autoremove -y && rm -rf /var/lib/apt/lists/*
diff --git a/taskcluster/docker/system-symbols-linux-scraper/README b/taskcluster/docker/system-symbols-linux-scraper/README
new file mode 100644
index 0000000000..ac37a2d3f3
--- /dev/null
+++ b/taskcluster/docker/system-symbols-linux-scraper/README
@@ -0,0 +1,64 @@
+Performing a (re)bootstrapping of symbols scraping process
+==========================================================
+
+Whenever for any reason the symbol scraping process might have been faulty long
+enough, we can end up (currently) in a situation where the recorded status of
+`SHA256SUMS.zip` on the TaskCluster index is inconsistent with what we really
+processed.
+
+This document aims at explaining what needs to be done and where to recover
+from that state (this is based on the experience from bug 1893156).
+
+First, you need to identify since how long the problem has been present. As of
+now there is no really better tooling than processing manually the cron tasks
+logs and see when it started to fail.
+
+Once you have identified a date, the next step is to work on the bootstrapping
+content. As visible in
+https://searchfox.org/mozilla-central/rev/f6e3b81aac49e602f06c204f9278da30993cdc8a/taskcluster/docker/system-symbols-linux-scraper/run.sh#62,
+the first source of truth is the gh-pages branch of the symbol-scrapers github
+repository: https://github.com/mozilla/symbol-scrapers/tree/gh-pages. This
+source of truth is evaluated ONLY if the TaskCluster index is NOT present. The
+route is being computed from the running task's definition:
+https://searchfox.org/mozilla-central/rev/f6e3b81aac49e602f06c204f9278da30993cdc8a/taskcluster/docker/system-symbols-linux-scraper/run.sh#14
+from which we ONLY consider the `latest` alias.
+
+As of today the index is for example for debian:
+index.gecko.v2.mozilla-central.latest.system-symbols.debian and thus one can
+explore the content at
+https://firefox-ci-tc.services.mozilla.com/tasks/index/gecko.v2.mozilla-central.latest.system-symbols.debian,
+other means of browsing including pushdate allows to find e.g.,
+https://firefox-ci-tc.services.mozilla.com/tasks/index/gecko.v2.mozilla-central.pushdate.2024.04.20.20240420094034.system-symbols/debian
+from which we can get a link to the sums file:
+https://firefox-ci-tc.services.mozilla.com/api/index/v1/task/gecko.v2.mozilla-central.pushdate.2024.04.20.20240420094034.system-symbols.debian/artifacts/public%2Fbuild%2FSHA256SUMS.zip
+
+Once you have identified WHEN the problem arose, you can take the above URL
+(adapting with the correct date) and adapting to the various distributions.
+
+Make sure you have an uptodate git clone of the mozilla/symbol-scrapers
+repository, checkout a new branch out of the gh-pages tree, and you can proceed
+to the data extraction following (example with a different date):
+ for distro in alpine archlinux debian fedora firefox-flatpak firefox-snap gnome-sdk-snap mint opensuse ubuntu; do
+ wget https://firefox-ci-tc.services.mozilla.com/api/index/v1/task/gecko.v2.mozilla-central.pushdate.2024.02.07.latest.system-symbols.$distro/artifacts/public%2Fbuild%2FSHA256SUMS.zip -O $distro/SHA256SUMS.zip;
+ done;
+ mv archlinux/SHA256SUMS.zip arch/
+
+Please note that there's a slight difference in naming, archlinux vs arch.
+Please note other distros might have been added since so you need to adapt.
+
+Send a pull request once this is OK, make it reviewed or merge it.
+
+As of now, the content of the boostrapping process is in the state we want, but
+if you run a symbols scraping task, it will still pull data from the
+TaskCluster index. This time, you need to use the index that refers to latest
+and NOT the pushdate or another one, so the index used in the example SHOULD be
+good. You just have to run deleteTask (with the appropriate credentials if you
+have them, or ask releng for help in #firefox-ci):
+ for distro in alpine archlinux debian fedora firefox-flatpak firefox-snap gnome-sdk-snap mint opensuse ubuntu; do
+ taskcluster api index deleteTask gecko.v2.mozilla-central.latest.system-symbols.$distro
+ done;
+
+From there, HTTP queries to (for the debian example)
+https://firefox-ci-tc.services.mozilla.com/tasks/index/gecko.v2.mozilla-central.latest.system-symbols.debian
+would return 404, which will make the symbol scraping tasks search its data on
+GitHub.