在 GitHub Actions 工作流程中缓存 APT 包

问题描述:

我在我的 C 项目中使用以下 Github Actions 工作流程.工作流在大约 40 秒内完成,但其中超过一半的时间用于安装 valgrind 包及其依赖项.

I use the following Github Actions workflow for my C project. The workflow finishes in ~40 seconds, but more than half of that time is spent by installing the valgrind package and its dependencies.

我相信缓存可以帮助我加快工作流程.我不介意多等几秒钟,但这似乎是对 GitHub 资源的毫无意义的浪费.

I believe caching could help me speed up the workflow. I do not mind waiting a couple of extra seconds, but this just seems like a pointless waste of GitHub's resources.

name: C Workflow

on: [push, pull_request]

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v1

    - name: make
      run: make

    - name: valgrind
      run: |
        sudo apt-get install -y valgrind
        valgrind -v --leak-check=full --show-leak-kinds=all ./bin

运行 sudo apt-get install -y valgrind 安装以下软件包:

Running sudo apt-get install -y valgrind installs the following packages:

  • gdb
  • gdbserver
  • libbabeltrace1
  • libc6-dbg
  • libipt1
  • valgrind

我知道 Actions 支持缓存特定目录(并且已经有几个关于此的已回答的 SO 问题和文章),但我不确定 apt 安装的所有不同软件包最终在哪里.我假设 /bin//usr/bin/ 不是唯一受安装包影响的目录.

I know Actions support caching of a specific directory (and there are already several answered SO questions and articles about this), but I am not sure where all the different packages installed by apt end up. I assume /bin/ or /usr/bin/ are not the only directories affected by installing packages.

是否有一种优雅的方法可以缓存已安装的系统包以供将来工作流运行?

此答案的目的是展示如何使用 github 操作完成缓存.不一定要显示如何缓存 valgrind,它确实显示了,但更多的是要表明并非所有内容都可以/应该被缓存,以及缓存和恢复缓存与重新安装依赖项的权衡需要考虑.

The purpose of this answer is to show how caching can be done with github actions. Not necessarily to show how to cache valgrind, which it does show, but more so to show that not everything can/should be cached, and that the tradeoffs of caching and restoring a cache, vs reinstalling the dependency needs to be taken into account.

您将使用 actions/cache 操作来执行这个.

You will make use of the actions/cache action to do this.

将其添加为一个步骤(在您需要使用 valgrind 之前):

Add it as a step (before you need to use valgrind):

- name: Cache valgrind
  uses: actions/cache@v2
  id: cache-valgrind
  with:
      path: "~/valgrind"
      key: ${{secrets.VALGRIND_VERSION}}

下一步应尝试安装缓存版本(如果有)或从存储库安装:

The next step should attempt to install the cached version if any or install from the repositories:

- name: Install valgrind
  env:
    CACHE_HIT: ${{steps.cache-valgrind.outputs.cache-hit}}
    VALGRIND_VERSION: ${{secrets.VALGRIND_VERSION}}
  run: |
      if [[ "$CACHE_HIT" == 'true' ]]; then
        sudo cp --verbose --force --recursive ~/valgrind/* /
      else
        sudo apt-get install --yes valgrind="$VALGRIND_VERSION"
        mkdir -p ~/valgrind
        sudo dpkg -L valgrind | while IFS= read -r f; do if test -f $f; then echo $f; fi; done | xargs cp --parents --target-directory ~/valgrind/
      fi

说明

VALGRIND_VERSION 秘密设置为以下内容的输出:

Explanation

Set VALGRIND_VERSION secret to be the output of:

apt-cache policy valgrind | grep -oP '(?<=Candidate:\s)(.+)'

这将允许您在发布新版本时仅通过更改密钥的值来使缓存无效.

this will allow you to invalidate the cache when a new version is released simply by changing the value of the secret.

dpkg -L valgrind 用于列出使用sudo apt-get install valgrind时安装的所有文件.

dpkg -L valgrind is used to list all the files installed when using sudo apt-get install valgrind.

我们现在可以用这个命令做的是将所有依赖项复制到我们的缓存文件夹中:

What we can now do with this command is to copy all the dependencies to our cache folder:

dpkg -L valgrind | while IFS= read -r f; do if test -f $f; then echo $f; fi; done | xargs cp --parents --target-directory ~/valgrind/


此外

除了复制valgrind的所有组件外,可能还需要复制依赖(比如本例中的libc),但是我不建议继续沿着这条路走,因为依赖链只是从那里开始增长的.准确的说,需要复制到最终有一个适合valgrind运行的环境所需要的依赖如下:


Furthermore

In addition to copying all the components of valgrind, it may also be necessary to copy the dependencies (such as libc in this case), but I don't recommend continuing along this path because the dependency chain just grows from there. To be precise, the dependencies needed to copy to finally have an environment suitable for valgrind to run in is as follows:

  • libc6
  • libgcc1
  • gcc-8-base

要复制所有这些依赖项,您可以使用与上述相同的语法:

To copy all these dependencies, you can use the same syntax as above:

for dep in libc6 libgcc1 gcc-8-base; do
    dpkg -L $dep | while IFS= read -r f; do if test -f $f; then echo $f; fi; done | xargs cp --parents --target-directory ~/valgrind/
done

当安装 valgrind 所需的一切只是简单地运行 sudo apt-get install valgrind 时,所有这些工作真的值得吗??如果您的目标是加快构建过程,那么您还必须考虑恢复(下载和提取)缓存与再次运行命令来安装 valgrind所花费的时间代码>.

Is all this work really worth the trouble when all that is required to install valgrind in the first place is to simply run sudo apt-get install valgrind? If your goal is to speed up the build process, then you also have to take into consideration the amount of time it is taking to restore (downloading, and extracting) the cache vs simply running the command again to install valgrind.

最后恢复缓存,假设它存储在/tmp/valgrind,可以使用命令:

And finally to restore the cache, assuming it is stored at /tmp/valgrind, you can use the command:

cp --force --recursive /tmp/valgrind/* /

这基本上会将所有文件从缓存复制到根分区.

Which will basically copy all the files from the cache unto the root partition.

除了上面的流程,我还有一个示例缓存 valgrind";通过从源代码安装和编译它.缓存现在大约63MB(压缩)大小,仍然需要单独安装libc,这与目的不符.

In addition to the process above, I also have an example of "caching valgrind" by installing and compiling it from source. The cache is now about 63MB (compressed) in size and one still needs to separately install libc which kind of defeats the purpose.

注意:此问题的另一个答案提出了我认为是缓存依赖项的更安全方法, 通过使用预安装依赖项的容器.最好的部分是您可以使用操作来使这些容器保持最新.

Note: Another answer to this question proposes what I could consider to be a safer approach to caching dependencies, by using a container which comes with the dependencies pre-installed. The best part is that you can use actions to keep those containers up-to-date.

参考文献: