admin管理员组

文章数量:1026961

I have the following code which checks for changes to markdown files. If this is true, the relative paths of these files are handed over to conditional_pandoc where I want to use the official Pandoc docker to run a bash script (see below) which converts these files and places them in the same directory as the markdown files (This works if I install Pandoc manually, see my last question). To not worry about the path-structure of GH Action Dockers, I tried to set the workspace directly to the the base repo.

But I get the following error:

[...]
Status: Downloaded newer image for pandoc/latex:3.5
pandoc: bash: withBinaryFile: does not exist (No such file or directory)
Error: Process completed with exit code 1.

Since this error isn't very telling I suppose either the workspace or the env doesn't work. How can I fix this?

convert.yaml:

on:
  push:
    branches:
      - 'main' # Do the work exclusively for the branch deploying the website

jobs:
  # Seperate jobs to be able to possibly use condition_check_files for other tasks as well
  condition_check_files:
    runs-on: 'ubuntu-22.04'
    outputs:
      bool_files_changed: ${{ steps.check_file_changed.outputs.bool_files_changed }}
      list_changed_files: ${{ steps.check_file_changed.outputs.list_changed_files }}
    steps:
    - uses: actions/checkout@v4
      with:
        fetch-depth: 2
    - shell: pwsh
      id: check_file_changed
      run: |
        # Look only for changed files (A - added, M - modified) and return their path+name (the specific changes are irrelevant)
        $diff=git diff --name-only --diff-filter=AM "HEAD^" HEAD

        # Filter the files under content/ with the .md extension excluding the Hugo associated _index.md files
        $FilesDiff=$diff | Where-Object { $_ -match 'content/' -and $_ -match '.md$' -and -not ($_ -match '_index.md') }
        $HasDiff=$FilesDiff.Length -gt 0

        # Set the output named "bool_files_changed"
        echo "bool_files_changed=$HasDiff" >> $env:GITHUB_OUTPUT
        echo "list_changed_files=$FilesDiff" >> $env:GITHUB_OUTPUT

  # Run the job only with 'bool_files_changed' equals 'True'
  conditional_pandoc:
    runs-on: 'ubuntu-22.04'
    needs: [ condition_check_files ]
    if: needs.condition_check_files.outputs.bool_files_changed == 'true'
    env:
      list_changed_files: ${{ needs.condition_check_files.outputs.list_changed_files }}
    steps:
      - uses: actions/checkout@v4  # In order to find the script pandoc.sh
        with:
          fetch-depth: 2

      - name: Run Pandoc in Docker
        # This is a one-liner. It gets a Docker Image of pandoc/latex:3.5 working directly in the workspace (because
        # $list_changed_files contains relative paths) and executes the script.
        run: >-
          docker run
          --rm -v ${{ github.workspace }}:/workspace
          -w /workspace
          pandoc/latex:3.5
          bash ./pandoc.sh ${{ env.list_changed_files }}
      - name: Commit files # transfer the new files into the repository
        run: |
          git config --local user.name "GH_Action_Bot"
          git add ./content
          git commit -m "GH Action: Pandoc | New output for changed files"
          git push -f origin main

pandoc.sh, directly in the base dir of the repo:

#!/bin/bash
# List types which will only use --standalone. You can easily add more extensions if you're fine with this setting
Only_Standalone_Output_Types="latex pdf html docx odt"
# Use GNU Parallel to work each file on its own cpu core.
# Pandoc creates an AST (Abstract Syntax Tree); reuse this by saving/reading from .ast
# You can easily add individual conversion rules by using pandoc after the 'done' part. Keep in mind to finish all non-comment lines with backslash
parallel --jobs 0 \
    pandoc --from markdown --to native './{}' -o './{.}.ast' ';'\
    for i in "$Only_Standalone_Output_Types"';' do \
        pandoc --from native './{.}.ast' --standalone -o './{.}.$i' ';' \
    done';' \
    rm './{.}.ast' ::: "$1"

I have the following code which checks for changes to markdown files. If this is true, the relative paths of these files are handed over to conditional_pandoc where I want to use the official Pandoc docker to run a bash script (see below) which converts these files and places them in the same directory as the markdown files (This works if I install Pandoc manually, see my last question). To not worry about the path-structure of GH Action Dockers, I tried to set the workspace directly to the the base repo.

But I get the following error:

[...]
Status: Downloaded newer image for pandoc/latex:3.5
pandoc: bash: withBinaryFile: does not exist (No such file or directory)
Error: Process completed with exit code 1.

Since this error isn't very telling I suppose either the workspace or the env doesn't work. How can I fix this?

convert.yaml:

on:
  push:
    branches:
      - 'main' # Do the work exclusively for the branch deploying the website

jobs:
  # Seperate jobs to be able to possibly use condition_check_files for other tasks as well
  condition_check_files:
    runs-on: 'ubuntu-22.04'
    outputs:
      bool_files_changed: ${{ steps.check_file_changed.outputs.bool_files_changed }}
      list_changed_files: ${{ steps.check_file_changed.outputs.list_changed_files }}
    steps:
    - uses: actions/checkout@v4
      with:
        fetch-depth: 2
    - shell: pwsh
      id: check_file_changed
      run: |
        # Look only for changed files (A - added, M - modified) and return their path+name (the specific changes are irrelevant)
        $diff=git diff --name-only --diff-filter=AM "HEAD^" HEAD

        # Filter the files under content/ with the .md extension excluding the Hugo associated _index.md files
        $FilesDiff=$diff | Where-Object { $_ -match 'content/' -and $_ -match '.md$' -and -not ($_ -match '_index.md') }
        $HasDiff=$FilesDiff.Length -gt 0

        # Set the output named "bool_files_changed"
        echo "bool_files_changed=$HasDiff" >> $env:GITHUB_OUTPUT
        echo "list_changed_files=$FilesDiff" >> $env:GITHUB_OUTPUT

  # Run the job only with 'bool_files_changed' equals 'True'
  conditional_pandoc:
    runs-on: 'ubuntu-22.04'
    needs: [ condition_check_files ]
    if: needs.condition_check_files.outputs.bool_files_changed == 'true'
    env:
      list_changed_files: ${{ needs.condition_check_files.outputs.list_changed_files }}
    steps:
      - uses: actions/checkout@v4  # In order to find the script pandoc.sh
        with:
          fetch-depth: 2

      - name: Run Pandoc in Docker
        # This is a one-liner. It gets a Docker Image of pandoc/latex:3.5 working directly in the workspace (because
        # $list_changed_files contains relative paths) and executes the script.
        run: >-
          docker run
          --rm -v ${{ github.workspace }}:/workspace
          -w /workspace
          pandoc/latex:3.5
          bash ./pandoc.sh ${{ env.list_changed_files }}
      - name: Commit files # transfer the new files into the repository
        run: |
          git config --local user.name "GH_Action_Bot"
          git add ./content
          git commit -m "GH Action: Pandoc | New output for changed files"
          git push -f origin main

pandoc.sh, directly in the base dir of the repo:

#!/bin/bash
# List types which will only use --standalone. You can easily add more extensions if you're fine with this setting
Only_Standalone_Output_Types="latex pdf html docx odt"
# Use GNU Parallel to work each file on its own cpu core.
# Pandoc creates an AST (Abstract Syntax Tree); reuse this by saving/reading from .ast
# You can easily add individual conversion rules by using pandoc after the 'done' part. Keep in mind to finish all non-comment lines with backslash
parallel --jobs 0 \
    pandoc --from markdown --to native './{}' -o './{.}.ast' ';'\
    for i in "$Only_Standalone_Output_Types"';' do \
        pandoc --from native './{.}.ast' --standalone -o './{.}.$i' ';' \
    done';' \
    rm './{.}.ast' ::: "$1"
Share Improve this question asked Nov 16, 2024 at 15:04 GH-MKGH-MK 456 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

Problem

Pandoc docker image entrypoint is this:

ENTRYPOINT ["/usr/local/bin/pandoc"]

In docker images, anything passed to CMD is then an argument to the ENTRYPOINT.

which means that when running this:

        run: >-
          docker run
          --rm -v ${{ github.workspace }}:/workspace
          -w /workspace
          pandoc/latex:3.5
          bash ./pandoc.sh ${{ env.list_changed_files }}

your are effectively trying to run the command /usr/local/bin/pandoc bash ./pandoc.sh file1 file2 file3 ..., which of course does not work.

Solution 1

One solution is to override the entrypoint script when running your container like this:

        run: >-
          docker run
          --rm -v ${{ github.workspace }}:/workspace
          -w /workspace
          --entrypoint /bin/bash 
          pandoc/latex:3.5
          ./pandoc.sh ${{ env.list_changed_files }}

this should use bash as your entrypoint and you can then run your pandoc.sh script like before: the ./pandoc.sh ${{ env.list_changed_files }} is now the CMD supplied for the ENTRYPOINT.

Solution 2

However, instead of running your docker container directly with docker run, it is probably cleaner to use GH Actions container configuration.

  container-test-job:
    runs-on: 'ubuntu-22.04'
    container:
      image: pandoc/latex:3.5
    needs: [ condition_check_files ]
    if: needs.condition_check_files.outputs.bool_files_changed == 'true'
    env:
      list_changed_files: ${{ needs.condition_check_files.outputs.list_changed_files }}
    steps:
      - uses: actions/checkout@v4  # In order to find the script pandoc.sh
        with:
          fetch-depth: 2
      - name: Run pandoc
        run: ./pandoc.sh ${{ env.list_changed_files }}

Here you need to specify only the image. Workspace should be automatically mounted as a volume and the commands are executed on it. GH also automatically overrides the entrypoint.

I have the following code which checks for changes to markdown files. If this is true, the relative paths of these files are handed over to conditional_pandoc where I want to use the official Pandoc docker to run a bash script (see below) which converts these files and places them in the same directory as the markdown files (This works if I install Pandoc manually, see my last question). To not worry about the path-structure of GH Action Dockers, I tried to set the workspace directly to the the base repo.

But I get the following error:

[...]
Status: Downloaded newer image for pandoc/latex:3.5
pandoc: bash: withBinaryFile: does not exist (No such file or directory)
Error: Process completed with exit code 1.

Since this error isn't very telling I suppose either the workspace or the env doesn't work. How can I fix this?

convert.yaml:

on:
  push:
    branches:
      - 'main' # Do the work exclusively for the branch deploying the website

jobs:
  # Seperate jobs to be able to possibly use condition_check_files for other tasks as well
  condition_check_files:
    runs-on: 'ubuntu-22.04'
    outputs:
      bool_files_changed: ${{ steps.check_file_changed.outputs.bool_files_changed }}
      list_changed_files: ${{ steps.check_file_changed.outputs.list_changed_files }}
    steps:
    - uses: actions/checkout@v4
      with:
        fetch-depth: 2
    - shell: pwsh
      id: check_file_changed
      run: |
        # Look only for changed files (A - added, M - modified) and return their path+name (the specific changes are irrelevant)
        $diff=git diff --name-only --diff-filter=AM "HEAD^" HEAD

        # Filter the files under content/ with the .md extension excluding the Hugo associated _index.md files
        $FilesDiff=$diff | Where-Object { $_ -match 'content/' -and $_ -match '.md$' -and -not ($_ -match '_index.md') }
        $HasDiff=$FilesDiff.Length -gt 0

        # Set the output named "bool_files_changed"
        echo "bool_files_changed=$HasDiff" >> $env:GITHUB_OUTPUT
        echo "list_changed_files=$FilesDiff" >> $env:GITHUB_OUTPUT

  # Run the job only with 'bool_files_changed' equals 'True'
  conditional_pandoc:
    runs-on: 'ubuntu-22.04'
    needs: [ condition_check_files ]
    if: needs.condition_check_files.outputs.bool_files_changed == 'true'
    env:
      list_changed_files: ${{ needs.condition_check_files.outputs.list_changed_files }}
    steps:
      - uses: actions/checkout@v4  # In order to find the script pandoc.sh
        with:
          fetch-depth: 2

      - name: Run Pandoc in Docker
        # This is a one-liner. It gets a Docker Image of pandoc/latex:3.5 working directly in the workspace (because
        # $list_changed_files contains relative paths) and executes the script.
        run: >-
          docker run
          --rm -v ${{ github.workspace }}:/workspace
          -w /workspace
          pandoc/latex:3.5
          bash ./pandoc.sh ${{ env.list_changed_files }}
      - name: Commit files # transfer the new files into the repository
        run: |
          git config --local user.name "GH_Action_Bot"
          git add ./content
          git commit -m "GH Action: Pandoc | New output for changed files"
          git push -f origin main

pandoc.sh, directly in the base dir of the repo:

#!/bin/bash
# List types which will only use --standalone. You can easily add more extensions if you're fine with this setting
Only_Standalone_Output_Types="latex pdf html docx odt"
# Use GNU Parallel to work each file on its own cpu core.
# Pandoc creates an AST (Abstract Syntax Tree); reuse this by saving/reading from .ast
# You can easily add individual conversion rules by using pandoc after the 'done' part. Keep in mind to finish all non-comment lines with backslash
parallel --jobs 0 \
    pandoc --from markdown --to native './{}' -o './{.}.ast' ';'\
    for i in "$Only_Standalone_Output_Types"';' do \
        pandoc --from native './{.}.ast' --standalone -o './{.}.$i' ';' \
    done';' \
    rm './{.}.ast' ::: "$1"

I have the following code which checks for changes to markdown files. If this is true, the relative paths of these files are handed over to conditional_pandoc where I want to use the official Pandoc docker to run a bash script (see below) which converts these files and places them in the same directory as the markdown files (This works if I install Pandoc manually, see my last question). To not worry about the path-structure of GH Action Dockers, I tried to set the workspace directly to the the base repo.

But I get the following error:

[...]
Status: Downloaded newer image for pandoc/latex:3.5
pandoc: bash: withBinaryFile: does not exist (No such file or directory)
Error: Process completed with exit code 1.

Since this error isn't very telling I suppose either the workspace or the env doesn't work. How can I fix this?

convert.yaml:

on:
  push:
    branches:
      - 'main' # Do the work exclusively for the branch deploying the website

jobs:
  # Seperate jobs to be able to possibly use condition_check_files for other tasks as well
  condition_check_files:
    runs-on: 'ubuntu-22.04'
    outputs:
      bool_files_changed: ${{ steps.check_file_changed.outputs.bool_files_changed }}
      list_changed_files: ${{ steps.check_file_changed.outputs.list_changed_files }}
    steps:
    - uses: actions/checkout@v4
      with:
        fetch-depth: 2
    - shell: pwsh
      id: check_file_changed
      run: |
        # Look only for changed files (A - added, M - modified) and return their path+name (the specific changes are irrelevant)
        $diff=git diff --name-only --diff-filter=AM "HEAD^" HEAD

        # Filter the files under content/ with the .md extension excluding the Hugo associated _index.md files
        $FilesDiff=$diff | Where-Object { $_ -match 'content/' -and $_ -match '.md$' -and -not ($_ -match '_index.md') }
        $HasDiff=$FilesDiff.Length -gt 0

        # Set the output named "bool_files_changed"
        echo "bool_files_changed=$HasDiff" >> $env:GITHUB_OUTPUT
        echo "list_changed_files=$FilesDiff" >> $env:GITHUB_OUTPUT

  # Run the job only with 'bool_files_changed' equals 'True'
  conditional_pandoc:
    runs-on: 'ubuntu-22.04'
    needs: [ condition_check_files ]
    if: needs.condition_check_files.outputs.bool_files_changed == 'true'
    env:
      list_changed_files: ${{ needs.condition_check_files.outputs.list_changed_files }}
    steps:
      - uses: actions/checkout@v4  # In order to find the script pandoc.sh
        with:
          fetch-depth: 2

      - name: Run Pandoc in Docker
        # This is a one-liner. It gets a Docker Image of pandoc/latex:3.5 working directly in the workspace (because
        # $list_changed_files contains relative paths) and executes the script.
        run: >-
          docker run
          --rm -v ${{ github.workspace }}:/workspace
          -w /workspace
          pandoc/latex:3.5
          bash ./pandoc.sh ${{ env.list_changed_files }}
      - name: Commit files # transfer the new files into the repository
        run: |
          git config --local user.name "GH_Action_Bot"
          git add ./content
          git commit -m "GH Action: Pandoc | New output for changed files"
          git push -f origin main

pandoc.sh, directly in the base dir of the repo:

#!/bin/bash
# List types which will only use --standalone. You can easily add more extensions if you're fine with this setting
Only_Standalone_Output_Types="latex pdf html docx odt"
# Use GNU Parallel to work each file on its own cpu core.
# Pandoc creates an AST (Abstract Syntax Tree); reuse this by saving/reading from .ast
# You can easily add individual conversion rules by using pandoc after the 'done' part. Keep in mind to finish all non-comment lines with backslash
parallel --jobs 0 \
    pandoc --from markdown --to native './{}' -o './{.}.ast' ';'\
    for i in "$Only_Standalone_Output_Types"';' do \
        pandoc --from native './{.}.ast' --standalone -o './{.}.$i' ';' \
    done';' \
    rm './{.}.ast' ::: "$1"
Share Improve this question asked Nov 16, 2024 at 15:04 GH-MKGH-MK 456 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

Problem

Pandoc docker image entrypoint is this:

ENTRYPOINT ["/usr/local/bin/pandoc"]

In docker images, anything passed to CMD is then an argument to the ENTRYPOINT.

which means that when running this:

        run: >-
          docker run
          --rm -v ${{ github.workspace }}:/workspace
          -w /workspace
          pandoc/latex:3.5
          bash ./pandoc.sh ${{ env.list_changed_files }}

your are effectively trying to run the command /usr/local/bin/pandoc bash ./pandoc.sh file1 file2 file3 ..., which of course does not work.

Solution 1

One solution is to override the entrypoint script when running your container like this:

        run: >-
          docker run
          --rm -v ${{ github.workspace }}:/workspace
          -w /workspace
          --entrypoint /bin/bash 
          pandoc/latex:3.5
          ./pandoc.sh ${{ env.list_changed_files }}

this should use bash as your entrypoint and you can then run your pandoc.sh script like before: the ./pandoc.sh ${{ env.list_changed_files }} is now the CMD supplied for the ENTRYPOINT.

Solution 2

However, instead of running your docker container directly with docker run, it is probably cleaner to use GH Actions container configuration.

  container-test-job:
    runs-on: 'ubuntu-22.04'
    container:
      image: pandoc/latex:3.5
    needs: [ condition_check_files ]
    if: needs.condition_check_files.outputs.bool_files_changed == 'true'
    env:
      list_changed_files: ${{ needs.condition_check_files.outputs.list_changed_files }}
    steps:
      - uses: actions/checkout@v4  # In order to find the script pandoc.sh
        with:
          fetch-depth: 2
      - name: Run pandoc
        run: ./pandoc.sh ${{ env.list_changed_files }}

Here you need to specify only the image. Workspace should be automatically mounted as a volume and the commands are executed on it. GH also automatically overrides the entrypoint.

本文标签: GH Action Docker with envbash script and outputStack Overflow