Skip to content

myst-anchors ingores source_encoding in conf.py #609

@ile-24556

Description

@ile-24556

Describe the bug

context
When I execute myst-anchors to predict anchor links.

Source (saved with UTF-8):

# Title

## ASCII

## Ä

Built html:

<section id="title">
  <h1>Title<a class="headerlink" href="#title" title="Permalink to this heading"></a></h1>
  <section id="ascii">
    <h2>ASCII<a class="headerlink" href="#ascii" title="Permalink to this heading"></a></h2>
  </section>
  <section id="a">
    <h2>Ä<a class="headerlink" href="#a" title="Permalink to this heading"></a></h2>
  </section>
</section>

expectation
Same anchors as the built HTML are shown.

$ myst-anchors.exe source/index.md 
<h1 id="title"></h1>
<h2 id="ascii"></h2>
<h2 id="a"></h2>

bug
But instead UnicodeDecodeError is raised.

$ myst-anchors source/index.md
Traceback (most recent call last):
  File "***\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "***\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "***\venv\Scripts\myst-anchors.exe\__main__.py", line 7, in <module>
  File "***\venv\lib\site-packages\myst_parser\cli.py", line 41, in print_anchors
    text = parser.render(args.input.read())
UnicodeDecodeError: 'cp932' codec can't decode byte 0x84 in position 27: illegal multibyte sequence

problem
It seems to that myst-anchors ignores source_encoding in conf.py and uses the encoding of the locale.

Reproduce the bug

  1. Prepare Japanese Windows environment (sorry for the difficulty ...)
  2. sphinx-quickstart
  3. Add myst_parser in extensions and set source_encoding in conf.py
  4. Write index.md including non-ASCII characters
  5. Run myst-anchors source/index.md

conf.py

source_encoding = 'utf_8'

List your environment

$ python
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.getdefaultlocale()
('ja_JP', 'cp932')

Windows 10 Home 21H2
myst-parser==0.18.0
Sphinx==5.1.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions