Skip to content

feat: add CP857 (Turkish DOS) to supported document encodings#307782

Closed
Diode11-Alt wants to merge 2 commits into
microsoft:mainfrom
Diode11-Alt:fix/add-cp857-encoding
Closed

feat: add CP857 (Turkish DOS) to supported document encodings#307782
Diode11-Alt wants to merge 2 commits into
microsoft:mainfrom
Diode11-Alt:fix/add-cp857-encoding

Conversation

@Diode11-Alt
Copy link
Copy Markdown

Summary

This PR adds CP857 (Turkish DOS) to VS Code's supported document encodings, resolving the long-standing request in #300041 (160+ upvotes).

Problem

CP857 (Code Page 857) was the dominant encoding used in Turkey until the 2000s. Many legacy Turkish projects, government databases, and archived files still use this encoding. Currently, VS Code users working with CP857-encoded files cannot select this encoding from the encoding picker, forcing them to use workarounds or external tools.

While CP857 is already recognized in VS Code's terminal encoding detection (src/vs/base/node/terminalEncoding.ts), it was missing from the document encoding picker (SUPPORTED_ENCODINGS).

Changes

src/vs/workbench/services/textfile/common/encoding.ts:

  • Added cp857 entry to the SUPPORTED_ENCODINGS map
  • Grouped it with the existing Turkish encodings (Windows 1254 at order 32, ISO 8859-9 at order 33)
  • Assigned order 34, shifting subsequent encodings by +1
  • Labels follow existing conventions: 'Turkish (CP 857)' (long) / 'CP 857' (short)

Why this works without further changes

  • @vscode/iconv-lite-umd already supports CP857 natively — no dependency updates needed
  • The encoding picker UI automatically picks up any entry in SUPPORTED_ENCODINGS
  • The toNodeEncoding() and encodingExists() functions handle CP857 correctly via iconv-lite's normalization

Testing

  • Verified that @vscode/iconv-lite-umd recognizes 'cp857' as a valid encoding
  • The change follows the exact same pattern as all other DOS code page entries (CP 437, CP 850, CP 852, CP 865, CP 866)
  • No breaking changes — this is purely additive

Before / After

Before After
Turkish section shows only Windows 1254 and ISO 8859-9 Turkish section now also shows CP 857
CP857 files cannot be opened with correct encoding CP857 files can be properly decoded and re-encoded

Fixes #300041

Fixes microsoft#300041

CP857 was the most popular encoding used in Turkey until the 2000s and
many legacy projects still use it. It was already supported in the
terminal encoding list but was missing from the document encoding picker.

This adds CP857 as 'Turkish (CP 857)' to the SUPPORTED_ENCODINGS map,
grouped with the other Turkish encodings (Windows 1254, ISO 8859-9).
The underlying iconv-lite-umd library already supports CP857, so no
additional dependency changes are needed.
@vs-code-engineering
Copy link
Copy Markdown
Contributor

📬 CODENOTIFY

The following users are being notified based on files changed in this PR:

@bpasero

Matched files:

  • src/vs/workbench/services/textfile/common/encoding.ts

@Diode11-Alt
Copy link
Copy Markdown
Author

I agree

@lramos15 lramos15 requested a review from bpasero April 6, 2026 13:22
@lramos15 lramos15 assigned bpasero and unassigned lramos15 Apr 6, 2026
@bpasero bpasero enabled auto-merge (squash) April 23, 2026 18:19
@bpasero
Copy link
Copy Markdown
Contributor

bpasero commented Apr 24, 2026

#300114

@bpasero bpasero closed this Apr 24, 2026
auto-merge was automatically disabled April 24, 2026 06:11

Pull request was closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CP857 is missing in supported document encodings

4 participants