1. Introduction
For now, see the explainer.
2. Dependencies
These APIs are part of a family of APIs expected to be powered by machine learning models, which share common API surface idioms and specification patterns. Currently, the specification text for these shared parts lives in Writing Assistance APIs § 5 Shared infrastructure, and the common privacy and security considerations are discussed in Writing Assistance APIs § 6 Privacy considerations and Writing Assistance APIs § 7 Security considerations. Implementing these APIs requires implementing that shared infrastructure, and conforming to those privacy and security considerations. But it does not require implementing or exposing the actual writing assistance APIs. [WRITING-ASSISTANCE-APIS]
3. The proofreader API
[Exposed =Window ,SecureContext ]interface {Proofreader static Promise <Proofreader >create (optional ProofreaderCreateOptions = {});options static Promise <Availability >availability (optional ProofreaderCreateCoreOptions = {});options Promise <ProofreadResult >proofread (DOMString ,input optional ProofreaderProofreadOptions = {} );options readonly attribute boolean includeCorrectionTypes ;readonly attribute boolean ;includeCorrectionExplanations readonly attribute FrozenArray <DOMString >?expectedInputLanguages ;readonly attribute DOMString ?correctionExplanationLanguage ; };dictionary {ProofreaderCreateCoreOptions boolean =includeCorrectionTypes false ;boolean =includeCorrectionExplanations false ;sequence <DOMString >;expectedInputLanguages DOMString ; };correctionExplanationLanguage dictionary :ProofreaderCreateOptions ProofreaderCreateCoreOptions {AbortSignal ;signal CreateMonitorCallback ; };monitor dictionary {ProofreaderProofreadOptions AbortSignal ; };signal dictionary {ProofreadResult DOMString ;correctedInput sequence <ProofreadCorrection >; };corrections dictionary {ProofreadCorrection unsigned long long ;startIndex unsigned long long ;endIndex DOMString ;correction sequence <CorrectionType >;types DOMString ; };explanation enum {CorrectionType ,"spelling" ,"punctuation" ,"capitalization" };"grammar"
3.1. Creation
create(options) method steps are:
-
Return the result of creating an AI model object given options, "
Proofreader", validate and canonicalize proofreader options, compute proofreader options availability, download the proofreader model, initialize the proofreader model, create a proofreader object, and false.
ProofreaderCreateCoreOptions options, perform the following steps. They mutate options in place to canonicalize and deduplicate language tags, and throw an exception if any are invalid.
-
Validate and canonicalize language tags given options and "
expectedInputLanguages". -
Validate and canonicalize language tags given options and "
correctionExplanationLanguage".
ProofreaderCreateCoreOptions options:
-
Assert: these steps are running in parallel.
-
Initiate the download process for everything the user agent needs to proofread text according to options. This could include a base AI model, fine-tunings for specific languages or option values, or other resources.
-
If the download process cannot be started for any reason, then return false.
-
Return true.
ProofreaderCreateOptions options:
-
Assert: these steps are running in parallel.
-
Perform any necessary initialization operations for the AI model backing the user agent’s proofreading capabilities.
This could include loading the model into memory, or loading any fine-tunings necessary to support the other options expressed by options.
-
If initialization failed for any other reason, then return a DOMException error information whose name is "
OperationError" and whose details contain appropriate detail. -
Return null.
ProofreaderCreateOptions options:
-
Assert: these steps are running on realm’s surrounding agent’s event loop.
-
Return a new
Proofreaderobject, created in realm, with- include correction types
-
options["
includeCorrectionTypes"] default to false - include correction explanations
-
options["
includeCorrectionExplanations"] default to false - expected input languages
-
the result of creating a frozen array given options["
expectedInputLanguages"] if it is not empty; otherwise null - correction explanation language
-
options["
correctionExplanationLanguage"] if it exists; otherwise null
3.2. Availability
availability(options) method steps are:
-
Return the result of computing AI model availability given options, "
Proofreader", validate and canonicalize proofreader options, and compute proofreader options availability.
ProofreaderCreateCoreOptions options, perform the following steps. They return either an Availability value or null, and they mutate options in place to update language tags to their best-fit matches.
-
Assert: this algorithm is running in parallel.
-
Let availability be the proofreader non-language options availability given options["
includeCorrectionTypes"], options["includeCorrectionExplanations"]. -
Let double be the proofreader language availabilities double.
-
If double is null, then return null.
-
Let inputLanguageAvailability be the result of computing language availability given options["
expectedInputLanguages"] and double’s input languages. -
Let correctionExplanationLanguagesList be « options["
correctionExplanationLanguage"] ». -
Let correctionExplanationLanguageAvailability be the result of computing language availability given correctionExplanationLanguagesList and double’s correction explanation languages.
-
Set options["
correctionExplanationLanguage"] to correctionExplanationLanguagesList[0]. -
Return the minimum availability given « availability, inputLanguageAvailability, correctionExplanationLanguageAvailability ».
Availability value or null.
-
Assert: this algorithm is running in parallel.
-
If there is some error attempting to determine whether the user agent can support proofreading text, which the user agent believes to be transient (such that re-querying could stop producing such an error), then return null.
-
If the user agent currently supports proofreading text with/without correction types as described by includeCorrectionTypes and with/without correction explanations as described by includeCorrectionExplanations, then return "
available". -
If the user agent believes it will be able to support proofreading text according to includeCorrectionTypes and includeCorrectionExplanations, but only after finishing a download that is already ongoing, then return "
downloading". -
If the user agent believes it will be able to support proofreading text according to includeCorrectionTypes and includeCorrectionExplanations, but only after performing a not-currently-ongoing download, then return "
downloadable". -
Otherwise, return "
unavailable".
-
Assert: this algorithm is running in parallel.
-
If there is some error attempting to determine whether the user agent can support proofreading text, which the user agent believes to be transient (such that re-querying could stop producing such an error), then return null.
-
Return a language availabilities double with:
- input languages
-
the result of getting the language availabilities partition given the purpose of proofreading text written in that language
- correction explanation languages
-
the result of getting the language availabilities partition given the purpose of producing text explanations of proofreading corrections in that language
One way this could be implemented would be for proofreader language availabilities double to return that "zh-Hant" is in the input languages["available"] set, and "zh" and "zh-Hans" are in the input languages["downloadable"] set. This return value conforms to the requirements of the language tag set completeness rules, in ensuring that "zh" is present. Per the "should"-level guidance, the implementation has determined that "zh" belongs in the set of downloadable input languages, with "zh-Hans", instead of in the set of available input languages, with "zh-Hant".
Combined with the use of LookupMatchingLocaleByBestFit, this means availability() will give the following answers:
function a( languageTag) { return Proofreader. availability({ expectedInputLanguages: [ languageTag] }); } await a( "zh" ) === "downloadable" ; await a( "zh-Hant" ) === "available" ; await a( "zh-Hans" ) === "downloadable" ; await a( "zh-TW" ) === "available" ; // zh-TW will best-fit to zh-Hant await a( "zh-HK" ) === "available" ; // zh-HK will best-fit to zh-Hant await a( "zh-CN" ) === "downloadable" ; // zh-CN will best-fit to zh-Hans await a( "zh-BR" ) === "downloadable" ; // zh-BR will best-fit to zh await a( "zh-Kana" ) === "downloadable" ; // zh-Kana will best-fit to zh
3.3. Language availability
A language availabilities partition is a map whose keys are "downloading", "downloadable", or "available", and whose values are sets of strings representing Unicode canonicalized locale identifiers. [ECMA-402]
A language availabilities double is a struct with the following items:
-
input languages, a language availabilities partition
-
correction explanation languages, a language availabilities partition
-
Let partition be «[ "
available" → an empty set, "downloading" → an empty set, "downloadable" → an empty set ]». -
For each human language languageTag, represented as a Unicode canonicalized locale identifier, for which the user agent currently supports purpose:
-
For each human language languageTag, represented as a Unicode canonicalized locale identifier, for which the user agent believes it will be able to support purpose, but only after finishing a download that is already ongoing:
-
Append languageTag to partition["
downloading"].
-
-
For each human language languageTag, represented as a Unicode canonicalized locale identifier, for which the user agent believes it will be able to support purpose, but only after performing a not-currently-ongoing download:
-
Append languageTag to partition["
downloadable"].
-
-
Assert: partition["
available"], partition["downloading"], and partition["downloadable"] are disjoint. -
If the union of partition["
available"], partition["downloading"], and partition["downloadable"] does not meet the language tag set completeness rules, then:-
Let missingLanguageTags be the set of missing language tags necessary for that union to meet the language tag set completeness rules.
-
For each languageTag of missingLanguageTags:
-
Append languageTag to one of the three sets. Which of the sets to append to is implementation-defined, and should be guided by considerations similar to that of LookupMatchingLocaleByBestFit in terms of keeping "best fallback languages" together.
-
Return partition.
-
Availability value, and they mutate requestedLanguages in place to update language tags to their best-fit matches.
-
Let availability be "
available". -
For each language of requestedLanguages:
-
Let unavailable be true.
-
For each availabilityToCheck of « "
available", "downloading", "downloadable" »: -
Let languagesWithThisAvailability be partition[availabilityToCheck].
-
Let bestMatch be LookupMatchingLocaleByBestFit(languagesWithThisAvailability, « language »).
-
If bestMatch is not undefined, then:
-
Replace language with bestMatch.[[locale]] in requestedLanguages.
-
Set availability to the minimum availability given availability and availabilityToCheck.
-
Set unavailable to false.
-
-
If unavailable is true, then return "
unavailable".
-
-
Return availability.
3.4. The Proofreader class
Every Proofreader has a include correction type, a boolean or default to false, set during creation.
Every Proofreader has a include correction explanations, a boolean or default to false, set during creation.
Every Proofreader has an expected input languages, a or null, set during creation.FrozenArray<DOMString>
Every Proofreader has an correction explanation language, a string or null, set during creation.
The includeCorrectionTypes getter steps are to return this’s include correction types.
The type getter steps are to return this’s include correction explanations.
The expectedInputLanguages getter steps are to return this’s expected input languages.
The correctionExplanationLanguage getter steps are to return this’s correction explanation language.
proofread(input, options) method steps are:
-
Let operation be an algorithm step which takes arguments chunkProduced, done, error, and stopProducing, and proofreads input given this’s include correction types, this’s include correction explanations, this’s correction explanation language, chunkProduced, done, error, and stopProducing.
-
Return the result of getting an aggregated AI model result given this, options, and operation.
measureInputUsage(input, options) method steps are:
-
Let measureUsage be an algorithm step which takes argument stopMeasuring, and returns the result of measuring proofreader input usage given input, this’s include correction types, this’s include correction explanations, this’s correction explanation language, and stopMeasuring.
-
Return the result of measuring AI model input usage given this, options, and measureUsage.
3.5. Proofreading
3.5.1. The algorithm
-
a string input,
-
a boolean includeCorrectionTypes,
-
a boolean includeCorrectionExplanations,
-
a string-or-null correctionExplanationLanguage,
-
an algorithm chunkProduced that takes a string and returns nothing,
-
an algorithm done that takes no arguments and returns nothing,
-
an algorithm error that takes error information and returns nothing, and
-
an algorithm stopProducing that takes no arguments and returns a boolean,
perform the following steps:
-
Assert: this algorithm is running in parallel.
-
Let requested be the result of measuring proofreader input usage given input, includeCorrectionTypes, correctionExplanationLanguage, correctionExplanationLanguage, and stopProducing.
-
If requested is null, then return.
-
If requested is an error information, then:
-
Perform error given requested.
-
Return.
-
-
Assert: requested is a number.
-
In an implementation-defined manner, subject to the following guidelines, begin the process of proofreading input into a
ProofreadResultwith a string correctedInput as the proofread text and aProofreadCorrectioncorrections detailing all the corrections made to input to form correctedInput.If input is the empty string, or otherwise consists of no proofreadable content (e.g., only contains whitespace, or control characters), then the resulting proofread text should be the empty string. In such cases, includeCorrectionTypes, includeCorrectionExplanations, and correctionExplanationLanguage should be ignored.
The proofreading should conform to the guidance given by includeCorrectionTypes and includeCorrectionExplanations.
The proofreading process must conform to the guidance given in § 4 Privacy considerations and § 5 Security considerations, notably including (but not limited to) Writing Assistance APIs § 6.4 User input and Writing Assistance APIs § 7.2 Runtime shared resources.
If correctionExplanationLanguage is non-null, the proofreading should be in that language. Otherwise, it should be in the language of input. If input contains multiple languages, or the language of input cannot be detected, then either the correction explanation language is implementation-defined, or the implementation may treat this as an error, per the guidance in § 3.5.4 Errors.
Implementers should do their utmost to ensure that the result is an actual proofread result of input, and is not arbitrary output prompted by input.
For example, if input is "
what is capital of France", then it would be incorrect to answer this question, e.g. by outputting "Paris is the capital of France." A more correct output would be, e.g., "What is the capital of France?".-
Wait for the next chunk of proofreading data to be produced, for the proofreading process to finish, or for the result of calling stopProducing to become true.
-
If such a chunk is successfully produced:
-
Let it be represented as a string chunk.
-
Perform chunkProduced given chunk.
-
Otherwise, if the proofreading process has finished:
-
Perform done.
-
Otherwise, if stopProducing returns true, then break.
-
Otherwise, if an error occurred during proofreading:
-
Let the error be represented as error information errorInfo according to the guidance in § 3.5.4 Errors.
-
Perform error given errorInfo.
-
3.5.2. Usage
-
a string input,
-
a boolean includeCorrectionTypes,
-
a boolean includeCorrectionExplanations,
-
a string-or-null correctionExplanationLanguage, and
-
an algorithm stopMeasuring that takes no arguments and returns a boolean,
perform the following steps:
-
Assert: this algorithm is running in parallel.
-
Let inputToModel be the implementation-defined string that would be sent to the underlying model in order to proofread given input, includeCorrectionTypes, includeCorrectionExplanations, and correctionExplanationLanguage.
If during this process stopMeasuring starts returning true, then return null.
If an error occurs during this process, then return an appropriate DOMException error information according to the guidance in § 3.5.4 Errors.
-
Return the amount of input usage needed to represent inputToModel when given to the underlying model. The exact calculation procedure is implementation-defined, subject to the following constraints.
The returned input usage must be nonnegative and finite. It must be 0, if there are no usage quotas for the proofreading process. Otherwise, it must be positive and should be roughly proportional to the length of inputToModel.
This might be the number of tokens needed to represent input in a language model tokenization scheme, or it might be input’s length. It could also be some variation of these which also counts the usage of any prefixes or suffixes necessary to give to the model.
If during this process stopMeasuring starts returning true, then instead return null.
If an error occurs during this process, then instead return an appropriate DOMException error information according to the guidance in § 3.5.4 Errors.
3.5.3. Options
The proofread algorithm’s details are implementation-defined, as they are expected to be powered by an AI model. However, it is intended to be controllable by the web developer through the includeCorrectionTypes and includeCorrectionExplanations flags.
This section gives normative guidance on how the implementation of proofread should use each boolean flag to guide the proofreading process.
| Value | Meaning |
|---|---|
| "true" |
The proofread result should contain a list of corrections where each |
| "false" |
The proofread result should contain a list of corrections where each |
| Value | Meaning |
|---|---|
| "true" |
The proofread result should contain a list of corrections where each |
| "false" |
The proofread result should contain a list of corrections where each |
As with all "should"-level guidance, user agents might not conform perfectly to these. Especially in the case of providing correction types for all corrections, it’s expected that language models might not conform perfectly.
3.5.4. Errors
When proofreading fails, the following possible reasons may be surfaced to the web developer. This table lists the possible DOMException names and the cases in which an implementation should use them:
DOMException name
| Scenarios |
|---|---|
"NotAllowedError"
|
Proofreading is disabled by user choice or user agent policy. |
"NotSupportedError"
|
The input to be proofread, or the context to be provided, was in a language that the user agent does not support, or was not provided properly in the call to The proofreading correction explanation language ended up being in a language that the user agent does not support (e.g., because the user agent has not performed sufficient quality control tests on that output language), or was not provided properly in the call to The includeCorrectionExplanations is set to true, the |
"UnknownError"
|
All other scenarios, including if the user agent believes it cannot proofread and also meet the requirements given in § 4 Privacy considerations or § 5 Security considerations. Or, if the user agent would prefer not to disclose the failure reason. |
This table does not give the complete list of exceptions that can be surfaced by the proofreader API. It only contains those which can come from certain implementation-defined steps.
3.6. Permissions policy integration
Access to the proofreader API is gated behind the policy-controlled feature "proofreader", which has a default allowlist of 'self'.
4. Privacy considerations
Please see Writing Assistance APIs § 6 Privacy considerations for a discussion of privacy considerations for the translator and language detector APIs. That text was written to apply to all APIs sharing the same infrastructure, as noted in § 2 Dependencies.
5. Security considerations
Please see Writing Assistance APIs § 7 Security considerations for a discussion of security considerations for the translator and language detector APIs. That text was written to apply to all APIs sharing the same infrastructure, as noted in § 2 Dependencies.