WordChecker | Devpost

Inspiration

It can be quite cumbersome to proofread long texts for minor inconsistencies which a spell checker won't find. To search for possible typos in sequence may take a lot of time and some inconsistencies may still be forgotten. The project aim is to provide a word counting that lists word occurrences in files. It can either search for several selected words in one go, or just gather every word in a text in a summarizing list that tells you how many times a word occurs in the text.

What it does

Counts word occurrences in text files and results are saved in alphabetical order in a text file. The results are summarized for each source text file.

Effects

Typos and inconsistencies in the used terms in large files can be easilier found.
Such as inconsistency regarding hyphen usage,
reference signs for figures such as in patent applications or manuals,
or using similar terms, such as both "disc" and "disk", for the same object.

Some characters, especially commas, periods and parentheses after a word, are removed before a word is registered. This smoothes the vocabulary somewhat, but by keeping the punctuation inside of a word if any (such as in "2.0"), typos such as missing spaces after a comma are also registered after a closing bracket. URLs or other compositions with slashes are not split up so that low-level domains or word endings (such as the "s" in "file(s)") are not counted as separate words.

How I built it

with Python

Challenges I ran into

Accomplishments that I'm proud of

What I learned

What's next for WordChecker

Some syntax highlighting for similar words
Other programming languages

Requirements

Written in Python 3 (works with e.g. version 3.8.1), these modules are imported: sys, re, glob, os

Works best if the text files are saved as UTF-8 (with or without BOM), which can be saved for example with Windows Notepad or Notepad++. The 3-byte BOM which can occur at the beginning of an UTF-8 file, is skipped so that the first word of the file is registered as a "normal word".

Built With

python

Updates

Daniela Grothe started this project — Jun 25, 2020 02:16 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.