README
Welcome to the forked version of
the British speller.
Since no one cared about updating the speller (a free
project that gives much effort) I took the task
myself in 2013, around a decade ago.
I grabbed the original project and started
adding/removing/fixing words. I kept the original authors
credits and added my name.
I have created the best up-to-date British speller. It
encompasses several fields of knowledge, from simple to
complex words.
Furthermore, suitable as a basis for Commonwealth and
European English.
It doesn't matter your race, religion, gender, age or
academic background, everyone should have access to all
words equally free.
I am improving the speller to the maximum since I am
testing it on the field:
Most of my e-mails are in English, so I see the typo
reports and attempt to verify if it is really typos or
missing words.
I have also been pasting webpages from newspapers, TV
channels and such to see which words are flagged.
To make sure the words I add are the correct ones, I have
been looking for them in credible sources:
1)
Oxford
Dictionaries;
2)
Collins Dictionary;
3)
Macmillan Dictionary;
4)
Cambridge Dictionary;
5)
Merriam-Webster Dictionary (used with caution );
6)
Wiktionary (used
with caution );
7)
Wikipedia (used
with caution );
8)
Physical dictionaries.
In January 2015, I purchased an “Oxford
Gold Account” to have a higher access to Oxford
Dictionaries.
I am also involved on several projects with a specific
jargon, having added some “special”
words.
I have been told to use scripts to update the dictionary,
but I am adding the words with copy/paste after checking
them in the dictionaries mentioned above. This is slower
and harder, but the results are much better and accurate.
Some words are chiefly American, and I will only add them
if there is no British correspondent.
In July 2019, Стоян e-mailed me saying that I added many
random words, making the dictionary a lengthy lexicon
instead of a spellchecker. And that I needed a big corpus
with the top 15–30% of the most frequent words from
different areas of science, newspapers, fiction, poetry,
Wikipedia articles, texts from Project Gutenberg and more
books or websites, etc.
Thus, I have been focusing on adding plurals and
possessives to the wordlist and also cleaning the .dic
file by removing duplicates and merging words using affix
flags. I have been checking important Wikipedia articles
to find missing words for specific subjects, making it
possible to write essays in a subject by making most of
the used terms available.
I pioneered the concept of adding possessives to words and
listing them in the release notes.
Some people complained that I add “all
words under the sun”. If you find any obsolete
words or archaic words that are close to current
replacements, please report them. I have been adding
derivates of words to assure that words like “biblically”
aren't missing (see Bugzilla ticket in LibreOffice:https://bugs.documentfoundation.org/show_bug.cgi?id=154826)
Status of the British
Dictionary V3.2.4:
The statistics for V3.2.4, released on 1.Sep.2023.
Please note that there are thousands of duplicates in the
wordlist because some words in the .dic can't be merged
because they contain both SFX and PFX in the flags.
Those PFXs make it harder to find if words are already in
the .DIC and make it harder to merge flags, and it also
messes the order of extracting the wordlist at the update
release day.
This and other things such as duplicated flags cause
duplicates.
In future versions of Proofing Tool GUI, I will code
features to mitigate this issue.
About ize/ise:
Just like in other languages, some words can be
written differently. Since Oxford says some words are
valid both ways, I kept both and the user decides which he
prefers. A good example is: “online”
and “on-line”.
For ize/ise, both ways are accepted in some words:
— optimize/optimise:
https://dictionary.com/browse/optimise;
— realize/realise:
https://dictionary.com/browse/realise.
Oxford Dictionaries will only refer that certain words
accept both ize/ise for Premium accounts.
The regular user won't know by accessing the Oxford
website, but I have access to it.
Places from New Zealand/UK
(England, Scotland, Wales & Northern Ireland):
On V2.61–2.64 I included tons of place names.
My scientist friend, Peter McGavin, told me that in NZ
they use British, so I decided to do something about it. I
did the same for the UK. I searched on Wikipedia for “towns”, “counties”,
“villages”, “boroughs”,
“suburbs”, etc. and
based me on:
— https://en.wikipedia.org/wiki/List_of_towns_in_England;
— https://en.wikipedia.org/wiki/List_of_towns_in_New_Zealand;
— https://en.wikipedia.org/wiki/List_of_civil_parishes_in_England;
— https://en.wikipedia.org/wiki/List_of_civil_parishes_in_Scotland;
— https://en.wikipedia.org/wiki/List_of_places_in_Scotland;
— https://en.wikipedia.org/wiki/List_of_communities_in_Wales;
— https://en.wikipedia.org/wiki/Local_government_in_Wales;
— https://en.wikipedia.org/wiki/List_of_towns_and_villages_in_Northern_Ireland;
— https://en.wikipedia.org/wiki/Counties_of_Northern_Ireland;
— https://en.wikipedia.org/wiki/Category:Suburbs_in_New_Zealand;
— https://en.wikipedia.org/wiki/List_of_Church_of_Scotland_parishes.
Furthermore, added places sent to me by Peter C.:
© OpenStreetMap contributors:
www.openstreetmap.org/copyright.
© The Clergy of the Church of England Database Project,
2005.
Cities from Australia
On V2.65 I added the cities in Australia by
population, since they are in valid English:
— https://en.wikipedia.org/wiki/List_of_cities_in_Australia_by_population
Cities from US
On V2.65 I added tons of cities in the US with a
10 000+ population, since they are in valid English.
This list was supplied by Michael Holroyd on Kevin
Atkinson's GitHub.
Cities from Canada
On V2.67 I added the cities in Canada,
since they are in valid English:
— https://en.wikipedia.org/wiki/List_of_cities_in_Canada
State and union territory capitals in India
On V2.90 I added this list to the dictionary,
since they are in valid
English:
— https://en.wikipedia.org/wiki/List_of_state_and_union_territory_capitals_in_India
Common prescription and OTC drugs
On V2.63 I added tons of drug names supplied by
Andrew Ziem on Kevin Atkinson's GitHub.
The generic drugs (such as “diphenhydramine”)
are in lowercase, while the brand names (such as “Abilify”)
are capitalised.
Words regarding COVID-19
On V2.83
I added tons of entries regarding the pandemic.
Main difficulties developing this dictionary:
1) Proper names;
2) Possessive forms;
3) Plurals.
I have been checking word by word to spot errors and missing
plurals/possessives.
It will take many months/year(s) to have it ready.
Some words need rechecking since I can't find plurals or the
entries in the .dic use PFX and SFX, so I can't properly fix
them.
I need to code a feature in my tool Proofing Tool GUI to
extract PFX + SFX words.
I was checking the words from 'a' to 'z' but Peter (a
friend who suggests words) told me to begin with less
% words.
If one wants to do it hard, there is no other way, I must
check word by word, it will take months/year(s), but it will
be done.
Adding new words:
If you believe to have found a missing/incorrect word,
please send it to me for analysis. If it is in the Oxford or
Collins dictionaries, I will add it.
Removing US words:
If you find American words, which appear both in
Oxford and Collins dictionaries as such with a British
correspondence, please send them to me for analysis
and removal.
Obscene/vulgar words:
If you find any of the kind in the wordlist, and they don't
have the flag NOSUGGEST !,
please report them to me.
Archaic words:
I will only add archaic words if they don't interfere with
other words.
Notice that in literacy writing, some writers use archaic
words.
Please report to me any archaic words that have very similar
current ones.
Obsolete words:
If you find any obsolete words, please report them to me.
Hyphenated words:
I have been avoiding to add words with hyphens and thus be
checking if they can also be written as just one word (together),
or if the official dictionaries state that they have no
hyphen at all, thus removing them from the wordlist.
I hope that people will enjoy my work and that it may be
useful to the progress of humankind.
Kind regards from:
30.Jun.2016 |
Marco A.G.Pinto
Master of Science in Information Warfare/Competitive
Intelligence;
Open-Source Developer;
Translator.
|
Last update:
1.Sep.2023 |