An automated archive of NYTimes Spelling Bee puzzle answers
New puzzles are released at 3 am ET.
See Days.md.
See Pangrams.md.
See Words.md.
An automated archive of NYTimes Spelling Bee puzzle answers ๐
Home Page: https://www.nytimes.com/puzzles/spelling-bee
License: MIT License
An automated archive of NYTimes Spelling Bee puzzle answers
New puzzles are released at 3 am ET.
See Days.md.
See Pangrams.md.
See Words.md.
Add more readme badges.
Currently they are unsorted which makes them mirror the order in which each word first occurs across the daily puzzles.
Add running tests to CI.
The first puzzle date that I ran this code on was 2023-01-01.
NYT themselves do not provide historical puzzles or any puzzles beyond the current day's.
However, the Spelling Bee goes back to at least May 2018 - https://www.sbsolver.com/archive/2018/05.
On the other hand, the first valid forum URL appears to be 2021-09-20.
It may possible to gather this historical data by various means, such as pages on web archivers.
Add test coverage report.
Maybe later:
Add complexity column to all words table.
Add complexity / reading level per word, e.g., https://www.wordcalc.com/readability/.
Add a security scan to CI.
Dependency Review or similar from https://github.com/tedmiston/spelling-bee-answers/actions/new?category=security.
Already using CodeQL default setup. Update: Disabled because it just burns and burns Actions runner minutes on every commit.
Create an ongoing list of pangrams like the all words list.
Create Pydantic model(s) as needed.
Currently all logic interacts with the JSON data files directly without any model / validation / (de)serialization layer. Migrating to models enables cleaner separation of concerns and quality.
Day
Puzzle
model
expiration
/ freeExpiration
keys both optionalยฏ\_(ใ)_/ยฏ
Draft ideas for potential future models:
? Create Table
model
headers
, rows
? Create DocTemplate
TaggedDoc
model
? Create Word
model
?? Create WordList
model
title
, description
, words
, table
To reduce timezone-related bugs making runs on CI consistent with local runs.
days.py
tests/tests_integration.py
(As a link.)
Make primary column bold in words / days tables for readability.
This can be added to the Answers pipeline and run right after the readme table generation step.
Add linting to CI.
Maybe others / more later - https://smirnov-am.github.io/python-linters-for-better-code-quality/
Add word popularity / commonality / frequency to all words table.
Need to find a good source for this one.
blocked by #11
Source ideas:
Add list / table of all words across all puzzles.
Create a simple common interface for the two Spelling Bee scrapers to improve quality.
#13 Added links to definitions on Wordnik. It would be nice to have the definitions inline, say scrape the first occurrence from Wordnik or some heuristic, perhaps?
Cache Poetry binary itself in CI (not deps installed via Poetry).
The Gr1N/setup-poetry action runs every time on every run which adds ~15โ20s.
This is the slowest step in the entire pipeline!
It does not seem to have any built-in feature to cache the Poetry binaries itself.
Maybe I can achieve that via actions/cache?
Alternatively, the setup-python docs mention just using pipx install poetry
. How does the performance of that compare to setup-poetry? [How] can that be cached?
Is it always the same puzzle editor?
Editors.md
with columns: "Name", "Puzzle Count"Refactor core Python code from disparate modules into one cohesive package.
Add pangram count to days table.
Add word count by day to the readme table.
Add definition column to all words table.
Link to the word's definition on Wordnik.
A PoC scraper to parse puzzle data from the SBSolver archive, e.g., https://www.sbsolver.com/s/1.
In the future this can be used for historical data backfilling a la #42. This should allow retrieving (at least some of) the data from 2018โ2022.
Is there a way to assess which days puzzles are harder or easier?
For example, with the full-size crossword, Monday puzzles are easiest and difficulty progressively increases throughout the week through Saturday; Sunday is a bit different though. Anecdotally, I suspect the Bee follows a similar pattern.
Perhaps just using score / points and/or word count from the puzzle directly could be a first pass? There's also the points needed for Genius level metric for each puzzle.
Note: I am not currently tracking the points info in the puzzle JSON files. Can I acquire that historical data?
Generate daily markdown doc pages, like the All Words page, but for the letters, pangrams, and answers of that day.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.