Git Product home page Git Product logo

spelling-bee-answers's Introduction

spelling-bee-answers's People

Contributors

dependabot[bot] avatar tedmiston avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

spelling-bee-answers's Issues

Info : Word definitions

#13 Added links to definitions on Wordnik. It would be nice to have the definitions inline, say scrape the first occurrence from Wordnik or some heuristic, perhaps?

Data : Backfill historical data

The first puzzle date that I ran this code on was 2023-01-01.

NYT themselves do not provide historical puzzles or any puzzles beyond the current day's.

However, the Spelling Bee goes back to at least May 2018 - https://www.sbsolver.com/archive/2018/05.

On the other hand, the first valid forum URL appears to be 2021-09-20.

It may possible to gather this historical data by various means, such as pages on web archivers.

  • Create SBSolver scraper (via #47)
  • ...

Data : SBSolver scraper

A PoC scraper to parse puzzle data from the SBSolver archive, e.g., https://www.sbsolver.com/s/1.

In the future this can be used for historical data backfilling a la #42. This should allow retrieving (at least some of) the data from 2018โ€“2022.

CI : Cache Poetry binary

Cache Poetry binary itself in CI (not deps installed via Poetry).

The Gr1N/setup-poetry action runs every time on every run which adds ~15โ€“20s.

This is the slowest step in the entire pipeline!

It does not seem to have any built-in feature to cache the Poetry binaries itself.

Maybe I can achieve that via actions/cache?


Alternatively, the setup-python docs mention just using pipx install poetry. How does the performance of that compare to setup-poetry? [How] can that be cached?

Info : Word popularity

Info : Aggregate words list

Add list / table of all words across all puzzles.

  • As a markdown doc
  • Link from main readme
  • Use the stats counter script to generate
  • Update nightly โ†’ moved to #24

Info : Puzzle Editors

Is it always the same puzzle editor?

  • Create Editors.md with columns: "Name", "Puzzle Count"
  • ...

Core : Pydantic model(s)

Create Pydantic model(s) as needed.

Currently all logic interacts with the JSON data files directly without any model / validation / (de)serialization layer. Migrating to models enables cleaner separation of concerns and quality.


Draft ideas for potential future models:

  • ? Create Table model

    • Fields: headers, rows
  • ? Create DocTemplate TaggedDoc model

    • It's not really a template since we're replacing existing contents when we update them vs just render / substitute
  • ? Create Word model

    • Include boolean field for is_pangram
  • ?? Create WordList model

    • Fields: title, description, words, table

Info : Alpha sort words lists

Currently they are unsorted which makes them mirror the order in which each word first occurs across the daily puzzles.

CI : Tests

Add running tests to CI.

  • Unit tests
  • Integration tests

Info : Puzzle difficulty analysis

Is there a way to assess which days puzzles are harder or easier?

For example, with the full-size crossword, Monday puzzles are easiest and difficulty progressively increases throughout the week through Saturday; Sunday is a bit different though. Anecdotally, I suspect the Bee follows a similar pattern.

Perhaps just using score / points and/or word count from the puzzle directly could be a first pass? There's also the points needed for Genius level metric for each puzzle.

Note: I am not currently tracking the points info in the puzzle JSON files. Can I acquire that historical data?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.