Building the Byte-Pair Encoding tokenizer algorithm similar to Karpathy's bpe with an interactive interface using Gradio in just a single file ~200 lines of code.
download the repo
git clone https://github.com/Esmail-ibraheem/tiktoken.git
install required libraries
pip install -r requirements.txt
then run this command
python tiktoken.py