Git Product home page Git Product logo

untexify's Introduction

Untexify - a clone of Detexify

What this is

Untexify is a handwriting-to- $\LaTeX$ dictionary. $\LaTeX$ is the most popular markup language for technical documents in mathematics, and is widely used throughout computer science and industry.

However, its mnemonic conventions for symbols are old and debatably antiquated. Many beginning researchers and students of the sciences find difficulty remembering the $\LaTeX$ name for a certain symbol. Untexify is an interactive canvas which will return the closest $\LaTeX$ code for any symbol drawn on it.

This is accomplished using an OCR machine learning model written in Python using TensorFlow, which is then ran in a webapp created with Django and hosted online with the help of fly.io.

The training dataset was not hand-drawn, like in the case of many other MNIST-like models. Rather, it was programmatically generated using an image-transformation pipeline created with the Albumentations library. This way, I didn’t have to find or write myself thousands of hand-drawn $\LaTeX$ symbols.

This entire project was planned and conceived in the lovely Org-mode, as Emacs is my primary development environment. The website was itself written as a series of HTML and Javascript codeblocks which were tangled together using Org-mode’s exporting functionality.

Road Map

  1. Backend
    1. Create the dataset
      1. Pull a large list of symbols from the OEIS I simply copied a table’s symbols and formatted them into a file such that each piece of $\LaTeX$ was on its own line.
      2. Convert the images into .png files I used latex2image to convert the list of commands into small, square images of each symbol. The program is a bit finicky, so for future reference, I placed the file containing my equations called equations.txt the root of the git repository, then ran from the root:
        cd src/
        . set.sh $absolute_path_to_equations_txt
        cd ..
        cd equations.txt_aux
        python generate_latex.py
                    

        I numbered the resulting files using Dired.

      3. Sort the images into classes based on their $\LaTeX$ code I created my images folder, and used this bit of bash magic to sort them into subfolders sharing their same names:
        for i in $(seq 0 $IMAGE_COUNT); do mkdir $i; mv $i.png $i/; done
                    
      4. Simulate handwriting To do this I need a series of “transforms” which will piecewise randomly affect an aspect of a given image. This prevents overfitting, and in the first phase makes the model functional at all. Here are the aspects of the image I chose to transform:
        Writing aspect Transform name
        “wiggliness” or poor handwriting A.ElasticTransform()
        Sharpening A.Sharpen()
        Uniform color A.Equalize()
        Orientation/rotation
        Scale
        • Translation and scale Although a textbook cited at the keras docs mentions that convolution layers should be translation invariant, a cursory test of my model indicates they are definitely not. So, I need to alter the transformation stack accordingly. The model is also not resistant to the scale of the input, so I need to fix that as well.
        • Stroke The model is not resistant to different strokes. Depending on the way I implement the frontend, there may be no reason to train the model to recognize this.
        • Choose a list of symbols Initially, I chose a sample of 50 symbols picked mostly arbitrarily. The initial sample includes multiple sets of symbols which would be similarly drawn ($\prec$ and $<$, for example), and also made liberal use of $¬$’s (¬’s). Because no large public facing database of small $\LaTeX$ symbols in the model’s format exists, and the transform stack is prohibitively computationally expensive, I had to decide what my relatively small data set will contain. I decided on a set of symbols composed mostly of some of the most popular mathematical symbols.

          This might be a bit paradoxical, because those symbols which are most popular surely are the most remembered. This may be true, but it is also true that there are probably more beginning researchers and students in need of a reference for basic symbols than there are people who need to look up the more esoteric symbols. Since detexify exists and has a more comprehensive database, I choose for my tool to be more of a quick reference.

    2. [ ] Train the model
  2. Frontend
    1. Hosting To host this project I used fly.io for its excellent integration with Django, which was used to construct the frontend. Fly.io’s extremely simple installation instructions for a number of web-app libraries for popular languages, and it was overall very simple to use for someone not experienced in website hosting like myself.
    2. Website structure The frontend’s structure was made entirely using Django, which was excellent for me as someone with lots of Python experience, and little HTML or CSS experience.

      Most of the interface between the model (which was made using another Python library, Tensorflow) and the page was handled in a single views.py file. Python acted as the glue between Django and Tensorflow, which was extremely helpful and satisfying to work with.

      Those parts of the website I needed to actually delve into HTML for, were done almost entirely using org-mode’s helpful HTML export. I could export large swaths of org-mode documents to a nice-looking CSS “frame”, while embedding HTML within the plain org text for seamless integration into the final product.

Exporting the code

This section contains the real code I am using for some Untexify’s user-facing elements. They are written as code blocks, which are themselves tangled and merged together within Org-mode’s exported HTML file, and are placed automatically where Django expects them to be.

The utility of a literate configuration in this case is debatable, and it exists mostly as a proof-of-concept, and convenience since most of my other design lives within Emacs. With the use of custom stylesheet functionality, however, I can quickly alter the look of the site at anytime, with minimal effort.

This block is the javascript code for the HTML canvas responsible for accepting user input, in the form of hand-drawn approximations of the symbol they are trying to look up.

// matches elements of the document "document" (presumably the default instance of the Document() object instantiated by call "defer" in the script element) which have "class=myCanvas".
const canvas = document.querySelector('.myCanvas');
const width = canvas.width = window.innerWidth;
const height = canvas.height = window.innerHeight - 85;
const ctx = canvas.getContext('2d');

ctx.fillStyle = 'rgb(0,0,0)';
ctx.fillRect(0, 0, width, height);

const colorPicker = document.querySelector('input[type="color"]');
const sizePicker = 4;
const output = document.querySelector('.output');
const clearBtn = document.querySelector('button');

// covert degrees to radians
function degToRad(degrees) {
  return degrees * Math.PI / 180;
};

// update sizepicker output value

sizePicker.addEventListener('input', () => output.textContent = sizePicker.value);

// store mouse pointer coordinates, and whether the button is pressed
let curX;
let curY;
let pressed = false;

// update mouse pointer coordinates
document.addEventListener('mousemove', e => {
  curX = (window.Event) ? e.pageX : e.clientX + (document.documentElement.scrollLeft ? document.documentElement.scrollLeft : document.body.scrollLeft);
  curY = (window.Event) ? e.pageY : e.clientY + (document.documentElement.scrollTop ? document.documentElement.scrollTop : document.body.scrollTop);
});

canvas.addEventListener('mousedown', () => pressed = true);

canvas.addEventListener('mouseup', () => pressed = false);

clearBtn.addEventListener('click', () => {
  ctx.fillStyle = 'rgb(0,0,0)';
  ctx.fillRect(0, 0, width, height);
});

function draw() {
  if (pressed) {
    ctx.fillStyle = colorPicker.value;
    ctx.beginPath();
    ctx.arc(curX, curY - 85, sizePicker.value, degToRad(0), degToRad(360), false);
    ctx.fill();
  }

  requestAnimationFrame(draw);
}

draw();
  

Now, we render embed the user-facing HTML elements onto the page.

<!DOCTYPE html>
{% load static %}
<html lang="en-us">
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=width, initial-scale=5.0">
    <title>Canvas</title>
    <script src="{% static 'testapp/script.js' %}" defer></script>
  </head>
  <body>
    <canvas class="myCanvas" id="canvas" style="border: 3px solid black;">
      <p>Add suitable fallback here.</p>
    </canvas>
    <div class="toolbar">
        <form enctype="multipart/form-data" action="" method="post">
            {% csrf_token %}
            {{ form }}
            <input type="submit" value="Submit">
        </form>
      <button class="clearButton">Clear canvas</button>
    </div>
    <canvas class="background">
      </canvas>
      {{ symbol }}
  </body>
</html>
  

untexify's People

Contributors

rgri avatar

Watchers

 avatar

untexify's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.