Git Product home page Git Product logo

opennmt-py's Introduction

OpenNMT-py: Open-Source Neural Machine Translation and (Large) Language Models

Build Status Documentation Gitter Forum

OpenNMT-py is the PyTorch version of the OpenNMT project, an open-source (MIT) neural machine translation (and beyond!) framework. It is designed to be research friendly to try out new ideas in translation, language modeling, summarization, and many other NLP tasks. Some companies have proven the code to be production ready.

We love contributions! Please look at issues marked with the contributions welcome tag.

Before raising an issue, make sure you read the requirements and the Full Documentation examples.

Unless there is a bug, please use the Forum or Gitter to ask questions.


For beginners:

There is a step-by-step and explained tuto (Thanks to Yasmin Moslem): Tutorial

Please try to read and/or follow before raising newbies issues.

Otherwise you can just have a look at the Quickstart steps


New:

  • Special note on Pytorch v2: up to v2.0.1 dynamic shapes are not handled properly, hence torch.compile() will not work with OpenNMT-py. We have tested nightly (in May) and it works with a small gain. Next version will be 2.1
  • LLM support with converters for: Llama, OpenLlama, Redpajama, MPT-7B, Falcon.
  • Support for 8bit and 4bit quantization along with LoRA adapters, with or without checkpointing.
  • You can finetune 7B and 13B models on a single RTX 24GB with 4-bit quantization.
  • Inference can be forced in 4/8bit using the same layer quantization as in finetuning.
  • Once your model is finetuned you can run inference either with OpenNMT-py or faster with CTranslate2.
  • MMLU evaluation script, see results here

For all usecases including NMT, you can now use Multiquery instead of Multihead attention (faster at training and inference) and remove biases from all Linear (QKV as well as FeedForward modules).

If you used previous versions of OpenNMT-py, you can check the Changelog or the Breaking Changes


Tutorials:

  • How to replicate Vicuna with a 7B or 13B llama (or Open llama, MPT-7B, Redpajama) Language Model: Tuto Vicuna
  • How to finetune NLLB-200 with your dataset: Tuto Finetune NLLB-200
  • How to create a simple OpenNMT-py REST Server: Tuto REST
  • How to create a simple Web Interface: Tuto Streamlit
  • Replicate the WMT17 en-de experiment: WMT17 ENDE

Setup

OpenNMT-py requires:

  • Python >= 3.8
  • PyTorch >= 1.13 <2.1

Install OpenNMT-py from pip:

pip install OpenNMT-py

or from the sources:

git clone https://github.com/OpenNMT/OpenNMT-py.git
cd OpenNMT-py
pip install -e .

Note: if you encounter a MemoryError during installation, try to use pip with --no-cache-dir.

(Optional) Some advanced features (e.g. working pretrained models or specific transforms) require extra packages, you can install them with:

pip install -r requirements.opt.txt

Documentation & FAQs

Full HTML Documentation

FAQs

Acknowledgements

OpenNMT-py is run as a collaborative open-source project. Project was incubated by Systran and Harvard NLP in 2016 in Lua and ported to Pytorch in 2017.

Current maintainers (since 2018):

François Hernandez and Ubiqus Team. Vincent Nguyen (Seedfall)

Citation

If you are using OpenNMT-py for academic work, please cite the initial system demonstration paper published in ACL 2017:

@inproceedings{klein-etal-2017-opennmt,
    title = "{O}pen{NMT}: Open-Source Toolkit for Neural Machine Translation",
    author = "Klein, Guillaume  and
      Kim, Yoon  and
      Deng, Yuntian  and
      Senellart, Jean  and
      Rush, Alexander",
    booktitle = "Proceedings of {ACL} 2017, System Demonstrations",
    month = jul,
    year = "2017",
    address = "Vancouver, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/P17-4012",
    pages = "67--72",
}

opennmt-py's People

Contributors

adamlerer avatar anderleich avatar apaszke avatar bmccann avatar bpopeters avatar da03 avatar flauted avatar francoishernandez avatar funboarder13920 avatar guillaumekln avatar gwenniger avatar helson73 avatar jianyuzhan avatar jsenellart avatar justinchiu avatar l-k-11235 avatar meocong avatar panosk avatar pltrdy avatar scarletpan avatar sebastiangehrmann avatar soumith avatar srush avatar tayciryahmed avatar thammegowda avatar vince62s avatar waino avatar wjbianjason avatar xutaima avatar zenglinxiao avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.