Comments (2)
It seems that Spacy is used only to get the tagger labels and entity names,
import spacy
nlp = spacy.load('en', parser = False)
POS = {w: i for i, w in enumerate([''] + list(nlp.tagger.labels))}
ENT = {w: i for i, w in enumerate([''] + nlp.entity.move_names)}
wouldn't it make sens to just put them explicitly in the code?
from hmnet.
Hi. I changed the docker file as below and it fixed the problem:
FROM nvidia/cuda:10.0-devel-ubuntu18.04
#FROM nvidia/cuda:11.0-devel-ubuntu18.04
##############################################################################
# Versions
##############################################################################
ENV PYTHON_VERSION=3
ENV TENSORFLOW_VERSION=1.15.2
ENV PYTORCH_VERSION=1.2.0
ENV TORCHVISION_VERSION=0.4.0
ENV TENSORBOARDX_VERSION=1.8
ENV CUDNN_VERSION=7.6.0.64-1+cuda10.0
ENV NCCL_VERSION=2.4.7-1+cuda10.0
ENV MXNET_VERSION=1.5.0
##############################################################################
# Installation/Basic Utilities
##############################################################################
RUN apt-get update && \
apt-get install -y --allow-change-held-packages --no-install-recommends \
software-properties-common \
openssh-client openssh-server \
pdsh curl sudo net-tools \
vim iputils-ping wget perl \
libxml-parser-perl \
libcudnn7=${CUDNN_VERSION} \
libnccl2=${NCCL_VERSION} \
libnccl-dev=${NCCL_VERSION} \
--allow-downgrades
##############################################################################
# Installation Latest Git
##############################################################################
RUN add-apt-repository ppa:git-core/ppa -y && \
apt-get update && \
apt-get install -y git && \
git --version
##############################################################################
# Python and Pip
##############################################################################
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get install -y python3 python3-dev && \
rm -f /usr/bin/python && \
ln -s /usr/bin/python3 /usr/bin/python && \
curl -O https://bootstrap.pypa.io/get-pip.py && \
python get-pip.py && \
rm get-pip.py && \
pip install --upgrade pip && \
# Print python an pip version
python -V && pip -V
##############################################################################
# MXNet
##############################################################################
RUN pip install mxnet-cu100==${MXNET_VERSION}
##############################################################################
# TensorFlow
##############################################################################
RUN pip install tensorflow-gpu==${TENSORFLOW_VERSION}
##############################################################################
# PyTorch
##############################################################################
RUN pip install torch==${PYTORCH_VERSION}
RUN pip install torchvision==${TORCHVISION_VERSION}
RUN pip install tensorboardX==${TENSORBOARDX_VERSION}
##############################################################################
# Temporary Installation Directory
##############################################################################
ENV STAGE_DIR=/tmp
RUN mkdir -p ${STAGE_DIR}
##############################################################################
# Mellanox OFED
##############################################################################
ENV MLNX_OFED_VERSION=4.6-1.0.1.1
RUN apt-get install -y libnuma-dev
RUN cd ${STAGE_DIR} && \
wget -q -O - http://www.mellanox.com/downloads/ofed/MLNX_OFED-${MLNX_OFED_VERSION}/MLNX_OFED_LINUX-${MLNX_OFED_VERSION}-ubuntu18.04-x86_64.tgz | tar xzf - && \
cd MLNX_OFED_LINUX-${MLNX_OFED_VERSION}-ubuntu18.04-x86_64 && \
./mlnxofedinstall --user-space-only --without-fw-update --all -q && \
cd ${STAGE_DIR} && \
rm -rf ${STAGE_DIR}/MLNX_OFED_LINUX-${MLNX_OFED_VERSION}-ubuntu18.04-x86_64*
##############################################################################
# Install Open MPI
##############################################################################
RUN mkdir ${STAGE_DIR}/openmpi && \
cd ${STAGE_DIR}/openmpi && \
wget https://www.open-mpi.org/software/ompi/v4.0/downloads/openmpi-4.0.1.tar.gz && \
tar zxf openmpi-4.0.1.tar.gz && \
cd openmpi-4.0.1 && \
./configure --enable-orterun-prefix-by-default && \
make -j $(nproc) all && \
make install && \
ldconfig && \
rm -rf ${STAGE_DIR}/openmpi
##############################################################################
# Ucomment and set SSH Daemon port
###############################################################################
RUN mkdir -p /var/run/sshd
# Allow OpenSSH to talk to containers without asking for confirmation
RUN cat /etc/ssh/ssh_config | grep -v StrictHostKeyChecking > /etc/ssh/ssh_config.new && \
echo " StrictHostKeyChecking no" >> /etc/ssh/ssh_config.new && \
mv /etc/ssh/ssh_config.new /etc/ssh/ssh_config
# SSH Daemon port for DeepSpeed
ENV SSH_PORT=2222
RUN cat /etc/ssh/sshd_config > ${STAGE_DIR}/sshd_config && \
sed "0,/^#Port 22/s//Port ${SSH_PORT}/" ${STAGE_DIR}/sshd_config > /etc/ssh/sshd_config
##############################################################################
# Common Python Packages
##############################################################################
RUN pip install future typing
RUN pip install numpy \
scipy \
h5py \
azureml-defaults \
tqdm \
scikit-learn \
pytest \
boto3 \
filelock \
tokenizers \
requests \
regex \
mpi4py \
sentencepiece \
sacremoses \
spacy==2.3.5 \
nltk \
pyrouge \
py-rouge \
seqeval
RUN pip install transformers==2.4.1
RUN pip install tokenizers==0.8.1
RUN python -m nltk.downloader punkt
RUN python -m spacy download en
##############################################################################
# Set default shell to /bin/bash
##############################################################################
SHELL ["/bin/bash", "-cu"]
from hmnet.
Related Issues (13)
- Docker building, Tensor Size issues, may be related to package versions. HOT 3
- Problems while building docker HOT 4
- Preprocessing my own data for inference
- The order of token_attn and sent_attn in decoder is different between the code and the paper, in MeetingNet_Transformer.py
- This repo is missing important files
- How to build a new data set with the same format HOT 2
- cublas runtime error HOT 7
- Cuda out of memory HOT 6
- How to solve cuda out of memory error? HOT 1
- tokenizer.convert_ids_to_tokens not generating special tokens with predefined position offset
- Which version of spacy are you using? HOT 1
- How to train models with mine own data sets?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hmnet.