neulab / cmulab Goto Github PK
View Code? Open in Web Editor NEWCMU Linguistic Annotation Backend
CMU Linguistic Annotation Backend
When creating a new ELAN file using cmulab_elan.py
, when the ELAN file is moved to a new location it no longer is able to automatically find the wav file, and it has to be selected manually. It might be good to write the full path to the wav file in the new ELAN files so it can be found automatically.
Currently the directions suggest sending a password over an insecure connection. I wonder if it would be possible to start out by suggesting user authentication to happen through Google for example to bypass this problem.
Reply from @antonisa:
It's doable within Django's framework:
https://fosstack.com/how-to-add-google-authentication-in-django/
https://medium.com/trabe/oauth-authentication-in-django-with-social-auth-c67a002479c1
Because the whole linguistic annotation backend isn't going to only handle speech, it seems a little bit strange to me to have the top tier of the annotation interface be named "speech". Maybe change it to "cmulab"?
I get the following error:
(python3) tanuki:cmulab neubig$ python manage.py makemigrations annotator
Traceback (most recent call last):
File "manage.py", line 15, in <module>
execute_from_command_line(sys.argv)
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
utility.execute()
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/django/core/management/__init__.py", line 375, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/django/core/management/base.py", line 323, in run_from_argv
self.execute(*args, **cmd_options)
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/django/core/management/base.py", line 361, in execute
self.check()
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/django/core/management/base.py", line 390, in check
include_deployment_checks=include_deployment_checks,
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/django/core/management/base.py", line 377, in _run_checks
return checks.run_checks(**kwargs)
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/django/core/checks/registry.py", line 72, in run_checks
new_errors = check(app_configs=app_configs)
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/django/core/checks/urls.py", line 40, in check_url_namespaces_unique
all_namespaces = _load_all_namespaces(resolver)
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/django/core/checks/urls.py", line 57, in _load_all_namespaces
url_patterns = getattr(resolver, 'url_patterns', [])
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/django/utils/functional.py", line 80, in __get__
res = instance.__dict__[self.name] = self.func(instance)
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/django/urls/resolvers.py", line 579, in url_patterns
patterns = getattr(self.urlconf_module, "urlpatterns", self.urlconf_module)
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/django/utils/functional.py", line 80, in __get__
res = instance.__dict__[self.name] = self.func(instance)
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/django/urls/resolvers.py", line 572, in urlconf_module
return import_module(self.urlconf_name)
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 678, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/Users/neubig/work/cmulab/cmulab/urls.py", line 18, in <module>
urlpatterns += static(settings.MEDIA_URL, document_root=settings.MEDIA_ROOT)
File "/Users/neubig/anaconda3/envs/python3/lib/python3.6/site-packages/django/conf/urls/static.py", line 22, in static
raise ImproperlyConfigured("Empty static prefix not permitted")
django.core.exceptions.ImproperlyConfigured: Empty static prefix not permitted
Maybe there's an environmental variable or something I need to set?
It'd be nice to have an implementation in docker so we can run with multiple workers.
Clicking on "ready" on the public model page gives a 404 error.
Apparently it is possible to create ELAN plugins using the LEXAN or audio recognizer plugins here:
https://tla.mpi.nl/tools/tla-tools/elan/download/
It might be nice to add this (although this is not a rush).
Consent form should be clicked on before entering interface and then disappear after.
The OCR interface could be a sub-frame beneath the main tabs.
We should
Currently there are some data files in the "example-scripts" directory. They need attribution to the original source, perhaps in the README.md file?
It'd be nice to stay on the cmulab.dev domain
This is somewhat lower priority for now, but it would also be nice to explicitly encrypt the information stored in the DB.
From @antonisa:
https://pypi.org/project/pysqlcipher/ allows en/decryption of an sqlite and I think all the calls will remain the same. Will read more.
Related to #3, we need to think about security of the http protocol at some point, because this is a security-sensitive application.
We'll have to do encryption, and I can try to find a way to get a secure key we can use, either through CMU or by creating a separate domain name.
I had some pre-req troubles getting this installed. on Ubuntu Bionic (WSL)
I installed django (2.1) and python (3.6-3.7)
These are also needed to run python manage.py runserver
.
Note: It seems to fail if "django-filters" is also installed alongside django-filter.
A nice first step at actually making the backend do something useful might be to create a client that annotates ELAN or Praat files. It looks like this library might be useful in doing so: https://github.com/dopefishh/pympi
If we could create a simple example that, for example, reads in an ELAN tier, adds VAD, and then writes out a new file with this added, that might be a nice proof of concept that is also easy to implement. I'd be happy to help out with creating the groundwork for it if that sounds useful.
Currently "Universal phone recognizer for ELAN" is a link directly to github, but it should have one explanation page on the main page.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.