Comments (11)
@hemangsk sure, but please describe how you'd do this before you write the code in case any of us have suggestions/modifications - it's much easier for both sides! :)
Assigning you 👍
from coala-quickstart.
get_language_from_hashbang
return value can be memorized.
But performance is not a consideration, as this is run once per project lifetime typically.
from coala-quickstart.
Looks neat 👍
And unless my eyes fail me, the data.close()
is outside the with open(...) as data
;) I know, this is just a prototype. Just saying :P
from coala-quickstart.
Hey! can I take this up?
from coala-quickstart.
@adtac Thanks!
I figured this solution that in coala-quickstart > generation > Utilities.py > get_extension(), split_by_language(), These functions have a similar task to separate the given files based on language and extensions. So Inside the loop which iterates through the list of project_files, we can add to call to new utility functions get_language_from_hashbang()
and get_extension_from_hashbang()
.
These will read the contents from first line of extension-less file and then parse the string in it to see if the string starts '!#', confirming its a hashbang, we can obtain the language that is being used in that file and hence the extension from the exts dictionary/ the pygments approach [https://github.com/coala/coala/pull/3162].
Like for string on first line be,
first_line = '!#bin/bash'
lang = first_line[5:]
ext = exts[lang]
will it be the right approach and can be worked upon?
from coala-quickstart.
Sounds good. get_language_from_hashbang
will be the interesting/challenging part. Would be good if you can describe how you will do that.
from coala-quickstart.
One more thing to look into is #!/usr/bin/env python
- that should have the same effect as a #!/usr/bin/python
shebang ;)
from coala-quickstart.
sorry for the delay! Here's the approach I've come up for get_language_from_hashbang()
,
In the coala-quickstart > generation > Utilities.py
def split_by_language(project_files):
lang_files = defaultdict(lambda: set())
for file in project_files:
name, ext = os.path.splitext(file)
if ext in exts:
for lang in exts[ext]:
lang_files[lang.lower()].add(file)
lang_files["all"].add(file)
# Check for hashbang
elif name and not ext:
with open(file, 'r') as data:
hashbang = data.readline()
if(re.match('/(^#![(a-z)|\/]*[ ][a-z]*)|(^#![(a-z)|\/]*)/', hashbang)):
language = get_language_from_hashbang(hashbang)
try:
for ext in exts:
for lang in exts[ext]:
if(language == lang):
lang_files[lang.lower()].add(file)
lang_files["all"].add(file)
except KeyError:
# Handling error
data.close()
return lang_files
And get_language_from_hashbang(hashbang)
def get_language_from_hashbang(hashbang):
if(re.match('(^#![(a-z)|\/]*[ ][a-z]*)', hashbang)):
language = hashbang.split(' ')[1]
elif(re.match('(^#![(a-z)|\/]*)', hashbang)):
language = hashbang.split('/')[-1]
return language
Shortcomings in this approach which I've figured out till now and I'm working on are,
- Regex can be improved using (Backtracing?)
- Nested for loop is used in try block and it is not time efficient
from coala-quickstart.
Thanks for the feedback @jayvdb @adtac :) I'm on it
from coala-quickstart.
@hemangsk any news?
from coala-quickstart.
from coala-quickstart.
Related Issues (20)
- Node initialiser includes dead code HOT 1
- initialize_project_data includes dead code HOT 1
- function green_mode unnecessary delayed import of check_filename_prefix_postfix HOT 10
- UtilitiesTest.py test method test_search_for_orig contains dead code HOT 2
- test_bear_test_fun_2 contains unused test_unified_results HOT 2
- TestUtilities.py bear_test_module unused variable mocked HOT 1
- .travis.yml: Cache ntlk data
- add .coafile.green to .gitignore HOT 1
- facilitate deletion of coafile.green on running pytest HOT 3
- Investigate: how to use the nextgen-core for more reliable and easier processing of bears HOT 1
- coala-ci failing because of old version of a bear
- green_modeTest.py: Unused import AllKindsOfSettingsDependentBear HOT 3
- Unable to install for Python 3.7 HOT 2
- Use coala & coala-bears master HOT 2
- Spelling mistake taraget HOT 5
- Spelling mistake caputred HOT 1
- Add .editorconfig to moban sync HOT 1
- green_mode.py: Incomplete branch coverage on python3.6 HOT 2
- Make coala-quickstart generate configuration files in TOML
- spelling of command is incorrect in Installation section. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from coala-quickstart.