aikubo / linkinglines Goto Github PK
View Code? Open in Web Editor NEWLink and Cluster Cartesian Line Segments
License: MIT License
Link and Cluster Cartesian Line Segments
License: MIT License
Part of openjournals/joss-reviews#6147
The documentation contains unresolved links like so: Follow this indepth tutorial <>
_ to get started!
If you build the documentation locally, you probably should see warnings with more details to fix these.
Furthermore, the API documentation can use some more formatting. For example in
https://linkinglines.readthedocs.io/en/latest/linkinglines.html#module-linkinglines.SyntheticLines there are multiple unformatted multiline examples that are put on a single line.
Just to preface: use of this specific data is just a suggestion, feel free to use any other source! I do believe, however, that adding some (small) example(s) with data other than dyke data is needed to make evident the general usability of the software. No complex analysis is required for this purpose, just show how it is loaded (in csv
form or with geopandas
) and a simple analysis with e.g. the Cartesian space - Hough Transform space plot.
I have personally been involved with gathering fracture trace data and some of our data is publicly available:
The data consists of ESRI Shapefile data that can be loaded with geopandas
(See #20) or with QGIS and transformed to csv
to work with the current API.
This dataset is contains code and data. Fracture trace data is in data/trace_data/traces/20m/
folder in geojson
data type.
You can freely add individual files/parts of files to this repository as the datasets are openly and freely licensed. Just add a mention and link to Zenodo or the DOI found on the Zenodo pages in e.g. the README.md
or some other suitable place.
Part of openjournals/joss-reviews#6147
The repository has files and folders that probably don't need to be checked into version control. A non-exhaustive list:
.virtual_documents
dist
__pycache__ (several folders)
src/linkinglines.egg-info
docs/.ipynb_checkpoints
docs/.virtual_documents
docs/_build
You could remove these. In the future you can ignore these files with a gitignore. See:
https://git-scm.com/docs/gitignore
https://docs.github.com/en/get-started/getting-started-with-git/ignoring-files
Part of openjournals/joss-reviews#6147
import sys
sys.path.append("../src")
sys.path.append("/home/akh/myprojects/Dikes_Linking_Project/dikedata/spanish peaks")
# Packages written for the paper
from htMOD import HoughTransform
from clusterMod import AggCluster
from plotmod import DotsLines, plotScatterHist, pltRec, DotsHT
from PrePostProcess import *
from examineMod import examineClusters, checkoutCluster
from fitRadialCenters import RadialFit, NearCenters, CenterFunc
should probably become something like
import linkinglines as ll
from ll.htMOD import HoughTransform
... # etc
I believe a similar thing needs to be done for your tests to work automatically.
Part of openjournals/joss-reviews#6147
You have stated that version v.2.1.0
of LinkingLines
is to be reviewed. You have created a GitHub release accordingly. However, there is no git
tag representing this revision of the LinkingLines
repository. I would suggest creating a git
tag that represents the correct revision. If the latest commit on master represents the version to be reviewed, you can tag it with git tag
and push the tag with git push --tags
.
I think DemoFractures.ipynb
serves as a good example of using the software for other data than dykes but I think it looks like there are no radial patterns in the fracture data (not sure if I was expecting them when suggesting the data ๐ ). Or that is how I interpret the results. I think you should add a few words to the notebook about there apparently not being any radial patterns according to the analysis in this data, which is a result in itself, so there are not any confusion in someone trying to interpret the results.
Trying to run docs/DemoLinkingLines.ipynb
, I get the following error in cell number 2:
โ poetry run ipython DemoLinkingLines.ipynb
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[1], line 5
1 # Load the example dataset
2
3 #load it using built in reading function which also does the preprocessing
4 # the CSV must have a column called "WKT" and can have any other data
----> 5 dikeset=ll.readFile('../data/SpanishPeaks_3857.csv', preprocess=True)
File ~/projects/LinkingLines/src/linkinglines/PrePostProcess.py:84, in readFile(name, preprocess)
82 # if preprocess is True, preprocess the data
83 if preprocess:
---> 84 data = WKTtoArray(data)
85 data = preProcess(data)
87 return data
File ~/projects/LinkingLines/src/linkinglines/PrePostProcess.py:196, in WKTtoArray(df, plot)
194 if not ("WKT" in df.columns ):
195 if not ("geometry" in df.columns):
--> 196 raise ValueError("No geometry present")
198 xstart=[]
199 ystart=[]
ValueError: No geometry present
While making new data formats work, maybe you made a change that made the old data not work ๐
There is also an error in the path to the file, it should be '../data/SpanishPeaks_3857.csv'
rather than '/../data/SpanishPeaks_3857.csv'
i.e. without the leading forward slash.
Part of openjournals/joss-reviews#6147
The codebase is currently quite fixed towards a specific research workflow. It would help re-use if it becomes more generic. Two good examples:
Currently the package only reads and writes csv files with a (undocumented) column containing WellKnownText LineStrings. A generic version would take in any geospatial dataframe, only needing to check whether the geometries are lines. The great package geopandas, fits this use case perfectly, and automatically gives you a range of input and output files, and probably more generic methods to read coordinates from geometry.
Ideally package code is object oriented, with methods being useful on their own. Here you use DataFrames as intermediate objects, but they still require scripting steps. This is easier to explain in an example:
Current code:
data=pd.read_csv('path/to/data')
theta,rho,xc,yc=ll.HoughTransform(data)
data['theta']=theta
data['rho']=rho
dtheta=2 #degrees <-- unused
drho=500 #meters <-- unused
dikeset, Z=ll.AggCluster(data)
The above code requires setting data['theta']
, before AggCluster
can work. As such, it would be easier to let ll.HoughTransform(data)
return a new Dataframe with these columns added, that can be used directly, like so:
data = pd.read_csv('path/to/data')
ndata = ll.HoughTransform(data) # returns dataframe with columns theta, rho, xc, yc added
dikeset, Z=ll.AggCluster(ndata)
You can go even one step further, and let AggCluster
call HoughTransform
if the required columns theta, rho etc are missing from it.
I would document the development dependencies i.e. pytest
in pyproject.toml
according to the specification (https://packaging.python.org/en/latest/specifications/pyproject-toml/#dependencies-optional-dependencies).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.