Git Product home page Git Product logo

linkinglines's Issues

Documentation links don't work and examples are not formatted

Part of openjournals/joss-reviews#6147

The documentation contains unresolved links like so: Follow this indepth tutorial <>_ to get started!

If you build the documentation locally, you probably should see warnings with more details to fix these.

Furthermore, the API documentation can use some more formatting. For example in
https://linkinglines.readthedocs.io/en/latest/linkinglines.html#module-linkinglines.SyntheticLines there are multiple unformatted multiline examples that are put on a single line.

Add example with data other than dykes (e.g. fracture trace data)

Just to preface: use of this specific data is just a suggestion, feel free to use any other source! I do believe, however, that adding some (small) example(s) with data other than dyke data is needed to make evident the general usability of the software. No complex analysis is required for this purpose, just show how it is loaded (in csv form or with geopandas) and a simple analysis with e.g. the Cartesian space - Hough Transform space plot.

My data suggestions

I have personally been involved with gathering fracture trace data and some of our data is publicly available:

The data consists of ESRI Shapefile data that can be loaded with geopandas (See #20) or with QGIS and transformed to csv to work with the current API.

This dataset is contains code and data. Fracture trace data is in data/trace_data/traces/20m/ folder in geojson data type.

You can freely add individual files/parts of files to this repository as the datasets are openly and freely licensed. Just add a mention and link to Zenodo or the DOI found on the Zenodo pages in e.g. the README.md or some other suitable place.

Clean up repo with a .gitignore

Part of openjournals/joss-reviews#6147

The repository has files and folders that probably don't need to be checked into version control. A non-exhaustive list:

.virtual_documents
dist
__pycache__ (several folders)
src/linkinglines.egg-info
docs/.ipynb_checkpoints
docs/.virtual_documents
docs/_build

You could remove these. In the future you can ignore these files with a gitignore. See:
https://git-scm.com/docs/gitignore
https://docs.github.com/en/get-started/getting-started-with-git/ignoring-files

Documentation example requires local modules

Part of openjournals/joss-reviews#6147

import sys
sys.path.append("../src")
sys.path.append("/home/akh/myprojects/Dikes_Linking_Project/dikedata/spanish peaks")

# Packages written for the paper
from htMOD import HoughTransform
from clusterMod import AggCluster
from plotmod import DotsLines, plotScatterHist, pltRec, DotsHT
from PrePostProcess import *
from examineMod import examineClusters, checkoutCluster
from fitRadialCenters import RadialFit, NearCenters, CenterFunc

should probably become something like

import linkinglines as ll

from ll.htMOD import HoughTransform
...  # etc

I believe a similar thing needs to be done for your tests to work automatically.

Document the version of LinkingLines sent for review with tag

Part of openjournals/joss-reviews#6147

You have stated that version v.2.1.0 of LinkingLines is to be reviewed. You have created a GitHub release accordingly. However, there is no git tag representing this revision of the LinkingLines repository. I would suggest creating a git tag that represents the correct revision. If the latest commit on master represents the version to be reviewed, you can tag it with git tag and push the tag with git push --tags.

Add short explanation inside `DemoFractures.ipynb`

I think DemoFractures.ipynb serves as a good example of using the software for other data than dykes but I think it looks like there are no radial patterns in the fracture data (not sure if I was expecting them when suggesting the data ๐Ÿ˜† ). Or that is how I interpret the results. I think you should add a few words to the notebook about there apparently not being any radial patterns according to the analysis in this data, which is a result in itself, so there are not any confusion in someone trying to interpret the results.

Issue with running `DemoLinkingLines.ipynb`

Trying to run docs/DemoLinkingLines.ipynb, I get the following error in cell number 2:

โžœ poetry run ipython DemoLinkingLines.ipynb
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[1], line 5
      1 # Load the example dataset
      2
      3 #load it using built in reading function which also does the preprocessing
      4 # the CSV must have a column called "WKT" and can have any other data
----> 5 dikeset=ll.readFile('../data/SpanishPeaks_3857.csv', preprocess=True)

File ~/projects/LinkingLines/src/linkinglines/PrePostProcess.py:84, in readFile(name, preprocess)
     82 # if preprocess is True, preprocess the data
     83 if preprocess:
---> 84     data = WKTtoArray(data)
     85     data = preProcess(data)
     87 return data

File ~/projects/LinkingLines/src/linkinglines/PrePostProcess.py:196, in WKTtoArray(df, plot)
    194 if not ("WKT" in df.columns ):
    195     if not ("geometry" in df.columns):
--> 196      raise ValueError("No geometry present")
    198 xstart=[]
    199 ystart=[]

ValueError: No geometry present

While making new data formats work, maybe you made a change that made the old data not work ๐Ÿ˜…

There is also an error in the path to the file, it should be '../data/SpanishPeaks_3857.csv' rather than '/../data/SpanishPeaks_3857.csv' i.e. without the leading forward slash.

More generic API

Part of openjournals/joss-reviews#6147

The codebase is currently quite fixed towards a specific research workflow. It would help re-use if it becomes more generic. Two good examples:

On the use of WKT

Currently the package only reads and writes csv files with a (undocumented) column containing WellKnownText LineStrings. A generic version would take in any geospatial dataframe, only needing to check whether the geometries are lines. The great package geopandas, fits this use case perfectly, and automatically gives you a range of input and output files, and probably more generic methods to read coordinates from geometry.

On the use of scripting

Ideally package code is object oriented, with methods being useful on their own. Here you use DataFrames as intermediate objects, but they still require scripting steps. This is easier to explain in an example:

Current code:

data=pd.read_csv('path/to/data')
theta,rho,xc,yc=ll.HoughTransform(data)
data['theta']=theta
data['rho']=rho

dtheta=2 #degrees <-- unused
drho=500 #meters <-- unused

dikeset, Z=ll.AggCluster(data)

The above code requires setting data['theta'], before AggCluster can work. As such, it would be easier to let ll.HoughTransform(data) return a new Dataframe with these columns added, that can be used directly, like so:

data = pd.read_csv('path/to/data')
ndata = ll.HoughTransform(data)  # returns dataframe with columns theta, rho, xc, yc added
dikeset, Z=ll.AggCluster(ndata)

You can go even one step further, and let AggCluster call HoughTransform if the required columns theta, rho etc are missing from it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.