Git Product home page Git Product logo

geojsplit's People

Contributors

underchemist avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

pierrecamilleri

geojsplit's Issues

allow different batch sizes

this is for an enhancement.

aside from setting a specific batch size for all splits, let users set a batch size per split file that will be generated.

revision in the main code can be something like this:

try:
	page = 0 # nth fragment
	origBatch = batch

	while True:
		data: List[Dict[str, Any]] = []

		batch = origBatch
		
		if batches is not None and page < len(batches):
			batch = batches[page]
			page += 1

		for _ in range(batch):
			data.append(next(features))
		
		yield geojson.FeatureCollection(data)
except StopIteration:

originally, my aim is to have a file size limit per split... but this can do for now.

DeprecationWarning: ijson opening file in text mode instead of binary mode

Context

A warning is raised when using the geojsplit package. This happens when using a file as input.

Example of code raising the warning :
geojson = geojsplit.GeoJSONBatchStreamer(file_path)

Version

Python==3.9.7

geojsplit==0.1.2
ijson=2.6.1

Stack

  ijson/compat.py:48:DeprecationWarning: 
  ijson works by reading bytes, but a string reader has been given instead. This
  probably, but not necessarily, means a file-like object has been opened in text
  mode ('t') rather than binary mode ('b').
  
  An automatic conversion is being performed on the fly to continue, but on the
  other hand this creates unnecessary encoding/decoding operations that decrease
  the efficiency of the system. In the future this automatic conversion will be
  removed, and users will receive errors instead of this warning. To avoid this
  problem make sure file-like objects are opened in binary mode instead of text
  mode.
  
    warnings.warn(_str_vs_bytes_warning, DeprecationWarning)

parallelize file creation

this is for another enhancement

cli runs too long, maybe multiprocessing module can be used to improve the performance of this block

for count, features in enumerate(gj.stream(batch=args.geometry_count)):
try:
new_filename: Path = gen_filename(
gj.geojson, count, width=args.suffix_length, parent=args.output
)
except TypeError as e:
logger.error(f"Could not generate a unique suffix.", exc_info=e)
try:
if not args.dry_run:
if not new_filename.parent.exists():
logger.debug(f"creating output directory {args.output}")
new_filename.parent.mkdir(parents=True, exist_ok=True)
with new_filename.open("w") as fp:
json.dump(features, fp)
logger.debug(
f"successfully saved {len(features['features'])} features to {new_filename}"
)
except IOError as e:
logger.error(f"Could not write features to {new_filename}", exc_info=e)
# account for 0 based index of enumerate that is required for `pad` method.
if args.limit is not None:
if count >= args.limit - 1:
break

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.