tmontaigu / pylas Goto Github PK

View Code? Open in Web Editor NEW

39.0 39.0 13.0 998 KB

⚠️ pylas was merged into laspy 2.0 https://github.com/laspy/laspy⚠️

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

lidar python3

pylas's People

Contributors

Stargazers

Watchers

Forkers

davidcaron weyerhaeuser brycefrank ljthink nnu-gisa shnhrtkyk fredericbonifas fdbesanto2 seminar2012 geolearnai k0kod loickbriot oboychuk

pylas's Issues

Buffer smaller than requested size for some LAS files

Hi,
I'm trying to read a received LAS file with pylas and I'm getting this:

>>> import pylas
>>> f1 = pylas.open('/usr/src/data/pylas/lidar/pointcloud.las')
>>> f1.read()
Traceback (most recent call last):
	File "/opt/anaconda3/lib/python3.7/site-packages/pylas-0.3.4-py3.7.egg/pylas/point/record.py", line 260, in from_stream
		data = np.frombuffer(point_data_buffer, dtype=points_dtype, count=count)
ValueError: buffer is smaller than requested size

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
	File "<stdin>", line 1, in <module>
	File "/opt/anaconda3/lib/python3.7/site-packages/pylas-0.3.4-py3.7.egg/pylas/lasreader.py", line 62, in read
		points = self._read_points(vlrs)
	File "/opt/anaconda3/lib/python3.7/site-packages/pylas-0.3.4-py3.7.egg/pylas/lasreader.py", line 115, in _read_points
		self.stream, point_format, self.header.point_count
	File "/opt/anaconda3/lib/python3.7/site-packages/pylas-0.3.4-py3.7.egg/pylas/point/record.py", line 269, in from_stream
		points_dtype,
	File "/opt/anaconda3/lib/python3.7/site-packages/pylas-0.3.4-py3.7.egg/pylas/point/record.py", line 32, in raise_not_enough_bytes_error
		point_data_buffer_len / points_dtype.itemsize,
pylas.errors.PylasError: The file does not contain enough bytes to store the expected number of points
expected 230066000 bytes, read 195556100 bytes (34509900 bytes missing == 862747.5 points) and it cannot be corrected
195556100 (bytes) / 40 (point_size) = 4888902.5 (points)

I can't attach the LAS file that is bigger than 10MB but I could send a link to download it if needed.
I hope you can help.

Uppercase extensions

Hi! I am using pylas for reading and writing LAS and LAZ files, it is really a great tool! Just wanted to mention that while files with both upper- and lower-case extensions (e.g. 'file.laz' and 'file.LAZ') are properly recognised as compressed files in reading, this is not the case in writing: writing to 'file.laz' would generate a compressed file, while 'file.LAZ' would not be compressed. Maybe having filename.split(".").lower()[-1] == "laz" here?

pylas/pylas/lasdatas/base.py

Line 327 in 1255266

is_ext_laz = filename.split(".")[-1] == "laz"

"lasrs" compression not working with current release in PyPi 0.4.3

compress_points_buf is a bytes object when using lazrs causing errors with the compress_points_buf.tobytes() call.

pylas/pylas/lasdatas/base.py

Line 283 in 7b35d43

points_bytes = bytearray(compressed_points_buf.tobytes())

Using lazrs 0.2.5, pylas 0.4.3, Python 3.8.5

Other notes in case relevant:

I'm writing LAS point format >= 6 with extra bytes attributes.
My test cases run with lazrs if i simply remove the .tobytes() or replace with points_bytes = bytearray(compressed_points_buf if isinstance(compressed_points_buf, bytes) else compressed_points_buf.tobytes()).
LASzip compressor is working with the same test cases.

Add Return Number to point attributes

Thanks for this module! I love the laz-perf integration and streaming support! Would it be possible to add return_number to the point attributes?

For example lasinfo shows:

lasinfo -i pylastests/simple.laz

...
number of first returns:        925
number of intermediate returns: 28
number of last returns:         901
number of single returns:       789
overview over number of returns of given pulse: 789 195 71 10 0 0 0
...

However, with pylas:

python -c 'import pylas; print(pylas.open("pylastests/simple.laz").points.dtype.names)'

You only get:

('X', 'Y', 'Z', 'intensity', 'bit_fields', 'raw_classification', 'scan_angle_rank', 'user_data', 'point_source_id', 'gps_time', 'red', 'green', 'blue')

Converting Lidar .las / .laz to .tif

I would like to suggest a new feature to be able to convert a lidar .las / .laz to a .tif file.

remove dimension

Hey,

I am struggling to remove the dimension once it is already on the point_format.dimensions list and in the points array.
Is there any method that I miss ? or any trick or advice that can help me?

Thanks

Seekable reader

Hi,

First of all, I want to compliment you on this python implementation for reading LiDAR data. I think the on the fly decompression in Python is great.

I was wondering if there was a possibility for adding a seek function to the LasReader. This would greatly improve the usability for indexed reading, especially for large compressed LAZ datasets containing a lasindex.

I saw in the rust package that the LasZipDecompressor actually supports seeking. https://github.com/tmontaigu/laz-rs-python/blob/master/src/lib.rs#L187. I think the same holds for your laszip python bindings.

I was actually kind of surprised that this option was missing and wondering before doing the implementation whether there is a limitation I'm overlooking?

Thijs van Winden

PS: I have a use case where I have to query nearby LiDAR points from a huge dataset, so streaming access would be really useful there.

Piping LAS with extra bytes through laszip.exe to compress it produces bad files

Using a script like this:

import sys
import io

import pylas

las = pylas.read(sys.argv[1])

las.write("tmp.laz")

with io.BytesIO() as out:
    las.write(out, do_compress=True)
    out.seek(0)
    pylas.read(out)

if sys.argv[1] is a file with extra bytes, the tmp.laz will be invalid
(lasinfo will output errors like

[...]
ERROR: 'end-of-file' after 454 of 1065 points for '.\tmp.laz'
corrupt chunk table
[...]

And pylas will fail to read the file with an error like

pylas.errors.PylasError: The file does not contain enough bytes to store the expected number of points
expected 64965 bytes, read 27703 bytes (37262 bytes missing == 610.8524590163935 points) and it cannot be corrected
27703 (bytes) / 61 (point_size) = 454.1475409836066 (points)

however if the file at sys.argv[1] does not have extra bytes everything is fine.

More open modes

Right now the pylas.open function only alows to open in read mode.
(Writing is possible but it is a method a pylas.LasBase class)

The PR #16 plan to add a Write mode to create a new file and write to it.

Can a "append" mode be added ?:

in append mode, user would be able to append points to the end of the file, but not read, nor overwrite points

Can a read + write be added ?:

in that mode user would be able to... read points and also write / overwrite points

In both cases the hardest will be the compressed case (LAZ file) as well as files with EVLRs).
(Append mode seems more feasible tho)

Write return value

I'm developing a tool which requires laz files and am looking to use pylas to perform conversions from las on-the-fly. The tool uses a GUI and, although pylas is quick, there's no getting around the volume of data; las files are large and can still take several (10+) seconds to convert. I would like to provide users with a process dialog during the conversion.

To convert, I'm running:

las = pylas.read(las_input)
las.write(laz_output)

It doesn't look like write provides a way to measure progress or a return value indicating completion. Is this something you would be interested in?

It looks like LasData (ultimately) uses LasData._write_to() for the write call. This creates a writer which calls write_points. (I think I linked the correct invocation). This checks a self.done. Is that something which could be returned up the call chain to indicate success?

As for getting progress updates, I'd have to think about how that could be done (another method to avoid a performance hit?), if you're even interested in that. For now, I'm just going to check for the existence of the output file; if it exists, the process completed, right? :).

Reading from S3

I was trying to read LAZ files from S3, using s3fs and pylas' ability to read from streams:

import pylas
import s3fs

fs = s3fs.S3FileSystem()
las = pylas.open(fs.open('s3://bucketname/example.laz', 'rb'))

However, I get the following error:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/ubuntu/miniconda3/envs/pylas-env/lib/python3.6/site-packages/pylas/lib.py", line 34, in open_las
    return read_las_stream(source)
  File "/home/ubuntu/miniconda3/envs/pylas-env/lib/python3.6/site-packages/pylas/lib.py", line 90, in read_las_stream
    vlrs = vlr.VLRList.read_from(data_stream, num_to_read=header.number_of_vlr)
  File "/home/ubuntu/miniconda3/envs/pylas-env/lib/python3.6/site-packages/pylas/vlr.py", line 571, in read_from
    raw = RawVLR.read_from(data_stream)
  File "/home/ubuntu/miniconda3/envs/pylas-env/lib/python3.6/site-packages/pylas/vlr.py", line 61, in read_from
    data_stream.readinto(header)
AttributeError: 'S3File' object has no attribute 'readinto'

Looks like s3fs doesn't have an implementation of readinto. So, is there a recommended way to read from S3? Or, should I look into getting an implementation of readinto for S3File?

Saving file after update_header

Hi,
I'm trying to write the file with the same name after I apply the update_header(), but I'm getting "Permission Denied". It happens even if I close the file.

I tried to check the documentation but I can't find a way to save the file after the update_header using the same name. Am I forced to write to another file name?

I hope you can help,

Handle Scaled Extra Bytes

Handle 'Scaled Extra Byte'

Read
Write
Creation

Error when decompressing laz file

Downloading any .laz file from here: https://remotesensingdata.gov.scot/data#/map and attempting to extract it using e.g.

pylas.read("NT0763_2PPM_LAS_PHASE1.laz")

Results in:

     60     vlr_data = np.frombuffer(laszip_vlr.record_data, dtype=np.uint8)
     62     point_decompressed = np.zeros(point_count * point_size, np.uint8)
---> 64     lazrs.decompress_points(point_compressed, vlr_data, point_decompressed, parallel)
     65 except lazrs.LazrsError as e:
     66     raise LazError("lazrs error: {}".format(e)) from e

PanicException: assertion failed: mid <= self.len()

Any idea why this would be?

I'm on a Windows 10 machine. The file opens correctly in QGIS.

Make `pylas.open` more pythonic

This might be a tough request, but it would be nice to be able to separate out the current open function into more pythonic open and read functions. For example:

with pylas.open('simple.laz') as fh:
    fh.read()

Having pylas.open read into the file/handle/stream enough just to get the header, would also provide convenient/quick access to some info without reading the entire dataset.

Rework tests

I find that the tests (while they work fine) should be improved to be more readable / understandable

Helper to write SRS as WKT in VLR format

Hi,
First, thank you for this very helpful library.
Is there any way to easily write the SRS in WKT format when creating a las file with pylas.create() ?
I see there is a write_to() method from the pylas.vlrs.vlrlist module but I do not understand how it works. Maybe an example can help here.
Thanks.

Writing empty .LAZ makes python crash

import pylas
las = pylas.create()
las.write('empty.las') # Ok
las.write('empty.laz') # crash

Improve EVLR handling

Improve/Fix Weird behaviour of sub fields

Atm, sub fields are a bit weird to interact with.
Example:

import pylas
las = pylas.read(som_file)


las.classification[:] = 0
assert np.all(las.classification == 0) # This will fail (unless the classification was already 0s)

The current way to work arround this is to do

import pylas
las = pylas.read(som_file)

classification = las.classification
classification[:] = 0
las.classification = classification
assert np.all(las.classification == 0) # This will work

All 'sub fields' have this behaviour, (classification, key_point, number_of_returns, return_number, etc), this is because when accessing these fields we return a new array with the valuesz.

The way to fix/improve this would be to return an object that act as a view into the array that holds the values
and that view-object would take care of returning and setting the proper values using a bitmask.

Truncating return numbers

I am comparing pylas[lazr] to laszip64 at converting las to laz and seeing a 30% speed bump with lazr 🎉. If I understand correctly, it should be as simple as:

las = pylas.read(las_input)
las.write(laz_output)

However, I get the following warning:

WARNING:pylas.headers.rawheader:Received return numbers up to 6, truncating to 5 for header.

I'm not sure what this means. I have tried grepping through all the sources and can't find the source.

Any insight? Is my data being truncated?

Regression: SubFieldView does not support some numpy operations

In pylas==0.4.3 for example, it was possible to concatenate fields and to apply several numpy operations to the fields of the point clouds. Unfortunately, some operations are not possible anymore in pylas==0.5.0a1, probably because of the new SubFieldView class. Here are some examples of operations that crash with pylas==0.5.0a1 but worked perfectly with pylas==0.4.3:

>>> import pylas
>>> import numpy as np
>>> cloud = pylas.read("pylastests/simple.las")
>>> np.concatenate([cloud.return_number, cloud.return_number])
...
ValueError: zero-dimensional arrays cannot be concatenated
>>> cloud.return_number[0] + 1
...
TypeError: unsupported operand type(s) for +: 'SubFieldView' and 'int'
>>> field = np.zeros(len(cloud.points), dtype=np.uint8)
>>> field[:] = cloud.return_number[:]
...
TypeError: __array__() takes 1 positional argument but 2 were given
>>> ordered = np.argsort(cloud.gps_time)
>>> cloud.return_number[:] = cloud.return_number[ordered]
...
TypeError: unsupported operand type(s) for <<: 'SubFieldView' and 'int'

EVLR support is clunky

Current support for 1.4 files with EVLR is not very good (especially for LAZ files), some tests are currently skipped.