Git Product home page Git Product logo

jimut123 / jimutmap Goto Github PK

View Code? Open in Web Editor NEW
138.0 7.0 16.0 3.98 MB

API to get enormous amount of high resolution satellite images from satellites.pro quickly through multi-threading! create map your own map dataset. Bringing data to Humans.

Home Page: https://jimutmap.readthedocs.io/

License: GNU General Public License v3.0

Python 100.00%
scraping satellite images high resolution api geo satellite-data multithreading enormous

jimutmap's Introduction


...Bringing Data to Humans



πŸ“‹ Contents

Note: I am actively looking for project maintainers who can volunteer to fix bugs/issues and work on TODOs, due to my limited time in maintaining this project. If you want to be a maintainer, either solve a bug or successfully complete a TODO, then email me for the role (this process is for selecting valid maintainers).

πŸ” Purpose

This package collects data from satellites.pro. It fetches all the tiles (image and road mask pair) as given by the parameters provided by the user. This uses an API-key generated at the time of browsing the map. There are some future plans for this project, check TODO to see what this will support in the future.

The api accessKey token is automatically fetched if you have Google Chrome or Chromium installed using chromedriver-autoinstaller. Otherwise, you'll have to fetch it manually and set the ac_key parameter (which can be found out by selecting one tile from Apple Map, through chrome/firefox by going Developer->Network, looking at the assets, and finding the part of the link beginning with &accessKey= until the next &) every 10-15 mins.

[Back to Top]

πŸ’‘ Need for scraping satellite data

Well it's good (best in the world) satellite images, we just need to give the coordinates (Lat,Lon, and zoom) to get your dataset of high resolution satellite images! Create your own dataset and apply ML algorithms :')

The scraping API is present, call it and download it.

[Back to Top]

πŸ›  Installation and Usages

sudo pip3 install jimutmap

# Install google chrome for chrome driver
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
sudo apt install ./google-chrome-stable_current_amd64.deb

# optional for viewing the temporary files generated by internal databases
sudo apt-get install sqlite sqlitebrowser

Needs to have google chrome web browser in the system.

For example usage, check test.py

Sorry, 5 -- threads unavailable, using maximum CPU threads : 4
Initializing jimutmap ... Please wait...
Sorry, 50 -- threads unavailable, using maximum CPU threads : 4
Initializing jimutmap ... Please wait...
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 20/20 [00:00<00:00, 113.67it/s]
Sorry, 50 -- threads unavailable, using maximum CPU threads : 4
Initializing jimutmap ... Please wait...
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 20/20 [00:00<00:00, 722.10it/s]
Total satellite images to be downloaded =  210
Total roads tiles to be downloaded =  210
Approx. estimated disk space required = 4.1015625 MB
Total number of satellite images needed to be downloaded =  210
Total number of satellite images needed to be downloaded =  210
Batch =============================================================================  1
===================================================================================
Sorry, 50 -- threads unavailable, using maximum CPU threads : 4
Downloading all the satellite tiles: 
Updating sanity db ...
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 27/27 [00:00<00:00, 13291.81it/s]
Total number of satellite images needed to be downloaded =  197
Total number of satellite images needed to be downloaded =  196
Downloading speed == 0.09333877563476563 MiB/s 
Waiting for 15 seconds... Busy downloading
Downloading speed == 0.11976458231608073 MiB/s 
Waiting for 15 seconds... Busy downloading
Downloading speed == 0.01717344919840495 MiB/s 
Waiting for 15 seconds... Busy downloading
Batch =============================================================================  2
===================================================================================
Downloading all the satellite tiles: 
Updating sanity db ...
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 420/420 [00:00<00:00, 99921.03it/s]
Total number of satellite images needed to be downloaded =  0
Total number of satellite images needed to be downloaded =  0
************************* Download Sucessful *************************
Cleaning up... hold on
Updating sticher db ...
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 420/420 [00:00<00:00, 24357.17it/s]
Total number of satellite images needed to be downloaded =  0
Total number of satellite images needed to be downloaded =  0
Calculating bounding boxes for tiles :: 
Total number of rows present in the database=  210
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 210/210 [00:00<00:00, 528693.78it/s]
Min lat tile = 390842, Max lat tile = 390855, Min lon tile = 228264, Max lon tile = 228278
No. of tiles in latitude = 13, and longitude = 14
Creating an image of size : 3328x3584 pixels ...
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 13/13 [00:00<00:00, 28.89it/s]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 13/13 [00:00<00:00, 42.02it/s]
Temporary sqlite files to be deleted = ['temp_sanity.sqlite', 'sticher.sqlite'] ? 
(y/N) : y
Temporary chromedriver folders to be deleted = ['100'] ? 
(y/N) : y

[Back to Top]

πŸ“š Some of the example images downloaded at different scales

[Back to Top]

πŸ“˜ Datasets

Jimutmap might behave weirdly in some cases. Please check the list of datasets here.

πŸ“š Stitched tiles for Kolkata

[Back to Top]

πŸ“Ή YouTube video

If you are confused with the documentation, please see this video, to see the scraping in action Apple Maps API to get enormous amount of satellite data for free using Python3.

[Back to Top]

πŸ“š Sample of the images downloaded

img of sat dat

[Back to Top]

:feelsgood: Perks

This is done through parallel proccessing, so this will take maximum threads available in your CPU, change the code to your own requirements!

If you want to re-fetch tiles, remember to delete/move tiles after every fetch request done! Else you won't get the updated information (tiles) of satellite data after that tile. It is calculated automatically so that all the progress remains saved!

[Back to Top]

πŸ““ Additional Note

This is created for educational and research purposes only! The authors are not liable for any damage to private property.

[Back to Top]

:atom: TODOs

Please check TODOs, since this project needs collaborators.

[Back to Top]

❓ Questions or want to discuss about something ?

Submit an issue.

[Back to Top]

🀝 Contribution

Please see Contributing.md

[Back to Top]

πŸ›‘οΈ LICENSE

 GNU GENERAL PUBLIC LICENSE
                       Version 3, 29 June 2007

 Copyright (C) 2019-20 Jimut Bahan Pal, <https://jimut123.github.io/>
 Everyone is permitted to copy and distribute verbatim copies
 of this license document, but changing it is not allowed.

[Back to Top]

πŸ“ BibTeX and citations

@misc{jimutmap_2019,
  author = {Jimut Bahan Pal},
  title = {jimutmap},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/Jimut123/jimutmap}}
}

[Back to Top]

jimutmap's People

Contributors

beingsoumyadeepsharma avatar dependabot[bot] avatar imgbotapp avatar jimut123 avatar soumith avatar sourav-raj avatar tigerhawkvok avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

jimutmap's Issues

Improvement on _getAPIKey method

Some internet connections -such as mine- are slower than others. In my case the attribute 'data-map-printing-background' wasn't loading fast enough so the value of mapData was None and an error was raised.
Code:

time.sleep(15)
baseMap = driver.find_element_by_css_selector("#map-canvas .leaflet-mapkit-mutant")
mapData = baseMap.get_attribute("data-map-printing-background")

The wait time(5 seconds) is not flexible enough for all internet connections, for me a 10-15 seconds wait works best. Maybe it would be better to build a version of the code that is not type-dependent but situation-dependent.

I will create a fork if i find the time.

Getting historical Imagery

When we the base code

download_obj = api(min_lat_deg = 10,
                      max_lat_deg = 10.2,
                      min_lon_deg = 10,
                      max_lon_deg =  base 11,
                      zoom = 19,
                      verbose = False,
                      threads_ = 5, 
                      container_dir = ".")

it downloads the current image of that specific location. Is there any possible way we can implement a way to get the historical image of any location (within reason).

Also is there any way I can only get satillieste images and none of the other formats

Black tiles and Extremely slow download speed

The script downloads nothing but black tiles at a rate of 0.0017232894897460938 MiB/s

To Reproduce
Steps to reproduce the behavior:
Run either Test.Py or a modified version of it

**Desktop

  • OS: [Win 10]
  • Browser [Chrome]
  • Python 3.7
  • Jimutmap 1.4.2

Additional context
Zoom level was 17, Cores 4, Satellites Pro had up to 19 zoom for the region being downloaded, API key auto generated
Screenshots
issue1
Issue2

Program
modifiedtest.txt

Problem with token aquision

Describe the bug
After installing chrome into wsl along with required dependencies it fails to get token for the map or fails on inicialization of chrome auto installer package (see output below).

To Reproduce
Steps to reproduce the behavior:

  1. Run python3 test.py

Expected behavior
Receive satellite images from the scraper into chosen directory.

Screenshots

WARNING:root:Can not find chromedriver for currently installed chrome version.
Traceback (most recent call last):
  File "/home/patrik/jimutmap/test.py", line 15, in <module>
    from jimutmap import api, sanity_check, stitch_whole_tile
  File "/home/patrik/jimutmap/jimutmap/__init__.py", line 19, in <module>
    from .sanity_checker import sanity_check
  File "/home/patrik/jimutmap/jimutmap/sanity_checker.py", line 221, in <module>
    sanity_obj = api(min_lat_deg = 10,
  File "/home/patrik/jimutmap/jimutmap/jimutmap.py", line 86, in __init__
    self._getAPIKey()
  File "/home/patrik/jimutmap/jimutmap/jimutmap.py", line 189, in _getAPIKey
    driver = webdriver.Chrome(executable_path= chromeDriverPath, options=options)
TypeError: WebDriver.__init__() got an unexpected keyword argument 'executable_path'

Desktop

  • OS: Windows 10 (WSL with Ubuntu 22.04 LTS)

ret_lat_lon function is not giving the correct result for given pixel and zoom

Describe the bug
If we try to download the images with the name lat_long.jpg using the function ret_lat_lon instead of xTiles_yTiles.jpg, the conversion of latitude doesn't seem correct.
Even if we just call the 1st functions on some (xTiles, YTiles) and pass the output (lat, lon) to lat function, results will vary.

To Reproduce
Steps to reproduce the behavior:

  1. Easiest way to reproduce this is by calling the functions
    api_=api(min_lat_deg = 23.0,
    max_lat_deg = 23.03,
    min_lon_deg = 72.49,
    max_lon_deg = 72.53, zoom=18)
    api_.ret_xy_tiles(23.0, 72.49), api_.ret_xy_tiles(23.03, 72.53)
  2. output is : ((183857, 113855), (183886, 113831))
  3. now call the 2nd function:
    api_.ret_lat_lon(183857, 113855), api_.ret_lat_lon(183886, 113831)
  4. output is :
    ((21.888705486651332, 72.48916625976562), (21.914930407351424, 72.52899169921875))

Expected behavior
output should be ((23.0, 72.49), (23.03, 72.53))

Desktop (please complete the following information):

  • OS: [Windows 10]
  • Browser [chrome]

Does it still work?

I came here from a research project requiring satellite images. I noticed somethings that are not mentioned in the project.
First, getting the API access key, as soon as open the developer tools for the network tab, the GET request for tiles starts getting rejected from the server.

If I try doing it from the selenium headless driver, it gets timeout. Did the satellite.pro team patched the scraping vulnerability or is it just a random chance that this is happening?

Not loading

Hello,
first of all thank you for this library!
I do have some problems with it, after initializing, the CPU`s are busy and the script is running, but 0%| | 0/117 [00:00<?, ?it/s] - this is not indicating any work done, after hours. The OS is Windows 10, all pre-requisites are installed as mentioned in README.

Greetings !

Variables from longitude and latitude

Hey, thanks for sharing your library :)
I'm pretty new to python and I wonder if the longitude and latitude can be exchanged by variables? I'd like to get them from a city and country so I did the following combination:

from geopy.geocoders import Nominatim
default_user_agent= 'YOURUSERAGENT'
geolocator = Nominatim(user_agent="my_user_agent")
city ="YourCity"
country ="YourCountry"
loc = geolocator.geocode(varcity+','+ varcountry)
print(loc.latitude, loc.longitude)

maxlat = loc.latitude + 0.5
maxlon = loc.longitude + 0.5

print (maxlat, maxlon)


from jimutmap import api

download_obj = api (min_lat_deg = loc.longitude,
                    max_lat_deg = maxlat,
                    min_lon_deg = loc.latitude,
                    max_lon_deg = maxlon,
                    zoom = 19, verbose = False,
                    threads_ = 5,
                    container_dir = "YOUR PATH")

download_obj.download(getMasks = True)

Both parts work separate, but when I bring them together, they either give me black images or telling me
AssertionError
assert min_lon_deg < max_lon_deg

Do you have any clue? :-)

BLANK IMAGES

Running the test script all images are totally blank.
How to fix it?

Black images when i download tiles from test.py

Describe the bug
Hello, i have been trying to use the jimutmap, that i think is an interesting python script for geospatial analysis, i already installed in windows 11 and ubuntu, following all of the steps and properly installing each dependency with their correspondent version (i.e. numpy, opencv, selenium, etc)... the problem that i have been facing in the last days is when i download the tiles from the test.py, i only download black images also for the street maps, i tried different zoom options, and modifying any other parameters, but, same problem, any recommendation?, many thanks in advance for your support... best regards

Screenshots
Screenshot from 2023-02-02 14-49-47

Desktop (please complete the following information):

  • Ubuntu 20.04 and windows 11
  • Chrome
  • Python version 3.7 and python version 3.9

Iterating over api results in no images downloaded

Describe the bug
I have a pandas dataframe with locations I wish to download tiles for. I am downloading a limited area around each location. However placing in a loop, I find that often no images are downloaded. Deleting the chromedriver and retrying can fix the issue, but not always

To Reproduce

for index in tqdm(range(len(df))):
    test_lat = df.iloc[index]['lat']
    test_lon = df.iloc[index]['lon']
    extent = 0.01

    min_lat_deg = test_lat - extent
    max_lat_deg = test_lat + extent
    min_lon_deg = test_lon - extent
    max_lon_deg = test_lon + extent

    print(min_lat_deg, max_lat_deg, min_lon_deg, max_lon_deg)

    download_obj = api(
        min_lat_deg = min_lat_deg, # min_lat,
        max_lat_deg = max_lat_deg, # max_lat,
        min_lon_deg = min_lon_deg, #Β min_lon,
        max_lon_deg = max_lon_deg, #Β max_lon,
        zoom = 16, #Β 0 is min, 17 is good
        verbose = False,
        threads_ = 5, 
        container_dir = img_dir
        )

    download_obj.download(getMasks = False)
    time.sleep(2) # wait for download to finish

Expected behavior
Either an exception is raised if there are no images to download, or some mechanism is available to retry

Screenshots
NA

Desktop (please complete the following information):

  • OS: macOS
  • Browser: chrome
  • Version: jimutmap==1.3.9

Additional context
Add any other context about the problem here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.