Git Product home page Git Product logo

gsa-advantage-scrape's Introduction

GSA Advantage Cart Scrape

This is a really simple script to scrape a parked cart from GSA Advantage. Thaks to Sean Herron for doing the breakthrough work. I have just filled in details.

Reason for Existence

To some extent this is experimental. If you aren't a Federal government user of GSA Advantage! or have to wish to server those users, this code is only valuable to use as an example, if that.

We are building a "spike" that re-utilizes GSA Advantage's excellent shopping cart building capabilities. Our goal is to build a searchable database of shopping carts and to smooth the process of communication around carts. This project could be considered a part of the project here at github codenamed "mario" (https://github.com/18F/Mario), but I am making this repo stand-alone in hopes of greater reuse.

We of course invite assistance, but this is not the best project to help on, due to its very "spikey" nature, so I am not opening any issues at present---but improving my Python code is generally easy to do!

Installation

# optional - install within VirtualEnv
cd src
virtualenv gsascraper
cd ..

pip install -r requirements.txt

Usage

To make this useful, we support command-line, CGI, and Flask usage.

Command-Line

The command line usage is simple:

python gsa-scrape-commandline.py

CGI

However, we normally used this a web-service API, and therefore host it with Apache. Although there are many ways to do this, I prefer to do it as a virtual host. This requires two steps: Adding an entry to /etc/hosts file and adding a VirtualHost entry to your apache configuration.

This is the entry I add to /etc/hosts, which allow the url http://gsa-advantage-scraper to be resolved to localhost.

127.0.0.1       gsa-advantage-scraper

The example configuration looks like this, although you may prefer not to use port 80, which makes more sense if you do not which to expose this service outside the machine on which you have installed it. If you, for example, install it on 8080, be sure to configure Apache to listent on that port. Be sure that the CGI handler is installed.

<VirtualHost gsa-advantage-scraper:80>
    ServerName gsa-advantage-scraper
    ScriptAlias /cgi-bin/ /Users/robertread/projects/gsa-advantage-scrape/src/ \

      DocumentRoot /Users/robertread/projects/gsa-advantage-scrape/src
       <Directory "/Users/robertread/projects/gsa-advantage-scrape/src">
            Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch
            Order allow,deny
            AllowOverride None
            Allow from all
            DirectoryIndex index.php
            AddHandler cgi-script .py .cgi
      </Directory>
</VirtualHost>

Depending on whether have used the "VirtualEnv" installation method above or have installed your python packages where they can be reached without VirtualEnv, you should access gsa-adv-cart-ve.py or gsa-adv-cart.py.

Flask App

To make the API available via a Flask app, run:

python src/server.py

then do requests to http://127.0.0.1:5000/api/v1/carts/CART_ID?u=USER&p=PASSWORD. Note this endpoint supports JSONP via a callback parameter.

Note

The scraper is best tested by either using Curl or by using the commandline program included. If you know all of the necessary credentials, the commandline interface is the best way to insure that you have all Python modules installed.

Curl is the best way to make sure that you have Apache configured as you desire.

The most common usage of this project is to serve as an API used by the project Mario but you of course may use it as you see fit.

Public Domain

This project is in the worldwide public domain. As stated in CONTRIBUTING:

To test, you can run "python gsa-advantage-py".

Public domain

This project is in the worldwide public domain. As stated in CONTRIBUTING:

This project is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication.

All contributions to this project will be released under the CC0 dedication. By submitting a pull request, you are agreeing to comply with this waiver of copyright interest.

gsa-advantage-scrape's People

Contributors

afeld avatar annalee avatar dlapiduz avatar konklone avatar phirefly avatar robertlread avatar seanherron avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.