brozeph / node-craigslist Goto Github PK

View Code? Open in Web Editor NEW

70.0 70.0 45.0 1.03 MB

Node driver for searching Craigslist.com listings

License: MIT License

JavaScript 100.00%

node-craigslist's People

Stargazers

Watchers

node-craigslist's Issues

Contact info no longer visible

Would it be possible to run some code that reveals the contact info and returns it?

baseHost Not Configured for Canadian Cities

To get this to work for Toronto, I had to change the code like so:

baseHost = '.craigslist.ca',

Maybe a flag in options to allow this to be set in a custom manner?

Price Filters in examples don't seem to be working

I can search for items fine, but when I use the price filters they are completely ignored. This happening for anyone else?

Search Example returning an Error: maximum redirect limit exceeded

Running the #search example in the README returns an error for too many redirects.
Tried adjusting the city and the search query but the redirect limit error is always returned.

reproduced in node versions:
v15.12.0
v14.16.0

var
  craigslist = require('node-craigslist'),
  client = new craigslist.Client({
    city : 'seattle'
  });

client
  .search('xbox one')
  .then((listings) => {
    // play with listings here...
    listings.forEach((listing) => console.log(listing));
  })
  .catch((err) => {
    console.error(err);
  });

throws:

Error: maximum redirect limit exceeded
    at ClientRequest.<anonymous> (/mnt/c/Users/mbmcm/example/node_modules/reqlib/dist/index.js:429:29)
    at Object.onceWrapper (node:events:476:26)
    at ClientRequest.emit (node:events:369:20)
    at HTTPParser.parserOnIncomingClient [as onIncoming] (node:_http_client:636:27)
    at HTTPParser.parserOnHeadersComplete (node:_http_common:129:17)
    at Socket.socketOnData (node:_http_client:502:22)
    at Socket.emit (node:events:369:20)
    at addChunk (node:internal/streams/readable:313:12)
    at readableAddChunk (node:internal/streams/readable:288:9)
    at Socket.Readable.push (node:internal/streams/readable:227:10) {
  options: {
    hostname: 'seattle.craigslist.org',
    method: 'GET',
    path: '/',
    maxRedirectCount: 5,
    maxRetryCount: 3,
    timeout: 60000,
    headers: { 'Content-Type': 'application/json', 'Content-Length': 0 },
    [Symbol(context)]: URLContext {
      flags: 400,
      scheme: 'https:',
      username: '',
      password: '',
      host: 'seattle.craigslist.org',
      port: null,
      path: [Array],
      query: null,
      fragment: null
    },
    [Symbol(query)]: URLSearchParams {}
  },
  state: {
    data: '',
    failover: { index: 0, values: [] },
    redirects: [ [URL], [URL], [URL], [URL], [URL] ],
    tries: 1,
    headers: { location: 'https://seattle.craigslist.org/' },
    statusCode: 301
  }
}

the redirects:

[
  URL {
    href: 'https://seattle.craigslist.org/search/sss?sort=rel&query=xbox%20one',
    origin: 'https://seattle.craigslist.org',
    protocol: 'https:',
    username: '',
    password: '',
    host: 'seattle.craigslist.org',
    hostname: 'seattle.craigslist.org',
    port: '',
    pathname: '/search/sss',
    search: '?sort=rel&query=xbox%20one',
    searchParams: URLSearchParams { 'sort' => 'rel', 'query' => 'xbox one' },
    hash: ''
  },
  URL {
    href: 'https://seattle.craigslist.org/',
    origin: 'https://seattle.craigslist.org',
    protocol: 'https:',
    username: '',
    password: '',
    host: 'seattle.craigslist.org',
    hostname: 'seattle.craigslist.org',
    port: '',
    pathname: '/',
    search: '',
    searchParams: URLSearchParams {},
    hash: ''
  },
  URL {
    href: 'https://seattle.craigslist.org/',
    origin: 'https://seattle.craigslist.org',
    protocol: 'https:',
    username: '',
    password: '',
    host: 'seattle.craigslist.org',
    hostname: 'seattle.craigslist.org',
    port: '',
    pathname: '/',
    search: '',
    searchParams: URLSearchParams {},
    hash: ''
  },
  URL {
    href: 'https://seattle.craigslist.org/',
    origin: 'https://seattle.craigslist.org',
    protocol: 'https:',
    username: '',
    password: '',
    host: 'seattle.craigslist.org',
    hostname: 'seattle.craigslist.org',
    port: '',
    pathname: '/',
    search: '',
    searchParams: URLSearchParams {},
    hash: ''
  },
  URL {
    href: 'https://seattle.craigslist.org/',
    origin: 'https://seattle.craigslist.org',
    protocol: 'https:',
    username: '',
    password: '',
    host: 'seattle.craigslist.org',
    hostname: 'seattle.craigslist.org',
    port: '',
    pathname: '/',
    search: '',
    searchParams: URLSearchParams {},
    hash: ''
  }
]

Search returns max 120 records

Hej, a quick question is there any possibilty to extend the length of search results ?
Now its only 120 it looks like its only searching on the first page on craigslist.

Getting TypeError when making a list() or search() request

I'm using your library to build out a small React app that will act as a wrapper for the craigslist client. When calling search() or list() on the client object, I get a TypeError: req.setTimeout is not a function that originates from req.setTimeout(options.timeout, req.abort); at Line 306 in web.js . I am not sure what the cause of this could be as http and https are dependencies of this package (i.e. the object should be there)

HTTP error received: This IP has been automatically blocked

I'm getting a 400 or higher status code on my call to list(options) method

node:2037) UnhandledPromiseRejectionWarning: Error: HTTP error received
    at IncomingMessage.<anonymous> (/Users/j/PersonalProjects/node_modules/reqlib/dist/index.js:515:29)
    at IncomingMessage.emit (events.js:203:15)
    at endReadableNT (_stream_readable.js:1145:12)
    at process._tickCallback (internal/process/next_tick.js:63:19)
(node:2037) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:2037) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

Process finished with exit code 0

url property contains duplicate url inside

https://travis-ci.org/brozeph/node-craigslist/jobs/278743585
Error: getaddrinfo ENOTFOUND seattle.craigslist.orghttps seattle.craigslist.orghttps:443

Somehow the url property has a duplicate url inside, for example, https://atlanta.craigslist.orghttps://atlanta.craigslist.org/atl/cto/d/2006-mercedes-benz-e350-sedan/6277839728.html.

Url for listings in nearby city are wrong.

If I have a search query in Chicago and Craigslist displays listings in nearby cities then the url is wrong. It will end up being something like:

https://chicago.craigslist.com//rockford.craigslist.com/....

Should be

https://rockford.craigslist.com/...

Does this support posting listings on CL?

lots of redirect errors


Error: maximum redirect limit exceeded
{ host: 'vancouver.craigslist.org',
     method: 'GET',
     path: '//vancouver.craigslist.ca/vancouver.craigslist.ca/vancouver.craigslist.ca/vancouver.craigslist.ca/vancouver.craigslist.ca/search/sss?sort=rel&query=xbox',
     pathname: '//vancouver.craigslist.ca/vancouver.craigslist.ca/vancouver.craigslist.ca/vancouver.craigslist.ca/vancouver.craigslist.ca/search/sss',
     rawStream: false,
     secure: true },

It appears to be on only some domains not all.

use proxy with request call?

Can I pass request options to search()? ie: { proxy: 'http://xxxx' }. It would be easy since you're using request http lib to merge custom request options into your defaults.

No 'Access-Control-Allow-Origin' header present

I am getting an error while trying to search using the client:

while using the following example code from the documentation:

  craigslist = require('node-craigslist'),
  client = new craigslist.Client({
    city : 'seattle'
  });
 
client
  .search('xbox one')
  .then((listings) => {
    // play with listings here...
    listings.forEach((listing) => console.log(listing));
  })
  .catch((err) => {
    console.error(err);
  });  ```


Are there any work-arounds for this?

Craiglist has changed markup

Looks like Craigslist has changed their markup since the last time this was updated. I'm working on a PR for the changes.

Here is an example element from the new markup:

<li class="result-row" data-pid="5860176271">

        <a href="/see/ctd/5860176271.html" class="result-image gallery" data-ids="1:00P0P_3IlMCFImstr,1:00Q0Q_2yT6oiXMXMm,1:00q0q_acQKZKPZrAw,1:00w0w_kLyz6zFzOtT,1:01010_9DjwOoUDpJq,1:00707_6ylrUdl3pnR,1:01414_4qnNxdSbTP1,1:00g0g_8PXG6lYQNhL,1:01717_iFWPkVwo1B7,1:00M0M_8doB1KRfskj,1:00M0M_aQFi2kHTwk4,1:00000_jbQc0aR0CfQ,1:00j0j_4HP8wRGtHfR,1:00g0g_aRJpJdVrWYW,1:00d0d_4uReTVs8kDj,1:00t0t_4IK8FnVWYJW,1:00909_kwyv7hM4uGa,1:00101_6n4ALF8ZmQx,1:00L0L_1LplADcLFqk,1:00i0i_hGWKB4DH2UN,1:01313_3Nvvxpu7u4e,1:00909_3fMOGJ7JQwD,1:00j0j_eP5r5fWHLDF,1:00000_95ycGWFZvkP"></a>

    <p class="result-info">
        <span class="icon icon-star" role="button">
            <span class="screen-reader-text">favorite this post</span>
        </span>

            <time class="result-date" datetime="2016-11-03 17:40" title="Thu 03 Nov 05:40:41 PM">Nov  3</time>


        <a href="/see/ctd/5860176271.html" data-id="5860176271" class="result-title hdrlnk">2006 *GMC Sierra* 1500 Denali AWD - Clean Carfax History! 2006 GMC Sie</a>


        <span class="result-meta">


                <span class="result-hood"> (*GMC* *Sierra*)</span>

                <span class="result-tags">
                    pic
                    <span class="maptag" data-pid="5860176271">map</span>
                </span>

            <span class="banish icon icon-trash" role="button">
                <span class="screen-reader-text">hide this posting</span>
            </span>
            <span class="unbanish icon icon-trash red" role="button" aria-hidden="true"></span>
            <a href="#" class="restore-link">
                <span class="restore-narrow-text">restore</span>
                <span class="restore-wide-text">restore this posting</span>
            </a>
        </span>
    </p>
</li>

This IP has been automatically blocked

I'm getting this error message ,from every record I pull

Listing URL from nearby site returns strange values

For example, when I receive a listing from New Orleans that was supposed to be from Mobile, it returns a URL like this:

http://mobile.craigslist.com/http//neworleans.craigslist.com...

contact info from search call?

Is it possible to add contact info from the search api?

ie: email

price shows twice

from search() call: .price is $60$60

Results appear cached after the first request. Possible to still use Proxy?

This package is awesome!

Unfortunately, all my results appear cached after the first request. I've verified this by running the same search on craigslist in the browser. In the browser I see new listings followed by listings that match up with my stale results.

I am using nocache: true

My guess is that craigslist is sending me the same results over and over.

I have copied my code into a repl and was able to pull updated results, but again after the first pull, all subsequent pulls brought in the same data.

Here is a link to that repl:
repl
Here is a link to the same search on craigslist:
CL link
Note: if you test this, you should run my repl once, see if it matches with craigslist, wait a few minutes for more results to be added to craigslist, and then refresh both the repl and craigslist. results will now be out of sync.

I was hoping sending requests through a proxy would allow me to get fresh results every time.

I noticed from this issue that it was once possible to use a proxy. proxy issue

Is it possible to get the proxy option back or is there an alternative solution to this issue?

Thank you

maxAsk won't work

Looks like craigslist now uses max_price in query params instead

brozeph / node-craigslist Goto Github PK

node-craigslist's People

Stargazers

Watchers

Forkers

node-craigslist's Issues

Recommend Projects

Recommend Topics

Recommend Org