mdsecactivebreach / linkedint Goto Github PK
View Code? Open in Web Editor NEWLinkedInt: A LinkedIn scraper for reconnaissance during adversary simulation
License: GNU General Public License v3.0
LinkedInt: A LinkedIn scraper for reconnaissance during adversary simulation
License: GNU General Public License v3.0
Seem to be getting the following error message upon startup of the script
Traceback (most recent call last): File "LinkedInt.py", line 465, in <module> get_search() File "LinkedInt.py", line 189, in get_search r = requests.get(url, cookies=cookies, headers=headers) File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 75, in get return request('get', url, params=params, **kwargs) File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 60, in request return session.request(method=method, url=url, **kwargs) File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 533, in request resp = self.send(prep, **send_kwargs) File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 646, in send r = adapter.send(request, **kwargs) File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 516, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPSConnectionPool(host='www.linkedin.com', port=443): Max retries exceeded with url: /voyager/api/search/cluster?count=40&guides=List(v-%3EPEOPLE,facetCurrentCompany-%3E403184)&origin=OTHER&q=guided&start=0 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f12ae031cd0>: Failed to establish a new connection: [Errno -2] Name or service not known',))
Any help with this error will be appreciated:
Running on Python 2.7.13
Traceback (most recent call last):
File "LinkedInt.py", line 27, in
from thready import threaded
ImportError: No module named thready
Got one good run pulling data as expected, subsequent tries produce the error:
Could not authenticate to linkedin. 'NoneType' object has no attribute '__getitem__'
Creds are verified good and I'm able to login to LinkedIn from two different machines.
Any idea?
I currently have a LinkedIn premium account which I have recently downgraded to a basic account; however, it's still a valid premium account for a few more days (technically). That said, I do receive the following error:
[] 17459 Results Found
[] LinkedIn only allows 1000 results. Refine keywords to capture all data
The target org does have nearly 20k employees on LI - so that is accurate. Ideally, I would like to extract the entire company list from LI. That said, I'm not sure if this error is due to my account or if this same error would happen anyway. I didn't really want to pay to find out... Does anyone know offhand?
I'm trying to run this for a company with 812 employees and getting this error:
Traceback (most recent call last):
File "LinkedInt.py", line 477, in <module>
get_search()
File "LinkedInt.py", line 206, in get_search
print "[*] Fetching page %i with %i results" % ((p),len(content['elements'][0]['elements']))
IndexError: list index out of range
My environment;
OS: Manjaro 18.0.2 Illyria
Kernel: x86_64 Linux 4.19.13-1-MANJARO
Python 3.7.1 (default, Oct 22 2018, 10:41:28)
[GCC 8.2.1 20180831]
Installed beautifulsoup4 and thready.
I get this error when running.
LinkedInt-master]$ python LinkedInt.py
File "LinkedInt.py", line 45
print "[!] Oops, you did not enter your api_key, username, or password in LinkedInt.py"
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print("[!] Oops, you did not enter your api_key, username, or password in LinkedInt.py")?
LinkedInt-master]$
Running LinkedInt on either macOS or Kali Linux, I encounter this error whenever I try to do a scrape.
Here is a transcript of the session (with company and person info redacted)
[*] Enter search Keywords (use quotes for more percise results)
[*] Enter filename for output (exclude file extension)
LinkedInt-Test
[*] Filter by Company? (Y/N):
Y
[*] Specify a Company ID (Provide ID or leave blank to automate):
<REDACTED>
[*] Enter e-mail domain suffix (eg. contoso.com):
<REDACTED>.com
[*] Select a prefix for e-mail generation (auto,full,firstlast,firstmlast,flast,first.last,fmlast,lastfirst):
auto
[*] Automaticly using Hunter IO to determine best Prefix
[!] {first}
[+] Found first prefix
[!] Cannot load main LinkedIn page
<REDACTED>
[*] Obtained new session: <REDACTED>
[*] Using company ID: <REDACTED>
https://www.linkedin.com/voyager/api/search/cluster?count=40&guides=List(v->PEOPLE,facetCurrentCompany-><REDACTED>)&origin=OTHER&q=guided&start=0
[*] 122 Results Found
[*] Fetching 3 Pages
[*] Fetching page 0 with 40 results
[*] No picture found for <REDACTED>, ___
Traceback (most recent call last):
File "LinkedInt.py", line 477, in <module>
get_search()
File "LinkedInt.py", line 273, in get_search
email = '{}@{}'.format(user, suffix)
UnboundLocalError: local variable 'user' referenced before assignment
Getting this at the end
Traceback (most recent call last):
File "LinkedInt.py", line 447, in
prefix = content['data']['pattern']
KeyError: 'data'
Hi,
I am getting the following error:
Traceback (most recent call last):
File "LinkedInt.py", line 27, in
from thready import threaded
ImportError: No module named thready
LinkedIn have changed their json format
So the code on line 212 no longer works:
data_picture = "https://media.licdn.com/mpr/mpr/shrinknp_400_400%s" % c['hitInfo']['com.linkedin.voyager.search.SearchProfile']['miniProfile']['picture']['com.linkedin.voyager.common.MediaProcessorImage']['id']
It should be replaced with:
data_picture = c['hitInfo']['com.linkedin.voyager.search.SearchProfile']['miniProfile']['picture']['com.linkedin.common.VectorImage']['rootUrl'] + c['hitInfo']['com.linkedin.voyager.search.SearchProfile']['miniProfile']['picture']['com.linkedin.common.VectorImage']['artifacts'][3]['fileIdentifyingUrlPathSegment']
The [3] is the highest quality 800x800 but could be replaced with [0],[1] or [2] for lower quality images.
Hi,
can we change the url with other filters? like facetGeoRegion or similar. I have not found instructions on how to change the string can you help?
example
"https://www.linkedin.com/voyager/api/search/cluster?count=40&guides=List(v->PEOPLE,facetGeoRegion->eu.it,facetCurrentCompany->%s)&origin=OTHER&q=guided&start=0" % (companyID)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.