Git Product home page Git Product logo

getoldtweets-java's People

Contributors

jefferson-henrique avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

getoldtweets-java's Issues

Timestamp?

Hello!

I have run your code, but my output file does not include timestamps of a tweet?

EX :
"realDonaldTrump;2016/01/29;3121;9720;"THANK YOU to all of the incredible volunteers.....";ETC

Is there any way to modify this to get something like 2016/01/29 04:37:15?

I see your code has a spot where it declares dates in SDF yyyy-mm-dd HH:mm, but the file itself doesn't see things that way.

Retweeter or replier username

Hi,

I would like to know if it's possible to get retweeter username from the example that you propose. Is it possible include mentions and replies (username)?

Maybe you know Nodexl application, which output file shows these informations.

LM

Timeout Issue

Is there a limit on max tweets that we can get ?
Shows this error on getting 7000 tweets

java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at sun.security.ssl.InputRecord.readFully(Unknown Source)
at sun.security.ssl.InputRecord.read(Unknown Source)
at sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)
at sun.security.ssl.SSLSocketImpl.readDataRecord(Unknown Source)
at sun.security.ssl.AppInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read1(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)

Support ID export of each tweet

It'd be great to export the ID of each tweet as well.

It's accessible here:
<li class="js-stream-item stream-item stream-item expanding-stream-item " data-item-id="XXXXX" id="stream-item-tweet-XXXXX" data-item-type="tweet">
and
<a href="/username/status/XXXXX" class="tweet-timestamp js-permalink js-nav js-tooltip" data-original-title="11:00 PM - 30 Jun 2015">

Some tweets missing

I'm doing a query of all tweets of a certain hashtag. I'm only getting ~2000 of the 9500 total possible tweets (verified by using another website, and indeed there were tweets missing when I checked out certain dates).

My hypothesis is that this is due to the loop continuing before waiting for the data to be added to results. I want to test this by using Thread.sleep(100).

This is kind of embarassing...but it's been a while since I've used Java, so I don't remember how to build a new jar. So I should do:

mvn clean; mvn package

Then what?

Missing Tweets

Hello mate,

Thanks for sharing your project!
I'm using it in order to get all tweets posted by some users but it seems not working quite well. There are some tweets that can be accessed through a browser that arent when using this lib.

For instance, in a user with 1800 tweets, I only get 866 tweets. This seems to be a bit random because I didnt find any similarities in the missed tweets.

Do you have any suggestion to fix this problem?

Thanks! :)

usernames of Retweets

Can you help me with the code to find retweets data in form of usernames of people retweeted on a tweet?

Unable to retrieve tweets if it is more than 200

Hi Jefferson,

I was referring your project "GetOldTweets-java". But, i am facing the issue while retrieving tweets if it is more than 200 .

for ex :
java -jar got.jar querysearch="coke" maxtweets=250

whichever number i set for maxtweets it retrieved only 200 tweets. is there any way to retrieve more .

Regards,
Sachin

I don't have all the tweets

Hello,

I'm using this API for my final project of the university. I need to consult tweets between two dates. I run the command line and I have compared the result with a search in Twitter and I don´t have all the tweets. Somebody know what is the problem?

Thanks

Some tweets are incomplete

Hi Jefferson! Thank you so much for this project, I've been searching for something like this from a long time ago. I'm currently downloading some data and everything seemed to be fine but I checked Output.csv and text is incomplete in some tweets. Am I doing something wrong? This is my query:

java-jar got.jar querysearch="@manuelacarmena" maxtweets=100

HTTP Error 500

Hi,
First I would like to thank you for your work !
However, I have an issue when I try to get a lot of tweets by query search with some keywords.
I got the following error when I tried to collect 100k tweets containing "Sarkozy" between 2012/23/04 and 2012/05/05

java.io.IOException: Server returned HTTP response code: 500 for URL: https://twitter.com/i/search/timeline?f=realtime&q=+since%3A2012-04-23+until%3A2012-05-05+Sarkozy&src=typd&max_position=TWEET-198092582553792512-198562641613033472-BD1UO2FFu9QAAAAAAAAETAAAAAcAAAASAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(Unknown Source)
at me.jhenrique.manager.TweetManager.getURLResponse(TweetManager.java:61)
at me.jhenrique.manager.TweetManager.getTweets(TweetManager.java:84)
at me.jhenrique.main.Main.main(Main.java:50)

The error occured after collecting 22862 tweets. I ran it again and got the same error after collecting 6863 tweets. What's weird is that the program worked just fine when I did the same with "Hollande" as keyword.
Could you help me fix it ? :)
Thanks a lot !

Java error

Hi, sometimes I get this message and the query is cut off.

java.lang.NumberFormatException: For input string: ""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:504)
at java.lang.Integer.valueOf(Integer.java:582)
at me.jhenrique.manager.TweetManager.getTweets(TweetManager.java:96)
at me.jhenrique.main.Exporter.main(Exporter.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)

What does it refer to? How can we resolve it?

Not all tweets seem to download

Hey Jefferson, great application you wrote here.
I used it quite a bit, and I get reasonably good results. However, not all tweets seem to download. I tried this for BarackObama and over the last few days, most tweets get in (retweets don't, which is not the issue) but not all of his tweets of the last few days seem to be included in the app. Is there any reason tweets being excluded?

How to cite GetOldTweets?

Hi Jefferson,

I used GetOldTweets for an article and I want to cite it. Have you ever presented the software somewhere, so we can cite that? Or should we just give your GitHub URL?

getting random tweets in special geo and time

Hi
thanks for your good program. I'm using your program for getting tweets in special geo and time. but when I doesn't add query or username, it get error after getURLRespons and url is like:"url after getURLResponse: {"message":"Sorry! We did something wrong."}"
Is it possible to get random tweet in special time, without add query or username?

Geo Location Info

Hi, first off thanks a lot for the great work in creating this project!

Was just wondering if the Geo Location information is being parsed / returned correctly. Currently all the tweets retrieved seems to have it's getGeo() as empty string.

Thanks a lot!

How can I use the geo in the tweet class form command-line ?

Great program! Works flawlessly.

I am trying to get tweets only from Spain. I tried to get this into the querysearch by using "searchterm near:"España" within:250mi" (as in Twitter's adavanced search), but then the timeframe (since/until) won't work.

How can I use the geo in the tweet class form command-line? I've tried with country (Spain), country code (ES and SP) and with geodata - (36.085152, -9.122096, 43.805610, 3.319509) with no luck...

Or is it possible to get data for country, or geo_coords from the sender?

Thanks!

Incorrect splitting for separating hashtags

In the case where there is another tweet, a picture or similar embedded in a tweet, the text of the tweet and this attachment are not separated. See this example:
https://twitter.com/ACM_CHIIR/status/837247495864479744
Hooray - the proceedings arrived today! #chiir2017pic.twitter.com/lLZMfTpR0F

This is by itself unsatisfying, but also leads to problems in the detection of a hashtag.

If like in the example, the last token of a tweet before the attachment is a hashtag, tweet.getHashtags() will output #chiir2017pic as a hashtag, where it should simply be #chiir2017 .

unknown host exception

java.net.UnknownHostException: twitter.com
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at org.apache.http.impl.conn.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:45)
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:111)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353)
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
at main.java.me.jhenrique.manager.TweetManager.getURLResponse(TweetManager.java:80)
at main.java.me.jhenrique.manager.TweetManager.getTweets(TweetManager.java:96)
at main.java.me.jhenrique.main.Main.main(Main.java:28)
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:653)
at java.util.ArrayList.get(ArrayList.java:429)
at main.java.me.jhenrique.main.Main.main(Main.java:28)

How can i solve this particular problem?
I added the following lines of code to TwitterManager,
String host = "10.185.XX.XX";
String port = "8080";
System.out.println("Using proxy: " + host + ":" + port);
System.setProperty("https.proxyHost", host);
System.setProperty("https.proxyPort", port);
But still no use. Please help. My company uses a proxy server.

Doesn't work

I tried with twitter4j and proxy and it worked well. Now I'm forwarding to your project for old tweets. But it didn't work anyway. Can you help me?

org.apache.http.conn.HttpHostConnectException: Connect to twitter.com:443 [twitter.com/173.252.73.48] failed: Connection refused (Connection refused) at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353) at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55) at me.jhenrique.manager.TweetManager.getURLResponse(TweetManager.java:64) at me.jhenrique.manager.TweetManager.getTweets(TweetManager.java:80) at me.jhenrique.main.Exporter.main(Exporter.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58) Caused by: java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:337) at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:134)

Why specify username?

Is it possible to search among all users (all public tweets) without specifying any particular username?

Locations with county+city messes the output file.

Hi Jefferson,
I realized that if the location in a tweet is given like "Brooklyn, NYC," the second part appears on the next tab (mentions) and makes every column slide one tab in that row. Sparing two tabs for locations may resolve the issue.

  • Ok may bad, fixed it on Excel.

Doesn't work

I have tried your project and it didn't get any tweet
is there some thing to do?

How to get other twitter criteria, such as "reply" and "geo"

Hi Jefferson, thank you for providing getoldtweets, I will cite you in my thesis project. However, I would like to add some basic Twitter criteria to the extrapolation, such as reply and geo. Indeed geo doesn't retrive actually the users location. Do you think you can help? Thank you, Roberta

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.