Git Product home page Git Product logo

Comments (2)

edsu avatar edsu commented on September 15, 2024 1

Hi @JROliver3 -- thanks for the question.

If you want to automate hydration I suggest you take a look at our other utility twarc. It is a command line utility, but also a Python software library. Hydrator was really designed for people who are comfortable doing data analysis on CSV files, but aren't familiar with running things from the command line or doing systems programming.

For example, to hydrate a tweet id file with twarc you can:

twarc hydrate ids.txt > tweets.jsonl

You could achieve something like what you are talking about by automating a system call, which should be doable in your preferred programming language.

Or if you know Python it would probably be cleaner to use twarc as a library. That would allow you to introduce logic for looking for files in a directory, and writing them out where you need to, maybe even to a database, or data pipeline if you wanted.

I actually helped write a little program for a group doing COVID-19 related Twitter research. It loops through a directory and hydrates the data, and might help illustrate a bit more about how you can use twarc programmatically:

https://github.com/echen102/COVID-19-TweetIDs/blob/master/hydrate.py

Please feel free to ask more questions here, or close this issue if you feel like it has been resolved.

from hydrator.

JROliver3 avatar JROliver3 commented on September 15, 2024 1

@edsu

Thanks a bunch for the reply. This will be a huge help and will prevent a lot of work for me in the future (considering that I don't know much about the hydration process on my own). Coincidentally (or maybe not) I'm also working on a COVID-19 related project using scraped data from Twitter, so it sounds like that project should be perfect. I'll check it out.

My problem is resolved, closing for now. Thanks again.

from hydrator.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.