Git Product home page Git Product logo

Comments (6)

kenjitoyama avatar kenjitoyama commented on August 22, 2024

Hello @dubeyabhijeet! We do not officially support running AndroidEnv on real devices at the moment.
To do that, you'll need a new Simulator to grab screenshots and sending actions. If you don't care about performance that much, it's relatively straightforward to write one that uses adb screencap (https://developer.android.com/studio/command-line/adb#screencap) and adb input tap. Keep in mind that this solution will probably achieve only a handful of steps per second.

Cheers,

Daniel

from android_env.

dubeyabhijeet avatar dubeyabhijeet commented on August 22, 2024

Understood ! Thanks a lot....

Right now you are controlling it from PC to emulator. If i used real devices then performance will be trade off [PC to real devices].

What about deploy it through apk in phone? Like i use app traversal and on each screen use android_env to decide on actions to take? Like ndroid_env in wrapped in traversal apk and deployed on phone.

from android_env.

kenjitoyama avatar kenjitoyama commented on August 22, 2024

Yes, it's possible but at the moment there's no good way to do so. There are no APIs to fetch screenshots and execute actions from within Android, and that would also probably require root access, which is a security issue.

from android_env.

dubeyabhijeet avatar dubeyabhijeet commented on August 22, 2024

Got it ! Understood. Thanks a lot Daniel :)

from android_env.

dubeyabhijeet avatar dubeyabhijeet commented on August 22, 2024

Hi Daniel, What i see we need to add logs in apps source code. I have two queries:

  1. Not possible with 3rd party apps without source code?
  2. How deepmind trained agents for youtube? Have source code or youtube has those logs needed for rewards?

from android_env.

kenjitoyama avatar kenjitoyama commented on August 22, 2024

Hello @dubeyabhijeet!

Not possible with 3rd party apps without source code?

You can run anything you want in the emulator, but rewards are only available if something is shown in the log stream and captured by a regex.

How deepmind trained agents for youtube? Have source code or youtube has those logs needed for rewards?

We haven't trained explicitly for YouTube, it was just an example. However, we did train agents on apps for which we don't have the source code such as the Clock or System Settings by reusing accessibility events. Please see https://github.com/deepmind/android_env/blob/main/docs/example_tasks.md#accessibility-forwarder. At one point, we also extracted rewards from view hierarchies, but they're much slower to get (it needs a slow adb dumpsys call + parsing) than simple Log.i() calls. We also experimented with OCR, but though that works in synthetic text in static apps, it doesn't really work well for things like games or apps with unusal fonts and/or graphics. In theory, any event or message in Android could be used as a source of rewards, but we relied on logs because they're the easiest to implement (from a developer and task designer's point of views) and reliable.

Cheers,

Daniel

from android_env.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.