Git Product home page Git Product logo

visprog's Introduction

πŸ”₯ Checkout our new work CodeNav which addresses many limitations of VisProg and generalizes it further: πŸ”₯

βœ… Write tool descriptions Point to the codebase which you want the CodeNav agent to use - that's right, the raw source code! - CodeNav will index and search the source code directly
βœ… Generate the whole program at once CodeNav iteratively generates code (which imports and invokes functions and classes from your codebase), executes it, and then decides the next step based on the execution output. The next step could be searching in the codebase or writing more code
βœ… Generate one function call per line CodeNav generates free-form code - think of it similar to writing a code cell in an ipython notebook. While executing the current code block, CodeNav has access to global variables created while executing previous code blocks
βœ… Give up if there's an execution error CodeNav will look at execution results including errors, new variables created, and STDOUT, and will try to fix errors in the next step
βœ… Implement tools as simple function calls CodeNav gives you, as the developer of tools, flexibility to build a full-fledged codebase as you see fit - use abstractions, use object-oriented programming - just generally follow good software development practices (meaningful class/function/variable names, docstrics, specifying argument types in your code help)

Visual Programming: Compositional visual reasoning without training (CVPR 2023 Best Paper!)

By Tanmay Gupta and Aniruddha Kembhavi

[ Project Page | Arxiv Paper | Blog ]

teaser

This repository contains the official code for VisProg - a neuro-symbolic system that solves complex and compositional visual tasks given natural language instructions. VisProg uses the in-context learning ability of GPT3 to generate python programs which are then executed to get both the solution and a comprehensive and interpretable rationale. Each line of the generated program may invoke one of several off-the-shelf computer vision models, image processing routines, or python functions to produce intermediate outputs that may be consumed by subsequent parts of the program.

This code base has been designed to be:

βœ… easy to use (a simple ipynb per task)
βœ… easy to extend with new functionality by adding new modules to VisProg
βœ… easy to extend to new tasks by adding in-context examples for these tasks
βœ… minimal and modular to make it easy to dig into and build upon

Install Dependencies

conda env create -f environment.yaml
conda activate visprog

Running the notebooks

Having setup and activated the conda environment, you should be all set to run the notebooks in the notebooks/ folder. If you use an editor like VSCode, openning the .ipynbs within VSCode might be the easiest way to get started.

You will find a notebook for each of the following tasks, but they are quite similar in structure:

Simply, enter your OpenAI API key in the cell that currently reads <Enter your key here> and run the notebook. The notebooks are designed to be self-contained and should run end-to-end without any additional setup.

The basic structure of the notebooks is as follows:

  • Setup paths
  • Set OPENAI_API_KEY environment variable to use GPT3
  • Import ProgramGenerator and ProgramInterpreter classes
  • Import PROMPT (a text string containing in-context examples) or create_prompt (a function that creates the prompt on the fly)
  • Create the ProgramGenerator and ProgramInterpreter objects
  • Load the image or images to perform inference on
  • Specify the natural language question / statement / instruction
  • Generate program from the specified instruction using ProgramGenerator
  • Interpret and execute program using ProgramInterpreter
  • Visualize the returned result and visual rationale (execution trace)

Example Output

We have tried to make it easy to visualize each step of the execution trace.

For instance, when running the gqa notebook for the instruction How many people or animals are in the image? on assets/camel1.png, you should see the following outputs:

Program

BOX0=LOC(image=IMAGE,object='people')
BOX1=LOC(image=IMAGE,object='animals')
ANSWER0=COUNT(box=BOX0)
ANSWER1=COUNT(box=BOX1)
ANSWER2=EVAL(expr="{ANSWER0} + {ANSWER1}")
FINAL_RESULT=RESULT(var=ANSWER2)

Visual Rationale

assets/rationale.png

What if VisProg doesn't solve your task?

It is possible that the instruction you provide is not solved correctly by VisProg. This can happen for a few reasons:

  1. The instruction is very different from in-context examples that VisProg has seen before. In this case, even though the current set of modules may be adequate for solving the task, VisProg failed because of incorrect program generation. In this case, see if you can write a program using VisProg's modules that solves the task. If you can, then you may add this program to the in-context examples and re-run the notebook to handle similar instructions.
  2. The problem is not solvable with the current set of modules in VisProg. If this is the case, you can add new modules to VisProg to solve this task. See the next section for details.

Adding new functionality and ability to solve new tasks

  • Add new modules for enabling these functionalities to engine/step_interpreters.py. Don't forget to register these modules in register_step_interpreters function in the same file. Here's the step interpreter for the COUNT module. All modules have a similar structure with a parse, html, and execute function. The parse function parses the program string to extract the arguments and output variable. The html function generates the html representation for the execution trace. The execute function executes the module and returns the output and the html (if inspect=True) for the execution trace.

    class CountInterpreter():
        step_name = 'COUNT'
    
        def __init__(self):
            print(f'Registering {self.step_name} step')
    
        def parse(self,prog_step):
            parse_result = parse_step(prog_step.prog_str)
            step_name = parse_result['step_name']
            box_var = parse_result['args']['box']
            output_var = parse_result['output_var']
            assert(step_name==self.step_name)
            return box_var,output_var
    
        def html(self,box_img,output_var,count):
            step_name = html_step_name(self.step_name)
            output_var = html_var_name(output_var)
            box_arg = html_arg_name('bbox')
            box_img = html_embed_image(box_img)
            output = html_output(count)
            return f"""<div>{output_var}={step_name}({box_arg}={box_img})={output}</div>"""
    
        def execute(self,prog_step,inspect=False):
            box_var,output_var = self.parse(prog_step)
            boxes = prog_step.state[box_var]
            count = len(boxes)
            prog_step.state[output_var] = count
            if inspect:
                box_img = prog_step.state[box_var+'_IMAGE']
                html_str = self.html(box_img, output_var, count)
                return count, html_str
    
            return count
  • Add your in-context examples to a new file prompts/your_task_or_dataset_name.py. Note that instead of using in-context examples to generate programs, you may experiment with different ways of prompting such as providing function signatures and docstrings without needing to change the code at all!

  • You can now play with examples from this dataset using a notebook similar to those in the notebooks/ folder or create a python script to run inference on a large number of examples.

Here's what VisProg can do today

assets/teaser1.png

A summary of currently available modules

assets/modules.png

*Note that we have replaced ViLT for VQA with a more performant model called BLIP which was recently made available on Huggingface. This shows how easy it is to swap out or upgrade modules in VisProg.

Changes since the version used in the CVPR paper

  • GPT3 upgraded to text-davinci-003 from text-davinci-002
  • VQA module upgraded from ViLT to the more performant BLIP

Citation

If you find this code useful in your research, please consider citing:

@article{Gupta2022VisProg,
  title={Visual Programming: Compositional visual reasoning without training},
  author={Tanmay Gupta and Aniruddha Kembhavi},
  journal={ArXiv},
  year={2022},
  volume={abs/2211.11559}
}

visprog's People

Contributors

bigredt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

visprog's Issues

EOF occurred in violation of protocol

When I run gqa.ipynb, there is an error I can't fix:


SSLError Traceback (most recent call last)
SSLError: [SSL: UNEXPECTED_EOF_WHILE_READING] EOF occurred in violation of protocol (_ssl.c:1007)

The above exception was the direct cause of the following exception:

MaxRetryError Traceback (most recent call last)
File e:\Anaconda3\envs\visprog\lib\site-packages\requests\adapters.py:486, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
485 try:
--> 486 resp = conn.urlopen(
487 method=request.method,
488 url=url,
489 body=request.body,
490 headers=request.headers,
491 redirect=False,
492 assert_same_host=False,
493 preload_content=False,
494 decode_content=False,
495 retries=self.max_retries,
496 timeout=timeout,
497 chunked=chunked,
498 )
500 except (ProtocolError, OSError) as err:

File e:\Anaconda3\envs\visprog\lib\site-packages\urllib3\connectionpool.py:877, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
...
374 request_id=result.headers.get("X-Request-Id"),
375 )
376 # Don't read the whole stream for debug logging unless necessary.

APIConnectionError: Error communicating with OpenAI


OS: Win11
conda env: visprog

I will appreciate it if there are some solutions.

Using any Open-Source LLM

Hi, thanks for your great pioneering work!

I had a couple of questions:

  • I wonder if you tried using any of the many "free" open-source LLMs (LLaMA, MPT, etc.) during your experiments. If yes, how was the performance?
  • Did you analyze the performance of program generation (for a query) v/s the PROMPTs you use? I am sorry if it's already mentioned in the paper and I missed it.

Thanks.

Evaluation

In section 5 of the paper, you evaluated GQA and NLVR on the dataset, but you did not see the corresponding code in the open-source code, could you provide the appropriate code and dataset?

I can't thank you enough.

[BUG] ImportError: failed to find libmagic

Hi,
thank you very much for your exciting contibution and congrats for your success at cvpr2023!

Unfortunately, i'm running into an ImportError when trying to run the gqa.ipynb.

Logs:

ImportError                               Traceback (most recent call last)
Cell In[3], line 5
      2 from IPython.core.display import HTML
      3 from functools import partial
----> 5 from engine.utils import ProgramGenerator, ProgramInterpreter
      6 from prompts.gqa import create_prompt

File [c:\Users\sbene\Projects\visprog\engine\utils.py:7](file:///C:/Users/sbene/Projects/visprog/engine/utils.py:7)
      4 import numpy as np
      5 import copy
----> 7 from .step_interpreters import register_step_interpreters, parse_step
     10 class Program:
     11     def __init__(self,prog_str,init_state=None):

File [c:\Users\sbene\Projects\visprog\engine\step_interpreters.py:9](file:///C:/Users/sbene/Projects/visprog/engine/step_interpreters.py:9)
      7 import face_detection
      8 import io, tokenize
----> 9 from augly.utils.base_paths import EMOJI_DIR
     10 import augly.image as imaugs
     11 from PIL import Image,ImageDraw,ImageFont,ImageFilter

File [c:\Users\sbene\miniconda3\envs\visprog\lib\site-packages\augly\utils\__init__.py:8](file:///C:/Users/sbene/miniconda3/envs/visprog/lib/site-packages/augly/utils/__init__.py:8)
      1 #!/usr/bin/env python3
      2 # Copyright (c) Meta Platforms, Inc. and affiliates.
      3 # All rights reserved.
      4 #
      5 # This source code is licensed under the license found in the
      6 # LICENSE file in the root directory of this source tree.
----> 8 from augly.utils.asserts import (
      9     is_audio_file,
     10     is_image_file,
     11     is_video_file,
     12     validate_audio_path,
     13     validate_image_path,
     14     validate_output_path,
     15     validate_path,
     16     validate_rgb_color,
     17     validate_video_path,
     18 )
     19 from augly.utils.base_paths import (
     20     ASSETS_BASE_DIR,
     21     AUDIO_ASSETS_DIR,
   (...)
     33     VIDEO_METADATA_PATH,
     34 )
     35 from augly.utils.classes import Segment

File [c:\Users\sbene\miniconda3\envs\visprog\lib\site-packages\augly\utils\asserts.py:11](file:///C:/Users/sbene/miniconda3/envs/visprog/lib/site-packages/augly/utils/asserts.py:11)
      8 import os
      9 from typing import Tuple
---> 11 import magic
     12 from augly.utils.io import pathmgr
     15 def is_content_type(filename: str, content_type: str) -> bool:

File [c:\Users\sbene\miniconda3\envs\visprog\lib\site-packages\magic\__init__.py:209](file:///C:/Users/sbene/miniconda3/envs/visprog/lib/site-packages/magic/__init__.py:209)
    206     return m.from_descriptor(fd)
    208 from . import loader
--> 209 libmagic = loader.load_lib()
    211 magic_t = ctypes.c_void_p
    214 def errorcheck_null(result, func, args):

File [c:\Users\sbene\miniconda3\envs\visprog\lib\site-packages\magic\loader.py:49](file:///C:/Users/sbene/miniconda3/envs/visprog/lib/site-packages/magic/loader.py:49), in load_lib()
     46     pass
     47 else:
     48   # It is better to raise an ImportError since we are importing magic module
---> 49   raise ImportError('failed to find libmagic.  Check your installation')

ImportError: failed to find libmagic.  Check your installation

Error in execution step -- "TypeError:can only concatenate str (not "int") to str"

Hi! Thanks for the great work :)

I was running the NLVR notebook and get the following error for the first two statements. Any idea why?

TypeError                                 Traceback (most recent call last)
Cell In[15], line 1
----> 1 result, prog_state, html_str = interpreter.execute(prog,init_state,inspect=True)

File ~/visprog/engine/utils.py:38, in ProgramInterpreter.execute(self, prog, init_state, inspect)
     36 for prog_step in prog_steps:
     37     if inspect:
---> 38         step_output, step_html = self.execute_step(prog_step,inspect)
     39         html_str += str(step_html) + '<hr>'
     40     else:

File ~/visprog/engine/utils.py:24, in ProgramInterpreter.execute_step(self, prog_step, inspect)
     22 step_name = parse_step(prog_step.prog_str,partial=True)['step_name']
     23 print(step_name)
---> 24 return self.step_interpreters[step_name].execute(prog_step,inspect)

File ~/visprog/engine/step_interpreters.py:103, in EvalInterpreter.execute(self, prog_step, inspect)
    100     step_input = step_input.replace('xor','!=')
    102 step_input = step_input.format(**prog_state)
--> 103 step_output = eval(step_input)
    104 prog_step.state[output_var] = step_output
    105 if inspect:

File <string>:1

TypeError: can only concatenate str (not "int") to str

The program generated is as follows:

ANSWER0=VQA(image=LEFT,question='How many camels are in the image?')
ANSWER1=VQA(image=RIGHT,question='How many camels are in the image?')
ANSWER2=VQA(image=LEFT,question='How many people are in the image?')
ANSWER3=VQA(image=RIGHT,question='How many people are in the image?')
ANSWER4=EVAL(expr='{ANSWER0} + {ANSWER1} > {ANSWER2} + {ANSWER3}')
FINAL_ANSWER=RESULT(var=ANSWER4)

The subset of GQA

Hi, for a fair comparison, can you released the subset of GQA used in the paper?

Generated invalid programming

Hi!
Thanks for sharing your inspiring work.

When I am running the code, I find that the program generator sometimes gives invalid programming. For example, for the question 'What device is not black?', image ID n282436, in GQA dataset, the generated program is:

BOX0=LOC(image=IMAGE,object='black')
IMAGE0=CROP_NOT(image=IMAGE,box=BOX0)
ANSWER0=VQA(image=IMAGE0,question='What device is this?')
FINAL_RESULT=RESULT(var=ANSWER0)

However, there is no module named 'CROP_NOT' and it leads to running error. How did you fix such cases in the experiments?

Dataset for evaluation

Thanks for the work, could you release the dataset used for the evaluation task thank you!

Release of self-made datasets

Hi, thank you for your exciting work! I find that for the image editing and factual knowledge object tagging task, you annotate the evaluation dataset by yourself.

To evaluate VISPROG on this task, we annotate 100 tagging instructions across 46 images
To test VISPROG on the image editing instructions for de-identification, object highlighting, and object replacement, we collect 107 instructions across 65 images.

I'd like to ask if you have any plans to release them so that I can reproduce your results.
Thank you very much and looking forward to your reply.

TypeError while executing 'REPLACE' instruction.

HI! Thank you for your great work.
While executing image_editing.ipynb with the instruction "Replace the red bus with a blue bus", I bumped into the TypeError like below. Does anyone know how to solve the issue?

result, prog_state, html_str = interpreter.execute(prog,init_state,inspect=True)

SEG
label_ids_to_fuse unset. No instance will be fused.
dict_keys(['segmentation', 'segments_info'])
SELECT
REPLACE
C:\Users\username\anaconda3\Lib\site-packages\diffusers\models\attention_processor.py:1244: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:263.)
hidden_states = F.scaled_dot_product_attention(
100%
50/50 [00:05<00:00, 9.47it/s]

TypeError Traceback (most recent call last)
Cell In[11], line 1
----> 1 result, prog_state, html_str = interpreter.execute(prog,init_state,inspect=True)

File ~\visprog\engine\utils.py:38, in ProgramInterpreter.execute(self, prog, init_state, inspect)
36 for prog_step in prog_steps:
37 if inspect:
---> 38 step_output, step_html = self.execute_step(prog_step,inspect)
39 html_str += step_html + '


'
40 else:

File ~\visprog\engine\utils.py:24, in ProgramInterpreter.execute_step(self, prog_step, inspect)
22 step_name = parse_step(prog_step.prog_str,partial=True)['step_name']
23 print(step_name)
---> 24 return self.step_interpreters[step_name].execute(prog_step,inspect)

File ~\visprog\engine\step_interpreters.py:1328, in ReplaceInterpreter.execute(self, prog_step, inspect)
1326 objs = prog_step.state[obj_var]
1327 mask = self.create_mask_img(objs)
-> 1328 new_img = self.predict(img, mask, prompt)
1329 prog_step.state[output_var] = new_img
1330 if inspect:

File ~\visprog\engine\step_interpreters.py:1302, in ReplaceInterpreter.predict(self, img, mask, prompt)
1300 mask,, = self.resize_and_pad(mask)
1301 init_img,W,H = self.resize_and_pad(img)
-> 1302 new_img = self.pipe(
1303 prompt=prompt,
1304 image=init_img,
1305 mask_image=mask,
1306 # strength=0.98,
1307 guidance_scale=7.5,
1308 num_inference_steps=50 #200
1309 ).images[0]
1310 return new_img.crop((0,0,W-1,H-1)).resize(img.size)

File ~\anaconda3\Lib\site-packages\torch\utils_contextlib.py:115, in context_decorator..decorate_context(*args, **kwargs)
112 @functools.wraps(func)
113 def decorate_context(*args, **kwargs):
114 with ctx_factory():
--> 115 return func(*args, **kwargs)

File ~\anaconda3\Lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion_inpaint.py:1444, in StableDiffusionInpaintPipeline.call(self, prompt, image, mask_image, masked_image_latents, height, width, padding_mask_crop, strength, num_inference_steps, timesteps, guidance_scale, negative_prompt, num_images_per_prompt, eta, generator, latents, prompt_embeds, negative_prompt_embeds, ip_adapter_image, output_type, return_dict, cross_attention_kwargs, clip_skip, callback_on_step_end, callback_on_step_end_tensor_inputs, **kwargs)
1442 do_denormalize = [True] * image.shape[0]
1443 else:
-> 1444 do_denormalize = [not has_nsfw for has_nsfw in has_nsfw_concept]
1446 image = self.image_processor.postprocess(image, output_type=output_type, do_denormalize=do_denormalize)
1448 if padding_mask_crop is not None:

TypeError: 'bool' object is not iterable

Taking forever to run notebook

Hi,

When running the notebook, it gets stuck here
image
I've had it running for almost 2 hours and still nothing.
image
I followed the instructions to make a new conda environment and run the notebook in it. I've also put my api key where it asks.

So its been taking forever to run gqa.ipynb and I am wondering if I am doing anything wrong. Any help would be greatly appreciated

Missing Checkpoint: https://folk.ntnu.no/haakohu/WIDERFace_DSFD_RES152.pth

Hello, the image_editing.ipynb notebook has an error: HTTPError: HTTP Error 410: Gone. It looks like the checkpoint is no longer at the specified URL.

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
Cell In[13], [line 1](vscode-notebook-cell:?execution_count=13&line=1)
----> [1](vscode-notebook-cell:?execution_count=13&line=1) interpreter = ProgramInterpreter(dataset='imageEdit')

File [~/Programming/visprog-main/engine/utils.py:19](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/utils.py:19), in ProgramInterpreter.__init__(self, dataset)
     [18](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/utils.py:18) def __init__(self,dataset='nlvr'):
---> [19](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/utils.py:19)     self.step_interpreters = register_step_interpreters(dataset)

File [~/Programming/visprog-main/engine/step_interpreters.py:1363](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/step_interpreters.py:1363), in register_step_interpreters(dataset)
   [1344](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/step_interpreters.py:1344)     return dict(
   [1345](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/step_interpreters.py:1345)         LOC=LocInterpreter(),
   [1346](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/step_interpreters.py:1346)         COUNT=CountInterpreter(),
   (...)
   [1359](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/step_interpreters.py:1359)         RESULT=ResultInterpreter()
   [1360](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/step_interpreters.py:1360)     )
   [1361](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/step_interpreters.py:1361) elif dataset=='imageEdit':
   [1362](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/step_interpreters.py:1362)     return dict(
-> [1363](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/step_interpreters.py:1363)         FACEDET=FaceDetInterpreter(),
   [1364](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/step_interpreters.py:1364)         SEG=SegmentInterpreter(),
   [1365](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/step_interpreters.py:1365)         SELECT=SelectInterpreter(),
   [1366](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/step_interpreters.py:1366)         COLORPOP=ColorpopInterpreter(),
   [1367](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/step_interpreters.py:1367)         BGBLUR=BgBlurInterpreter(),
   [1368](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/step_interpreters.py:1368)         REPLACE=ReplaceInterpreter(),
   [1369](https://file+.vscode-resource.vscode-cdn.net/Users/akshay/Programming/visprog-main/notebooks/~/Programming/visprog-main/engine/step_interpreters.py:1369)         EMOJI=EmojiInterpreter(),
...
File [/usr/local/anaconda3/envs/visprog/lib/python3.10/urllib/request.py:643](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/envs/visprog/lib/python3.10/urllib/request.py:643), in HTTPDefaultErrorHandler.http_error_default(self, req, fp, code, msg, hdrs)
    [642](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/envs/visprog/lib/python3.10/urllib/request.py:642) def http_error_default(self, req, fp, code, msg, hdrs):
--> [643](https://file+.vscode-resource.vscode-cdn.net/usr/local/anaconda3/envs/visprog/lib/python3.10/urllib/request.py:643)     raise HTTPError(req.full_url, code, msg, hdrs, fp)

HTTPError: HTTP Error 410: Gone

lost files

Thank you for the great work. Where can we get this file: /usr/share/fonts/truetype/dejavu/DejaVuSansMono-Bold.ttf'

How can I evaluate the image editing task and where I can find the dataset?

Hello, thanks for your exciting work!

I find 'To test VISPROG on the image editing instructions for de-identification, object highlighting, and object replacement, we collect 107 instructions across 65 images.' in your paper

so how can I evaluate it and download image to reproduce your result without copyright?

Could I have download url and know how to 'manually score the predictions for correctness' ?

Thanks a lot!

Repository Not Found for url: https://huggingface.co/api/models/runwayml/stable-diffusion-inpainting/revision/fp16

I can't run the image editting task:

Registering COLORPOP step
Registering BGBLUR step
Registering REPLACE step
Couldn't connect to the Hub: 401 Client Error. (Request ID: Root=1-66e26e75-170a15a05078be685e693c57;e6b93512-3f07-4fa0-84d9-82802dc4a2f5)

Repository Not Found for url: https://huggingface.co/api/models/runwayml/stable-diffusion-inpainting/revision/fp16.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password..
Will try to load from local cache.
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py](https://localhost:8080/#) in hf_raise_for_status(response, endpoint_name)
    303     try:
--> 304         response.raise_for_status()
    305     except HTTPError as e:

12 frames
HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/api/models/runwayml/stable-diffusion-inpainting/revision/fp16

The above exception was the direct cause of the following exception:

RepositoryNotFoundError                   Traceback (most recent call last)
RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-66e26e75-170a15a05078be685e693c57;e6b93512-3f07-4fa0-84d9-82802dc4a2f5)

Repository Not Found for url: https://huggingface.co/api/models/runwayml/stable-diffusion-inpainting/revision/fp16.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.

The above exception was the direct cause of the following exception:

OSError                                   Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/pipeline_utils.py](https://localhost:8080/#) in download(cls, pretrained_model_name, **kwargs)
   1534             else:
   1535                 # 2. we forced `local_files_only=True` when `model_info` failed
-> 1536                 raise EnvironmentError(
   1537                     f"Cannot load model {pretrained_model_name}: model is not cached locally and an error occurred"
   1538                     " while trying to fetch metadata from the Hub. Please check out the root cause in the stacktrace"

OSError: Cannot load model runwayml/stable-diffusion-inpainting: model is not cached locally and an error occurred while trying to fetch metadata from the Hub. Please check out the root cause in the stacktrace above.

Is there any URL to replace?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.