carperai / cheese Goto Github PK

View Code? Open in Web Editor NEW

298.0 10.0 24.0 512 KB

Used for adaptive human in the loop evaluation of language and embedding models.

License: MIT License

Python 100.00%

cheese's Introduction

Coadaptive Harness for Effective Evaluation, Steering, & Enhancement

Used for adaptive human in the loop evaluation of language and embedding models.

Documentation

Getting Started

First install the required packages:

git clone https://github.com/carperai/cheese
cd cheese
pip install -r requirements.txt

You need rabbitmq for the messaging system in the backend. It can be installed easily with conda as follows:

conda install -c conda-forge rabbitmq-server

Then start the server (should do this on a different screen/window)

rabbitmq-server

Note that because of how rabbitmq works, it is possible several items may remain in the queue after your application using CHEESE has terminated. If the server is not shut down and restarted, then subsequent uses of CHEESE may end up drawing from items that were placed in the queue in a previous run.

As an example, the following script runs an image selection task, where two images from LAION-art dataset are presented and the labeller is asked to choose one.

python -m examples.image_selection

Custom tasks in Gradio

Refer to examples/image_selection.py for a comprehensive example on how to create your own custom task with a Gradio UI.

cheese's People

Contributors

Stargazers

Watchers

cheese's Issues

Batched model input

Currently, model creates a queue of tasks sent to it by client of pipeline, then handles them one at a time before sending back. If processing can be done fast enough that the queue never starts to accumulate many tasks, this should be fine. However, it is very likely there will be a case where the queue starts to fill up faster than model running on unbatched data can keep up with it. Need to collate date and allow model processing to be done in batched form.
Current Idea:

Before taking newest task from queue, check size of queue and if it would be worth batching it (i.e. maybe if the size of items in queue is greater than some predefined number)
Take many tasks off queue and collate them
After model output is obtained, undo this collation and send tasks back as normally
Collate and uncollate functions should be defined by user but be optional
Handle queued tasks should deal with collation and batched calls to process

Make API accessible without running directly

Ending data stream to user

Need to add ability to remove users with the API
Add something to API so that when pipeline is exhausted, it automatically removes all users
Need to update user view once they've been removed
-> In the simplest case, we add a "Exit" flag to task, that the manager can add to a task
-> Generate a task with no data and an exit flag, client sees this and switches to a "Thanks for helping!" screen of some sorts
In some cases, the client might have some data being shown to them, but they've already been removed.
-> This will require them to submit something
-> Posting submission will result in error, need the client manager to take submission, dump it, then give back the exit flag

How about provide a NER annotation example?

As the title said, can you give me some tips to implement a ner annotation example?

Issues with Running Examples

Describe the bug

There are several issues with trying to run the examples from scratch (on an M1 Mac). I would submit a PR to solve these but I probably don't have time at the moment.

To Reproduce

Try to follow the existing README on an M1 Mac.

Issues and Solutions

RabbitMQ won't run correctly using Conda

Solution

Add comment to README mentioning you can also run this command to run RabbitMQ
docker run -it --rm --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:3.11-management
Taken from https://www.rabbitmq.com/download.html

It is unexpected for the user to get a password required when running the example and there is no explanation of how to proceed

Solution

Initially add a comment to README describing that the username and password are printed on the command line.
Long term, likely don't use public by default for the examples OR provide a way to disable that using a command line argument or environment variable.

The instruct_hf_pipeline.py doesn't work out of the box because the dependencies are not in the requirements.txt file

Solution

Either add the dependencies to requirements.txt or add it in the README or print a more helpful message when running that example.

Need to move communication with user to API

Currently relying on client to implement some form of communication (i.e. send/receive jsons) with the user. For CHEESE to be a tool this isn't ideal. Whoever is using the API should have direct access to these JSON packets, but the current API provides no way to access them.

API Fails to connect to already running server sometimes

Pipeline can't send/receive data directly to/from model

Multiple setups might require data to go to model before client,
or to go to model than immediately client. Couple things preventing this currently.

Pipeline doesn't have an event subscriber for model to publish to

Fix this by adding a subscriber to pipeline (easy)

Client ends up waiting for model rather then getting new data in the case where model is the last to touch data before it goes to pipeline

Fix this by checking trip and comparing it to trip_max on client before deciding if we are going to put it into idle or waiting state

"Getting started" guide not working (no client id/password provided)

I'm following https://cheese1.readthedocs.io/en/latest/started.html, but when I run the python command (related: #47) I get a URL but no client id/password:

python -m examples.image_selection
100%|██████ 1/1 [00:04<00:00,  4.97s/it]
~/miniconda3/envs/cheese/lib/python3.11/site-packages/gradio/components.py:206: UserWarning: 'rounded' styling is no longer supported. To round adjacent components together, place them in a Column(variant='box').
  warnings.warn(
~/miniconda3/envs/cheese/lib/python3.11/site-packages/gradio/components.py:224: UserWarning: 'border' styling is no longer supported. To place adjacent components in a shared border, place them in a Column(variant='box').
  warnings.warn(
Running on local URL:  http://127.0.0.1:7860

Desktop:

OS: Ubuntu
Browser: Firefox

Many thanks for any help! :)

`webdataset` not installed but required

Describe the bug
The webdataset package needs to be manually installed in order to run examples.

To Reproduce
Steps to reproduce the behavior:

Follow the procedure described in Getting Started with CHEESE
When running python -m cheese.examples.image_selection the execution fails with the following error

Traceback (most recent call last):
  File "/home/user/conda/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/user/conda/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/user/cheese/examples/image_selection.py", line 2, in <module>
    from cheese.pipeline.iterable_dataset import IterablePipeline, InvalidDataException
  File "/home/user/cheese/cheese/pipeline/iterable_dataset.py", line 4, in <module>
    import webdataset as wds
ModuleNotFoundError: No module named 'webdataset'

Expected behavior

The example script should run without further installations. webdataset should be included in the requirements.txt file.

Desktop (please complete the following information):

OS: Ubuntu 20.04

Client stats

Should have client manager save stats on the users. Specifically, for each user:

How much have they labelled so far
What proportion of data do they "error"
How much time do they spend labelling data on average after it has been sent to them

Saving progress for datasets

Saving progress for datasets, namely IterablePipelines, is currently a bit clunky. The output dataset is agnostic of progress/location in source. With respect to the source iterator being read from, all that is really being saved is an index in the dataset being read from. Currently naively running next on iterator to get back to whatever index was saved. Leaving a note here to revisit this later as it might have unforeseen consequences at scale.

`instruct_hf_pipeline` example returns rankings of `None`

Describe the bug
Printing the results in extract_data() here, the rankings are None. The rankings are also missing from the produced rankings_dataset file.

To Reproduce
Steps to reproduce the behavior:

Run default instruct_hf_pipeline, e.g. python -m examples.instruct_hf_pipeline

Expected behavior
rankings should be a list of ints, corresponding to the human labeler's decisions.

Example buggy result (printing inside extract_data()):

{
  "query": "hat has inspired you to become a speaker? How important is your own English knowledge base to you",
  "completions": [
    "? So, how is a new speaker's grammar an essential tool in how you plan to speak?\n\nLangston is a student who writes English for all students, and so it is all about teaching the new speaker to think out loud. That is what he started doing two years ago when he learnt that his grammar was going to be different from that of their world renowned schoolwork teacher.\n",
    "?\n\nMe, and I am a fluent speaker. It is a privilege to be a speaker. Having some knowledge of English is important because while I get compliments for reading so many books I am already on a conversational train and I often find myself saying things I do not like in English. In order to achieve my own speech perfection, it can be hard to get my English to speak in English",
    "?\n\nYes, I speak more and I also like to communicate with other people in the community as a whole. I learned so much as a teenager living in Japan that I really don't understand what my Japanese does. But when I meet people, I just try and say English because I like speaking Japanese, I like seeing them on TV. I love hearing their opinions, even though they are ignorant",
    "?\n\nThe most important thing is learning to speak, even if it doesn't mean that much. The second person to do is to look at the problem and do something with it. The first person will look for something, and if possible, look at where it started. If you learn how to look for the first person in a sentence you can use the search function to find them, and if",
    "?\n\nThis question has been asked and answered with very clear answers as to simply what the speakers speak English. One can use this knowledge as a basis for a wide variety of topics, as in reading the book for five minutes, or as part of the job posting. For those who only need a few hours of reading a book a week, here is a brief discussion of a topic within English at"
  ],
  "rankings": null
}

The LMGenerationElement does not report an error, but is missing the rankings.

LMGenerationElement(client_id=1, trip=1, trip_start='client', trip_max=1, error=False, start_time=1674231730.971844, end_time=1674231740.1409464, query='Write a quote on the floor', 
completions=[...omit for brevity...], rankings=None)

If I solve the issue I'll comment here. Thx.

device=0 leads to AssertionError: Torch not compiled with CUDA enabled

Describe the bug
device is preset to be GPU for instruct hf pipeline. Needs to be taken as input for line 61 and 29. Otherwise it leads to AssertionError: Torch not compiled with CUDA enabled

To Reproduce
Steps to reproduce the behavior:
Run instruct_hf_pipeline.py on a CPU system

Expected behavior

detect GPU or take as input the device

Prolific integration

Need CHEESE to interface neatly with Prolific. Intended function is as follows:

User information from prolific linked to client IDs on CHEESE
CHEESE link sent to prolific users

Pipeline can't send directly to model

Should add something to batchelement to specify first target (default to user but add option for model)

carperai / cheese Goto Github PK

cheese's Introduction

Coadaptive Harness for Effective Evaluation, Steering, & Enhancement

Getting Started

Custom tasks in Gradio

cheese's People

Contributors

Stargazers

Watchers

Forkers

cheese's Issues

Describe the bug

To Reproduce

Issues and Solutions

RabbitMQ won't run correctly using Conda

Solution

It is unexpected for the user to get a password required when running the example and there is no explanation of how to proceed

Solution

The instruct_hf_pipeline.py doesn't work out of the box because the dependencies are not in the requirements.txt file

Solution

Recommend Projects

Recommend Topics

Recommend Org