genai-impact / ecologits Goto Github PK

View Code? Open in Web Editor NEW

44.0 8.0 6.0 2.54 MB

🌱 EcoLogits tracks the energy consumption and environmental footprint of using generative AI models through APIs.

Home Page: https://ecologits.ai/

License: Mozilla Public License 2.0

Python 99.71% Makefile 0.29%

genai generative-ai green-ai green-software llm llm-inference python sustainability sustainable-ai

ecologits's Introduction

EcoLogits

🌱 EcoLogits tracks the energy consumption and environmental impacts of using generative AI models through APIs.

Documentation: ecologits.ai

⚙️ Installation

pip install ecologits

For integration with a specific provider, use pip install ecologits[openai]. We are currently supporting the following providers: anthropic, cohere, google-generativeai, huggingface-hub, mistralai and openai. See the full list of providers.

🚀 Usage

from ecologits import EcoLogits
from openai import OpenAI

# Initialize EcoLogits
EcoLogits.init()

client = OpenAI(api_key="<OPENAI_API_KEY>")

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "Tell me a funny joke!"}
    ]
)

# Get estimated environmental impacts of the inference
print(f"Energy consumption: {response.impacts.energy.value} kWh")
print(f"GHG emissions: {response.impacts.gwp.value} kgCO2eq")

See package documentation on EcoLogits

💪 Contributing

To get started with setting up a development environment and making a contribution to EcoLogits, see Contributing to EcoLogits.

⚖️ License

This project is licensed under the terms of the Mozilla Public License Version 2.0 (MPL-2.0).

ecologits's People

Contributors

Stargazers

Watchers

Forkers

samuelrince damienfernandes kelu124 cvarrei ilyas-khiat np4567-dev

ecologits's Issues

Energy consumption of using OpenAI

Description

Enable energy consumption estimation of using the chat-completions-api with the python SDK.

Solution

Create a wrapper of the OpenAI client and insert energy consumption to the response when calling the chat-completions-api.

from genai_impact import OpenAI

client = OpenAI()

response = client.chat.completions.create(
	model="gpt-3.5-turbo",
	messages=[
		{"role": "user", "content": "Hello World!"}
	]
)

print(response.impacts) 	# Outputs an impact object containing the estimated energy consumption of the query.

Considerations

Model size

Parameter count of OpenAI models is unknown, so we will need to guesstimate it in the methodology working group. For now, I propose to follow this convention:

Model name	# Parameter count	Additional information
gpt-3.5-turbo	~20B	Potentially leaked in a paper + Similar performance compared to mistral-small.
gpt-4-turbo	~70B	Similar performance compared to mistral-medium which a previous version leaked
gpt-4	~200B	?
gpt-4-vision	?	?

Other variations of these models are considered equal in terms of the number of parameters.

Energy estimation

Based on the methodology v0, we can estimate the energy consumption of using a LLM with the following formula:

$$ Energy(model\_size, output\_tokens) = A * model\_size * output\_tokens $$

With:

$A=1.17e-4\ Wh$ a constant ;
$model\_size$ the number of model parameters counted in billions ;
$output\_tokens$ the number of tokens generated.

Enhance model repository

Description

The following improvements can be considered :

Automatically synchronize local file with remote (right now a GitHub URL)
Have the model repository follow its own release cycle
Handle alias instead of duplicating models
Support dynamic fields based on the model type

Solution

A separate repository that builds the file that is then injected in the client build?

Considerations

Any input is welcome.

Additional context

See more context here genai-impact/ecologits.js#4

`google-generativeai` async testing

The combination of transport=rest and async is broken ingoogle-generativeai, see google-gemini/generative-ai-python#203
This prevents us from testing async with cassettes for now, I removed the async tests for now, let's wait for this to be fixed

Originally posted by @adrienbanse in #50 (comment)

Implement methodology v1

Description

Implement the new release of the methodology that includes impacts (gwp, adpe, pe) and multisteps (usage + embodied)

Methodology v1 is on notion

Solution

Update the comupte llm impacts function.

Support regional energy mixes

For now, we only use the world mix to compute GWP, ADPE and PE:

IF_ELECTRICITY_MIX_GWP = 5.90478e-1     # kgCO2eq / kWh (World)
IF_ELECTRICITY_MIX_ADPE = 7.37708e-8    # kgSbeq / kWh (World)
IF_ELECTRICITY_MIX_PE = 9.988           # MJ / kWh (World)

But we know that some providers only have servers in a specific region (e.g. OpenAI in the US).

We could add a column in data/models.csv indicating a zone (country, continent, ...), and then query another .csv (or query some database in another way) to ask for an average energy mix of the zone, or keep the World mix if unspecified.

AzureOpenAI

does this project work with AzureOpenAI ?

Thanks

Units

We should screen the documentation to make sure that all units are specified. For example, I couldn't find the units of the latency.

Open discussion on licenses

Description

As stated in the last weekly meeting, let's take the opportunity of the GitHub organization migration to review the license we use. Currently, we use the default from Data For Good which is MIT, a very permissive license.

The main issue I have with a very permissive license like MIT is that the distribution or modification of the library can be done without using the same license.

Example: a third party using a modified version of EcoLogits has no obligation to distribute and open source the modified version.

This as at least two consequences:

The third party is not encouraged (forced) to contribute to the open-source project; thus modifications are not available for everyone to enjoy.
The third party can use a degraded version of the library without having to specify it. For instance, impacts calculation could be altered and lead to wrong results, without the end users knowing that it is not running the original EcoLogits library. This reduces the transparency of the methodology and impact reporting when used in other projects.

Solution

A solution to this issue is to use a copyleft license that can force an individual or a company to redistribute the modifications that have been made, for instance. Here is a list of the well-known copyleft licenses that we could use, with their pros and cons.

LGPL 3.0

This license is mainly applied to libraries. You may copy, distribute and modify the software provided that modifications are described and licensed for free under LGPL. Derivatives works (including modifications or anything statically linked to the library) can only be redistributed under LGPL, but applications that use the library don't have to be.

Sources: tldrlegal.com, fossa.com

Well-known license that is especially designed for libraries. In theory, anyone can install and use the library in any project (including proprietary and commercial) without releasing the project under the same license. One caveat is that it is usually a red flag for some people / companies to have dependencies with GPL-like code because GPLv3 or AGPLv3 licenses force you to redistribute your code under a GPLv3 compatible license (even if you use one small function in an enormous stack).

Another implication of this license is the ability for the end-user to be able to replace the library by another one. Meaning, a company that uses EcoLogits in its dashboard should provide the user an option to use an equivalent library to EcoLogits in replacement for the computation of the impacts, for instance. This can be a burden and it is probably another reason why the LGPLv3 is not commonly liked.

MPL 2.0

MPL is a copyleft license that is easy to comply with. You must make the source code for any of your changes available under MPL, but you can combine the MPL software with proprietary code, as long as you keep the MPL code in separate files. Version 2.0 is, by default, compatible with LGPL and GPL version 2 or greater. You can distribute binaries under a proprietary license, as long as you make the source available under MPL.

Sources: tldrlegal.com, fossa.com

Less-known but actively used license created by Mozilla. The main difference I see with MPL 2.0 compared to LGPL 3.0 is that this license is completely permissive when you only use the library. This is true when the library is in its own separate file, which is always the case in Python if you "pip install" the library. I think the specification of the separate file is used for low-level programming languages. Otherwise, if you modify the library you are required to make it available with the same license

Other licenses

There are of course other licenses available that are copyleft or not. I have only listed the two I consider for this project. We should consider that it is possible to change the license again in the future if required. It is generally poorly regarded to change to a more restrictive license, but ok or encouraged to go for a more permissive license.

Other considerations

Another issue that I have mentioned it is the commercial exploitation of the library with no or low added value. This a very classical issue with open-source projects, and two solutions are possible:

Use a very restrictive license that makes the software only usable in other open-source projects, that can be an issue if we want to address organizations that produce proprietary code (which we want).
Use dual-licensing 1. Open-source license that is very restrictive and 2. A business license to sell to companies. This is not very doable as a non-profit (or not easily), plus it is usually viewed as not an open-source practice, but more as a bait to use paid software.

So, we will not try to address this issue here and with a license. We need to find other innovative ways to fund the project and encourage companies to do so.

Other resources

As you @LucBERTON @aqwvinh @adrienbanse @AndreaLeylavergne have contributed to the project (made a commit or created a file) you are concerned by this change and I would be glad to hear your opinion on this.

A new name for gen AI impact package

Description

We need to find a fancy and easy to remember package name :)

Solution

List and discuss package name ideas.

Considerations

Should be easy to remember.
Should be in line with Data for good values and ethical standards.

Additional context

N/A

Add Amazon Bedrock provider

Description

Add Amazon Bedrock provider.

Solution

Amazon Bedrock uses its on API and Python package.

Python package: boto3

Amazon Bedrock:

Code example:

import boto3
import json
brt = boto3.client(service_name='bedrock-runtime')

body = json.dumps({
    "prompt": "\n\nHuman: explain black holes to 8th graders\n\nAssistant:",
    "max_tokens_to_sample": 300,
    "temperature": 0.1,
    "top_p": 0.9,
})

modelId = 'anthropic.claude-v2'
accept = 'application/json'
contentType = 'application/json'

response = brt.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType)

response_body = json.loads(response.get('body').read())

# text
print(response_body.get('completion'))

Add Perplexity provider

Description

Add perplexity.ai LLM provider.

Solution

Perplexity uses the same API as OpenAI, meaning the OpenAI python client is compatible with their service, and it only requires changing the API endpoint.

Client example: https://docs.perplexity.ai/docs/getting-started
Supported models: https://docs.perplexity.ai/docs/model-cards

We need to identify when another provider is used in the case of the OpenAI client. Plus support and register the models that they provide through their API.

Add Google generative AI models

Description

Add Google generative AI models, e.g. Gemini.

Solution

Google has it's own Python package for its generative AI models.

Python package: google-generativeai

Documentation:

Python quickstart
SDK Github
Models - also possible to print them directly with the Python SDK (see code example below)

Code example:

import google.generativeai as genai
genai.configure(api_key=GOOGLE_API_KEY)

# Print models
for m in genai.list_models():
  if 'generateContent' in m.supported_generation_methods:
    print(m.name)

model = genai.GenerativeModel('gemini-pro')

# Non-streamed response
response = model.generate_content("What is the meaning of life?")

# Streamed response
streamed_response = model.generate_content("What is the meaning of life?", stream=True)
for chunk in streamed_response:
  print(chunk.text)
  print("_"*80)

Configuration system

Description

Enable the user to configure EcoLogits library. An example is changing the impact factors of the electricity mix to match datacenter country.

Solution

Make available some options in the init method of EcoLogits object.

Considerations

Start simple.

Implement warnings

Description

Display warnings in the impacts section of each response when needed. These warnings can inform the user on the quality of the estimation for closed-source models, for instance.

Solution

Some warnings are already defined in the models.csv file.

Recurrent `ModuleNotFoundError`

It's probably very basic but any idea why I have to run poetry install --all-extras --with dev,docs everytime I make a change in the code to be able to run my __main__.py script? Otherwise I very often get ModuleNotFoundError: No module named 'genai_impact'.

OpenAI cassette bug

Bug description
OpenAI test fails even with the cassette, I get the following error: "FAILED tests/test_openai.py::test_openai_chat - openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable"

Am I doing something wrong?

To Reproduce
Update deps poetry install --all-extras --with dev,docs
Run tests poetry run pytest

System information
OS: macOS Ventura 13.5.2

Bug -- Initialisation Google Instrument because of other Google dependencies

By using another implementation of Ecologits than Google's in my project, because one of my dependencies from another library is ‘google-auth’, my environment contains a ‘google’ module but without the generative-ai aspect. It tries to initialise the Google instrumentor, but (rightly) fails. I did a few tests, using the specific generative Google module (google-generativeai) modifying this :

def init_google_instrumentor() -> None:
    if importlib.util.find_spec("google") is not None:
        from ecologits.tracers.google_tracer import GoogleInstrumentor

to:

def init_google_instrumentor() -> None:
    if importlib.util.find_spec("google-generativeai") is not None:
        from ecologits.tracers.google_tracer import GoogleInstrumentor

This fixes the bug, but it may need to be tested to ensure that it doesn't affect the correct functioning of the Google tracer.

Support for Mistral AI 1.0.0

Description

Mistral AI has released version 1.0.0 of their package mistralai. Adapt the support in EcoLogits to be compatible with this new version.

https://github.com/mistralai/client-python/blob/main/MIGRATION.md

Add Cohere provider

Description

Add Cohere LLM provider.

Solution

Cohere has its own API and python client.

Available models: https://docs.cohere.com/docs/models

Code example:

import cohere

client = cohere.Client('<<apiKey>>')

response = client.chat(
	chat_history=[
		{"role": "USER", "message": "Who discovered gravity?"},
		{"role": "CHATBOT", "message": "The man who is widely credited with discovering gravity is Sir Isaac Newton"}
	],
	message="What year was he born?",
	connectors=[{"id": "web-search"}]
)

print(response)

Bug -- Frenetically printing of "Could not find '....' for .... provider"

Describe the bug
The fact that the script checks each time it calculates the impact whether a model exists for a specific provider leads to Ecologits frantically prints the following message: ‘Could not find model “xxxxx” for provider xxxx’ when the model/ provider is not implemented or when the correct tracer is not uniquely instantiated (see reason 2).

The message come from the llm_impacts function from utils.py:

def llm_impacts(
    provider: str,
    model_name: str,
    output_token_count: int,
    request_latency: float,
) -> Optional[Impacts]:
    """
    High-level function to compute the impacts of an LLM generation request.

    Args:
        provider: Name of the provider.
        model_name: Name of the LLM used.
        output_token_count: Number of generated tokens.
        request_latency: Measured request latency in seconds.

    Returns:
        The impacts of an LLM generation request.
    """
    model = models.find_model(provider=provider, model_name=model_name)
    if model is None:
        # TODO: Replace with proper logging
        print(f"Could not find model `{model_name}` for {provider} provider.")
        return None
    model_active_params = model.active_parameters \
                          or Range(min=model.active_parameters_range[0], max=model.active_parameters_range[1])
    model_total_params = model.total_parameters \
                         or Range(min=model.total_parameters_range[0], max=model.total_parameters_range[1])
    return compute_llm_impacts(
        model_active_parameter_count=model_active_params,
        model_total_parameter_count=model_total_params,
        output_token_count=output_token_count,
        request_latency=request_latency
    )

Reason 1: In an architecture where the user would like to include a model that is not yet implemented in ecologits, it will spam the logs with this message.

Reason 2: This behaviour is also present when using the LiteLLM integration for MistralAI models and some other providers (e.g. Groq). LiteLLM relies on a OpenAI configuration for these providers, which also instanciate the OpenAI tracer, so: for ‘mistral/open-mistral-nemo-2407’ we get ‘Could not find model open-mistral-nemo-2407 for openai provider’ https://github.com/BerriAI/litellm/blob/b376ee71b01e3e8c6453a3dd21421b365aaaf9f8/litellm/llms/openai.py

To Reproduce
Using Mistral models with the LiteLLM tracer ( mistral/open-mistral-nemo-2407) or an unimplemented provider and/or model (e.g., groq/llama3-8b-8192) . The behaviour is more pronounced in stream mode (the message is printed for each token).

Proposed solution

To avoid this interaction between the OpenAI tracer and the LiteLLM tracer, adopting the LiteLLM tracer strategy as the default strategy for finding the provider could be useful.: provider=models.find_provider(model_name=model_name).
For the message, I suggest not displaying it when no model is returned or adding a verbosity parameter when initialising the Ecologits object (by default, False) to give the user the choice of whether or not to display this message.

Model characteristics dataset

Description

To compute the impacts of a query, we need some characteristics of the model that was used. Especially in the case of LLMs we need the total count of parameters.

Solution

A CSV or JSON file to store all known models and metadata like the total parameters count.

Considerations

Proprietary models

In many cases, we don't know the underlying architecture of models. thus we will need to guesstimate it (see issue #1 for OpenAI). The estimation can be based on performance achieved by this models in various leaderboards compared to open-weight models. It is crucial to keep the source of this assessment because it influences a lot the impacts.

Total parameters vs active parameters

In the case of mixture of experts models we can definie the active parameters count as the sum of all active/used parameters to run the computation. (example with mixtral).

Package install with optional dependencies

Description

Allow to install this package with extra/optional dependencies.
The goal is that users can install only useful dependencies for their usecase, hence reducing install size.

Solution

This package could be installed only with some dependancies.

e.g. Package with with mistral client and not OpenAI

https://python-poetry.org/docs/pyproject#extras

poetry install --extras "mysql pgsql"
poetry install -E mysql -E pgsql

Considerations

Should work for pip and poetry package managers.

Additional context

N/A

Automatic model update for Hugging Face

Description

We've recently added support for Hugging Face Inference Endpoints through the huggingface_hub python package. We now need to reference models that are available on Hugging Face (model name, number of parameters).

Solution

Possible solutions:

Scrapping at runtime (possibly not a good idea)
Telemetry (update in a github action triggered by package telemetry?)
...

Ping @aqwvinh

This package should work with most LLM clients

Description

This package should work with most LLM clients :

Open AI
Mistral AI
Anyscale
Anthropic
transformers
Cloud APIs

Solution

The first versions of this package may focus on OpenAI client only but, future versions should allow all features to be used with any LLM clients.

Considerations

N/A

Additional context

N/A

Unit tests

Description

Implement unit tests in this package.

Solution

All features and cases should be explicitly verified through unit testing

Considerations

As a reference :
https://github.com/dataforgoodfr/12_genai_impact/blob/main/tests/test_compute_impacts.py

import pytest

from genai_impact.compute_impacts import compute_llm_impact


@pytest.mark.parametrize('model_size,output_tokens', [
    (130, 1000),
    (7, 150)
])
def test_compute_impacts_is_positive(model_size: float, output_tokens: int) -> None:
    impacts = compute_llm_impact(model_size, output_tokens)
    assert impacts.energy >= 0

Additional context

N/A

Support for stream and async functions

Description

We currently have no support for "advanced" function for chat completion in async and/or in streaming.

Solution

Maybe we can look at what openllmetry does (again). 😄

Additional context

Examples for OpenAI SDK:

Streaming:

from openai import OpenAI

client = OpenAI()

stream = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Say this is a test"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

Async:

import os
import asyncio
from openai import AsyncOpenAI

client = AsyncOpenAI(
    # This is the default and can be omitted
    api_key=os.environ.get("OPENAI_API_KEY"),
)

async def main() -> None:
    chat_completion = await client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": "Say this is a test"}]
    )

asyncio.run(main())

Async + Streaming:

from openai import AsyncOpenAI

client = AsyncOpenAI()

async def main():
    stream = await client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": "Say this is a test"}],
        stream=True,
    )
    async for chunk in stream:
        print(chunk.choices[0].delta.content or "", end="")

asyncio.run(main())

Package documentation

Description

Write a clear documentation explaining how to install and use the package.

Solution

update the README.me file
create a full package documentation

Considerations

Technology and format for package documentation to be determined.
e.g. :

notion
sphinx (www.sphinx-doc.org)

Additional context

N/A

Dependancies are too restrictive

Describe the bug

Hello, I'd like to use EcoLogits in a project using LangChain. But it's impossible because EcoLogits prevent to use packaging>=25.0

If you did not need specific version of a package, I think it's better to remain open to all versions.

To Reproduce
Steps to reproduce the behavior:
poetry add ecologits langchain-core

Because no versions of ecologits match >0.1.5,<0.2.0
 and ecologits (0.1.5) depends on packaging (>=24.0,<25.0), ecologits (>=0.1.5,<0.2.0) requires packaging (>=24.0,<25.0).
Because langchain-core (0.1.52) depends on packaging (>=23.2,<24.0)
 and no versions of langchain-core match >0.1.52,<0.2.0, langchain-core (>=0.1.52,<0.2.0) requires packaging (>=23.2,<24.0).
Thus, ecologits (>=0.1.5,<0.2.0) is incompatible with langchain-core (>=0.1.52,<0.2.0).
And because langchain-experimental (0.0.58) depends on langchain-core (>=0.1.52,<0.2.0)
 and no versions of langchain-experimental match >0.0.58,<0.0.59, ecologits (>=0.1.5,<0.2.0) is incompatible with langchain-experimental (>=0.0.58,<0.0.59).
So, because fiscal-qa depends on both langchain-experimental (^0.0.58) and ecologits (^0.1.5), version solving failed.

Decorator mode vs client wrapper

Description

Investigate ways to implement this package through wrappers or decorators.

Solution

Client wrappers

from openai import OpenAI
from genai_impact import OpenAI

client = OpenAI()

response = client.chat.completions.create(
	model="gpt-3.5-turbo",
	messages=[...]
)

impacts = response.impacts
print(impacts)
# Outputs: Impacts(energy=XX, energy_unit='kWh', gwp=XX ...)

Patch decorator

from openai import OpenAI
from genai_impact import impact_estimator


@impact_estimator()
def summarize(client: OpenAI, doc: str)
		response = client.chat.completions.create(
				model="gpt-3.5-turbo",
				messages=[...]  # prompt summarize document
		)
		return response.choices[0].message.content


if __name__ == "__main__":
		client = OpenAI()
		document = "Lorem ipsum..."
		summarize(client, document)
		# After running this function the impacts are dumped in a CSV file

Maybe Patchy project could be useful :
https://pypi.org/project/patchy/

Considerations

N/A

Additional context

N/A

Unit test GitHub workflow

Description

Create a new GitHub workflow to run unit tests

Solution

Add a new workflow in .github to run tests with pytest.

LLM Inference Methodology: questions on energy required to train models initially

Hi!

The question I have is: do you have plans for integrating an estimation of the foundational models' initial training costs in terms of energy? I think there is some data for models like LLama 3?

Thank you so much for your work, we're using it for a chatbot arena project (much like https://chat.lmsys.org/) that compares models to improve text-based LLMs on the French language and other languages from France called LANGU:IA (I can send you a private link if you're interested to see it in action).

Have a nice day!