Git Product home page Git Product logo

sqlcommenter's Introduction

sqlcommenter's People

Contributors

aabmass avatar adamegyed avatar artefactop avatar balachandr avatar clamclamyan avatar dependabot[bot] avatar dolphin1999 avatar geonu avatar jackton1 avatar kapv89 avatar kostyay avatar m3m0r7 avatar martin1keogh avatar modulitos avatar nilebox avatar odeke-em avatar omirho avatar shmokmt avatar shubhwip avatar sjs994 avatar thiyagu55 avatar ymotongpoo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sqlcommenter's Issues

Traces not showing with Go Implementation in Query Insights

Hey
I've implemented this in my code.
I can see that traceparent is added to each query, unfortunately though the traces are not being linked up to my application traces. The query insights "end to end" view only shows the query breakdown without any of the application traces.
Perhaps tracespan is missing?
Any ideas how to debug this?

feat(support): support JOOQ for Java/Kotlin/Scala

Hi,

First, I would thank you for open sourcing this library and releasing features around SQL performance and analysis, which is from my point of view, this is very important πŸ”₯.

I would like to ask/propose SQLCommenter to support a really famous and important library of the JVM world named JOOQ.
As a Google Developer Expert on Google Cloud and Kotlin, I see a lot of interest on this library, to perform better queries and be able to optimize/analyse requests made to our SQL engine.

I know SQLCommenter is still young, but I open this issue to allow people to do a πŸ‘ if they think this should be a good idea to support this too.

If you are open to PR, I'll be glad to propose something if you want.

Thanks

Feature request: ASP.NET

Hi,

Can ASP.NET Core be added to the roadmap please, since you already have a pretty darn good SDK for .NET and an awesome integration with ASP.NET.

Support extra generic key/values in django middleware

It would be great to be able to add extra/custom/generic key/value in the django middleware.

The usage could be something like that:

SQLCOMMENTER_WITH_EXTRA_VALUES = 'my.get_extra_values'
# or
SQLCOMMENTER_WITH_EXTRA_VALUES = my.get_extra_values

def get_extra_values(request):
    # ... using `request`
    return {'my_custom_key': 'my_custom_value'}

I'm willing to work on a PR if the idea is accepted.

Cannot see any comments in the logs when using Java with Spring Boot and Hibernate

Hi,

I'm experimenting with SQL Commenter for Java (using version 2.0.1). We're using Spring Boot 2.7.0 and Hibernate (the version that ships with spring boot) and I've followed the instructions here. I.e. I've added:

@Configuration
MyConfigurer implements WebMvcConfigurer {
	
   @Bean
    public SpringSQLCommenterInterceptor sqlInterceptor() {
         return new SpringSQLCommenterInterceptor();
    }
 
    @Override
    public void addInterceptors(InterceptorRegistry registry) {
        registry.addInterceptor(sqlInterceptor());
    }
  
}

and added

properties.setProperty("hibernate.session_factory.statement_inspector", SCHibernate.class.getName());

to our Hibernate configuration.

Then I've started MySQL (5.6) in Docker and enabled logging:

 SET GLOBAL general_log = 'ON';

If I run SHOW VARIABLES LIKE "general_log%"; I see this:

+------------------+---------------------------------+
| Variable_name    | Value                           |
+------------------+---------------------------------+
| general_log      | ON                              |
| general_log_file | /var/lib/mysql/fb28ada36c57.log |
+------------------+---------------------------------+

When I log into the container and tail the log I see a lot SQL queries but I don't see any comments that I expect to see with sql-commenter.

What am I doing wrong?

Feature Request: celery support

In our application, a lot of our database load comes from scheduled celery tasks. They still use the Django ORM to access the database. Since they're coming from celery however, they don't go through the Middleware chain, so the queries aren't getting annotated by the google.cloud.sqlcommenter.django.middleware.SqlCommenter middleware.

Are there any suggestions on how to integrate sqlcommenter in with celery?

Spec unclear regarding malformed sql comments

@adamegyed @aabmass While reading through the specification of the sqlcommenter, specifically the Comment Escaping section. It states that:

If a comment already exists within a SQL statement, we MUST NOT mutate that statement.

The above statement is clear for the scenarios when we have properly formed comment such as:

SELECT * from FOO -- this is the comment
SELECT * from FOO /* this is the comment */

But the spec does not state what happens when we have malformed comments, such as:

SELECT * from FOO /* this is the comment
SELECT * from FOO *//* this is the comment 

Looking at the implementation of sqlcommenter in the provided languages, there seem to be discrepancies. Please do correct me if that is wrong.

Java

  private Boolean hasSQLComment(String stmt) {
    if (stmt == null || stmt.isEmpty()) {
      return false;
    }

    // Perhaps we now have the closing comment or not but that doesn't matter
    // as the SQL should be reported as having an opening comment regardless
    // of it is in properly escaped or not.
    return stmt.contains("--") || stmt.contains("/*");
  }

In this implementation as soon as we see that there is a opening comment tag, we return the SQL statement as it is.

References:

  1. Test cases

NodeJS

exports.hasComment = (sql) => {

    if (!sql)
        return false;

    // See https://docs.oracle.com/cd/B12037_01/server.101/b10759/sql_elements006.htm
    // for how to detect comments.
    const indexOpeningDashDashComment = sql.indexOf('--');
    if (indexOpeningDashDashComment >= 0)
        return true;

    const indexOpeningSlashComment = sql.indexOf('/*');
    if (indexOpeningSlashComment < 0)
        return false;

    // Check if it is a well formed comment.
    const indexClosingSlashComment = sql.indexOf('*/');
    
    /* c8 ignore next */
    return indexOpeningSlashComment < indexClosingSlashComment;
}

In this implementation we return false if there is a malformed comments which is totally opposite to what the implementation of java does.. That means, we do append the comment created by the library to the SQL statement.

References:

  1. Test cases

Python and Ruby

I was unable to find the place where, we check if the SQL statement has any comments or not. Does this mean that the implementation in these languages, we append the comment created by library irrespective of if there is a comment present in the statement. I just had a cursory look in the library, so please do correct me if I'm wrong.


Keeping in consideration the above thing, the question would be:

What to do when we have a malformed comment in the SQL statement, do we ignore the statement and return it as it is OR do we append the comment to the SQL statement?

FastAPI middleware breaks openapi endpoint

Hello

The FastAPI SQLCommenterMiddleware currently breaks FastAPI's openapi feature (see the log message at the end).
This is because this route uses a starlette.routing.Route (FastAPI is built on top of Starlette), which does not have the "route" info used in the code.

I think the easiest fix here would be to return child_scope.get("route") instead, so that if we find a matching child_scope but it lacks the info we're looking for the middleware simply returns no route info rather than crash.

Happy to create a merge request if this solution is OK with you.


Possible implementation of a currently failing test:

# in tests/fastapi/test.py
def test_get_docs_does_not_throw(client):  
    resp = client.get(app.docs_url)        
    assert resp.status_code == 200         

{
  "severity": "ERROR",
  "name": "uvicorn.error",
  "message": "Exception in ASGI application\n",
  "module": "httptools_impl",
  "process": 1626484,
  "processName": "MainProcess",
  "pathname": "<..>/lib/python3.8/site-packages/uvicorn/protocols/http/httptools_impl.py",
  "lineno": 378,
  "labels": {
    "app_name": "profile_api",
    "version": "1.14.0+query_insights.1"
  },
  "serviceContext": {
    "service": "profile_api",
    "version": "1.14.0+query_insights.1"
  },
  "stack_trace": "Traceback (most recent call last):	
    File \"<..>/lib/python3.8/site-packages/uvicorn/protocols/http/httptools_impl.py\", line 375, in run_asgi	
      result = await app(self.scope, self.receive, self.send)	
    File \"<..>/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py\", line 75, in __call__	
      return await self.app(scope, receive, send)	
    File \"<..>/lib/python3.8/site-packages/uvicorn/middleware/message_logger.py\", line 82, in __call__	
      raise exc from None	
    File \"<..>/lib/python3.8/site-packages/uvicorn/middleware/message_logger.py\", line 78, in __call__	
      await self.app(scope, inner_receive, inner_send)	
    File \"<..>/lib/python3.8/site-packages/fastapi/applications.py\", line 269, in __call__	
      await super().__call__(scope, receive, send)	
    File \"<..>/lib/python3.8/site-packages/starlette/applications.py\", line 124, in __call__	
      await self.middleware_stack(scope, receive, send)	
    File \"<..>/lib/python3.8/site-packages/starlette/middleware/errors.py\", line 184, in __call__	
      raise exc	
    File \"<..>/lib/python3.8/site-packages/starlette/middleware/errors.py\", line 162, in __call__	
      await self.app(scope, receive, _send)	
    File \"<..>/lib/python3.8/site-packages/opentelemetry/instrumentation/asgi/__init__.py\", line 459, in __call__	
      await self.app(scope, otel_receive, otel_send)	
    File \"<..>/lib/python3.8/site-packages/google/cloud/sqlcommenter/fastapi.py\", line 69, in __call__	
      info = _get_fastapi_info(fastapi_app, scope)	
    File \"<..>/lib/python3.8/site-packages/google/cloud/sqlcommenter/fastapi.py\", line 84, in _get_fastapi_info	
      route = _get_fastapi_route(fastapi_app, scope)	
    File \"<..>/lib/python3.8/site-packages/google/cloud/sqlcommenter/fastapi.py\", line 98, in _get_fastapi_route	
      return child_scope[\"route\"]	
        KeyError: 'route'",
  "timestamp": "2022-05-27T08:23:55.549322+00:00"
}

Feature request: spec support for using bound parameters as attribute values

Would it be feasible to support attribute values being specified via a query's parameter bindings?

Idea

I'd like to support prepared statements like the following (and stay within the sqlcommeter spec):

LOG:  execute my-demo-stmt: SELECT * FROM FOO WHERE ID = $1 /*application='app-1', traceparent=$2*/
DETAIL:  parameters: $1 = 1234, $2 = '00-5bd66ef5095369c7b0d1f8f4bd33716a-c532cb4098ac3dd2-01'

This would necessitate an extension to sqlcommeter spec. Something like:

FIELD      ::=  <SERIALIZED_KEY>=<VALUE>
VALUE      ::=  <SERIALIZED_VALUE> | <PARAM_NUM>
PARAM_NUM  ::=  $[0-9]+

Motivation

My objective is to enable the performance benefits of prepared statements, whilst still being able to use a unique per-query traceparent.

In Postgres, it is technically possible to prepare and execute a statement like the example above. (The $2 parameter is simply unused by the query planner, but still appears in the statement log.) I am exploring adding this support to DataDog/dd-trace-go's SqlCommeter.

Would this feature in the spec be generally useful?

Express middleware can write the wrong route

This effects both sequelize and knex as they use the same mechanism. The middleware sets the current request on the global Knex object:

Knex.__middleware__ = true;
Knex.__req__ = req;

And then uses this request in the wrapped query function to get route information

const req = Knex.__req__;
comments['route'] = req.path;

However, if the middleware runs again for a different concurrent request to a different route before the actual handler does the querying, it will overwrite Knex.__req__ with the new request. This can happen if the handler yields the event loop before it queries.

E.g. request comes for /foo and while it awaits something(), a request comes for /bar causing the middleware to overwrite __req__. Then /foo runs knex.select() and the query is annotated with the wrong request:

app.get('/foo', async (req, res) => {
  await something();
  const records = await knex.select(...);
  res.json(records);
});

app.get('/bar', async (req, res) => {
  const records = await knex.select(...);
  res.json(records);
});

django middleware only applies to default database connections, ignoring other databases

Currently, only the queries on the default database are instrumented by sqlcommenter.

The django middleware uses django.db.connection, but when using multiple databases in django django.db.connections (with an extra s) should be iterated on (possibly re-implementing execute_wrapper to handle all connections in one wrapper).

def __call__(self, request):
with connection.execute_wrapper(QueryWrapper(request)):
return self.get_response(request)

Go: Incorrect specification implementation causes traceparent not to work properly

So I've been investigating the bug I'm having in this isssue: #211

I was checking the way the query is sent to the database and realized that the current go implementation doesnt follow the specification.
The way the current implementation adds the comments is as such:
/*action=ListUsers,route=router,traceparent=00-ef43f0b3e8c3bf30186961dXXXX-YYYY-01*/

However, the specification states that all values must be single quoted
image

So I added single quotes to all the values that are being sent to the server and suddenly everything started working, so now I'm sending this instead (notice the single quotes on each parameter):
/*action='ListUsers',route='router',traceparent='00-ef43f0b3e8c3bf30186961dXXXX-YYYY-01'*/

This worked and solved the issue.

Based on the nodejs implementation it appears that all values need to be single quoted https://github.com/google/sqlcommenter/blob/master/nodejs/sqlcommenter-nodejs/packages/sqlcommenter-knex/index.js#L95

Attaching the PR to fix it below

Allow affixing SQLCommenter comment if comment already exists in statement

There's certain comments that are necessary to include in a SQL statement. The use case I have in mind in Postgres hints 1.

  /*+
    HashJoin(a b)
    SeqScan(a)
  */
 EXPLAIN SELECT *
    FROM pgbench_branches b
    JOIN pgbench_accounts a ON b.bid = a.bid
   ORDER BY a.aid;

The spec states:

If a comment already exists within a SQL statement, we MUST NOT mutate that statement.

That means we can never use SQLCommenter with pg hint comments.

Proposal: modify spec to optionally allow affixing a comment. Alternately, narrow the spec to "if a comment already exists as the last token in a sql statement". When parsing choose the last comment if there's more than one.

Update otel packages for sqlcommenter-sequelize

Several otel packages need to be updated:
- "@opentelemetry/api": "^1.3.0",
- "@opentelemetry/core": "^1.8.0",
- "@opentelemetry/sdk-trace-node": "^1.9.0"

Notes:

  • @opentelemetry/node was renamed to @opentelemetry/sdk-trace-node in v0.24.0
  • HttptraceContextPropagator was renamed to W3CTraceContextPropagator in @opentelemetry/core in v0.26.0

Node.js libraries require @opencensus/propagation-tracecontext but it is not in package.json

index.js (for both knex and sequelize integration) requires util.js which requires @opencensus/propagation-tracecontext.

const {TraceContextFormat} = require('@opencensus/propagation-tracecontext');

However, this dep is only in devDependencies atm. OpenCensus or OpenTelemetry are providers for tracestate/traceparent, so the dependency should be optional and only crash if the user passes tracestate/traceparent in their config.

Repro:

$ npm init -y
$ npm install @google-cloud/sqlcommenter-knex
$ node -i "require('@google-cloud/sqlcommenter-knex')"
internal/modules/cjs/loader.js:968
  throw err;
  ^

Error: Cannot find module '/usr/local/google/home/aaronabbott/repo/sqlcommenter/nodejs/sqlcommenter-nodejs/samples/express-opentelemetry/require('@google-cloud/sqlcommenter-knex')'
    at Function.Module._resolveFilename (internal/modules/cjs/loader.js:965:15)
    at Function.Module._load (internal/modules/cjs/loader.js:841:27)
    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
    at internal/main/run_main_module.js:17:47 {
  code: 'MODULE_NOT_FOUND',
  requireStack: []
}

Allow user to configure the location of comment

It is really common that the query is so long that the ending of the query will be trimmed in the logs.
If we can allow users to configure the location of comment to insert, that will make the library better.

URL encoding ambiguities

I noticed a discrepancy how we go about URL encoding.

  • Python and PHP implementations escape url encoding with %. This is mentioned nowhere in the specification.
  • The Go and Javascript do a simple URL encode.
  • The Java implementation also does not prefix but seems to double URL encode and reverse the order.

This results in three different results for the same traceparent example value of congo=t61rcWkgMzE,rojo=00f067aa0ba902b7:

Language URL encoded traceparent
Python, PHP congo%%3Dt61rcWkgMzE%%2Crojo%%3D00f067aa0ba902b7
Go, JS congo%3Dt61rcWkgMzE%2Crojo%3D00f067aa0ba902b7
Java rojo%253D00f067aa0ba902b7%2Ccongo%253Dt61rcWkgMzE

Consulting the spec to look for answers, but also creates more questions:

Meta characters such as ' should be escaped with a slash .

The way this is written implies multiple characters, but only one is given?

  1. URL encode the value e.g. given /param first, that SHOULD become %2Fparam%20first
  2. Escape meta-characters within the raw value; a single quote ' becomes '

This seems to make the Go/Javascript implementations correct. What threw me off initially as my language of choice (Ruby) already encoded ' as %27 so the second point seemed superfluous. Checking the other implementations, I notice that Javascript encodeURIComponent does not do the same as Ruby and leaves the ' intact... but it also begs the questions why this workaround and not have it URL encoded as well?

What I'd like to see:

  • A list of known URL encoding edge cases and their correct encoding form(s). There is already some listed in https://google.github.io/sqlcommenter/spec/#key-value-format but it contains a copy-paste error on the second row. I think the traceparent example and something like foo'bar are useful additions.
  • The unit tests and readme files for implementations in this repository all using at least this list of cases.

Re-Review PHP integration test for any bugs related corner cases.

Did a quick review on PHP Sqlcommenter integration test(#119) and approving it (to make it possible to have checks to before anyone can commit any code that breaks the library).

But to close this bug, we need to:

  • Do a thorough review again and fix any issues due to that.
  • Verify issues with git workflows added here.

nodejs: allow specifying `application` as part of the `wrap` operation

Currently all of the NodeJS sqlcommenter packages restrict the allowed comment entries to the DB driver, trace info, and the http route being served. I also want to specify the entry application so that I can tell where the query is coming from (especially in contexts that aren't serving http traffic!)

Are the allowed keys are being restricted on purpose? If so, would you accept a PR that allowed a static value for application to be specified at initialization time?

SQLCommenter PHP/Laravel package broken because of unpublished OpenTelemetry version

The Laravel/PHP package relies on version 0.9.0 of the OpenTelemetry package. Unfortunately, OpenTelemetry decided to unpublish all versions of the package prior to the 1.0.0.beta version. Running composer update in any project with this package installed will fail until the Laravel package is updated to support newer versions of OpenTelemetry

knex integration does not work with OpenTelemetry (possible OpenCensus) for node.js <14.5.0

OpenTelemetry relies on node.js async hooks context in async calls. More details here:

This package provides async-hooks based context manager which is used internally by OpenTelemetry plugins to propagate specific context between function calls and async operations. It only targets NodeJS since async-hooks is only available there.

Unfortunately, this doesn't work with "thenables" which is knex.js uses (see open-telemetry/opentelemetry-js#940, knex/knex#3550) for node versions <14.5.0 (see nodejs/node#22360).

Thankfully, there is an existing plugin that solves this has solved this context issue for knex, which sqlcommenter-knex can base a solution on: https://github.com/myrotvorets/opentelemetry-plugin-knex/blob/bd5f2d918eeaa86c257f5e433772ec57b4834582/lib/knex.ts#L115

I have verified that this is working in newer node versions without any code changes.

Clarification on the specification

Looking at the Key Value Format section, I'm a little confused

First, the second example seems to be off, the value serialize value + final don't match the initial value

KEY VALUE PAIR SERIALIZED_KEY SERIALIZED_VALUE FINAL
name='DROP TABLE FOO' route '%2Fpolls%201000' route='%2Fpolls%201000'

Then in the 3rd example:

KEY VALUE PAIR SERIALIZED_KEY SERIALIZED_VALUE FINAL
name''="DROP TABLE USERS'" name='' DROP%20TABLE%20USERS' name=''=β€˜DROP%20TABLE%20USERS'’

Shouldn't the name'' key has its quote escaped to name\'\'? And where does the second equal (=) in the Final form comes from?

Thank you.

feat(support) TypeORM

Hi,

this is a great library.
Unfortunately we are mostly using Typeorm (https://typeorm.io/) which is not supported yet.
I think this is one of the most important node.js ORMs currently.

If anyone is interested, i can try to help out as much as i can.

Thanks

Using JSON instead of the custom key-value format ?

Would you be open to optionally supporting JSON as the format instead of the custom key-value format currently being used ?
Since the current format is so close to json anyway and given high quality implementation of json parsers in practically all platforms (and some native support in many major languages), it seems that using JSON would bring some benefits without significant downsides.

website: figure out how to properly generate static HTML from Hugo

We need to figure out how to properly generate HTML from the website itself that'll then be rendered properly by the content retrieved from the gh-pages here.

@smg727 you have access to the sqlcommenter-website that I created and it can be run locally
by

hugo serve

or to generate HTML that'll be found in the /public directory

hugo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.