Git Product home page Git Product logo

trino-go-client's Introduction

Trino Go client

A Trino client for the Go programming language. It enables you to send SQL statements from your Go application to Trino, and receive the resulting data.

Build Status GoDoc

Features

  • Native Go implementation
  • Connections over HTTP or HTTPS
  • HTTP Basic, Kerberos, and JSON web token (JWT) authentication
  • Per-query user information for access control
  • Support custom HTTP client (tunable conn pools, timeouts, TLS)
  • Supports conversion from Trino to native Go data types
    • string, sql.NullString
    • int64, sql.NullInt64
    • float64, sql.NullFloat64
    • map, trino.NullMap
    • time.Time, trino.NullTime
    • Up to 3-dimensional arrays to Go slices, of any supported type

Requirements

  • Go 1.21 or newer
  • Trino 372 or newer

Installation

You need a working environment with Go installed and $GOPATH set.

Download and install Trino database/sql driver:

go get github.com/trinodb/trino-go-client/trino

Make sure you have Git installed and in your $PATH.

Usage

This Trino client is an implementation of Go's database/sql/driver interface. In order to use it, you need to import the package and use the database/sql API then.

Use trino as driverName and a valid DSN as the dataSourceName.

Example:

import "database/sql"
import _ "github.com/trinodb/trino-go-client/trino"

dsn := "http://user@localhost:8080?catalog=default&schema=test"
db, err := sql.Open("trino", dsn)

Authentication

Both HTTP Basic, Kerberos, and JWT authentication are supported.

HTTP Basic authentication

If the DSN contains a password, the client enables HTTP Basic authentication by setting the Authorization header in every request to Trino.

HTTP Basic authentication is only supported on encrypted connections over HTTPS.

Kerberos authentication

This driver supports Kerberos authentication by setting up the Kerberos fields in the Config struct.

Please refer to the Coordinator Kerberos Authentication for server-side configuration.

JSON web token authentication

This driver supports JWT authentication by setting up the AccessToken field in the Config struct.

Please refer to the Coordinator JWT Authentication for server-side configuration.

System access control and per-query user information

It's possible to pass user information to Trino, different from the principal used to authenticate to the coordinator. See the System Access Control documentation for details.

In order to pass user information in queries to Trino, you have to add a NamedArg to the query parameters where the key is X-Trino-User. This parameter is used by the driver to inform Trino about the user executing the query regardless of the authentication method for the actual connection, and its value is NOT passed to the query.

Example:

db.Query("SELECT * FROM foobar WHERE id=?", 1, sql.Named("X-Trino-User", string("Alice")))

The position of the X-Trino-User NamedArg is irrelevant and does not affect the query in any way.

DSN (Data Source Name)

The Data Source Name is a URL with a mandatory username, and optional query string parameters that are supported by this driver, in the following format:

http[s]://user[:pass]@host[:port][?parameters]

The easiest way to build your DSN is by using the Config.FormatDSN helper function.

The driver supports both HTTP and HTTPS. If you use HTTPS it's recommended that you also provide a custom http.Client that can validate (or skip) the security checks of the server certificate, and/or to configure TLS client authentication.

Parameters

Parameters are case-sensitive

Refer to the Trino Concepts documentation for more information.

source
Type:           string
Valid values:   string describing the source of the connection to Trino
Default:        empty

The source parameter is optional, but if used, can help Trino admins troubleshoot queries and trace them back to the original client.

catalog
Type:           string
Valid values:   the name of a catalog configured in the Trino server
Default:        empty

The catalog parameter defines the Trino catalog where schemas exist to organize tables.

schema
Type:           string
Valid values:   the name of an existing schema in the catalog
Default:        empty

The schema parameter defines the Trino schema where tables exist. This is also known as namespace in some environments.

session_properties
Type:           string
Valid values:   comma-separated list of key=value session properties
Default:        empty

The session_properties parameter must contain valid parameters accepted by the Trino server. Run SHOW SESSION in Trino to get the current list.

custom_client
Type:           string
Valid values:   the name of a client previously registered to the driver
Default:        empty (defaults to http.DefaultClient)

The custom_client parameter allows the use of custom http.Client for the communication with Trino.

Register your custom client in the driver, then refer to it by name in the DSN, on the call to sql.Open:

foobarClient := &http.Client{
    Transport: &http.Transport{
        Proxy: http.ProxyFromEnvironment,
        DialContext: (&net.Dialer{
            Timeout:   30 * time.Second,
            KeepAlive: 30 * time.Second,
            DualStack: true,
        }).DialContext,
        MaxIdleConns:          100,
        IdleConnTimeout:       90 * time.Second,
        TLSHandshakeTimeout:   10 * time.Second,
        ExpectContinueTimeout: 1 * time.Second,
        TLSClientConfig:       &tls.Config{
        // your config here...
        },
    },
}
trino.RegisterCustomClient("foobar", foobarClient)
db, err := sql.Open("trino", "https://user@localhost:8080?custom_client=foobar")

A custom client can also be used to add OpenTelemetry instrumentation. The otelhttp package provides a transport wrapper that creates spans for HTTP requests and propagates the trace ID in HTTP headers:

otelClient := &http.Client{
    Transport: otelhttp.NewTransport(http.DefaultTransport),
}
trino.RegisterCustomClient("otel", otelClient)
db, err := sql.Open("trino", "https://user@localhost:8080?custom_client=otel")

Examples

http://user@localhost:8080?source=hello&catalog=default&schema=foobar
https://user@localhost:8443?session_properties=query_max_run_time=10m,query_priority=2

Data types

Query arguments

When passing arguments to queries, the driver supports the following Go data types:

  • integers
  • bool
  • string
  • slices
  • trino.Numeric - a string representation of a number
  • time.Time - passed to Trino as a timestamp with a time zone
  • the result of trino.Date(year, month, day) - passed to Trino as a date
  • the result of trino.Time(hour, minute, second, nanosecond) - passed to Trino as a time without a time zone
  • the result of trino.TimeTz(hour, minute, second, nanosecond, location) - passed to Trino as a time with a time zone
  • the result of trino.Timestamp(year, month, day, hour, minute, second, nanosecond) - passed to Trino as a timestamp without a time zone

It's not yet possible to pass:

  • float32 or float64
  • byte
  • time.Duration
  • json.RawMessage
  • maps

To use the unsupported types, pass them as strings and use casts in the query, like so:

SELECT * FROM table WHERE col_double = cast(? AS DOUBLE) OR col_timestamp = CAST(? AS TIMESTAMP)

Response rows

When reading response rows, the driver supports most Trino data types, except:

  • time and timestamps with precision - all time types are returned as time.Time. All precisions up to nanoseconds (TIMESTAMP(9) or TIME(9)) are supported (since this is the maximum precision Golang's time.Time supports). If a query returns columns defined with a greater precision, values are trimmed to 9 decimal digits. Use CAST to reduce the returned precision, or convert the value to a string that then can be parsed manually.
  • DECIMAL - returned as string
  • IPADDRESS - returned as string
  • INTERVAL YEAR TO MONTH and INTERVAL DAY TO SECOND - returned as string
  • UUID - returned as string

Data types like HyperLogLog, SetDigest, QDigest, and TDigest are not supported and cannot be returned from a query.

For reading nullable columns, use:

  • trino.NullTime
  • trino.NullMap - which stores a map of map[string]interface{} or similar structs from the database/sql package, like sql.NullInt64

To read query results containing arrays or maps, pass one of the following structs to the Scan() function:

  • trino.NullSliceBool
  • trino.NullSliceString
  • trino.NullSliceInt64
  • trino.NullSliceFloat64
  • trino.NullSliceTime
  • trino.NullSliceMap

For two or three dimensional arrays, use trino.NullSlice2Bool and trino.NullSlice3Bool or equivalents for other data types.

To read ROW values, implement the sql.Scanner interface in a struct. Its Scan() function receives a []interface{} slice, with values of the following types:

  • bool
  • json.Number for any numeric Trino types
  • []interface{} for Trino arrays
  • map[string]interface{} for Trino maps
  • string for other Trino types, as character, date, time, or timestamp

License

Apache License V2.0, as described in the LICENSE file.

Build

You can build the client code locally and run tests with the following command:

go test -v -race -timeout 2m ./...

Contributing

For contributing, development, and release guidelines, see CONTRIBUTING.md.

trino-go-client's People

Contributors

a-urth avatar adamgreenhall avatar armsword avatar bluestalker avatar brianwarner avatar chaho12 avatar dvrkps avatar ebyhr avatar electrum avatar felipejfc avatar fengctor avatar fiorix avatar jpmoresmau avatar kamijin-fanta avatar losipiuk avatar mdesmet avatar mosabua avatar mrveera avatar nineinchnick avatar ramonberrutti avatar scbrickley avatar sirwart avatar skogtwin avatar sunwoo-shin avatar thomas-maurice avatar wendigo avatar winckler avatar wuzekang avatar yifeng-sigma avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

trino-go-client's Issues

Row data type not supported

Trino supports row data type as mentioned in the official documentation.

But there is no support for the data parsing in case we have some data in Row format

I had a look at ConvertValue function which does not have a switch case for the data type.
As it is called internally from nextLocked in sql.go while performing Next() to iterate and extract data, it is not possible to have a column as Row type, unless we manually cast all query to json.

Bug/Feat: Performance degradation with multiple requests

Issue: For 50 simultaneous requests at a time we are getting 5-8 times slower performance compared to node.js. Results are being returned much more slower.

Possible cause: Is it because of semaphores lock? Also, tried calling dbConnect within qHandler.

Test Code:

package main

import (
    "encoding/json"
    "log"
    "fmt"
    "time"
    "runtime"
    "net/http"
)
import "database/sql"
import trino "github.com/trinodb/trino-go-client/trino"

var dbConn *sql.DB

func dbConnect() *sql.DB {
    foobarClient := &http.Client{
        Transport: &http.Transport{
            Proxy: http.ProxyFromEnvironment,
            DialContext: (&net.Dialer{
                Timeout:   30 * time.Second,
                KeepAlive: 30 * time.Second,
                DualStack: true,
            }).DialContext,
            MaxIdleConns:          500,
            IdleConnTimeout:       90 * time.Second,
            TLSHandshakeTimeout:   1 * time.Second,
            ExpectContinueTimeout: 1 * time.Second,
            TLSClientConfig:       &tls.Config{
                InsecureSkipVerify: true,
            },
        },
    }

    trino.RegisterCustomClient("foobar", foobarClient)
    dsn := "https://" + "<username>" + ":" + "<password>" + "@<hostname>"
    db, err := sql.Open("trino", dsn)
    if err != nil {
      fmt.Println(err)
      panic(err)
    } 
    return db
}

func qHandler(w http.ResponseWriter, r *http.Request) {
    rows, err := dbConn.Query(p.Query)
    if err != nil {
        fmt.Println(err)
        return
    }
    defer rows.Close()

    w.Header().Set("Content-Type", "application/json")
    w.WriteHeader(http.StatusCreated)
    json.NewEncoder(w).Encode(&Response{"200", "hello"})
}

func main() {
    dbConn = dbConnect()
    defer dbConn.Close()
    http.HandleFunc("/api", qHandler)
    fmt.Printf("Starting server at port 8080\n")
    if err := http.ListenAndServe(":8080", nil); err != nil {
        log.Fatal(err)
    }
}

Please find below some profiles.

CPU profile:

(pprof) top5
Showing nodes accounting for 3160ms, 30.30% of 10430ms total
Dropped 530 nodes (cum <= 52.15ms)
Showing top 5 nodes out of 335
      flat  flat%   sum%        cum   cum%
    1490ms 14.29% 14.29%     1490ms 14.29%  runtime/internal/syscall.Syscall6
     780ms  7.48% 21.76%      780ms  7.48%  runtime.futex
     340ms  3.26% 25.02%      370ms  3.55%  crypto/internal/bigmod.(*Nat).shiftIn
     290ms  2.78% 27.80%      310ms  2.97%  runtime.step
     260ms  2.49% 30.30%      260ms  2.49%  crypto/internal/edwards25519/field.feMul
(pprof) top5 -cum
Showing nodes accounting for 0.03s, 0.29% of 10.43s total
Dropped 530 nodes (cum <= 0.05s)
Showing top 5 nodes out of 335
      flat  flat%   sum%        cum   cum%
         0     0%     0%      2.60s 24.93%  crypto/tls.(*Conn).HandshakeContext (inline)
     0.01s 0.096% 0.096%      2.60s 24.93%  crypto/tls.(*Conn).handshakeContext
         0     0% 0.096%      2.59s 24.83%  net/http.(*persistConn).addTLS.func2
         0     0% 0.096%      2.52s 24.16%  crypto/tls.(*Conn).clientHandshake
     0.02s  0.19%  0.29%      2.44s 23.39%  runtime.systemstack

Block profile:

(pprof) top5
Showing nodes accounting for 42.02hrs, 99.95% of 42.04hrs total
Dropped 90 nodes (cum <= 0.21hrs)
Showing top 5 nodes out of 28
      flat  flat%   sum%        cum   cum%
  42.02hrs 99.95% 99.95%   42.02hrs 99.95%  runtime.selectgo
         0     0% 99.95%    0.41hrs  0.97%  database/sql.(*DB).Query
         0     0% 99.95%    0.41hrs  0.97%  database/sql.(*DB).QueryContext
         0     0% 99.95%    0.41hrs  0.97%  database/sql.(*DB).QueryContext.func1
         0     0% 99.95%    0.41hrs  0.97%  database/sql.(*DB).query
(pprof) top5 -cum
Showing nodes accounting for 151274.19s, 99.95% of 151354.15s total
Dropped 90 nodes (cum <= 756.77s)
Showing top 5 nodes out of 28
      flat  flat%   sum%        cum   cum%
151274.19s 99.95% 99.95% 151274.19s 99.95%  runtime.selectgo
         0     0% 99.95%  74079.35s 48.94%  net/http.(*persistConn).writeLoop
         0     0% 99.95%  72588.09s 47.96%  net/http.(*persistConn).readLoop
         0     0% 99.95%   1607.39s  1.06%  net/http.(*conn).serve
         0     0% 99.95%   1607.39s  1.06%  net/http.(*ServeMux).ServeHTTP

Mutex profile:

(pprof) top5
Showing nodes accounting for 7.43ms, 99.79% of 7.44ms total
Dropped 12 nodes (cum <= 0.04ms)
Showing top 5 nodes out of 32
      flat  flat%   sum%        cum   cum%
    7.32ms 98.40% 98.40%     7.32ms 98.40%  sync.(*Mutex).Unlock
    0.06ms  0.84% 99.24%     0.06ms  0.84%  context.(*cancelCtx).cancel
    0.04ms  0.56% 99.79%     0.04ms  0.56%  internal/singleflight.(*Group).doCall
         0     0% 99.79%     0.06ms  0.84%  context.(*timerCtx).cancel
         0     0% 99.79%     0.06ms  0.84%  context.WithDeadlineCause.func2
(pprof) top5 -cum
Showing nodes accounting for 7.32ms, 98.40% of 7.44ms total
Dropped 12 nodes (cum <= 0.04ms)
Showing top 5 nodes out of 32
      flat  flat%   sum%        cum   cum%
    7.32ms 98.40% 98.40%     7.32ms 98.40%  sync.(*Mutex).Unlock
         0     0% 98.40%     5.41ms 72.73%  net/http.(*persistConn).readLoop
         0     0% 98.40%     5.36ms 71.96%  net/http.(*Transport).cancelRequest
         0     0% 98.40%     1.25ms 16.74%  net/http.(*Transport).dialConnFor
         0     0% 98.40%     1.25ms 16.74%  net/http.(*Transport).putOrCloseIdleConn

Heap Profile:

(pprof) top5
Showing nodes accounting for 5.01MB, 19.23% of 26.06MB total
Dropped 18 nodes (cum <= 0.13MB)
Showing top 5 nodes out of 119
      flat  flat%   sum%        cum   cum%
    2.51MB  9.63%  9.63%     2.51MB  9.63%  bufio.NewWriterSize (inline)
      -2MB  7.68%  1.95%     3.51MB 13.48%  net/http.(*Transport).dialConn
    1.50MB  5.76%  7.71%     1.50MB  5.76%  crypto/tls.Client (inline)
    1.50MB  5.76% 13.47%     1.50MB  5.76%  crypto/aes.(*aesCipherGCM).NewGCM
    1.50MB  5.76% 19.23%     1.50MB  5.76%  net/http.(*persistConn).readLoop
(pprof) top5 -cum
Showing nodes accounting for 0.51MB, 1.95% of 26.06MB total
Dropped 18 nodes (cum <= 0.13MB)
Showing top 5 nodes out of 119
      flat  flat%   sum%        cum   cum%
      -2MB  7.68%  7.68%     3.51MB 13.48%  net/http.(*Transport).dialConn
         0     0%  7.68%     3.51MB 13.48%  net/http.(*Transport).dialConnFor
    2.51MB  9.63%  1.95%     2.51MB  9.63%  bufio.NewWriterSize (inline)
         0     0%  1.95%    -2.03MB  7.80%  encoding/json.(*Decoder).Decode
         0     0%  1.95%    -2.03MB  7.80%  github.com/trinodb/trino-go-client/trino.(*driverStmt).exec.func2

Queries sometimes fail when decoding progress percent in the response

As reported in trinodb/grafana-trino#258, queries sometimes fail with:

error querying the database: trino: json: cannot unmarshal string into Go struct field stmtStats.stats.progressPercentage of type float32

The progressPercent field is declared as a float32: https://github.com/trinodb/trino-go-client/blob/master/trino/trino.go#L722, but the Trino server uses an OptionalDouble: https://github.com/trinodb/trino/blob/ca209630136eabda2449594ef2b6a4d82fb9c2e5/core/trino-main/src/main/java/io/trino/server/QueryProgressStats.java#L132

The driver should ignore invalid values in stats.

Performance issue

I am using trino-go-client to connect to Trino cluster and run queries on Hive catalog.

The first query run takes longer time of around 4-5 seconds and any subsequent runs takes half of them for next 3 minutes or so before going back to 4-5 seconds execution time.

Is there any setting I am missing to set or unset?

Is there a way to exec a query without handling it as a prepared statement?

Hi!
First of all, i'm really thankful for this library, the implementation for this library is great.
I've been looking into my cluster overview and noticed that everytime a query is executed from the go client, it's triggered as a prepared statement, so it looks like the following:
EXECUTE _trino_go USING 'foo'
i'm currently looking for one of the following:

  1. a reason for this to happen
  2. a way to avoid this execution pattern.

Again, thank you very much!

Does trino-go-client support golang sql.NamedArg?

Hi, I am using this trino client to query our trino database and I cannot figure out how to use golang sql.NamedArg with it.
For example, the query function call looks like:

db.Query("SELECT * FROM prometheus WHERE \"metricName\" = :metric", sql.NamedArg("metric", "cpu_usage"))

But I received the following error:

 Error executing query: trino: query failed (400 Bad Request): "Invalid X-Trino-Prepared-Statement header: line 1:73: mismatched input ':'. Expecting: 'ALL', 'ANY', 'SOME', <expression>"

I have also tried @metric but it didn't work either, so I am wondering whether the trino client supports NamedArg or not?

My runtime environment is:

  • Golang v1.20.6
  • trino-go-client v0.313.0

Thank you!

[BUG] Query is not created properly if there is hidden Args

Bug I found which I mentioned in slack at this part of the code

Error when there are only hidden args

If there are no regular args and only hidden Args, query is created like EXECUTE _presto_go USING. This causes an error io.prestosql.sql.parser.ParsingException: line 1:26: mismatched input '<EOF>'. Expecting: <expression> because it ends with USING.

Suggest How-to-fix

First check if there are any โ€˜simpleโ€™ arguments, then query should be EXECUTE preparedStatementName USING โ€ฆ. .
If there are only hidden args, query should be from original, st.query.

As electrum saids in slack, simply checking length of ss []string seems to be simpler and cleaner method

Can't execute more than one query on one connection

I found an interesting thing, where I can't run two queries on a connection. I created a connection and set the query_max_execution_time to 10 minutes. Then, I execute only one query, I don't encounter any problem. However, if I execute more than one query, it fails with the following error. I believe this should not happen.


Error:      	Received unexpected error:
        	            	<html>
        	            	<head>
        	            	<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
        	            	<title>Error 500 Request failed.</title>
        	            	</head>
        	            	<body><h2>HTTP ERROR 500 Request failed.</h2>
        	            	<table>
        	            	<tr><th>URI:</th><td>/v1/statement</td></tr>
        	            	<tr><th>STATUS:</th><td>500</td></tr>
        	            	<tr><th>MESSAGE:</th><td>Request failed.</td></tr>
        	            	<tr><th>SERVLET:</th><td>org.glassfish.jersey.servlet.ServletContainer-6314b195</td></tr>
        	            	</table>
        	            	
        	            	</body>
        	            	</html>

You can use following steps to reproduce the issue.

package main

import (
	"context"
	"fmt"

	"database/sql"
)

func check(ctx context.Context, config TrinoConfig) error {
	cfg, err := config.toTrinoDbConfig()
	dsn, err := cfg.FormatDSN()
	if err != nil {
		return err
	}
	db, err := sql.Open("trino", dsn)
	if err != nil {
		return err
	}

	timeoutStr := "10m"       // session timeout 10 minutes
	conn, err := db.Conn(ctx) //getting connection
	if err != nil {
		return err
	}
	_, err = conn.ExecContext(ctx, fmt.Sprintf("SET SESSION query_max_execution_time = '%s'", timeoutStr)) // setting session timeout as 10 minutes
	if err != nil {
		conn.Close()
		return err
	}

	_, err = conn.QueryContext(ctx, "select 1")
	if err != nil {
		return err
	}

	_, err = conn.QueryContext(ctx, "select 2")
	if err != nil {
		return err
	}
	return nil
}

access kerberized data

system๏ผšwindwos10

Configure Java access to Trino.
image

trino-go-client config.
image

The Java side can connect to Trino, but the Go client prompts an error.

go client error :
[04-24|11:36:23] error querying the database: trino: Error login to KDC: [Root cause: KDC_Error] KDC_Error: AS Exchange Error: kerberos error response from KDC: KRB Error: (6) KDC_ERR_C_PRINCIPAL_UNKNOWN Client not found in Kerberos database - CLIENT_NOT_FOUND

ParsingException for SQL Server when stored-procedure-table-function is enabled

When I enable sqlserver.experimental.stored-procedure-table-function-enabled=true and call the stored procedure of SQL Server, the ParsingException is thrown. Seems the feature is not supported by go client. I also try the trino cli. It works fine. Could you help to enhance this feature?

db.Query(SELECT * FROM TABLE( system.procedure( query => 'EXECUTE example_schema.employee_sp' ) );)

Client.Timeout exceeded while awaiting headers

I notice that I get this error when my coordinator is in heavy load while reading data in rows.Next(). Is there a timeout while fetching rows.Next?

  • I assume that there is issue while getting response from coordinator.
trino: query failed (0 ): "Get \"https://trino_url/v1/statement/executing/20240318_081907_03546_ch76y/y269bf9b085430f6e7d0222f3d8f2dc7309a9fdc9/1426\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"

My trinoclient is using retryable, and new Client uses default httpclient with default poold transport setting.

	retryClient := retryablehttp.NewClient()
	retryClient.Logger = nil
	TrinoClient = retryClient.StandardClient()
    defer rows.Close()
    if rows != nil {
        ....
        for rows.next() { 
            ...
            err := rows.Scan(results...)
            ...
        }
        err = rows.Err()
	if err != nil {
	    return err <-- error is logged here.
	}
    }

So, I was wondering is setting http.ClientTimeout recommded? or at least is it ok to set timeout? I couldn't find anything on timeout part.

I thought that if i don't set it, timeout is infinite and no error is responded.
This doesn't seem to work with retryable library, as it retries only on connection/server errors.

Support variable precision time, timestamp, time with time zone, timestamp with time zone types

Over couple of recent releases (up to including Presto 341), Presto gained support for variable precision time, timestamp, time with time zone, timestamp with time zone types.
For backward compatibility reasons, these are not rendered with actual precision to the client, unless the client sends proper X-Presto-Client-Capabilities header.

Add support for these, along with properly setting the header.

See more at trinodb/trino#1284

Panic: runtime error: index out of range in driverRows.Next

I'm hitting a panic that I reproduce pretty well, but not always, it sometimes panics and sometimes doesn't.

I'm using a pretty simple query: SELECT 1, NULL, 'abc'", but I'm not sure if it's relevant.

The relevant part of the backtrace:

panic: runtime error: index out of range [1] with length 1

goroutine 125 [running]:
github.com/trinodb/trino-go-client/trino.(*driverRows).Next(0x14000708160, {0x140000aa3f0, 0x9, 0x10667fbb0?})
        /Users/thomas/go/pkg/mod/github.com/trinodb/[email protected]/trino/trino.go:1028 +0x2a0

After that it's my code calling trino.driverRows.Next directly.

Code where the panic happens:

vv, err := v.ConvertValue(qr.data[qr.rowindex][i])

This is the state of trino.driverRows:

// First call to trino.driverRows.Next (Ok):
&{
	ctx:0x14000117980 
	stmt:0x140001153e0 
	queryID:20230104_135631_01766_ysbyr 
	nextURI:https://localhost:8443/v1/statement/queued/20230104_135631_01766_ysbyr/y235c9529065f1ef1da71af639ccea62d7e95fe9e/1 
	err:<nil> 
	rowindex:0 
	columns:[_col0 _col1 _col2] 
	coltype:[0x140001a67e0 0x140001a68c0 0x140001a6930] 
	data:[[1 <nil> abc]] 
	rowsAffected:0 
	statsCh:<nil> 
	doneCh:0x1400056e360
}

// Second call (panics):
&{
	ctx:0x14000117980 
	stmt:0x140001153e0 
	queryID:20230104_135631_01766_ysbyr 
	nextURI:https://localhost:8443/v1/statement/queued/20230104_135631_01766_ysbyr/y235c9529065f1ef1da71af639ccea62d7e95fe9e/1 
	err:<nil> 
	rowindex:1 
	columns:[_col0 _col1 _col2] 
	coltype:[0x140001a67e0 0x140001a68c0 0x140001a6930] 
	data:[[1 <nil> abc]] 
	rowsAffected:0 
	statsCh:<nil> 
	doneCh:0x1400056e360
}

I think the following line is incorrect:

if qr.columns == nil || qr.rowindex >= len(qr.data) {

I think >= should be changed to > since indexing is zero based.
But what I can't figure out why the current code works sometimes, but panics in the next test run... Incorrect.

Reduce dependencies

This trino library has far more dependencies than the old presto one, like github.com/Microsoft/go-winio, and specifically github.com/docker/docker, which isn't clearly related to trino, but conflicts with my application's dependencies. Is it possible to reduce these?

Fix parsing of decimals

The tests require parse_decimal_literals_as_double=true since the driver doesn't properly handle decimals (at least inside of arrays). This session property should be removed (it defaults to false) so that the default behavior of parsing decimal literals as decimals occurs, which how most people will use Presto. In general, the type conversion tests need to be more comprehensive.

Return extra runtime statistics information immediately

Unlike other other drivers which simply uses http request and gets runtime statistics information using HTTP protocol, Presto-Go-Client uses https://golang.org/pkg/database/sql/driver to implement and returns rows only after the query completes

@losipiuk suggests using named query arguments to pass callback to implementation of QueryContext method.

FYI

After setting connection by sql.Open("presto", dsn) whole process of querying data is as follows. These functions are all within go SQL.

sql.Query -> QueryContext -> query -> queryDC -> rowsiFromStatement -> ctxDriverStmtQuery -> siCtx.QueryContext(ctx, nvdargs) -> then returns rows

  • queryDC : queryDC executes a query on the given connection.
  • siCtx.QueryContext(ctx, nvdargs) : presto-go-client QueryContext function

Relevant slack discussion: https://prestosql.slack.com/archives/CFLB9AMBN/p1597988801033400

plan to support Update/Insert/Delete ?

Hi, I found the README file said that:
Only read operations are supported, such as SHOW and SELECT.

Is there any plan to support Update/Insert/Delete?

Thanks in advance.

Any way to extract queryId from result / rows?

Problem summary:
I am trying to get the query ID back from Trino, and I am calling the QueryContext method. The implementing Rows struct in the Trino code seems to not send it back in anyway that I can find.

Question:
Can I extract the query ID when using something like this? If so, how?

	rows, err := trinoDB.QueryContext(ctx, selectStmt, placeholders...)
	if err != nil {
		return nil, err
	}
	cols, err := rows.Columns()
	if err != nil {
		return nil, err
	}
	for rows.Next() {
             ...
        }

How to set a custom name to prepared statements, in different connections

I'm trying to name queries differently to avoid the generic
EXECUTE _trino_go USING 4
and get some context (at least in names).
EXECUTE total_customers_transactions USING 4
Is there a way to achieve this using the lib via ExecContext() o QueryRowContext()?

I feel like I'm missing something.
Thank you in advanced!

Queries don't get canceled on context cancel

Experiencing an issue where queries are not being canceled when the context is canceled. This leads to abandoned queries causing user inconvenience. Attempted resolution by setting a timeout aligned with context timeout on the connection, but encountered problems with the Trino driver. Requesting investigation and resolution to ensure proper query cancellation on context cancellation.

Support for more than one header?

Hello!

We at Lyft would benefit from possibility to pass more than one header in order to correctly handle requests through our custom proxy. But unfortunatelly currently trino client supports only one header - X-Trino-User (https://github.com/trinodb/trino-go-client/blob/master/trino/trino.go#L652).

What do You think about adding support for multiple headers? Lets say any header prefixed with X-Trino?

If that sounds ok, then I can prepare a PR for that.

Cheers

Pool of system root CAs is not used by default in case of https connection with no `SSLCert` and `SSLCertPath` configured.

Client version: v0.311.0
Effect. In case of the client uses https connection with no SSLCert and SSLCertPath are provided:

  • < v0.311.0: default root CAs of the current operating system is used.
  • >= v0.311.0: default root CAs of the current operating system is not used leading to:
trino: query failed (0 ): "Post \"https://<path-to-trino-frontend>": tls: failed to verify certificate: x509: certificate signed by unknown authority"

Details: it seems that the change Add support to pass ssl certificate value instead of only path introduced in v0.311.0 has changed the root CAs used by default:

Suggested fix (preserving the previous behavior): use default httpClientt in case of https connection if both SSLCert and SSLCertPath are not provided.

Trino VARBINARY columns should scan to []byte slices instead of strings.

Trino columns that are VARBINARY type are base64 encoded before being sent via HTTP to the go client. The current client implementation scans the data into a 'string' type without base64 decoding the values, forcing the user of the driver to do an extra non-intuitive base64 decode to get the original binary data back.

It would be better to have the driver perform the base64 decode and return a []byte slice.
For example, if an IP address or MAC address is stored as a varbinary column in trino, the client could return a []byte type which could be used to create a net.IP or net.HardwareAddr directly.

Unable to use []string as query parameter

From what i see in serial.go, there should be support for slice of strings as parameters.

But when I try to execute something like

s.db.QueryContext(ctx, "SELECT * FROM events WHERE contains(?, name)", []string{"a", "b"})

I get the following error with go 1.14:
converting argument $1 type: unsupported type []string, a slice of string

It looks like your driver needs to implement sql/driver.NamedValueChecker

Suggestions : defining Headers constants as public and add split Config.PrestoURI into user & url

Mentioned in slack

  1. defining Headers constants as public?

I was wondering why header constant is not public. People would need to use such headers but they would have to define their constants so that it matches presto-go-client const (which is bothersome)

presto-cli defines as public variable.

  1. Split PrestoURI with user info in Config to a DSN string

It is sort of confusing because PrestoURI is made up of default user and server addr
I think it would be better if we can split those two because URL consists of URL + user.

https://github.com/prestosql/presto-go-client/blob/master/presto/presto.go#L124

Update Go version to 1.18

  • Currently, the project is based on the 1.14 Go version. Let's upgrade the Go version to 1.18
  • Also, golang.org/x/net v0.0.0-20220706163947-c90051bbdb60 package has the vulnerability

Expose underlying trino error codes / names

ATM the only visible attribute on error is the http status code (from ErrQueryFailed), but a client might want to react differently depending on the underlying trino error (for example, if it's a exceeded time limit the error might bubble up, whereas if it's an exceeded cpu limit it could back off and retry). Right now the information is avail in the stmtError, but that's not exported.

Panic in `CheckNamedValue()` for an `INSERT` prepared statement which receives a nil value

Try running the following code with trino-go-client v0.312.0:

package main

import (
        "context"
        "database/sql"
        "log"

        _ "github.com/trinodb/trino-go-client/trino"
)

func main() {
        db, err := sql.Open("trino", "http://localhost:666")
        if err != nil {
                log.Fatal(err)
        }
        if _, err := db.ExecContext(context.Background(), "INSERT INTO TEST (FOO) VALUES (?)", nil); err != nil {
                log.Fatal(err)
        }
}

You'll observe the following panic:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x98 pc=0x12a2659]

goroutine 1 [running]:
github.com/trinodb/trino-go-client/trino.(*driverStmt).CheckNamedValue(0x12fde01?, 0xc000098bd0)
	/Users/mihaitodor/Projects/go/pkg/mod/github.com/trinodb/[email protected]/trino/trino.go:663 +0xd9
database/sql.driverArgsConnLocked({0x13ba5e0, 0xc0000f2000}, 0xc000058c00, {0xc000058f20, 0x1, 0x15ce480?})
	/usr/local/Cellar/go/1.21.1/libexec/src/database/sql/convert.go:178 +0x5b3
database/sql.resultFromStatement({0x13bab30, 0x15ce480}, {0x13ba5e0, 0xc0000f2000}, 0xc000058c00, {0xc000058f20, 0x1, 0x1})
	/usr/local/Cellar/go/1.21.1/libexec/src/database/sql/sql.go:2637 +0xfa
database/sql.(*DB).execDC(0x1?, {0x13bab30, 0x15ce480}, 0xc0000f0090, 0x30?, {0x133c2ea, 0x21}, {0xc000058f20, 0x1, 0x1})
	/usr/local/Cellar/go/1.21.1/libexec/src/database/sql/sql.go:1694 +0x45f
database/sql.(*DB).exec(0x4b15c80?, {0x13bab30, 0x15ce480}, {0x133c2ea, 0x21}, {0xc000058f20, 0x1, 0x1}, 0x0?)
	/usr/local/Cellar/go/1.21.1/libexec/src/database/sql/sql.go:1655 +0xdb
database/sql.(*DB).ExecContext.func1(0x30?)
	/usr/local/Cellar/go/1.21.1/libexec/src/database/sql/sql.go:1634 +0x4f
database/sql.(*DB).retry(0x13b9840?, 0xc000058e70)
	/usr/local/Cellar/go/1.21.1/libexec/src/database/sql/sql.go:1538 +0x42
database/sql.(*DB).ExecContext(0x13310a2?, {0x13bab30?, 0x15ce480?}, {0x133c2ea?, 0x0?}, {0xc000058f20?, 0x10013e0?, 0xc00008e058?})
	/usr/local/Cellar/go/1.21.1/libexec/src/database/sql/sql.go:1633 +0xc8
main.main()
	/Users/mihaitodor/test/main.go:16 +0xa5
exit status 2

Can't update to v0.310.0

I'm not exactly sure what is wrong, but when I run go get -u I get the following error message:

go: github.com/trinodb/[email protected]: verifying module: checksum mismatch
        downloaded: h1:QHRRrRrOlOKbY2SQloUspfsWyeMTvSl0UksFm/LA4GU=
        sum.golang.org: h1:GjSG/60MdmaZHWmOsUAbpwjElCcgfoem6KvFWeJ0Hss=

SECURITY ERROR
This download does NOT match the one reported by the checksum server.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

Did something go wrong with tagging v0.310.0, or was it retagged maybe?

Catalog must be specified when session catalog is not set

trino: query failed (200 OK): "io.prestosql.spi.PrestoException: line 1:15: Catalog must be specified when session catalog is not set"
Getting this error when running SELECT * FROM schema_name.table_name limit 10 query
I have already passed the catalog name and schema name in the connection string.
Error occured in Trino version - 350

Tried using Trino version 358, no error occured in that case. Can this be fixed for Trino 350 as well?

Null support for inserts

Client version: v0.312.0

I have a use case where I insert multiple records with different fields are being nullable. For example, my table schema is as follows:

id: int
name: string
order: int (nullable)
url: string (nullable)

order and url fields are nullable so when I have 2 records:

(1, "first record", nil, "url")
(2, "second record", 1, nil)

I expected to execute the following SQL query (with arguments):

INSERT INTO my_table (id, name, order, url)
VALUES
(1, 'first record', NULL, 'url'),
(2, 'second record', 1, NULL);

However, when ExecContext function is called with driver.NamedValue that has nil as a value, it throws exception in type conversion function here.

To support nil values, we can change the type conversion for nil to:

case nil:
    return "NULL", nil

I am curious what your thoughts are. If this is not a desirable behaviour, is there another workaround for the problem?

Add support to set X-Trino-Extra-Credential header

The Trino Go client does not provide a mechanism to set extraCredentials, which may be needed by some connectors. The Trino HTTP server expects to find a header named X-Trino-Extra-Credential per https://github.com/trinodb/trino/blob/8943325421e81990804a64c9f4f4fd1aaf3e2fc9/client/trino-client/src/main/java/io/trino/client/ProtocolHeaders.java#L131-L134

Possible Solution

  1. Extend the URL to accept a query parameter named extraCredentials that takes the form key=value[,key=value,...] per the requirements of the Trino source

  2. Set the X-Trino-Extra-Credential header similar to here

    trinoSessionHeader: query.Get("session_properties"),

goroutine stack exceeds 1000000000-byte limit while fetching

issue

Query finishes in Trino Web UI, and golang server fails due to stack overflow error.
I assume error happens here at recursive fetch function.

error msg

runtime: goroutine stack exceeds 1000000000-byte limit
fatal error: stack overflow

runtime stack:
runtime.throw(0x1169e18, 0xe)
        /usr/local/go/src/runtime/panic.go:774 +0x72
runtime.newstack()
        /usr/local/go/src/runtime/stack.go:1046 +0x6e9
runtime.morestack()
        /usr/local/go/src/runtime/asm_amd64.s:449 +0x8f

goroutine 2676910 [running]:
reflect.Value.Type(0x111d820, 0xc5194c8fc0, 0x199, 0x0, 0x0)
        /usr/local/go/src/reflect/value.go:1874 +0x1a4 fp=0xc2cb7b8310 sp=0xc2cb7b8308 pc=0x4b8004
encoding/json.indirect(0x111d820, 0xc5194c8fc0, 0x199, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /usr/local/go/src/encoding/json/decode.go:498 +0x7c4 fp=0xc2cb7b8390 sp=0xc2cb7b8310 pc=0x511f44
encoding/json.(*decodeState).object(0xc5193e7208, 0x111d820, 0xc5194c8fc0, 0x199, 0x1166e11, 0xb)
        /usr/local/go/src/encoding/json/decode.go:663 +0x62 fp=0xc2cb7b8620 sp=0xc2cb7b8390 pc=0x512d12
encoding/json.(*decodeState).value(0xc5193e7208, 0x111d820, 0xc5194c8fc0, 0x199, 0x111d820, 0xc5194c8fc0)
        /usr/local/go/src/encoding/json/decode.go:429 +0x6d fp=0xc2cb7b8688 sp=0xc2cb7b8620 pc=0x5114cd
encoding/json.(*decodeState).array(0xc5193e7208, 0xf756c0, 0xc5194c8e88, 0x197, 0xc5193e7230, 0x5b)
        /usr/local/go/src/encoding/json/decode.go:616 +0x1ac fp=0xc2cb7b8770 sp=0xc2cb7b8688 pc=0x51218c
encoding/json.(*decodeState).value(0xc5193e7208, 0xf756c0, 0xc5194c8e88, 0x197, 0xf756c0, 0xc5194c8e88)
        /usr/local/go/src/encoding/json/decode.go:419 +0xfd fp=0xc2cb7b87d8 sp=0xc2cb7b8770 pc=0x51155d
encoding/json.(*decodeState).object(0xc5193e7208, 0x111d820, 0xc5194c8e10, 0x199, 0xc5193e7230, 0x7b)
        /usr/local/go/src/encoding/json/decode.go:823 +0x1392 fp=0xc2cb7b8a68 sp=0xc2cb7b87d8 pc=0x514042
encoding/json.(*decodeState).value(0xc5193e7208, 0x111d820, 0xc5194c8e10, 0x199, 0x111d820, 0xc5194c8e10)
        /usr/local/go/src/encoding/json/decode.go:429 +0x6d fp=0xc2cb7b8ad0 sp=0xc2cb7b8a68 pc=0x5114cd
encoding/json.(*decodeState).array(0xc5193e7208, 0xf756c0, 0xc5194c8bb8, 0x197, 0xc5193e7230, 0x5b)
        /usr/local/go/src/encoding/json/decode.go:616 +0x1ac fp=0xc2cb7b8bb8 sp=0xc2cb7b8ad0 pc=0x51218c
encoding/json.(*decodeState).value(0xc5193e7208, 0xf756c0, 0xc5194c8bb8, 0x197, 0xf756c0, 0xc5194c8bb8)
        /usr/local/go/src/encoding/json/decode.go:419 +0xfd fp=0xc2cb7b8c20 sp=0xc2cb7b8bb8 pc=0x51155d
encoding/json.(*decodeState).object(0xc5193e7208, 0x111d820, 0xc5194c8b40, 0x199, 0x1166e11, 0xb)

sql

  • I modified sql for security/privacy issue.
  • result has 17,666,708 lines of rows, approx 10mb x 20 partition size in hdfs.
with topic_nx as (
	select refine_query
	from dbdb.tableA
	where gdid in (select coll_gdid
	                from dbdb.tableC
	               where log_date = '2022-07-04'
	                 and id like '...%')
	group by 1
	), web_top1_click as (
	select *
	FROM dbdb.tableB
	WHERE log_date between '2022-06-01' and '2022-06-30'
	and REFINE_QUERY IN (SELECT REFINE_QUERY
	                        FROM TOPIC_NX )
	and pageid in (select pageid
	                 from dbdb.tableA
	                group by 1)
	)

	select a.refine_query
	   , avg(case when b.qc is null then 0 else b.qc end) as Qc
	   , avg(case when c.cc is null then 0 else c.cc end) as trgt_topic_cc
	   , avg(case when d.web_cc is null then 0 else d.web_cc end) as web_cc
	   , avg(case when d.all_web_cc is null then 0 else d.all_web_cc end) as all_web_cc
	   , avg(case when d.m_all_cc is null then 0 else d.m_all_cc end) as m_all_cc
       , '2022-07-04' as log_date
	from topic_nx a
	left join (
	     select LOG_DATE, REFINE_QUERY, SUM(qc) AS QC
	       FROM dbdb.tableA
	      WHERE log_date between '2022-06-01' and '2022-06-30'
	        and REFINE_QUERY IN (SELECT REFINE_QUERY
	                               FROM TOPIC_NX
	                              group by 1)
	       group by 1,2) b
	  on a.refine_query = b.refine_query
	left join (
	     select LOG_DATE, REFINE_QUERY, SUM(cc) AS cC
	       FROM web_top1_click
	      WHERE 1=1
	        and gdid in (select coll_gdid
	                from dbdb.tableC
	               where log_date = '2022-07-04'
	                 and id like '...%')
	       group by 1,2) c
	  on a.refine_query = c.refine_query
	left join (
	     select LOG_DATE
	          , REFINE_QUERY
	          , sum(case when substr(area, 1,3) = 'area1' then cc else 0 end) as web_cc
	          , SUM(case when substr(area, 1,3) in ('area1', 'area2', 'area3', 'area4') then cc else 0 end) AS all_web_cC 
	          , sum(cc)  as m_all_cc
	       FROM web_top1_click
	       group by 1,2) d
	  on a.refine_query = d.refine_query
	group by 1

Steps to reproduce

  1. prepare Trino server with hive connector connected
  2. launch go lang server with sql above.
package main

import (
	"archive/zip"
	"database/sql"
	"fmt"
	"github.com/trinodb/trino-go-client/trino"
	_ "github.com/trinodb/trino-go-client/trino"
	"net/url"
	"os"
	"regexp"
	"strings"
)

func main() {
	u, err := url.Parse("...")
	if err != nil {
		fmt.Println(err.Error())
		return
	}
	u.User = url.UserPassword("...", "...")

	trinoConfig := &trino.Config{
		ServerURI: u.String(),
		Catalog:   "...",
	}
	dsn, err := trinoConfig.FormatDSN()
	if err != nil {
		fmt.Println(err.Error())
		return
	}
	db, err := sql.Open("trino", dsn)
	if err != nil {
		fmt.Println(err.Error())
		return
	}
	query := `...`

         // error happens here
	rows, err := db.Query(query,
		sql.Named("X-Trino-User", string("...")),
		sql.Named("X-Trino-Set-Session", "NONE"),
	)

	if err != nil {
		fmt.Println(err.Error())
		return
	}

	path, err := os.Getwd()
	if err != nil {
		fmt.Println(err.Error())
		return
	}

	filePath := path + "_trino.zip"
	_ = os.Remove(filePath) // delete before file.
	fi, err := os.OpenFile(filePath, os.O_WRONLY|os.O_CREATE, 0644)
	if err != nil {
		fmt.Println(err.Error())
		return
	}

	defer fi.Close()
	zw := zip.NewWriter(fi)
	defer zw.Close()
	iw, err := zw.Create("result.tsv")
	if err != nil {
		fmt.Println(err.Error())
		return
	}

	var header []string
	columnNames, _ := rows.Columns()
	columnTypes, err := rows.ColumnTypes()
	if err != nil {
		fmt.Println(err.Error())
		return
	}

	var results []interface{}
	var line int64
	for i := range columnNames {
		dtName := columnTypes[i].DatabaseTypeName()
		results = append(results, createType(dtName))
		header = append(header, fmt.Sprintf("%s:%s", columnNames[i], dtName))
	}

	if _, err = iw.Write([]byte(strings.Join(header, "\t") + "\n")); err != nil {
		fmt.Println(err.Error())
		return
	}

	if rows != nil {
		// Line default value is 0, but nevertheless re-assign
		line = 0
		for rows.Next() {
			err := rows.Scan(results...)
			if err != nil {
				fmt.Println(err.Error())
			}
      }
      ...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.