Git Product home page Git Product logo

Comments (23)

TheJayMann avatar TheJayMann commented on May 25, 2024 1

Upon attempting to debug this, at the point searchDataTypeFromCache is called with the parameter schemaCache being passed in, it appears to have a number of values loaded in the Tables field, but all other fields, including the Columns field, have no values loaded in. This testing is making use of MsSqlServerProvider.

While in the debugger, I manually ran the GetColumns() method for one of the tables in the query. I then saw the Columns field was updated with that table and its columns, and, for the one parameter comparing against a column for that table, the data types matched. This suggest to me, rather than relying on the schema cache, it might be better to pass the ISqlProvider and the current connection so that the GetColumns method can be used instead, with the implementation of GetColumns responsible for making use of the schema cache. Either that, or some startup code to either cache all information at runtime startup, or some method of specifying which tables should have information cached at runtime startup.

from sqlprovider.

Thorium avatar Thorium commented on May 25, 2024

Interesting. How can you verify that this happens? Is it happening only with some operations like if you do where-clause with some operations?

I'm asking because the function createOpenParameter just uses SqlParameter(name, object) where the other providers (Postgres, MySql, ...) do some dbType mapping magic and there would be other constructors to use for SqlParameter.

from sqlprovider.

TheJayMann avatar TheJayMann commented on May 25, 2024

I'm able to verify this by using a monitoring service, which is analyzing queries as they come in, and, for parameterized queries, it shows the individual parameters and types for the parameters. It also shows the estimated query plan diagram for the query. In the cases of the queries that I've been analyzing that was generated from SQLProvider, I am seeing implicit conversions of the column itself from varchar to nvarchar to match the type of the parameter (I assume because it is safe to convert from varchar to nvarchar but not the other way). I had seen the same thing happening with queries using EF Core, but because, without specifying IsUnicode as false, EF Core assumes the columns themselves are nvarchar. When specifying columns as IsUnicode false, EF Core passes the parameter as varchar, and can be seen in the query.

I had also looked a bit into the code, and I saw createOpenParameter which simply creates a SqlParameter passing in the string directly as the value. Looking into the source for SqlParameter, if one does not specify the SqlDbType directly as VarChar, or the DbType as AnsiString, it will determine SqlDbType to be NVarChar.

In my case, this has only been an issue when generating a WHERE clause, but this could theoretically be a problem when generating a HAVING or 'JOIN ... ON` clause as well.

As far as being able to compare the results of a parameter being declared nvarchar vs varchar, I've taken the query that the monitoring tool provides, changed the first line of a comma separated list of parameters and types enclosed in parenthesis and altered into a proper DECLARE statement, duplicated the query, then generated an estimated query plan diagram using a tool able to generate a diagram (e.g. SSMS or ADS). Assuming existing a table with column varchar, and an index for the column, the query filtering by nvarchar parameter will result in either an index scan or table scan, with a predicate having an implicit conversion of the varchar column to nvarchar, and the query filtering by varchar parameter will use an index seek with a seek predicate Prefix: column = Scalar Operator(param). Some (untested) estimations can put the first query to cost as much as 100x the second query.

If you need something more explicit or concrete, I can create a specific example. Or, if I have not explained something properly for what you intended to ask, or even if my explanation completely veered off course (as my explanation is rather abstract and verbose), I can clarify or reexplain as necessary.

from sqlprovider.

Thorium avatar Thorium commented on May 25, 2024

The initial problem is that both nvarchar and varchar are mapped to .NET String, and deciding from String it's not trivial what should it be converted to.

SQLProvider just creates an *.Data.SqlClient.SqlParameter without explicitly setting its DbType property. Which would go good if the database-driver could pick the correct parameter type by default.

Did you use Microsoft.Data.SqlClient or System.Data.SqlClient, and in which environment?

The driver then send to SQL-server some parametrized sql-query via ado, for example:

exec sp_executesql N'SELECT [emp].[FIRST_NAME] as ''FIRST_NAME'' FROM [dbo].[EMPLOYEES] as [emp] WHERE ([emp].[LAST_NAME] <> @param1) ',N'@param1 nvarchar(3)',@param1=N'ABC'

On this example it's trivial to see that LAST_NAME column is varchar (and thus N'@param1 nvarchar(3) is wrong). As we do have the database structures with column types in schema, we could try to set it manually. But right now the information is not "near", so we could either a) change the parameter-transfer to always include type with name, or b) some intelligent guess as lookup to guess the column type. But that goes quite hard because not all query parameters are directed database columns.

from sqlprovider.

TheJayMann avatar TheJayMann commented on May 25, 2024

In my case, I'm referencing both System.Data.SqlClient and Microsoft.Data.SqlClient, due to at the time I could never get one to work in all cases from design to build to runtime; however, I do believe that once at runtime it is ultimately making use of Microsoft.Data.SqlClient.

Conditional compilation has MSSQLSERVER_DYNAMIC used if MSBuildRuntimeType is set to Core (which is the case in CI and I assume in VSCode), and uses MSSQLSERVER otherwise (which is the case when in Visual Studio). I don't know if this is still needed, but it definitely was when I migrated from .NET Framework to .NET Core 3.1.

from sqlprovider.

TheJayMann avatar TheJayMann commented on May 25, 2024

I've been looking into this to see what can be done, if anything. With the current design time process, I don't see how any relevant information can be obtained during runtime. The example I am using to see what is available is a query that joins four tables via foreign keys, then filters based on a column of the first table. The where expression effectively ends up as _.Where(fun &tupledArg -> $tupledArg.Item1.GetColumn("ColumnName") = "param value"). At this point, the table name is lost, and I don't see a way to get to it (at least not without a significant amount of additional research).

During design time, all of the information is available, except that the scale is not retrieved, the precision is not used (neither of these are important in my specific example, but could possibly be if attempting to compare or insert decimal values), and the size, while present embedded in the TypeInfo field, is not available separately. There is, however, information to easily distinguish between char and nchar, as well as to determine maximum length, though not as easy. However, as far as I know, a provided property can only replace the direct invocation of the property, and cannot replace an outer expression containing an invocation of a property. Thus, if the designer were to add the necessary parameter information in the code generated by the property getter, it would be available during runtime generation of SQL from expression trees. While this would work, it doesn't appear to be the best option, as the only method I was able to imagine at this point would involve adding either optional parameters or overloads to the GetColumn family of methods to accept a DbType, size, scale, and precision parameter, of which the values are ignored. These values can be used either in the direct constructor of a parameter type, or by assigning these values on the properties given by the IDbDataParameter interface, which most, if not all, database specific parameter types implement. However, attempting to have the parameter properties determined automatically could be made more difficult if the column is to be wrapped in functions which return the same database type as the argument passed in, unless it is determined that column properties will only be used in a parameter object if the column is being compared directly to a parameterized value.

The other option, which allows greater flexibility with the drawback that one must implement it manually, would be to allow a "param" operator/function. It would take as arguments the value to be passed in as a parameter, as well as providing DbType, size, precision, and scale. The function itself when used outside of the expression tree would simply return the parameter value itself, ignoring all other arguments. When encountered in an expression tree, if the parameter value would be wrapped in a parameter object, the other values passed in to the function can be used to alter the parameter object, similar as mentioned above. I would assume if the parameter value would not normally be wrapped in a parameter object, then the function and the other parameters would be ignored, as if only the parameter value existed at this point.

Another thought that I had, which would be a more specific solution to this issue, rather than the more general ones mentioned above, would be for MSSQL provider to make use of ReadOnlySpan<byte> as char and varchar types. This is based on MSSQL making use of char and varchar for UTF8 encodings, as well as C# UTF8 string literals having the type ReadOnlySpan<byte>. After further thinking, this is probably not the best idea.

I do believe that I understand the code base enough now that I could implement one of the above ideas if desired.

from sqlprovider.

Thorium avatar Thorium commented on May 25, 2024

The problem is that you want your types to input parameters of queries, not to SqlEntities which are just outputs of select/view/...

So I think you would rather want some custom operator that would do cast to specific type:

query {
   for x in ctx do
   where (x.Name = CastVarChar("Joe"))
   select (x.Name, x.Age)
}

The good news is that this, if ok to you, would be way easier to implement.

from sqlprovider.

TheJayMann avatar TheJayMann commented on May 25, 2024

Automatically determining parameter types making use of column types is nicer, and it means that all code will likely be improved by doing nothing, but it is more difficult, likely has to be implemented in rather hacky ways, and has the potential to not work quite correctly in certain situations.

Having a function call (such as the CastVarChar example function you show) requires making code changes in order to use, but is better in all of the other ways, and I'd find it to be a more than acceptable alternative. That said, it would also be nice to be able to specify size, scale, and precision (in my case, since I'm not really making much use of decimal, only size really matters), as, when different parameters create different parameter sizes, it ends up being implemented as different query plans. This is not a major concern, but, having them available might be better.

from sqlprovider.

Thorium avatar Thorium commented on May 25, 2024

One problem is that Db types are database depentant, and what if you do e.g. where (firstname + ' ' + lastname = @param) and @param='John Doe', silly example, but auto-infering is hard, the queries can have nested other tables etc.

Another option could be configurable default mapping-table (plus cast hacks).

Besides I still think not picking the correct default parameter types is underlying driver problem rather than SQLProvider which is just an abstraction layer. They must have similar issues with other tools using Microsoft.Data.SqlClient.

from sqlprovider.

TheJayMann avatar TheJayMann commented on May 25, 2024

I believe the problem here is that SQLProvider is a bit too abstract. This is easily solvable when using System.Data.SqlClient, or Microsoft.Data.SqlClient directly by specifying the options when creating a SqlParameter object, or, at least, by modifying the DbType, Size, Scale, and Precision properties, which is common to all IDbDataParameter types. EFCore solves this by knowing how the columns have been specified, making use of IsUnicode(), HasMaxLength(), and HasPrecision().

I believe at this point in the discussion, attempting to automatically infer column types and making use of them in comparison operations is more difficult than may be acceptable for what it gains. Thus, I'm still in favor of a custom operator for specifying db types for parameters directly.

Also, given that I'd assume nearly all databases are nearly exclusively either char/varchar or nchar/nvarchar, perhaps even a static parameter to specify default text type.

from sqlprovider.

Thorium avatar Thorium commented on May 25, 2024

Another easy solution would be mapping nvarchar to String and varchar to char[].
But we don't want that because working with String is so much easier than char[], and it would be a breaking change.
I remember when MySQL version mapped database int to long for whatever reason and to change it working like all the other providers was that you have to go through your full code-base and remove L:s after each number when you updated SQLProvider, not good.

from sqlprovider.

Thorium avatar Thorium commented on May 25, 2024

On one-table-query it would be easy to pick the parameter type, but if it's a join, it's already more complex because several tables may have same column names and then you have to carry possible query table aliases etc.

from sqlprovider.

Thorium avatar Thorium commented on May 25, 2024

Funny enough, there is already .ToString() which is basically very near what you want:

query {
   for x in ctx.MyTable do
   where (x.Name = "Joe".ToString())
   select (x.Name, x.Age)
}

will generate:

select Name, Age 
from MyTable x
where x.Name = CAST(@param1 AS NVARCHAR(MAX))
-- @param1: 'Joe'

But instead nvarchar(max) you'd want varchar(max) I guess?
String does have method ".Normalize()" which could be used for this even though it's not exactly what that method is for.

from sqlprovider.

Thorium avatar Thorium commented on May 25, 2024

How custom operators work:
There is Operatos.fs and bottom is e.g. let StdDev (a:'a) = 1m, you want something like let ToVarChar (a:string) = ""
Then when you use the library you open FSharp.Data.Sql and you have this function in use
which you can use within query { ... }.

...but instead of you actually ever evaluating it, the query-parser will pick the Linq AST at SqlRuntime.Patterns.fs, see |SqlColumnGet| and translate it to CanonicalOperation containing CanonicalOp (you need a new op), which is then sent as this intermediate simplified AST model to database-specific GenerateQueryText which takes it in fieldNotation function and converts to corresponding SQL.

from sqlprovider.

Thorium avatar Thorium commented on May 25, 2024

I think we can do some trivial lookup from schemaCache to solve all the normal cases. Let me try to create a PR.

from sqlprovider.

Thorium avatar Thorium commented on May 25, 2024

Fix in 1.3.15

from sqlprovider.

TheJayMann avatar TheJayMann commented on May 25, 2024

I was just about to post about what I had done so far, creating a Param extension method and corresponding CannonicalOp, but then I see you got the automatic method working, and done in a fairly simple way as well. It would also be nice if size, scale, and precision could be pulled in as well, but having a matching DbType will have a far bigger impact.

I can test this soon, but my project is currently using 1.2.11, and the update to some version of 1.3 .* had some breaking changes, and thus upgrading to 1.3.15 and verifying the required changes didn't break anything else may take some time.

from sqlprovider.

Thorium avatar Thorium commented on May 25, 2024

#805

from sqlprovider.

TheJayMann avatar TheJayMann commented on May 25, 2024

For me, because I already had a significant amount of Async/Task mapping, and more advanced use of Async, including AsyncSeq (which was mostly able to be converted to TaskSeq), it was a bit more work. I did get it migrated, but, because of how much code has changed, I have to do some more thorough testing to make sure nothing else is broken. I did get a quick test in just before the work day ended, but it was still sending nvarchar parameters. I'll have time tomorrow to look more into it (just in case this happens to be an issue on my side).

from sqlprovider.

TheJayMann avatar TheJayMann commented on May 25, 2024

A workaround I have been able to use is, when I create a data context to be later used, I cast the type to ISqlDataContext (because, for some reason, CreateEntty is not available directly), then call CreateEntity() for each table where I want the columns to be cached. Code similar to the following worked for me.

  let data = Database.GetDataContext connStr
  do
    let sqlDataContext = data :> obj :?> FSharp.Data.Sql.Common.ISqlDataContext
    sqlDataContext.CreateEntity("schema.Table1") |> ignore
    sqlDataContext.CreateEntity("schema.Table2") |> ignore

from sqlprovider.

Thorium avatar Thorium commented on May 25, 2024

Makes sense to call getColumns, it will just fetch the same columns anyway from the cache if they are not there.

from sqlprovider.

Thorium avatar Thorium commented on May 25, 2024

Changed in version 1.3.16

from sqlprovider.

TheJayMann avatar TheJayMann commented on May 25, 2024

Nice. I'll verify this works come Monday, but I highly suspect it will.

from sqlprovider.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.