Git Product home page Git Product logo

Comments (9)

mavxg avatar mavxg commented on July 17, 2024 1

Please ignore the second memory leak. I think I may have imagined it as I cannot replicate. The line I thought was leaking is working fine and is cleaned up by this code further down in the file

Py_XDECREF(objCell);

from pyodbc.

v-chojas avatar v-chojas commented on July 17, 2024

Could you post an ODBC trace?

from pyodbc.

gordthompson avatar gordthompson commented on July 17, 2024

Reproduced on Ubuntu 20.04. Both .rss and .vms increase. Same behaviour if cursor is created, consumed (.fetchall()), and closed for each iteration.

from pyodbc.

gordthompson avatar gordthompson commented on July 17, 2024

ODBC trace logs for "execute with float" and "execute with str", using 5_000 rows. (Trace logs for 1_000_000 rows were almost 500 MiB each unzipped.)

exec_float.log.zip

exec_str.log.zip

from pyodbc.

gordthompson avatar gordthompson commented on July 17, 2024

Note also that this is not a version 5.x regression; version 4.0.39 produces the same results.

from pyodbc.

v-chojas avatar v-chojas commented on July 17, 2024

No need to run it with that many rows since the pattern shows up with far less.

I think the code is missing a Py_XDECREF(param) after this line: https://github.com/mkleehammer/pyodbc/blob/master/src/params.cpp#L1268

from pyodbc.

gordthompson avatar gordthompson commented on July 17, 2024

I added this to my local build but the leak persists.

@@ -1264,10 +1264,11 @@ bool BindParameter(Cursor* cur, Py_ssize_t index, ParamInfo& info)
 
             for(i=0;i<ncols;i++)
             {
                 // Bind the TVP's columns --- all need to use DAE
                 PyObject *param = PySequence_GetItem(row, i);
+                Py_XDECREF(param);
                 GetParameterInfo(cur, i, param, info.nested[i], true);
                 info.nested[i].BufferLength = info.nested[i].StrLen_or_Ind;
                 info.nested[i].StrLen_or_Ind = SQL_DATA_AT_EXEC;
 
                 Py_BEGIN_ALLOW_THREADS

from pyodbc.

mavxg avatar mavxg commented on July 17, 2024

Some more testing that might help.

The leak still occurs if you take the creation of the data object out of the loop. This seems to imply it is leaking memory allocated within pyODBC rather than a reference count issue with the passed in data. My guess was from the following code in cursor.cpp where it seems like the GetParameterInfo there is being called for each data item and doing an encode on the string generating an encoded byte array and the cleanup for this is somehow missing (the leak doesn't occur if you provide a bytes type parameters by doing the encode on the python side but does for str.)

https://github.com/mkleehammer/pyodbc/blob/ff1dd2318ec7792f9f031c6ea3966852ac013496/src/cursor.cpp#L855C22-L863C44

from pyodbc.

mavxg avatar mavxg commented on July 17, 2024

I have found a related memory leak (that I think may have been spotted before) when you insert a long string with cursor.fast_executemany=True you get a leak if the column you are inserting into is varchar(MAX).

Both of these memory leaks are because of the same problem but in two difference places where some coded is duplicated.

The TVP issue is caused because in GetUnicodeInfo you get a new PyObject from PyCodec_Encode with a ref count of 1. While this object is wrapped by encoded the Detach call means it will not decrement the ref count in the destructor. For normal calls this is correct behaviour as the decrement happens in FreeInfos with the line Py_XDECREF(a[i].pObject);. For TVP parameters FreeInfos will only clean up the top level parameter and not all the items in the sequence and so this object is getting leaked.

This is the same issue with cursor.fast_executemany=True when you have a column with MAX size because the detach on this line causes the reference count to not be reduced by the destructor

pParam->cell = encoded.Detach();

from pyodbc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.