Comments (1)
Reproducer
With PostgreSQL ready, for instance:
docker run --name some-postgres \
-e POSTGRES_USER=postgres \
-e POSTGRES_PASSWORD=mypsw \
-e PGUSER=postgres \
-e PGPASSWORD=mypsw \
-p 5432:5432 \
-d "postgres:12"
Run the following Python script:
from ml_metadata.metadata_store import metadata_store
from ml_metadata.proto import metadata_store_pb2
from faker import Faker
fake = Faker()
def create_store():
connection_config = metadata_store_pb2.ConnectionConfig()
connection_config.postgresql.host = 'localhost'
connection_config.postgresql.port = '5432'
connection_config.postgresql.user = 'postgres'
connection_config.postgresql.password = 'mypsw'
connection_config.postgresql.dbname = 'pgserver'
# connection_config.postgresql.skip_db_creation = 'false'
# connection_config.postgresql.ssloption.sslmode = '...' # disable, allow, verify-ca, verify-full, etc.
# connection_config.postgresql.ssloption.sslcert = '...'
# connection_config.postgresql.ssloption.sslkey = '...'
# connection_config.postgresql.ssloption.sslpassword = '...'
# connection_config.postgresql.ssloption.sslrootcert = '...'
store = metadata_store.MetadataStore(connection_config, enable_upgrade_migration=True)
return store
def create_ctx_type(store):
experiment_type = metadata_store_pb2.ContextType()
experiment_type.name = "Experiment"
experiment_type.properties["note"] = metadata_store_pb2.STRING
experiment_type_id = store.put_context_type(experiment_type)
results = store.get_context_types()
print(results)
return experiment_type_id
def working(experiment_type_id):
my_experiment = metadata_store_pb2.Context()
my_experiment.type_id = experiment_type_id
my_experiment.name = "exp1"
my_experiment.properties["note"].string_value = "My first experiment."
[experiment_id] = store.put_contexts([my_experiment])
print(experiment_id)
result = store.get_contexts_by_id([experiment_id])
print(result)
def lorem(max_chars):
result = ""
while len(result) < max_chars:
result += fake.sentence() + " "
return result[:max_chars]
def not_working(experiment_type_id):
my_experiment = metadata_store_pb2.Context()
my_experiment.type_id = experiment_type_id
my_experiment.name = "exp_longnote"
my_experiment.properties["note"].string_value = lorem(2690) # up to ~2680 works
[experiment_id] = store.put_contexts([my_experiment])
print(experiment_id)
result = store.get_contexts_by_id([experiment_id])
print(result)
if __name__ == '__main__':
store = create_store()
experiment_type_id = create_ctx_type(store)
working(experiment_type_id)
not_working(experiment_type_id)
Resulting in:
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0303 16:19:42.571434 13378 postgresql_metadata_source.cc:208] Connecting to database.
I0303 16:19:42.576880 13378 postgresql_metadata_source.cc:215] Connection to database succeed.
I0303 16:19:43.322311 13378 postgresql_metadata_source.cc:208] Connecting to database.
I0303 16:19:43.327381 13378 postgresql_metadata_source.cc:215] Connection to database succeed.
[id: 10
name: "Experiment"
properties {
key: "note"
value: STRING
}
]
1
[id: 1
type_id: 10
name: "exp1"
properties {
key: "note"
value {
string_value: "My first experiment."
}
}
type: "Experiment"
create_time_since_epoch: 1709479183548
last_update_time_since_epoch: 1709479183548
]
E0303 16:19:43.555275 13378 postgresql_metadata_source.cc:128] Execution failed: ERROR: index row size 2712 exceeds btree version 4 maximum 2704 for index "idx_context_property_string"
DETAIL: Index row references tuple (0,2) in relation "contextproperty".
HINT: Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.
WARNING:absl:mlmd client InternalError: PostgreSQL metadata source error: ERROR: index row size 2712 exceeds btree version 4 maximum 2704 for index "idx_context_property_string"
DETAIL: Index row references tuple (0,2) in relation "contextproperty".
HINT: Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.
Traceback (most recent call last):
File "/home/tarilabs/git/demo20231009-mlmdpg/python/test3.py", line 61, in <module>
not_working(experiment_type_id)
File "/home/tarilabs/git/demo20231009-mlmdpg/python/test3.py", line 52, in not_working
[experiment_id] = store.put_contexts([my_experiment])
File "/home/tarilabs/git/demo20231009-mlmdpg/venv/lib64/python3.10/site-packages/ml_metadata/metadata_store/metadata_store.py", line 520, in put_contexts
self._call('PutContexts', request, response)
File "/home/tarilabs/git/demo20231009-mlmdpg/venv/lib64/python3.10/site-packages/ml_metadata/metadata_store/metadata_store.py", line 203, in _call
return self._call_method(method_name, request, response)
File "/home/tarilabs/git/demo20231009-mlmdpg/venv/lib64/python3.10/site-packages/ml_metadata/metadata_store/metadata_store.py", line 224, in _call_method
self._pywrap_cc_call(cc_method, request, response)
File "/home/tarilabs/git/demo20231009-mlmdpg/venv/lib64/python3.10/site-packages/ml_metadata/metadata_store/metadata_store.py", line 255, in _pywrap_cc_call
raise errors.make_exception(error_message.decode('utf-8'), status_code)
ml_metadata.errors.InternalError: PostgreSQL metadata source error: ERROR: index row size 2712 exceeds btree version 4 maximum 2704 for index "idx_context_property_string"
DETAIL: Index row references tuple (0,2) in relation "contextproperty".
HINT: Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.
Demonstrating is potentially a typo/overlook/misconfiguration of the Index which is limiting the amount of chars which can be stored in the MLMD string property:
PostgreSQL metadata source error: ERROR: index row size 2712 exceeds btree version 4 maximum 2704 for index "idx_context_property_string"
DETAIL: Index row references tuple (0,2) in relation "contextproperty".
HINT: Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.
from ml-metadata.
Related Issues (20)
- conda distribution HOT 4
- Cannot install ml-metadata v1.0.0 HOT 15
- Garbage collection of underlying artifacts HOT 11
- How do I get the stack trace? HOT 3
- Data too long for column 'string_value' at row 1 HOT 3
- Example for connecting to Google Cloud Vertex Metadata HOT 1
- Suitable for computer vision project? HOT 2
- Google.com HOT 1
- Python2_brutegram HOT 1
- fully support mysql 8.0 (`caching_sha2_password` authentication) HOT 3
- Fb.me Bruterforce attack
- Reaching the dev team to ask questions about mlmd implementation and roadmap HOT 2
- CVE-2023-44487 and CVE-2023-39325 affecting mlmd gRPC remote server? HOT 4
- ML Metadata as a Model Registry solution HOT 3
- Add support for command line server ssl options HOT 1
- Using ML metadata postgres SSL connection options don't work for `sslMode=require` HOT 1
- Is there a plan to support Linux on the ARM64 architecture?
- pip install ml-metadata==1.14.0 doesn't work with Python >= 3.11 HOT 2
- Add cmdline ssl options for mysql
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ml-metadata.