Comments (14)
@thomaswitt I like your motivation! I am also a fan of Opensearch.
If I understand correctly, using a GSI with all attributes projected could be wasting space but it does keep the items distributed as desired since it maintains a complete copy.
I do think you could accomplish what you need to with an extension on top of Dynamoid e.g. shared before-create actions and custom finders. Reworking the innards of Dynamoid to handle that would probably be possible as well although would be more involved.
from dynamoid.
I want to use Dynamo the way it's designed and intended to be used, and I would love a Ruby library that makes these patterns easier to use. Unfortunately, Dynamoid is not the answer today. @thomaswitt please let me know if you make any efforts in your own adapter/gem. I would be happy to contribute
from dynamoid.
Hi,
In short - Dynamoid doesn't support explicitly anything related to the Single-Table design. Why? I suppose because its goal is to implement the ActiveRecord pattern.
It seems to me that the classic ActiveRecord pattern contradicts with ideas of the Single-Table design with kind of schemeless/multi-schema items. But I can easily imagine that Single-Table design's approach may be implemented on top of ORM like Dynamoid. Or on top of any other DynamoDB client.
That is I am not against supporting Single-Table design in Dynamoid. But I see benefits in separating new features and existing ActiveRecord-like approach. How strong should be this separation? I don't know right now. It depends probably on how natural specific features look from the point of view of the ActiveRecord approach.
Regarding the proposed feature with a range prefix. I am not familiar with the concrete patterns of the Single-Table design and don't know whether such range prefixes is a common/well known pattern. Could you please point at resources that describe such patterns?
from dynamoid.
I totally second this idea of @bmalinconico. The prefix in range keys to differentiate between different types of data is THE access pattern in DynamoDB. Especially also when combined with a prefix and a date like "FUEL_PRICE#2022-09-19" β¦ I'd really appreciate if Dynamoid would support this out of the box.
The advantages are obvious, especially in terms of pricing/capacity. Having one big table instead of lots of small table with their own throttle settings is hindering application performance and simply unnecessarily costs moneyβ¦
from dynamoid.
Regarding the proposed feature with a range prefix. I am not familiar with the concrete patterns of the Single-Table design and don't know whether such range prefixes is a common/well known pattern. Could you please point at resources that describe such patterns?
@andrykonchin - Here's the official DynamoDB doc: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-sort-keys.html
from dynamoid.
@thomaswitt Thank you for the link. Probably I've got what you are talking about.
@bmalinconico Yep, it seems it's a common approach to have a synthetic structured sort key. And it will be useful to support some predefined schemas.
On the other hand such options like prefix_on_persistance
and fixed_value
can be emulated with a handwritten before_save
hooks. So it would be a tiny enhancement.
from dynamoid.
@andrykonchin Yes, basically you could already write this with the current version, but I'd say convention over configuration is a very old tried-and-true mantra of Rails.
Apart from that, the way @bmalinconico described it, that's the way DynamoDB is originally intended to be used. I ran several huge DynamoDB based applications with tons of data and gazillions of rows. That's the only way to keep it scaling, and basically every AWS engineer will agree at an AWS summit.
The way currently Dynamoids default is designed with multi table has a lots of drawbacks when you put it into production. That's why I think the project should offer more defaults in the way of @bmalinconico 's idea.
The way how STI is currently implemented with the Type field is basically more band-aid. It should be a prefix in the range key. That would make also lots of other stuff easy.
When I started using Dynamoid I ran e.g. into this problem: #501. It'd be way more easier if in that example Company and Report would have the same primary key and then you could filter via Range Key what you really want to get, either the company data or the report data.
I started using Dymanoid because it was very convenient and used (despite my better knowledge) the multi table approach in the beginning. I then later had to painfully rewrite the whole app to use STI in a single table, with all model classes inheriting from a Bass Class which then defines table. Still there's a lot of code in my application with like Model.where(id: id_to_search_for, metadata.begins_with: 'REPORT#'), etc. With smart range keys like COMPANY# and COMPANY#REPORT you can easily get a company and all its all reports with a single query.
All in all - just my two cents, I'd really appreciate if the single table approach would be promoted more as at least one of two default approaches and easily supported within the software, without having to write hooks, etc.
from dynamoid.
Could you elaborate a bit more on how the prefix is supposed to work? What value it should be added to? At what moment - creation or every model updating?
from dynamoid.
@andrykonchin A full design I would need to think about longer, but how I would approach it if you configure Dynamoid that way (let's say via config.single_table_design = true):
- Dynamoid now requires in this mode that a range key is present - I would go for defaults like
key: :id
andrange: :metadata
as this is the default used by AWS e.g. in the NoSQL Workbench Modeler. You of course can still overwrite it if you want other key and range key names. - :id is generated by Dynamoid with a UUID when not specified out but can be overwritten (id: 'Berlin')
- :metadata is predefined as created_at and will be automatically expanded to
"#{type.upcase}##{created_at}
. So the prefix_on_persistance is by default the model name (type), but can be overwritten in a way @bmalinconico proposed (or alternatively fixed as he proposed as well) likerange :metadata, prefix_on_persistance: "CUSTOMPREFIX#"
. Also when defining your object, there should be a reserved keyword like range_id which is by default set to created_at, but you can overwrite that in case you want to have a range key like "USER#<customData_from_range_id>". You could then define an own range ID like an email or whatever, or change the range key from created_at to updated_at - or even dynamoically expanded and chained likeCOMMENT#[email protected]#2022-03-12T00:22:33.144Z'
when you set the range key to "{user_id}#β{created_at}", etc. - The table definition (name, capacity_mode) should be defined in a bass class like DynamoidBase and all models should inherit from this base class by convention (Employee < DynamoidBase)
- When I do a where search, I can either just look for the id and get multiple results or with a helper function look directly for key plus rangekey, like
Comment.find('excellent-post-1234', '[email protected]')
which would expand toComment.where(id: 'excellent-post-1234', 'metadata': 'COMMENT#[email protected]')
. - I would also potentially include intuitive helpers when looking for range keys with time series data etc. for begins_with, gt, etc, dor example (not yet a well thought out API, just an idea):
Document.find('docset1234', '2022-03-12T00:22:33.144Z')
->Document.where(id: 'docset1234').where(metadata: '2022-03-12T00:22:33.144Z')
Document.find('docset1234', '2022-03-*')
->Document.where(id: 'docset1234').where(metadata.begins_with: '2022-03-')
Address.find('Berlin', '10115', '10178')
->Address.where(id: 'Berlin').where(metadata.between: [10115, 10178])
Address.find('Berlin', '>10115')
->Address.where(id: 'Berlin').where(metadata.gt: 10115)
Address.find('Berlin', '>=10115')
->Address.where(id: 'Berlin').where(metadata.gte: 10115)
Address.find('Berlin', '<10115')
->Address.where(id: 'Berlin').where(metadata.lt: 10115)
Address.find('Berlin', '<=10115')
->Address.where(id: 'Berlin').where(metadata.lte: 10115)
from dynamoid.
@andrykonchin Hey Andrii, just checkin in whether you had time to think about those suggestions β¦
from dynamoid.
@andrykonchin Just a little ping. Have you given those ideas some thought?
from dynamoid.
I use STI with Dynamoid. My approach is to not have a range key and use shared GSI columns with redundant data. GSI columns are set with before actions and can include values from multiple columns as needed. Much of this logic can be abstracted into a parent class so using it in the models isn't excessive. I have 5 GSIs that are string-string and 5 that are are string-number. If I needed pizzas created by user 1 sorted by timestamp I would use a GSI e.g. Pizza#User#1,2024-02-01
. To further filter to store 2 you could have a GSI with Pizza#Store#2#User#1,2024-02-01
.
HK | Type | Name | Code | GSI_HK1 | GSI_RK1 |
---|---|---|---|---|---|
a#Pizza#pizza | Pizza | Pizza | 2024-02-01 | ||
a#PizzaTopping#onions | PizzaTopping | Onions | onions | PizzaTopping | onions |
a#PizzaTopping#pepperoni | PizzaTopping | Pepperoni | pepperoni | PizzaTopping | pepperoni |
If you don't like the redundant data and want to keep using range keys I suppose you could set the range key with a before action. You'd need to create new or override existing finders, however.
from dynamoid.
@ckhsponge ckhsponge I understand your approach, but the range key was invented for a reason (also in terms of data distribution). Especially also as the idea in dynamo is that you don't delete by default but rather insert new data to have a built in history, the range key comes very handy. In that sense Dynamoid is written in a way that tries to emulate ActiveRecord, but does not embrace the ideas of DynamoDB.
Unfortunately @andrykonchin doesn't seem to be open/interested to build another way which I described above which is more built like DynamoDB wants it, so I am considering to write an own lightweight Gem adapter to embrace these concepts.
Especially as it makes sense to combine this with OpenSearch/ElasticSearch, which is also currently PITA as most gems (like SearchKick) won't work out of the box. DynamoDB + Opensearch is a very powerful combination - and it deserves to be supported for Rails out of the box the way it's meant to be.
from dynamoid.
@bholzer I agree. Would you be open to throw some ideas together and do an outline of what should be in scope for a ruby lib?
from dynamoid.
Related Issues (20)
- Support segment/total_segments for scans HOT 1
- `key:` with number type -> type mismatch HOT 7
- Any chance to cut a more recent release? HOT 4
- Calling first modifies the original variable HOT 4
- How to use condition expressions in a query? HOT 6
- Coveralls is dead and is holding back simplecov version
- Bad Badge Anchor HOT 1
- how to set a GSI for condition ? HOT 1
- Update GSI the existing table HOT 3
- filter expression with or? HOT 2
- Conditional updates are incorrect in README HOT 2
- undefined method `symbolize_keys' when saving dynamo model HOT 2
- Field adapters no longer work (v3.9.0) HOT 6
- Updating an attribute that is a key of a GSI to `nil` (3.9.0) HOT 3
- Conditional update array element HOT 1
- Batching with more than 1000 doesn't have any effect? HOT 5
- Case where updating array/map fields does not work HOT 3
- Idea: "ActiveDocument" HOT 3
- has_one doesn't allow custom foreign_key HOT 1
- Not Functioning Correctly on Partial Updates of Serialized Fields HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dynamoid.