collective / collective.exportimport Goto Github PK
View Code? Open in Web Editor NEWExport and import content and other data from and to Plone
License: GNU General Public License v2.0
Export and import content and other data from and to Plone
License: GNU General Public License v2.0
Same as with the import (solved in #4), the export of a lot of data would eat up all the memory on the machine. The python dict that is created holds the data during export before being written to file as json can be quite large if you choose to to include base64-encoded binary data.
I have the use-case to export 60GB of content in files.
Options:
We have a Site with a lot of content admin traffic and there are new articles on the old page after we exported its content. So my idea is to safe the export date for each portal_type in a small portal annotation dict and add a checkbox on the @@export_content
page to export types after the last_export_date
... this date could/should also be shown in the type selector label ...
While test-using collective.exportimport I'm running into edge cases. I could solve these solve these by adding more code to a subclassed export_content or import_content view in my own custom migrationhelper package.
But the edge case is often 1 field on 1-3 content items in a 8 year old site. It's just not worth it to try to catch these in code, but easier to fix in the source site. See #12 as an example.
The main cause for validation error is the schema validation running at the deserializer step on import_content in plone.restapi. And plone.restapi is catching all validation errors and rethrowing them with a generic ValidationError class that only show the field, but not the error. (https://github.com/plone/plone.restapi/blob/f89276054088340b3ec6775db6280b1dc46f0866/src/plone/restapi/deserializer/dxcontent.py#L55-L60)
To support fixing these blips of migration issues, it would be nice to have a 'dry_run' option in the import_content that tries to create and deserialize a json, catches any validation errors and outputs a report of found original errors coming from https://github.com/plone/plone.restapi/blob/f89276054088340b3ec6775db6280b1dc46f0866/src/plone/restapi/deserializer/dxcontent.py#L45
For this to work we'd need a new feature in plone.restapi to not swallow the ValidationError's as shown above.
Relative links to browser-views are replaced to link to the current objects parent:
(Pdb++) from collective.exportimport.fix_html import html_fixer
(Pdb++) text = """<p><a href="edit">Link to a browser view</a></p>"""
(Pdb++) html_fixer(text, self.context)
'<p><a data-linktype="internal" data-val="7eb11200ba09ec174b74f24d2fb6f0c1" href="resolveuid/7eb11200ba09ec174b74f24d2fb6f0c1">Link to @@edit view</a></p>'
(Pdb++) self.context.__parent__.UID()
'7eb11200ba09ec174b74f24d2fb6f0c1'
since creation_date
is moved to created
when you activate data migration, the original creation date is not set ... leaving it on creation_date
works also for DX...
An Event exported from a Plone 4.3.6 site has the event_url
field being an empty string if the field on the original object was not set.
As a result, when importing this object into a 5.2 site, this WARNING will be logged:
WARNING [collective.exportimport.import_content:310] cannot deserialize http://localhost:.../...: BadRequest([{'message': 'The specified URI is not valid.', 'field': 'event_url', 'error': 'ValidationError'}],)
This happens in ImportContent.import_new_content()
:
# import using plone.restapi deserializers
deserializer = getMultiAdapter((new, self.request), IDeserializeFromJson)
try:
new = deserializer(validate_all=False, data=item)
except Exception as error:
logger.warning(
"cannot deserialize {}: {}".format(item["@id"], repr(error))
)
continue
One of the bad side-effects of this is that the folowing line is never reached for this object:
new.creation_date_migrated = creation_date
Therefore, when later running @@reset_dates
, this Event will produce this error:
AttributeError: creation_date_migrated
This is because acquisiton causes getattr(obj, "creation_date_migrated", None)
to be true-ish, but del obj.creation_date
raises AttributeError
:
created = getattr(obj, "creation_date_migrated", None)
if created and created != obj.creation_date:
obj.creation_date = created
del obj.creation_date_migrated
obj.reindexObject(idxs=["created"])
All of this can be avoided by setting item['event_url'] = None
, which is what I did in a custom global_dict_hook()
:
def global_dict_hook(self, item):
url = item.get('event_url', None)
if url == '':
item['event_url'] = None
return item
Utlimately, I'm not sure if this should be addressed in plone.restapi or here, but it's one of various issues (like #12) that affect default content types and should not require custom hooks.
User will go to the site setup and click on export and get a UI something like
-------------------------------------------------------------
| Warning: multiple exports selected. Download will be tar.gz |
-------------------------------------------------------------
# Exports
[x] Content
[x] File/Images
[ ] Users
[ ] Content Tree
[ ] Relations
[ ] Translations
[ ] Local Roles
[ ] Default Page Mapping
[ ] Object Positions in Parent
[ ] Comments
# Content Export
{query widget}
Type: Page
Path: /news depth:1
Path: /other-news depth:1
Creation date: > 1/1/20018
Selected content (21 items)
--------------------------------
| /news/item1
| /news/item2
| /other-news/big-news
--------------------------------
# File/Images Export
(o) url/path
( ) binary in tar.gz
( ) base64 encoded in json
[Download] [Save to Server] [Dry Run] [Cancel]
Features this adds
missing closing tag </form>
in export_content.pt
leads to traceback in Plone 4.3
Exported collections have the wrong @id
attribute set.
[
{
"@id": "http://localhost:8080/Plone/@@export_content",
"@type": "Collection",
"UID": "e6b3bf21738d4866b9acd1d0e7a1cf51",
"allow_discussion": false,
"..."
}
]
I was able to bypass this with a custom export view:
class CustomExportContent(ExportContent):
def dict_hook_collection(self, item, obj):
"""Use this to modify or skip the serialized data by type.
Return the modified dict (item) or None if you want to skip this particular object.
"""
# Fix the id for collections, which is set to “@@export-content” because of the HypermediaBatch in plone.restapi
item["@id"] = obj.absolute_url()
return item
For the import, only the @id
is required, although there are more properties with the wrong url:
"batching": {
"@id": "http://localhost:8080/Plone/@@export_content",
"first": "http://localhost:8080/Plone/@@export_content?b_start=0",
"last": "http://localhost:8080/Plone/@@export_content?b_start=50",
"next": "http://localhost:8080/Plone/@@export_content?b_start=25"
},
@pbauer, any opinion on how to fix this? Would it be ok to add the hook to the add-on directly?
Some changes need to be done to the serialized data when this tool is used for a migration (e.g. from Plone 4 to Plone 5 or 6) and/or from Archetypes to Dexterity.
To make this easier we could add a checkbox (checked by default) "Modify data for migrations".
If this is checked then some modifiers will run during export.
These could include:
Some data that restapi includes is useless for migrations. E.g. @components
, next_item
, previous_item
.
Relations are migrated seperately. haveing them in the data will mess up the site. This is probably easiest done by switching on custom serializers for IReferenceField
(AT) and IRelationChoice
and IRelationList
(DX) that return none.
# Migrate AT to DX
if item.get("expirationDate"):
item["expires"] = item["expirationDate"]
if item.get("effectiveDate"):
item["effective"] = item["effectiveDate"]
if item.get("excludeFromNav"):
item["exclude_from_nav"] = item["excludeFromNav"]
if item.get("subject"):
item["subjects"] = item["subject"]
see #10
TextField-export in Archetypes: Inspecting the AT-schema and applying a change for all Textfields if the RichtextWidget is not used (which means the field is probably Text in DX and not RichText).
# In Archetypes Text is handled the same as RichText
if isinstance(item.get("description", None), dict):
item[fieldname] = item[fieldname]["data"]
if isinstance(item.get('rights', None), dict):
item['rights'] = item['rights']['data']
Some criteria have changed, e.g.
query = item.pop("query", [])
for crit in query:
if crit["o"].endswith("relativePath") and crit["v"] == "..":
crit["v"] = "..::1"
if crit["i"] == "portal_type" and crit["o"].endswith("selection.is"):
crit["o"] = "plone.app.querystring.operation.selection.any"
Use the code in https://github.com/collective/collective.migrationhelpers/blob/master/src/collective/migrationhelpers/images.py to fix links to images and make them editable in TinyMCE.
Curious, because the function itself takes a debug
parameter. (Plone 5.2 instance)
Traceback:
INFO:interpreter:Exporting relations from site
Traceback (most recent call last):
File "/home/user/optplone/deployments/master/parts/client1/bin/interpreter", line 293, in <module>
exec(_val)
File "<string>", line 1, in <module>
File "/home/user/optplone/deployments/master/src/ruddocom.policy/ruddocom/policy/export_embedded.py", line 101, in <module>
full_export(site, exportpath, outputpath, what=what)
File "/home/user/optplone/deployments/master/src/ruddocom.policy/ruddocom/policy/export_embedded.py", line 74, in full_export
export_view()
File "/home/user/optplone/deployments/master/src/collective.exportimport/src/collective/exportimport/export_other.py", line 71, in __call__
all_stored_relations = self.get_all_references(debug)
File "/home/user/optplone/deployments/master/src/collective.exportimport/src/collective/exportimport/export_other.py", line 136, in get_all_references
if self.debug:
AttributeError: 'SimpleViewClass from /home/user/optplone/deploymen' object has no attribute 'debug'
git blame
2a0ff06f src/collective/exportimport/export_other.py (Philip Bauer 2021-06-03 14:55:24 +0200 135) }
2a0ff06f src/collective/exportimport/export_other.py (Philip Bauer 2021-06-03 14:55:24 +0200 136) if self.debug:
2a0ff06f src/collective/exportimport/export_other.py (Philip Bauer 2021-06-03 14:55:24 +0200 137) item["from_path"] = from_brain[0].getPath()
2a0ff06f src/collective/exportimport/export_other.py (Philip Bauer 2021-06-03 14:55:24 +0200 138) item["to_path"] = to_brain[0].getPath()
52acc20d src/collective/exportimport/export_other.py (Thibaut Born 2021-11-29 13:40:14 +0100 139) item = self.reference_hook(item)
280c5cf9 src/collective/exportimport/export_other.py (Thibaut Born 2021-11-30 11:20:42 +0100 140) if item is None:
280c5cf9 src/collective/exportimport/export_other.py (Thibaut Born 2021-11-30 11:20:42 +0100 141) continue
2a0ff06f src/collective/exportimport/export_other.py (Philip Bauer 2021-06-03 14:55:24 +0200 142) results.append(item)
I have several relation fields in a portlet. Exporting portlets then fails because a RelationValue
is not json serialisable:
http://localhost:9152/plone/@@export_portlets
Traceback (innermost last):
Module ZPublisher.Publish, line 138, in publish
Module ZPublisher.mapply, line 77, in mapply
Module ZPublisher.Publish, line 48, in call_object
Module collective.exportimport.export_other, line 477, in __call__
Module json, line 251, in dumps
Module json.encoder, line 209, in encode
Module json.encoder, line 431, in _iterencode
Module json.encoder, line 332, in _iterencode_list
Module json.encoder, line 408, in _iterencode_dict
Module json.encoder, line 408, in _iterencode_dict
Module json.encoder, line 332, in _iterencode_list
Module json.encoder, line 408, in _iterencode_dict
Module json.encoder, line 408, in _iterencode_dict
Module json.encoder, line 442, in _iterencode
Module json.encoder, line 184, in default
TypeError: <z3c.relationfield.relation.RelationValue object at 0x114132050> is not JSON serializable
A bit related Is this comment from Philip where he removes some relations code, although I guess this was only active when exporting content, and not portlets.
The following diff in the portlet export code fixes it for me:
$ git diff
diff --git a/src/collective/exportimport/export_other.py b/src/collective/exportimport/export_other.py
index f358a1c..383635a 100644
--- a/src/collective/exportimport/export_other.py
+++ b/src/collective/exportimport/export_other.py
@@ -535,13 +535,18 @@ def export_local_portlets(obj):
settings = IPortletAssignmentSettings(assignment)
if manager_name not in items:
items[manager_name] = []
+ from z3c.relationfield.relation import RelationValue
+ assignment_data = {}
+ for name in schema.names():
+ value = getattr(assignment, name, None)
+ if value and isinstance(value, RelationValue):
+ value = value.to_object.UID()
+ assignment_data[name] = value
+
items[manager_name].append({
'type': portlet_type,
'visible': settings.get('visible', True),
- 'assignment': {
- name: getattr(assignment, name, None)
- for name in schema.names()
- },
+ 'assignment': assignment_data,
})
return items
The code needs to be more robust, but those are details.
I am not sure if this is a reasonable place for this fix or if there is a more general place.
Ah, wait, using this works too:
json_compatible(getattr(assignment, name, None))
At least then you get an export without errors, although my earlier code that returns uuids could be preferable in some cases.
Some export/imports that do not rely on other content (that might not yet exists at the time of importing) could be included in the default export/import of content. One example is constrains which is implemented like that in #71
Other options are:
We could add hooks like item = self.export_constrain(item, obj)
and self.import_constrain(obj, item)
for each to make it easy to override. We could also add checkboxes so enable/disable these extra-steps during export.
In client projects I also export/import some marker-interfaces and annotations but I don't think that could be generalized.
It would probably be enough to add some more dokumentation with simple examples how that do that.
Copying this issue from what I wrote on community.plone.org to keep the list of possible switched/fixes central on github.
While importing an exported contenttype from Plone 4/AT to Plone 5.2 DX, the import routine gives a traceback on:
The problem is the timezone part of the date string the exported modified date is u'2011-10-13T13:49:57+01:00' with a : separator in the timezone. strptime on Python 3 expects 0100. My work around so far has been simple and is on https://github.com/collective/collective.exportimport/tree/modified_date_parser . Use dateutil to parse the string on import.
However:
If this is specific to AT, this fix could be added/combined with other small tweaks to make the AT export more compatible with DX import and put behind a checkbox/switch as discussed in #7.
@pbauer added download to server for eported .json files. I think we'll also need add 'import from server'. I've successfully export 5+ Gb of File Items into a File.json, but cannot import these back into a fresh Plone site because either the browser or the local Plone server crashes/aborts when uploading this through the frontend :-(
Just as content export has a "start from this path" input field, there should be a "do not recurse in the following paths" field (perhaps a multiline field / list) that makes the content exporter iterator completely skip anything below those paths. For many use cases, this is not necessary since the exported JSON can be filtered to exclude said content, but there are cases for very large sites where the user would rather not have to wait until everything is exported.
Concomitant to this, perhaps it would not be such a bad idea to allow for exports of multiple roots rather than a single one. Right now I have four Plone sites in a single Zope instance, and I have to export them all individually rather than at once. It's a bit of a bummer, frankly (and I also could never get the plone api get view thingie to work in scripts — when I do it that way, no objects are exported).
Currently, we do not export/import the default page of the site root. I propose to use an empty string for the "uuid"
dict value to denote the default page of the site root, as a special case.
Trying to install collective.exportimport with pip does not work, raising the error
ERROR: Package 'collective.exportimport' requires a different Python: 3.8.1 not in '==2.7, >=3.6'
The issue is in the python_requires specification, and can be seen with:
pip install packaging
python
>>> from packaging.specifiers import SpecifierSet
>>> from packaging.version import Version
>>> Version("3.8.1") in SpecifierSet("==2.7, >=3.6")
False
>>> Version("2.7.10") in SpecifierSet("==2.7, >=3.6")
False
The PEP 440 specify "The comma (",") is equivalent to a logical and operator: a candidate version must match all given version clauses in order to match the specifier as a whole."
Currently collective.relationhelpers
is used.
When I add this package in a Plone 6 site, startup fails with a configuration conflict:
zope.configuration.config.ConfigurationConflictError: Conflicting configuration actions
For: ('view', (<InterfaceClass Products.CMFPlone.interfaces.siteroot.IPloneSiteRoot>, <InterfaceClass zope.publisher.interfaces.browser.IDefaultBrowserLayer>), 'inspect-relations', <InterfaceClass zope.publisher.interfaces.browser.IBrowserRequest>)
File "/Users/maurits/shared-eggs/cp39/Products.CMFPlone-6.0.0a1.dev0-py3.9.egg/Products/CMFPlone/controlpanel/browser/configure.zcml", line 326.2-332.8
<browser:page
name="inspect-relations"
for="Products.CMFPlone.interfaces.IPloneSiteRoot"
class=".relations.RelationsInspectControlpanel"
template="relations_inspect.pt"
permission="Products.CMFPlone.InspectRelations"
/>
File "/Users/maurits/shared-eggs/cp39/collective.relationhelpers-1.5-py3.9.egg/collective/relationhelpers/configure.zcml", line 9.2-15.8
<browser:page
name="inspect-relations"
for="Products.CMFPlone.interfaces.IPloneSiteRoot"
class=".api.InspectRelationsControlpanel"
template="relations_inspect.pt"
permission="cmf.ManagePortal"
/>
For: ('view', (<InterfaceClass Products.CMFPlone.interfaces.siteroot.IPloneSiteRoot>, <InterfaceClass zope.publisher.interfaces.browser.IDefaultBrowserLayer>), 'rebuild-relations', <InterfaceClass zope.publisher.interfaces.browser.IBrowserRequest>)
File "/Users/maurits/shared-eggs/cp39/Products.CMFPlone-6.0.0a1.dev0-py3.9.egg/Products/CMFPlone/controlpanel/browser/configure.zcml", line 334.2-340.8
<browser:page
name="rebuild-relations"
for="Products.CMFPlone.interfaces.IPloneSiteRoot"
class=".relations.RelationsRebuildControlpanel"
template="relations_rebuild.pt"
permission="cmf.ManagePortal"
/>
File "/Users/maurits/shared-eggs/cp39/collective.relationhelpers-1.5-py3.9.egg/collective/relationhelpers/configure.zcml", line 17.2-23.8
<browser:page
name="rebuild-relations"
for="Products.CMFPlone.interfaces.IPloneSiteRoot"
class=".api.RebuildRelationsControlpanel"
template="relations_rebuild.pt"
permission="cmf.ManagePortal"
/>
Alternatively, collective.relationhelpers
should be fixed to not fail in this case.
I have an portlet export that includes a portlet that I no longer want. Currently, the import fails, because the portlet type is not known:
Traceback (innermost last):
Module ZPublisher.WSGIPublisher, line 167, in transaction_pubevents
Module ZPublisher.WSGIPublisher, line 376, in publish_module
Module ZPublisher.WSGIPublisher, line 271, in publish
Module ZPublisher.mapply, line 85, in mapply
Module ZPublisher.WSGIPublisher, line 68, in call_object
Module collective.exportimport.import_other, line 582, in __call__
Module collective.exportimport.import_other, line 597, in import_portlets
Module collective.exportimport.import_other, line 618, in register_portlets
Module zope.component._api, line 165, in getUtility
zope.interface.interfaces.ComponentLookupError:
(<InterfaceClass zope.component.interfaces.IFactory>, 'collective.quickupload.QuickUploadPortlet')
It would be nice if the importer could ignore it.
I see two options:
ComponentLookupError
from above, log a warning, and continue with the next portlet.I prefer the first one, as it is a small change and works for everyone. But maybe we prefer that integrators explicitly catch the error. For the second one I now have this monkey patch, from which I could make a PR.
from collective.exportimport import import_other
import logging
logger = logging.getLogger(__name__)
_orig_register_portlets = import_other.register_portlets
IGNORE_PORTLET_TYPES = [
"collective.quickupload.QuickUploadPortlet",
]
def register_portlets(obj, item):
"""Register portlets for one object.
CHANGE compared to original: pop unwanted portlets.
I tried to override the browser view first,
but then I would have had to copy the template.
"""
for manager_name, portlets in item.get("portlets", {}).items():
if not portlets:
continue
ignore = []
for portlet_data in portlets:
if portlet_data["type"] in IGNORE_PORTLET_TYPES:
logger.info(
"Ignoring portlet type %s at %s",
portlet_data["type"],
obj.absolute_url(),
)
ignore.append(portlet_data)
for portlet_data in ignore:
portlets.remove(portlet_data)
return _orig_register_portlets(obj, item)
import_other.register_portlets = register_portlets
logger.info(
"Patched collective.exportimport register_portlets to ignore these types: %s",
IGNORE_PORTLET_TYPES,
)
Do we have a preference?
Somehow I do not get the trick. If I export my ATFileAttachment without base64 data it looks like this:
Now importing this with a custom_dict_hook ("ATFileAttachment" -> "DXFile") creates a File object as expected, but the file
field has not data 😢 ... so if I'm correct, the plone.restapi deserializer doesn't load data from remote origins? do I have to take care about this myself?
Exported JSON files can be become rather large in particular with inline binary data (which is often needed rather than having a reference to a blob file). Using JSONL would improve the handling of exports a lot. In particular, you could filter JSON records more easily using command line tools like grep
.
I hae a snippet of code that is supposed to export (and it works with all exports other than content), but when I try it with content export, I get a zero-bytes JSON file. This snippet of code runs as an entry point of my bin/client
program.
Does anyone know what the problem with the export is?
Code: https://gist.github.com/Rudd-O/f46154c80eb9937ec387e2b460ebbe8b
EDIT:
Ultimately the goal is to be able to command an export of everything via the Plone CLI (bin/client -c export.py
), primarily for (but not limited to) migration automation and testing. All other exports work correctly using this code — only the content one does not.
Comments imported to Plone 5 are not unescaped:
Sätze wie folgt formulieren: <br />
Hi from Plone Conf!
In the Installation header, the README says You don't need to install the add-on.
after instructions for installing through buildout
. There does not seem to be any information about using collective.exportimport without installation. How would collective.exportimport be used without installation? I am personally interested in exporting a complete Plone 4 site to compare results with jsonify.
Plone content rules are stored in an object IRuleStorage
which is not visible in Zope space and is also not visible in content space (it can only be obtained using getUtility()
). Implementation-wise, IRuleStorage
is implemented as a simple OOBTree()
, that can be acquired (plone.app.contentrules.browser.assignments.acquired_rules()
has the scoop).
It would be great if the import/export framework had a step to deal with the rule storage. This would require at least a custom pair of serializer/deserializer.
Hello, i'm tryng to install on plone 4.2.1 but i have this error
is this compatibile with 4.2.1? thanks
root@intranet2:/opt/buildout/Plone4.2.1Intranet/zeocluster# ./bin/buildout
Updating zeoserver.
Installing client1.
Getting distribution for 'hurry.filesize'.
Got hurry.filesize 0.9.
Getting distribution for 'collective.exportimport'.
/opt/buildout/Plone4.2.1Intranet/Python-2.7/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'project_urls'
warnings.warn(msg)
/opt/buildout/Plone4.2.1Intranet/Python-2.7/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'python_requires'
warnings.warn(msg)
error: Setup script exited with error in collective.exportimport setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers
An error occured when trying to install collective.exportimport main. Look above this message for any errors that were output by easy_install.
While:
Installing client1.
Getting distribution for 'collective.exportimport'.
Error: Couldn't install: collective.exportimport main
*************** PICKED VERSIONS ****************
[versions]
Products.LinguaPlone = 4.1.2
hurry.filesize = 0.9
*************** /PICKED VERSIONS ***************
-------------------------------------------------------------
| Warning: multiple exports selected. Download will be tar.gz |
-------------------------------------------------------------
# Exports
[x] Content
[x] File/Images
[ ] Users
[ ] Content Tree
[ ] Relations
[ ] Translations
[ ] Local Roles
[ ] Default Page Mapping
[ ] Object Positions in Parent
[ ] Comments
# Content Export
{query widget}
Type: Page, News Item
Path: /news depth:1
Path: /other-news depth:1
Creation date: > 1/1/20018
-----------------------------
Selected content (21 items, 9Mb)
- News Item (10, 5Mb)
- Page (9, 2Mb)
- Folders (implicit) (2, 1Mb)
-----------------------------
# File/Images Export
(o) url/path
( ) binary in tar.gz
( ) base64 encoded in json
[Download] [Save to Server] [Dry Run] [Cancel]
Something funny is going on with the relations import.
In import_other isReferencing is added to the ignores list:
collective.exportimport/src/collective/exportimport/import_other.py
Lines 271 to 274 in 5270f95
From the rest of the code in this method it seems all existing relations in the target site are removed and only the 'sanitised' relations are imported again. but this destroys all linkintegrity relations.
I assume the import_content imports are recreating the isReferencing relations while restoring the content items.
in:
collective.exportimport/src/collective/exportimport/import_other.py
Lines 287 to 297 in 5270f95
But this does not work, I get ObjectMissing Errors in z3c.relationfield.event.updateRelations where it tries to list existing relations:
[9] > /Users/fred/.buildout/eggs/cp38/z3c.relationfield-0.9.0-py3.8.egg/z3c/relationfield/event.py(81)updateRelations()
-> rels = list(catalog.findRelations({'from_id': obj_id}))
[10] /Users/fred/.buildout/eggs/cp38/zc.relation-1.1.post2-py3.8.egg/zc/relation/catalog.py(734)<genexpr>()
-> return (self._relTools['load'](t, self, cache) for t in tokens)
[11] /Users/fred/.buildout/eggs/cp38/z3c.relationfield-0.9.0-py3.8.egg/z3c/relationfield/index.py(49)load()
-> return intids.getObject(token)
[12] /Users/fred/.buildout/eggs/cp38/zope.intid-4.3.0-py3.8.egg/zope/intid/__init__.py(89)getObject()
-> raise ObjectMissingError(id)
Stranger is that when I step back in the debugger to frame 9 and execute the same line again I do get the rels list:
(Pdb++) list(catalog.findRelations({'from_id': obj_id}))
[<z3c.relationfield.relation.RelationValue object at 0x11eefdac0 oid 0x200b0 in <Connection at 10d0bb760>>]
but then I'm importing the Plone 4 Archetypes linkintegrity lists on recreated Dexterity Content. It could work as no paths/ids etc have changed and for non existing content the relations will be dropped. But it doesn't feel very clean to do this.
[edit:] I tried this and the isReferencing is actually increasing when I include isReferencing relations on import. So importing content creates 1848 isReferencing relatins, then restoing relations ups isReferencing to 1941 relations. :-O
Meh. I mean doing this in the import_code: make an extra import on the fly, merge with the incoming relations.json and reapply, dropping isReferencing only from the relations.json .
@pbauer How did you deal with this so far while using collective.exportimport?
Using a really minimal Plone 4 installation (https://github.com/collective/minimalplone4/) we get these errors when adding collective.exportimport to eggs:
collective.exportimport-1.0-py2.7.egg/collective/exportimport/configure.zcml", line 13.2-19.8
ImportError: No module named relationfield.interfaces
collective.exportimport-1.0-py2.7.egg/collective/exportimport/configure.zcml", line 57.2-58.52
ImportError: No module named contenttypes.interfaces
Funny that this wasn't caught on tests.
Needed other pins as well, but is out of the scope of this project (will put them here for the sake of documentation):
# Python 2/Plone 4 compatibility.
plone.restapi = 6.13.8
PyJWT = 1.7.1
# https://github.com/Julian/jsonschema/issues/453
# Getting distribution for 'pyrsistent>=0.14.0'
# ValueError: need more than 0 values to unpack
jsonschema = 2.6.0
When testing the topic to collection migration we noticed to sort_on
and sort_reversed
metadata seems to not get migrated.
it seems like it is just forgotten to add that to the metadata.
With a quick look it seems that it might be fixed by replacing
self._collection_sort_reversed = criterion.getReversed()
self._collection_sort_on = criterion.Field()
with
topic_metadata["sort_reversed"] = criterion.getReversed()
topic_metadata["sort_on"] = criterion.Field()
in this file:
https://github.com/collective/collective.exportimport/blob/main/src/collective/exportimport/serializer.py#L341
or adding it below the
topic_metadata["query"] = json_compatible(formquery)
on line 361
Let me know if this seems correct to you, then we will make a PR
The export content page has a typo, it should say children, it says childen.
I plan to implement the export and import of the full revision-history created bt CMFEditions.
I'm still undecided if it should be a option step during the default migration or a additional export/import step. Maybe the later to be able to limit the number of exported and imported revisions.
Traceback (most recent call last):
File "/home/user/optplone/deployments/601a/parts/client1/bin/interpreter", line 294, in <module>
exec(_val)
File "<string>", line 1, in <module>
File "/home/user/optplone/deployments/601a/src/ruddocom.policy/src/ruddocom/policy/ctl/import_embedded.py", line 105, in <module>
full_import(site, importpath, what)
File "/home/user/optplone/deployments/601a/src/ruddocom.policy/src/ruddocom/policy/ctl/import_embedded.py", line 85, in full_import
fixer()
File "/home/user/optplone/deployments/601a/src/collective.exportimport/src/collective/exportimport/import_content.py", line 697, in __call__
portal.ZopeFindAndApply(portal, search_sub=True, apply_func=reset_dates)
File "/home/user/optplone/buildout-cache/eggs/Zope-5.3-py3.8.egg/OFS/FindSupport.py", line 171, in ZopeFindAndApply
self.ZopeFindAndApply(ob, obj_ids, obj_metatypes,
File "/home/user/optplone/buildout-cache/eggs/Zope-5.3-py3.8.egg/OFS/FindSupport.py", line 171, in ZopeFindAndApply
self.ZopeFindAndApply(ob, obj_ids, obj_metatypes,
File "/home/user/optplone/buildout-cache/eggs/Zope-5.3-py3.8.egg/OFS/FindSupport.py", line 165, in ZopeFindAndApply
apply_func(ob, (apply_path + '/' + p))
File "/home/user/optplone/deployments/601a/src/collective.exportimport/src/collective/exportimport/import_content.py", line 688, in reset_dates
del obj.modification_date_migrated
AttributeError: modification_date_migrated
Why is this happening?
Not a bug, perhaps a feature request for supporting plone.formwidget.geolocation.geolocation.Geolocation:
2021-12-11 18:05:41,976 INFO [collective.exportimport.export_content:299][waitress-2] Error exporting https://test.dynamore.de/en/locations/subsidiaries/dynamore-swiss-en: No converter for making <plone.formwidget.geolocation.geolocation.Geolocation object at 0x7fed111cff28> (<class 'plone.formwidget.geolocation.geolocation.Geolocation'>) JSON compatible
All my default pages import, except for the default page of the plone site root. I notice in defaultpages.json
that there is an object with UID plone_site_root
, but this object ID does not appear in the content.json
file (the actual plone site root has a concrete UID.
It does not look like the defaultpages routine that looks for the site root actually links the default page with the site root.
What gives?
Plone 4.3.20 (AT) with current checkout from master generates these errors for discussion items:
2022-04-12T16:07:55 ERROR collective.exportimport.export_content Error exporting http://dev2.zopyx.de:5080/eteaching/community/communityevents/ringvorlesung/hybride-lehrszenarien-gestalten/++conversation++default/1602677734552386
Traceback (most recent call last):
File "/home/ajung/sandboxes/iwm/plone4.buildout/src/collective.exportimport/src/collective/exportimport/export_content.py", line 284, in export_content
item = self.fix_url(item, obj)
File "/home/ajung/sandboxes/iwm/plone4.buildout/src/collective.exportimport/src/collective/exportimport/export_content.py", line 428, in fix_url
if item["parent"]["@id"] != parent_url:
KeyError: 'parent'
When properly setting default-pages with setDefaultPage('foo')
the target object foo
has to exist and foo
is also reindexed with the index is_default_page
. So this is a task that needs to be handles after content migration same as relations.
A side-benefit is that you can skip this part when you want to migrate to a Plone 6 site with Volto (because Volto does not support default-pages).
If the now optional with debug=1 addition of paths on the relations export is always done:
collective.exportimport/src/collective/exportimport/export_other.py
Lines 69 to 72 in 5270f95
we can provide much better logging on the import step. Now it only logs
2021-05-20 14:30:29,877 INFO [collective.relationhelpers.api:237][waitress-2] 7b02d3b74f57cd1d4a1c782fd74dd649 is missing
When exporting ATCollection items the queries might contain outdated configurations. The following query doesn’t work in Plone 5 with plone.app.querystring > 1.3.2 (https://github.com/plone/plone.app.querystring/blob/master/CHANGES.rst#1312-2015-11-26)
"query": [
{
"i": "portal_type",
"o": "plone.app.querystring.operation.selection.is",
"v": [
"News Item"
]
},
{
"i": "path",
"o": "plone.app.querystring.operation.string.relativePath",
"v": "../"
}
],
The correct version would be:
"query": [
{
"i": "portal_type",
"o": "plone.app.querystring.operation.selection.any",
"v": [
"News Item"
]
},
{
"i": "path",
"o": "plone.app.querystring.operation.string.relativePath",
"v": "../"
}
],
With plone.app.querystring
there is an upgrade step available (8 -> 9) which you can run to fix imported collections with the wrong queries.
Besides that, should we add some logic to adjust the queries on export/import?
Currently, content is exported without creator information. There are cases, however, when that information is needed and it is not sufficient to have imported objects created by the importing user. Maybe it would be useful to export creator information optionally, but if so, I'm not sure about whether it should be done by default.
In plone.restapi deserializing richtext uses html_parser.unescape(data)
before setting the RichTextValue. See https://github.com/plone/plone.restapi/blob/master/src/plone/restapi/deserializer/dxfields.py#L292
This leads to broken content because code-examples are transformed to html-tags: <pre>Code example: <h2>Heading 2</h2> example</pre>
becomes <pre>Code example: <h2>Heading 2</h2> example</pre>
I'm not sure if this is a bug in restapi but for the purpose of exportimport I will override the RichTextFieldDeserializer
with a version that does not do that for now.
Here is a diff that allows the program to continue.
diff --git a/src/collective/exportimport/export_other.py b/src/collective/exportimport/export_other.py
index 2cf8e33..201f7e7 100644
--- a/src/collective/exportimport/export_other.py
+++ b/src/collective/exportimport/export_other.py
@@ -117,7 +117,12 @@ class ExportRelations(BrowserView):
if relation_catalog:
portal_catalog = getToolByName(self.context, "portal_catalog")
for rel in relation_catalog.findRelations():
- if rel.from_path and rel.to_path:
+ try:
+ rel_from_path_and_rel_to_path = rel.from_path and rel.to_path
+ except ValueError:
+ logger.exception("Cannot export relation %s, skipping", rel)
+ continue
+ if rel_from_path_and_rel_to_path:
from_brain = portal_catalog(
path=dict(query=rel.from_path, depth=0)
)
jsonschema
requries pyrsistent>=0.14.0
which has to be limited to the latest py2.7 compatible version 0.15.7
when installing on python 2
With this data:
[
{
"portlets": {
"plone.footerportlets": [
{
"type": "plone.portlet.static.Static",
"visible": true,
"assignment": {
"header": "Want to comment on this?",
"text": {
"data": "<p style=\"text-align: center;\"><em>Want to comment on this?\u00a0 </em><a rel=\"noopener\" target=\"_blank\" href=\"https://t.me/Site_com\" data-linktype=\"external\" data-val=\"https://t.me/Site_com\"><strong>Join our Telegram</strong> channel</a> and share your opinions in each post!</p>",
"content-type": "text/html",
"encoding": "utf-8"
},
"omit_border": true,
"footer": null,
"more_url": null
}
}
]
},
"uuid": "1bb9868bd9de4637977348acab5d1893"
},
[...]
the imported portlet won't render, and editing shows this:
We’re sorry, but there seems to be an error…
Here is the full error message:
Traceback (innermost last):
Module ZPublisher.WSGIPublisher, line 167, in transaction_pubevents
Module ZPublisher.WSGIPublisher, line 376, in publish_module
Module ZPublisher.WSGIPublisher, line 271, in publish
Module ZPublisher.mapply, line 85, in mapply
Module ZPublisher.WSGIPublisher, line 68, in call_object
Module plone.app.portlets.browser.formhelper, line 177, in __call__
Module z3c.form.form, line 233, in __call__
Module plone.z3cform.fieldsets.extensible, line 65, in update
Module plone.z3cform.patch, line 30, in GroupForm_update
Module z3c.form.group, line 132, in update
Module z3c.form.form, line 136, in updateWidgets
Module z3c.form.field, line 277, in update
Module plone.app.textfield.widget, line 42, in update
Module z3c.form.browser.textarea, line 37, in update
Module z3c.form.browser.widget, line 171, in update
Module Products.CMFPlone.patches.z3c_form, line 46, in _wrapped
Module z3c.form.widget, line 132, in update
Module plone.app.textfield.widget, line 99, in toWidgetValue
ValueError: Can not convert {'data': '<p style="text-align: center;"><em>Want to comment on this?\xa0 </em><a rel="noopener" target="_blank" href="https://t.me/RuddO_com" data-linktype="external" data-val="https://t.me/RuddO_com"><strong>Join our Telegram</strong> channel</a> and share your opinions in each post!</p>', 'content-type': 'text/html', 'encoding': 'utf-8'} to an IRichTextValue
Looks like the text
object is being passed in lieu of the text
subobject. Something is wrong with the deserializer.
In old Plone sites this is apparently no problem to set, but when we call the DX deserializer on an imported item where effective > expires, the import is aborted with a ValidationError:
orig_error is my local patch in plone.restapi to see the real message when I get dropped in to a PDB (PDBDebugMode)
*** zExceptions.BadRequest: [{'error': 'ValidationError',
'message': 'error_expiration_must_be_after_effective_date',
'orig_error': EffectiveAfterExpires('error_expiration_must_be_after_effective_date'),
'id': 'http://localhost:9050/plone/nl/kalender/asfasdfdsfasdf.pdf',
'orig_data': {'@id': 'http://localhost:9050/plone/nl/kalender/asdfasdfsdfads.pdf', '@type': 'File', 'UID': '5397b307c9cc784319fca831cb0994d9', 'allow_discussion': False, 'contributors': [], 'created': '2015-09-07T12:40:17+00:00', 'creators': ['asdfasfdf'], 'description': None, 'effective': '2015-09-07T12:40:00+00:00', 'exclude_from_nav': True, 'expires': '2014-02-01T00:00:00+00:00',
I plan to add support for export/import of workflow history. It will probably be straightforward and part of the default process.
Hi, sorry for not having a closer look first, but if it is possible, then all is good, if it is still not possible, but desirable, I might contribute the missing code if such a use case would be liked:
So my use case would be to be able to export (and re-import) a part of the website. The main idea is to export anything that is linked from a few overview pages and be able to re-import that on an empty plone site. This would allow for example our designer to work with real content but not have to work with the 1M objects we have on our database 🙂
Is that already possible? 🤔 would it be ok to provide such a functionality here in this package?
With small changes to the current code we could support the use-case of updating existing content instead of (so far) only skipping (no changes) or ignoring (create with a new id) it.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.