Request to add support for paragraph alignment: left</co

I built the analysis document: <a href="https://github.com/onlyjus/python-docx/blo

feature: Paragraph.alignment about python-docx HOT 16 CLOSED

python-openxml commented on September 27, 2024

feature: Paragraph.alignment

from python-docx.

Comments (16)

scanny commented on September 27, 2024

That looks like the right enumeration.

I usually start the analysis for a new feature with having a look at the MS API. Here's the page for Paragraph.Alignment: http://msdn.microsoft.com/en-us/library/office/ff844837(v=office.15).aspx

Unless I can think of a good reason not to, I use the same name in the python-docx API. So this feature would be Paragraph.alignment.

Happy to have you work on it. Want to start with an analysis page like this one? http://python-docx.readthedocs.org/en/latest/dev/analysis/features/underline.html

The schema you'll need is /ref/xsd/wml.xsd. Specimen XML I usually get by creating a quick Word document and using opc-diag to browse the XML. That's usually most of the way there to propose a protocol which pretty much defines the feature from an API standpoint.

Let me know if you need more to go on.

from python-docx.

scanny commented on September 27, 2024

It should definitely support None, but I'm not sure what the semantics of assigning True or False would be. Probably best just to have None and the values of WD_ALIGN_PARAGRAPH (not developed yet) like .LEFT, .RIGHT, .CENTER, .JUSTIFY, etc.

from python-docx.

scanny commented on September 27, 2024

btw, this page should already have most of what you need in the way of XML Schema excerpt:
http://python-docx.readthedocs.org/en/latest/dev/analysis/schema/ct_ppr.html. Should be able to mostly cut and paste.

Best to start a new branch off of develop, say 'feature/paragraph-align' that we can use to coordinate commits. This is the branch you'll use to send a pull request from. If you spike, you can do it based on this branch rather than develop. Let me know if you need more on that aspect :)

from python-docx.

eyalbd1 commented on September 27, 2024

+1 for justification, please let me know if you need a hand.

from python-docx.

onlyjus commented on September 27, 2024

I created a new branch that i am spiking to. Then i will rebaseline the commits and merge it once i have the tests, docs, and code functioning.

I think i am getting mixed up. There seems to be lots of documentation on the internet about paragraph justification as apposed to aligment. Maybe it is a legacy term?

I have the code already working. I need to change the justification to aligment and get the docs and tests done.

from python-docx.

scanny commented on September 27, 2024

I think it's fair to think of them as synonyms in this context. The actual element tag is jc, which I expect came from the term 'justification' somehow (maybe j ustifi - c ation, but no idea really :). The Microsoft API for Word calls it alignment, which is why I think that's the best first idea for the name. Folks who know or come across the one API are able to map to the other in a straightforward way.

I encourage you to form the commit sequence from here in the order of (1) analysis document, (2) acceptance test, (3) units. If those are on your pull request branch I can offer feedback on the analysis, for example, before you develop the acceptance tests that depend on them. I think once you give that ordering a try you might see how it feels logical to do it that way, at least once you've spiked out a solution and are confident you understand the implementation side.

from python-docx.

onlyjus commented on September 27, 2024

I built the analysis document:
https://github.com/onlyjus/python-docx/blob/justification/docs/dev/analysis/features/alignment.rst

Let me know what you think.

from python-docx.

scanny commented on September 27, 2024

Hi Justin, this is awesome!

For some reason, GitHub won't let me add comments at the line level, I think because it's hard-coded to always render reStructuredText files rather than show them as source. So I'll have to provide my remarks here:

Protocol

the result after assigning WD_ALIGN_PARAGRAPH.LEFT would be 'LEFT (0)', not DOUBLE (0). I've added a more sophisticated Enumeration class that acts like an int, but has a str() value that actually tells you its symbolic value. You can try it out like this:

from docx.enum.text import WD_UNDERLINE

print(str(WD_UNDERLINE.DOUBLE))

-> 'DOUBLE (3)'

You can probably remove the three lines having WD_ALIGNMENT.RIGHT, the others are enough to demonstrate all the important cases: default (None), setting from None to something (left in this case), setting from something to something else (center), and from something to None.

Specimen XML

I always strip out the w:rsid..=... type attributes. Those are part of the revision tracking mechanism and just clutter up the examples usually, so I remove them. Also, the examples are probably better if they're parallel, so including the <w:p> and other tags in the second and later specimens so you can compare them "apples to apples" visually to see the actual differences introduced is probably a good idea in this case. You could make the text state the behavior being shown, like "A paragraph with inherited alignment.", "A paragraph with left alignment.", etc.

Schema

I think you're going to want to put in the "32 others" explicitly in this case, because the position of w:jc in that sequence is significant. You'll need to know specifically which come before and which come after it. This will be critical when it comes to inserting a new w:jc child into an existing w:pPr element. If you get the sequence incorrect the resulting file won't load. The other one you might have been looking at, rPr I think, is actually an xsd:choice rather than an xsd:sequence in the schema, meaning the order of child elements isn't significant for rPr. In this one it's a sequence though (usually is in Open XML) so the order is critical.

If you can fix those I think this one's ready to commit, go ahead and send the pull request when you're ready. You can add commits to it as you go and they'll all come in on the same pull request automatically :)

If you don't mind though, could you change the branch name to 'feature/paragraph-align'? That's the convention we have going. This git command will do the trick:

$ git branch -m justification feature/paragraph-align

from python-docx.

scanny commented on September 27, 2024

Oh, and by the way, don't add another commit with your changes, just amend the existing commit. Something like this should do the trick, although a little googling on it may be worthwhile:

$ git commit --amend

Depends a bit on how you do commits, like using a tool or whatever, but I like to do these at the command line since they're a little bit special, regular tools don't seem to have extra features like this.

Amending the commit keeps unnecessary noise out of the commit sequence.

from python-docx.

jeremy886 commented on September 27, 2024

looking forward to the feature

from python-docx.

onlyjus commented on September 27, 2024

Couple things,

I renamed the branch
I added all your suggestions to the analysis document
Q: The enumeration and the schema values don't exactly line up? Any suggestions?
https://github.com/onlyjus/python-docx/blob/feature/paragraph-align/docs/dev/analysis/features/alignment.rst
I started working on the acceptance test:
Q: Am I on the right track: features/par-enum-props.feature?
https://github.com/onlyjus/python-docx/blob/feature/paragraph-align/features/par-enum-props.feature
Once I get the acceptance test and the enumerations in, I'll re-baseline my commits and start a merge request.

from python-docx.

scanny commented on September 27, 2024

re: 2) Yes, that's a good point. It comes up from time to time, I suppose there are historical reasons related to Microsoft wanting to keep the API consistent across versions. The general guiding principles I've used are: 1) As long as we're in there, we might as well make all the XML values work, 2) If needed, don't hesitate to make up or drop an enumeration value, 3) the behavior exhibited by Word when assigning a particular enumeration value should be the same in python-docx. This latter one comes into play when it's not quite clear by examination just what XML value should map to what enumeration value.

I think the mapping is worth spelling out in the analysis document now that you mention it because it will directly drive the enumeration definition when you get to in a couple steps later. A simple table should do the trick, there should be some examples in other .rst files under the .docs directory, I can find one for you if you need, it's basically drawing border lines out of '=', '-', and '+' characters.

Would look something like this for a start:

enum	attr
LEFT	'left'
RIGHT	'right'
CENTER	'center'
JUSTIFY	'both' ?
etc	...

The ones that are questionable fall under principle 3) above and the only reliable way to tell is to use the MS API and see what it does. Usually I use IronPython for that, but if you have Windows handy that might be more convenient if you know your way around Visual Basic for Applications a little bit. I've already got a setup for this sort of thing so can take care of the mapping if you want, but it will be the end of the week before I can get to it, I'm out of town this week on business.

The general gist is that you would assign a particular enumeration value to Paragraph.align, save the presentation, then examine the XML that is produced. Would probably do it in one step by inserting and setting alignment of ten paragrahs or so, a different enumeration value to each, then the XML gives you the mapping you need.

re: 3) Yes, looks right to me so far :)

from python-docx.

scanny commented on September 27, 2024

Hi Justin, here's the results of the IronPython run to try all the WdParagraphAlignment values. Note that setting Paragraph.Alignment to wdAlignParagraphLeft causes no <w:pPr> element to be inserted. This would correspond to the behavior I had been thinking for assigning Paragraph.alignment = None. We'll have to give a noodle on that; maybe we don't ever return None or allow it to be assigned for this property and just stick with wdAlignParagraphLeft for that case:

<w:p>
  <w:pPr>
    <w:jc w:val="center"/>
  </w:pPr>
  <w:r>
    <w:t>wdAlignParagraphCenter</w:t>
  </w:r>
</w:p>
<w:p>
  <w:pPr>
    <w:jc w:val="distribute"/>
  </w:pPr>
  <w:r>
    <w:t>wdAlignParagraphDistribute</w:t>
  </w:r>
</w:p>
<w:p>
  <w:pPr>
    <w:jc w:val="both"/>
  </w:pPr>
  <w:r>
    <w:t>wdAlignParagraphJustify</w:t>
  </w:r>
</w:p>
<w:p>
  <w:pPr>
    <w:jc w:val="highKashida"/>
  </w:pPr>
  <w:r>
    <w:t>wdAlignParagraphJustifyHi</w:t>
  </w:r>
</w:p>
<w:p>
  <w:pPr>
    <w:jc w:val="lowKashida"/>
  </w:pPr>
  <w:r>
    <w:t>wdAlignParagraphJustifyLow</w:t>
  </w:r>
</w:p>
<w:p>
  <w:pPr>
    <w:jc w:val="mediumKashida"/>
  </w:pPr>
  <w:r>
    <w:t>wdAlignParagraphJustifyMed</w:t>
  </w:r>
</w:p>
<w:p>
  <w:r>
    <w:t>wdAlignParagraphLeft</w:t>
  </w:r>
</w:p>
<w:p>
  <w:pPr>
    <w:jc w:val="right"/>
  </w:pPr>
  <w:r>
    <w:t>wdAlignParagraphRight</w:t>
  </w:r>
</w:p>
<w:p>
  <w:pPr>
    <w:jc w:val="thaiDistribute"/>
  </w:pPr>
  <w:r>
    <w:t>wdAlignParagraphThaiJustify</w:t>
  </w:r>
</w:p>

This is the code that generated the .docx file this was extracted from, in case it's handy later:

# coding: utf-8

from System.IO import FileInfo

import clr
clr.AddReference("office")
clr.AddReference("Microsoft.Office.Interop.Word")

from Microsoft.Office.Core import MsoTriState
from Microsoft.Office.Interop.Word import (
    ApplicationClass, WdParagraphAlignment
)

wordApp = ApplicationClass()
wordApp.Visible = MsoTriState.msoTrue

document = wordApp.Documents.Open(FileInfo('test.docx').FullName)

paragraphs = document.paragraphs

alignments = (
    'wdAlignParagraphCenter',
    'wdAlignParagraphDistribute',
    'wdAlignParagraphJustify',
    'wdAlignParagraphJustifyHi',
    'wdAlignParagraphJustifyLow',
    'wdAlignParagraphJustifyMed',
    'wdAlignParagraphLeft',
    'wdAlignParagraphRight',
    'wdAlignParagraphThaiJustify',
)

for align_str in alignments:
    paragraphs[paragraphs.count].range.InsertParagraphAfter()
    paragraph = paragraphs[paragraphs.count]
    paragraph.alignment = getattr(WdParagraphAlignment, align_str)
    paragraph.range.text = align_str

from python-docx.

eyalbd1 commented on September 27, 2024

Hey Guys,

What is the expected release for this feature?

Thanks.

from python-docx.

scanny commented on September 27, 2024

@onlyjus are you going to finish this one up? I can take it over if you can't get to it.

from python-docx.

scanny commented on September 27, 2024

Added in commit 25abe5b on develop. Will appear in next release over the weekend. @onlyjus I gave you 4 of the 5 commits.

from python-docx.

feature: Paragraph.alignment about python-docx HOT 16 CLOSED

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent