Git Product home page Git Product logo

obsidian-html / obsidian-html Goto Github PK

View Code? Open in Web Editor NEW
323.0 4.0 48.0 67.18 MB

Python code to convert Obsidian notes to proper markdown and optionally to create an html site too.

Home Page: https://obsidian-html.github.io

License: GNU General Public License v3.0

Python 47.25% CSS 5.73% HTML 6.83% Dockerfile 0.07% JavaScript 39.01% Jupyter Notebook 0.81% Nix 0.29% Shell 0.03%
markdown obsidian obsidian-md obsidianmd obsidian-html markdown-to-html notes notes-app note-taking obsidian-notes

obsidian-html's Introduction

Obsidian-html TestSuite

Description

An application to export Obsidian notes to standard markdown and an html based website.

Read more:

Important note: This package has the same name as another package that used to be quite well known. That one seems to have been renamed to Oboe. The original was located at https://github.com/kmaasrud/obsidian-html and later https://github.com/kmaasrud/oboe which you find referenced in a lot of places. We would link to it but we can't find an authoritative source, only forks.

obsidian-html's People

Contributors

alldaydev00 avatar bhickta avatar dwrolvink avatar gewerd-strauss avatar jcolson avatar pangoraw avatar programmerino avatar sevmonster avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

obsidian-html's Issues

Include pasted images

Obsidian allows users to paste images in the client.
When such an imagelink is found, it needs to be copied over to the output

Allow template.html to be provided by enduser

Now with the fancy packaging, it has become very confusing for end users to change the template that is being used.

For the main.css file this is not an issue, as it can be overwritten once, but the template is merged with every note, so it should be easy to edit it ahead of processing the output.

graph doesn't show all links to nodes

I'm not sure how to report on this, but I've noticed that there are node links that should be there (they are in obsidian for example) but they are not in the html output of the graph.

if there is any debug that I can help with, let me know.

Make link read out more robust

When taking an existing standard markdown project as direct input, a lot of errors pop up with links:

  • Links can omit the .md suffix, this needs to be added when no suffix is present
  • Drive links like 'C:' need to be handled as external links
  • Links starting with '/' should be fixed to root instead of page_root
  • Links that include an anchor, like home.md#Chapter1, need the anchor removed for processing and then put back at the end

Possibly more issues.

Make a new class to handle links:

class MdUrl:
  pass

proper_links = re.findall("(?<=\]\().+?(?=\))", md.page)
for l in proper_links:
  link = MdUrl(l)
  ...

Fix graph view for process_all: True

The new mode of setting

process_all: False
to true allows us to convert all notes, not just those that are reachable via the homepage.

But this breaks the backlink mechanism, as we can't guarantee that the recurse function is called by a linking page (it can be the main loop). Thus we need to flip it around. In the linking page, figure out what the dest_path is of each link of the note, and if a valid link, add the node and the link to the pb.network_tree object.

Then remove the code that is now the wrong way around.

Note: this will also fix another bug:

  • Links missing in graph view more than one note links to another (processed note does not get processed again).

stylesheets not abiding by the `html_url_prefix` setting

it seems as though the stylesheets referenced in the html is not abiding by the html_url_prefix setting:

[2022-02-06 11:17:41] ERROR `/98682199-5ac9-448c-afc8-23ab7359a91b-static/mermaid.css' not found.
[2022-02-06 11:17:41] ERROR `/98682199-5ac9-448c-afc8-23ab7359a91b-static/main.css' not found.
[2022-02-06 11:17:41] ERROR `/98682199-5ac9-448c-afc8-23ab7359a91b-static/graph.css' not found.
[2022-02-06 11:17:41] ERROR `/98682199-5ac9-448c-afc8-23ab7359a91b-static/mermaid.min.js' not found.
[2022-02-06 11:17:41] ERROR `/98682199-5ac9-448c-afc8-23ab7359a91b-static/graph.css' not found.

Working in main

We're now all working in the main branch via PR's
It might be useful to create a dev branch and work in there, and then we can control the testing a bit better before merging to master.

I can also give regular contributors more permissions on the dev branch then.

Good idea? Better ways of working?

image tags with alt text don't get parsed by the findall regex

image isn't getting the path correct ...

[2022-02-07 15:38:33] ERROR `/media/D_D-DragonHeart-Dec-15,-2021-18-31-57-GMT-image1.png' not found.
<p><img alt="Cost Inspiration Points Eff ect Alter your roll by +1 Alter teammates roll by +1 Alter the DMS roll by+/- 1 " src="../../media/D_D-DragonHeart-Dec-15,-2021-18-31-57-GMT-image1.png" />   </p>

where the rest of the images are good:

<p><img alt="" src="/output/html/media/D_D-DragonHeart-Dec-15%2C-2021-18-31-57-GMT-image2.jpeg" />   </p>
<p><img alt="" src="/output/html/media/D_D-DragonHeart-Dec-15%2C-2021-18-31-57-GMT-image3.jpeg" />   </p>

Add tag pages

  • Add one page for every tag containing links to each note with that tag.
  • Sort most recently edited to oldest edit
  • url: host/<prefix/>tags/type/article for tag type/article

losing certain characters on HTML conversion

Notice how the û character is removed in HTML output

original:

location:: Turmish, Faerûn

intermediate md:

location:: Turmish, Faerûn   

resultant HTML:

        <p>location:: Turmish, Faern   </p>

Give warning when duplicate filenames are found in root

Because of how the code (and Obsidian) works, duplicate filenames are not allowed anywhere in the root folder, irregardless of the subfolder.

If this is not the case, you'll get buggy behavior.
There should be a warning output when duplicate filenames are found.

Code blocks are not ignored

if you write something in a code block, it should not alter due to our conversions, but at the moment that is not guaranteed

Graph view: Add Zooming/Panning

Requirement for bigger projects, graph view is unusable with very many notes

Ticket for documentation only, solution already in the works

Tests to add

List of issues we face(d) and what tests to add to cover this.

(1): this breaks often #88

make a helper function to return config values

this will be a clean way to handle if values are not configured in the config.yml

the helper function will set 'default' values for config.yml keys and if a specific value is requested that is not properly configured, will fall back to those default values OR will give an appropriate error to the end user to tell them to configure a key in their config.yml

Page open twice header link bug

Related to #12

It is possible atm to have a page open twice in different bugs.
Any id based operations will prefer the earlier tab.

Hovering over a header in a later tab will show the link icon in the corresponding header of the earlier tab.
Perhaps change the operation to be class based.

No such file or directory: '/Users/jcolson/src/personal/DragonHearts/output/md/Personal/httpd.service'

looks like the markdown made a link out of something that wasn't a link, which confused the html conversion

MarkdownLink(
        url = "httpd.service", 
        suffix = '.service', 
        isValid = True, 
        isExternal = False, 
        inRoot = True, 
        src_path = /Users/jcolson/src/personal/DragonHearts/output/md/Personal/httpd.service, 
        rel_src_path = Personal/httpd.service, 
        rel_src_path_posix = Personal/httpd.service, 
        page_path = /Users/jcolson/src/personal/DragonHearts/output/md/Personal/renew-certificates---lets-enc.md, 
        root_path = /Users/jcolson/src/personal/DragonHearts/output/md 
)
Traceback (most recent call last):
  File "/usr/local/bin/obsidianhtml", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 529, in main
    ConvertMarkdownPageToHtmlPage(unparsed[k]['fullpath'], pb)
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 118, in ConvertMarkdownPageToHtmlPage
    shutil.copyfile(link.src_path, paths['html_output_folder'].joinpath(link.rel_src_path))
  File "/usr/local/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/shutil.py", line 264, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: '/Users/jcolson/src/personal/DragonHearts/output/md/Personal/httpd.service'

NameError: name 'warnings' is not defined

File /Users/jcolson/src/personal/DragonHearts/output/md/media/Personal-Recipe--How-to-Make-Jalapeño-Pickles---Field-&-Stream-image3.gif not located, so not copied.
Traceback (most recent call last):
  File "/usr/local/bin/obsidianhtml", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 532, in main
    ConvertMarkdownPageToHtmlPage(unparsed[k]['fullpath'], pb)
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 154, in ConvertMarkdownPageToHtmlPage
    warnings.warn(f"Image {str(full_link_path)} treated as external and not imported in html")
NameError: name 'warnings' is not defined

Compile RSS feed

I have the following note template:

---
tags:
- date/{{date}}
---

# {{title}}

This ensures each new post gets the date of the day it was created.

This combined with the tag tree:

Could make an easy poor-man's RSS feed. Would make a nice addition for minimal cost.

in testing #20, ran into another link issue - `TypeError: 'bool' object is not subscriptable`

in testing #20, ran into another link issue - TypeError: 'bool' object is not subscriptable

edited code to print:

        # -- [10] Add code inclusions
        for l in re.findall(r'^(\<inclusion href="[^"]*" />)', self.page, re.MULTILINE):
            link = l.replace('<inclusion href="', '').replace('" />', '')
            link_lookup = GetObsidianFilePath(link, self.file_tree)
            print(link)
            print(link_lookup)
            file_record = link_lookup[1]
            header = link_lookup[2]

            if link_lookup == False:
                self.page = self.page.replace(l, f"> **obsidian-html error:** Could not find page {link}.")
                continue
            
            self.links.append(file_record['fullpath'])
Hymey.pdf
('Hymey.pdf.md', False, '')
Traceback (most recent call last):
  File "/usr/local/bin/obsidianhtml", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 462, in main
    recurseObisidianToMarkdown(str(paths['obsidian_entrypoint']), pb)
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 74, in recurseObisidianToMarkdown
    recurseObisidianToMarkdown(files[link_path]['fullpath'], pb)
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 74, in recurseObisidianToMarkdown
    recurseObisidianToMarkdown(files[link_path]['fullpath'], pb)
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 44, in recurseObisidianToMarkdown
    md.ConvertObsidianPageToMarkdownPage(paths['md_folder'], paths['obsidian_entrypoint'])
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/MarkdownPage.py", line 287, in ConvertObsidianPageToMarkdownPage
    self.links.append(file_record['fullpath'])
TypeError: 'bool' object is not subscriptable

IndexError: tuple index out of range

Not sure how to troubleshoot this one.

Traceback (most recent call last):
  File "/opt/homebrew/bin/obsidianhtml", line 8, in <module>
    sys.exit(main())
  File "/opt/homebrew/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 365, in main
    recurseObisidianToMarkdown(str(paths['obsidian_entrypoint']), pb)
  File "/opt/homebrew/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 75, in recurseObisidianToMarkdown
    recurseObisidianToMarkdown(files[link_path]['fullpath'], pb)
  File "/opt/homebrew/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 75, in recurseObisidianToMarkdown
    recurseObisidianToMarkdown(files[link_path]['fullpath'], pb)
  File "/opt/homebrew/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 49, in recurseObisidianToMarkdown
    md.ConvertObsidianPageToMarkdownPage(paths['md_folder'], paths['obsidian_entrypoint'])
  File "/opt/homebrew/lib/python3.9/site-packages/obsidianhtml/MarkdownPage.py", line 277, in ConvertObsidianPageToMarkdownPage
    header = link_lookup[2]
IndexError: tuple index out of range

was getting FileNotFoundError

have no way to determine which file was causing the issue.

error I was seeing:

904/1418
Traceback (most recent call last):
  File "/usr/local/bin/obsidianhtml", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 477, in main
    recurseObisidianToMarkdown(unparsed[k]['fullpath'], pb)
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 41, in recurseObisidianToMarkdown
    md = MarkdownPage(page_path, paths['obsidian_folder'], files)
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/MarkdownPage.py", line 39, in __init__
    self.metadata, self.page = frontmatter.parse(f.read())
  File "/usr/local/lib/python3.9/site-packages/frontmatter/__init__.py", line 82, in parse
    fm = handler.load(fm)
  File "/usr/local/lib/python3.9/site-packages/frontmatter/default_handlers.py", line 238, in load
    return yaml.load(fm, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/yaml/__init__.py", line 81, in load
    return loader.get_single_data()
  File "/usr/local/lib/python3.9/site-packages/yaml/constructor.py", line 49, in get_single_data
    node = self.get_single_node()
  File "yaml/_yaml.pyx", line 673, in yaml._yaml.CParser.get_single_node
  File "yaml/_yaml.pyx", line 687, in yaml._yaml.CParser._compose_document
  File "yaml/_yaml.pyx", line 731, in yaml._yaml.CParser._compose_node
  File "yaml/_yaml.pyx", line 847, in yaml._yaml.CParser._compose_mapping_node
  File "yaml/_yaml.pyx", line 860, in yaml._yaml.CParser._parse_next_event
yaml.parser.ParserError: while parsing a block mapping
  in "<unicode string>", line 2, column 1
did not find expected key
  in "<unicode string>", line 3, column 29

Rename/restructure image_suffixes mechanism

To discern between image tags and note inclusions, the image_suffixes list is now used to check what suffix a link has.

#26 added pdf to the list, which is not an image. This can be confusing in the future.

  • Rename image_suffixes to something more sensible
  • Move this list to the config yaml so that further code changes are not neccessary in the future
  • Add documentation

`not_created.html` is not getting tags (such as `html_url_prefix`) replaced

                <!-- Includes -->
                <link rel="stylesheet" href="{html_url_prefix}/98682199-5ac9-448c-afc8-23ab7359a91b-static/mermaid.css" />
                <link rel="stylesheet" href="{html_url_prefix}/98682199-5ac9-448c-afc8-23ab7359a91b-static/main.css" />
                <script src="{html_url_prefix}/98682199-5ac9-448c-afc8-23ab7359a91b-static/mermaid.min.js"></script> 

Factor out ConvertFullWindowsPathToRelativeMarkdownPath()

Related to the fix of #8

  • Factor out ConvertFullWindowsPathToRelativeMarkdownPath() in favor of using pathlib standard methods.
  • Standardize variable naming and use of Path vs string paths e.g.
    • rel_path vs relative_path
    • _path always being a Path('').resolve() object
    • *_path_posix for strings in the a/b/file.md format
    • etc

Allow index.html#go-to-header anchors

Note for reference. Just fixed this.

  • When in tab mode, we want to move to the correct section
  • When opening a new tab it needs to scroll to the correct section (without screwing up the layout of the page...)

Fixed in 94218c6

Enable Obisidian inclusions

Obsidian has the following syntax for including notes in a note:

![[Note Name]]
![[Note Name#Chapter Name]]

This is not handled at the moment.
While I think this feature is kinda whack, it is a core Obsidian feature, so it should be supported.

Md to html does not allow duplicate filenames

Related to the fix for https://github.com/dwrolvink/obsidian-html/issues/7

This fix also applies for md->html conversion, where it isn't necessary because in markdown the referenced file path can always be deduced.

This is quite a bit of work to get right though. The md->html code now uses the files[file_name] dict indexing method to find local files. We could use files[relative_path] but then the relative path of the linked file should be determined after the regex search.

This is done by doing (Path(page_path).parent.path + link_path) and then it needs to be processed so that: /a/b/c/../../bla.md --> /a/bla.md

FR: First compile markdown then compile HTML

Atm html and md are compiled side by side, because that's easier.
But this does not allow existing proper markdown to function as input without possible bugs.

We should separate the code into a md compilation stage and an html compilation stage.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.