obsidian-html / obsidian-html Goto Github PK

View Code? Open in Web Editor NEW

323.0 4.0 48.0 67.18 MB

Python code to convert Obsidian notes to proper markdown and optionally to create an html site too.

Home Page: https://obsidian-html.github.io

License: GNU General Public License v3.0

Python 47.25% CSS 5.73% HTML 6.83% Dockerfile 0.07% JavaScript 39.01% Jupyter Notebook 0.81% Nix 0.29% Shell 0.03%

markdown obsidian obsidian-md obsidianmd obsidian-html markdown-to-html notes notes-app note-taking obsidian-notes

obsidian-html's Introduction

Obsidian-html

Description

An application to export Obsidian notes to standard markdown and an html based website.

ObsidianHtml/Documentation.

Important note: This package has the same name as another package that used to be quite well known. That one seems to have been renamed to Oboe. The original was located at https://github.com/kmaasrud/obsidian-html and later https://github.com/kmaasrud/oboe which you find referenced in a lot of places. We would link to it but we can't find an authoritative source, only forks.

obsidian-html's People

Contributors

Stargazers

Watchers

obsidian-html's Issues

Include pasted images

Obsidian allows users to paste images in the client.
When such an imagelink is found, it needs to be copied over to the output

idea: a way to show backlinks to a node/document

Allow template.html to be provided by enduser

Now with the fancy packaging, it has become very confusing for end users to change the template that is being used.

For the main.css file this is not an issue, as it can be overwritten once, but the template is merged with every note, so it should be easy to edit it ahead of processing the output.

graph doesn't show all links to nodes

I'm not sure how to report on this, but I've noticed that there are node links that should be there (they are in obsidian for example) but they are not in the html output of the graph.

if there is any debug that I can help with, let me know.

Make link read out more robust

When taking an existing standard markdown project as direct input, a lot of errors pop up with links:

Links can omit the .md suffix, this needs to be added when no suffix is present
Drive links like 'C:' need to be handled as external links
Links starting with '/' should be fixed to root instead of page_root
Links that include an anchor, like home.md#Chapter1, need the anchor removed for processing and then put back at the end

Possibly more issues.

Make a new class to handle links:

class MdUrl:
  pass

proper_links = re.findall("(?<=\]\().+?(?=\))", md.page)
for l in proper_links:
  link = MdUrl(l)
  ...

Fix graph view for process_all: True

The new mode of setting

obsidian-html/example_config.yml

Line 55 in ff371c6

process_all: False

to true allows us to convert all notes, not just those that are reachable via the homepage.

But this breaks the backlink mechanism, as we can't guarantee that the recurse function is called by a linking page (it can be the main loop). Thus we need to flip it around. In the linking page, figure out what the dest_path is of each link of the note, and if a valid link, add the node and the link to the pb.network_tree object.

Then remove the code that is now the wrong way around.

Note: this will also fix another bug:

Links missing in graph view more than one note links to another (processed note does not get processed again).

stylesheets not abiding by the `html_url_prefix` setting

it seems as though the stylesheets referenced in the html is not abiding by the html_url_prefix setting:

[2022-02-06 11:17:41] ERROR `/98682199-5ac9-448c-afc8-23ab7359a91b-static/mermaid.css' not found.
[2022-02-06 11:17:41] ERROR `/98682199-5ac9-448c-afc8-23ab7359a91b-static/main.css' not found.
[2022-02-06 11:17:41] ERROR `/98682199-5ac9-448c-afc8-23ab7359a91b-static/graph.css' not found.
[2022-02-06 11:17:41] ERROR `/98682199-5ac9-448c-afc8-23ab7359a91b-static/mermaid.min.js' not found.
[2022-02-06 11:17:41] ERROR `/98682199-5ac9-448c-afc8-23ab7359a91b-static/graph.css' not found.

Working in main

We're now all working in the main branch via PR's
It might be useful to create a dev branch and work in there, and then we can control the testing a bit better before merging to master.

I can also give regular contributors more permissions on the dev branch then.

Good idea? Better ways of working?

image tags with alt text don't get parsed by the findall regex

image isn't getting the path correct ...

[2022-02-07 15:38:33] ERROR `/media/D_D-DragonHeart-Dec-15,-2021-18-31-57-GMT-image1.png' not found.

<p><img alt="Cost Inspiration Points Eff ect Alter your roll by +1 Alter teammates roll by +1 Alter the DMS roll by+/- 1 " src="../../media/D_D-DragonHeart-Dec-15,-2021-18-31-57-GMT-image1.png" />   </p>

where the rest of the images are good:

<p><img alt="" src="/output/html/media/D_D-DragonHeart-Dec-15%2C-2021-18-31-57-GMT-image2.jpeg" />   </p>
<p><img alt="" src="/output/html/media/D_D-DragonHeart-Dec-15%2C-2021-18-31-57-GMT-image3.jpeg" />   </p>

default css doesn't scale images

would be good to be able to see large images

Add tag pages

Add one page for every tag containing links to each note with that tag.
Sort most recently edited to oldest edit
url: host/<prefix/>tags/type/article for tag type/article

external http anchor tags should use a target="_blank" so that clicks on them are launched in another tab/window

losing certain characters on HTML conversion

Notice how the û character is removed in HTML output

original:

location:: Turmish, Faerûn

intermediate md:

location:: Turmish, Faerûn

resultant HTML:

        <p>location:: Turmish, Faern   </p>

Mermaid usage

Mermaid saves as code, no image.

Give warning when duplicate filenames are found in root

Because of how the code (and Obsidian) works, duplicate filenames are not allowed anywhere in the root folder, irregardless of the subfolder.

If this is not the case, you'll get buggy behavior.
There should be a warning output when duplicate filenames are found.

idea: it would be a good feature to have a dynamic entrypoint based on a tag

It would be a nice feature to be able to have a entrypoint index html based on a tag.

Such as #my-entrypoint-links

Could build an index.md based on those tags and convert from that entrypoint.

Implement pip install method

See https://github.com/kmaasrud/oboe/blob/master/setup.py
This project should copy that method for installation.

Code blocks are not ignored

if you write something in a code block, it should not alter due to our conversions, but at the moment that is not guaranteed

Disable tab layout on phone

Graph view: Add Zooming/Panning

Requirement for bigger projects, graph view is unusable with very many notes

Ticket for documentation only, solution already in the works

Tests to add

List of issues we face(d) and what tests to add to cover this.

#138 (1)
#164

(1): this breaks often #88

make a helper function to return config values

this will be a clean way to handle if values are not configured in the config.yml

the helper function will set 'default' values for config.yml keys and if a specific value is requested that is not properly configured, will fall back to those default values OR will give an appropriate error to the end user to tell them to configure a key in their config.yml

Use envars or yaml input file instead of commandline arguments

I prefer the flexibility and simplicity of using yaml files or envvars compared to having to add a better parser.

Maybe implement both, otherwise, just use yaml files.

Special characters being removed by mermaid plugin

Okay, fixed in 0.0.8

Compare: https://obsidian-html.github.io/?path=%2FResources%2FExample%2520site%2FesMermaid.html/ (dynamically loaded) https://obsidian-html.github.io/Resources/Example%20site/esMermaid.html (loaded normally).

mermaid.init() was the solution here.

I tested it. Now it deletes all Russian symbols in files.

Originally posted by @slavamirniy in #18 (comment)

Page open twice header link bug

Related to #12

It is possible atm to have a page open twice in different bugs.
Any id based operations will prefer the earlier tab.

Hovering over a header in a later tab will show the link icon in the corresponding header of the earlier tab.
Perhaps change the operation to be class based.

No such file or directory: '/Users/jcolson/src/personal/DragonHearts/output/md/Personal/httpd.service'

looks like the markdown made a link out of something that wasn't a link, which confused the html conversion

MarkdownLink(
        url = "httpd.service", 
        suffix = '.service', 
        isValid = True, 
        isExternal = False, 
        inRoot = True, 
        src_path = /Users/jcolson/src/personal/DragonHearts/output/md/Personal/httpd.service, 
        rel_src_path = Personal/httpd.service, 
        rel_src_path_posix = Personal/httpd.service, 
        page_path = /Users/jcolson/src/personal/DragonHearts/output/md/Personal/renew-certificates---lets-enc.md, 
        root_path = /Users/jcolson/src/personal/DragonHearts/output/md 
)
Traceback (most recent call last):
  File "/usr/local/bin/obsidianhtml", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 529, in main
    ConvertMarkdownPageToHtmlPage(unparsed[k]['fullpath'], pb)
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 118, in ConvertMarkdownPageToHtmlPage
    shutil.copyfile(link.src_path, paths['html_output_folder'].joinpath(link.rel_src_path))
  File "/usr/local/Cellar/[email protected]/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/shutil.py", line 264, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: '/Users/jcolson/src/personal/DragonHearts/output/md/Personal/httpd.service'

NameError: name 'warnings' is not defined

File /Users/jcolson/src/personal/DragonHearts/output/md/media/Personal-Recipe--How-to-Make-Jalapeño-Pickles---Field-&-Stream-image3.gif not located, so not copied.
Traceback (most recent call last):
  File "/usr/local/bin/obsidianhtml", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 532, in main
    ConvertMarkdownPageToHtmlPage(unparsed[k]['fullpath'], pb)
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 154, in ConvertMarkdownPageToHtmlPage
    warnings.warn(f"Image {str(full_link_path)} treated as external and not imported in html")
NameError: name 'warnings' is not defined

not_created.html is not being linked to properly

I'll investigate in the morning

Compile RSS feed

I have the following note template:

---
tags:
- date/{{date}}
---

# {{title}}

This ensures each new post gets the date of the day it was created.

This combined with the tag tree:

https://obsidian-html.github.io/tags/
http://localhost:8000/Resources/Example%20site/esTags.html
https://obsidian-html.github.io/Resources/Example%20site/esTags.html
obsidian-html/obsidianhtml/__init__.py

Line 468 in b13d5f4

recurseTagList(pb.tagtree, 'tags/', pb)

Could make an easy poor-man's RSS feed. Would make a nice addition for minimal cost.

maybe should use `site_name:` config instead of `Notes` as title of tags entrypoint

default to current directory for config.yml if `-i /some/path/to/it` isn't passed

sorry @dwrolvink - sometimes I create issues just to track them and don't fully flesh out the thought!

the idea is that you should be able to run obsidianhtml with no parameters if the config.yml file is in your current directory.

I've created a PR for that, as well as printing the help info out when there are other issues as well.

in testing #20, ran into another link issue - `TypeError: 'bool' object is not subscriptable`

in testing #20, ran into another link issue - TypeError: 'bool' object is not subscriptable

edited code to print:

        # -- [10] Add code inclusions
        for l in re.findall(r'^(\<inclusion href="[^"]*" />)', self.page, re.MULTILINE):
            link = l.replace('<inclusion href="', '').replace('" />', '')
            link_lookup = GetObsidianFilePath(link, self.file_tree)
            print(link)
            print(link_lookup)
            file_record = link_lookup[1]
            header = link_lookup[2]

            if link_lookup == False:
                self.page = self.page.replace(l, f"> **obsidian-html error:** Could not find page {link}.")
                continue
            
            self.links.append(file_record['fullpath'])

Hymey.pdf
('Hymey.pdf.md', False, '')
Traceback (most recent call last):
  File "/usr/local/bin/obsidianhtml", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 462, in main
    recurseObisidianToMarkdown(str(paths['obsidian_entrypoint']), pb)
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 74, in recurseObisidianToMarkdown
    recurseObisidianToMarkdown(files[link_path]['fullpath'], pb)
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 74, in recurseObisidianToMarkdown
    recurseObisidianToMarkdown(files[link_path]['fullpath'], pb)
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 44, in recurseObisidianToMarkdown
    md.ConvertObsidianPageToMarkdownPage(paths['md_folder'], paths['obsidian_entrypoint'])
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/MarkdownPage.py", line 287, in ConvertObsidianPageToMarkdownPage
    self.links.append(file_record['fullpath'])
TypeError: 'bool' object is not subscriptable

when storing the resultant html in git, too many changes as a result of the uuid.uuid4() replacement in graph_template

should use a hash that doesn't change every single time the html is generated

not_created.html's home link should take into consideration `html_url_prefix`

<div class="container-wrapper">
        <div class="container">
        <p>This note was linked, but never created (or removed)</p>
<a href="javascript:history.back()">Go Back</a> or <a href="/" id="homelink">Go to Home</a>.

IndexError: tuple index out of range

Not sure how to troubleshoot this one.

Traceback (most recent call last):
  File "/opt/homebrew/bin/obsidianhtml", line 8, in <module>
    sys.exit(main())
  File "/opt/homebrew/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 365, in main
    recurseObisidianToMarkdown(str(paths['obsidian_entrypoint']), pb)
  File "/opt/homebrew/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 75, in recurseObisidianToMarkdown
    recurseObisidianToMarkdown(files[link_path]['fullpath'], pb)
  File "/opt/homebrew/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 75, in recurseObisidianToMarkdown
    recurseObisidianToMarkdown(files[link_path]['fullpath'], pb)
  File "/opt/homebrew/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 49, in recurseObisidianToMarkdown
    md.ConvertObsidianPageToMarkdownPage(paths['md_folder'], paths['obsidian_entrypoint'])
  File "/opt/homebrew/lib/python3.9/site-packages/obsidianhtml/MarkdownPage.py", line 277, in ConvertObsidianPageToMarkdownPage
    header = link_lookup[2]
IndexError: tuple index out of range

was getting FileNotFoundError

have no way to determine which file was causing the issue.

error I was seeing:

904/1418
Traceback (most recent call last):
  File "/usr/local/bin/obsidianhtml", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 477, in main
    recurseObisidianToMarkdown(unparsed[k]['fullpath'], pb)
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/__init__.py", line 41, in recurseObisidianToMarkdown
    md = MarkdownPage(page_path, paths['obsidian_folder'], files)
  File "/usr/local/lib/python3.9/site-packages/obsidianhtml/MarkdownPage.py", line 39, in __init__
    self.metadata, self.page = frontmatter.parse(f.read())
  File "/usr/local/lib/python3.9/site-packages/frontmatter/__init__.py", line 82, in parse
    fm = handler.load(fm)
  File "/usr/local/lib/python3.9/site-packages/frontmatter/default_handlers.py", line 238, in load
    return yaml.load(fm, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/yaml/__init__.py", line 81, in load
    return loader.get_single_data()
  File "/usr/local/lib/python3.9/site-packages/yaml/constructor.py", line 49, in get_single_data
    node = self.get_single_node()
  File "yaml/_yaml.pyx", line 673, in yaml._yaml.CParser.get_single_node
  File "yaml/_yaml.pyx", line 687, in yaml._yaml.CParser._compose_document
  File "yaml/_yaml.pyx", line 731, in yaml._yaml.CParser._compose_node
  File "yaml/_yaml.pyx", line 847, in yaml._yaml.CParser._compose_mapping_node
  File "yaml/_yaml.pyx", line 860, in yaml._yaml.CParser._parse_next_event
yaml.parser.ParserError: while parsing a block mapping
  in "<unicode string>", line 2, column 1
did not find expected key
  in "<unicode string>", line 3, column 29

Rename/restructure image_suffixes mechanism

To discern between image tags and note inclusions, the image_suffixes list is now used to check what suffix a link has.

#26 added pdf to the list, which is not an image. This can be confusing in the future.

Rename image_suffixes to something more sensible
Move this list to the config yaml so that further code changes are not neccessary in the future
Add documentation

FR: Identify external links in html code

Search for all <a href="string" instances that start with http/https (case insensitive)
Add in a class="external-link"

enable to ability for users to _add_ custom inclusions to the already existing dynamic_inclusions via config.yml

add a:

# Provide an array of custom inclusions (css, javascript, etc) that you would like to be included in the resultant html
html_custom_inclusions:

so that custom html inclusions can be added without having to replace the default html template

html: add link chapter onhover button

In most markdown reader you can click on a link icon to get the url + '#chapter-name'.
This is pretty useful and should be added.

`not_created.html` is not getting tags (such as `html_url_prefix`) replaced

                <!-- Includes -->
                <link rel="stylesheet" href="{html_url_prefix}/98682199-5ac9-448c-afc8-23ab7359a91b-static/mermaid.css" />
                <link rel="stylesheet" href="{html_url_prefix}/98682199-5ac9-448c-afc8-23ab7359a91b-static/main.css" />
                <script src="{html_url_prefix}/98682199-5ac9-448c-afc8-23ab7359a91b-static/mermaid.min.js"></script>

Make external link _blank behavior configurable

See #71

relative linking to images not working with a urlprefix configured

Make a copy of the obsidian vault to a temp dir to avoid data loss in case of bugs

I just overwrote a note of mine working on a new feature.

Now I can burn myself on my own stupidity, I don't want others to suffer the same fate.
(Edit: just found out I made a backup yesterday so all is fine!)

Should be pretty easy. OS specific troubles might be challenging potentially.

Factor out ConvertFullWindowsPathToRelativeMarkdownPath()

Related to the fix of #8

Factor out ConvertFullWindowsPathToRelativeMarkdownPath() in favor of using pathlib standard methods.
Standardize variable naming and use of Path vs string paths e.g.
- rel_path vs relative_path
- _path always being a Path('').resolve() object
- *_path_posix for strings in the a/b/file.md format
- etc

getting an error `Two or more files with the name ".DS_Store" exist in the root folder.`

it would be great to be able to 'exclude' files ... as .DS_Store (mac) files are definitely not something that matter for exports.

Two or more files with the name ".DS_Store" exist in the root folder.

Allow index.html#go-to-header anchors

Note for reference. Just fixed this.

When in tab mode, we want to move to the correct section
When opening a new tab it needs to scroll to the correct section (without screwing up the layout of the page...)

Fixed in 94218c6

Enable Obisidian inclusions

Obsidian has the following syntax for including notes in a note:

![[Note Name]]

![[Note Name#Chapter Name]]

This is not handled at the moment.
While I think this feature is kinda whack, it is a core Obsidian feature, so it should be supported.

Md to html does not allow duplicate filenames

Related to the fix for https://github.com/dwrolvink/obsidian-html/issues/7

This fix also applies for md->html conversion, where it isn't necessary because in markdown the referenced file path can always be deduced.

This is quite a bit of work to get right though. The md->html code now uses the files[file_name] dict indexing method to find local files. We could use files[relative_path] but then the relative path of the linked file should be determined after the regex search.

This is done by doing (Path(page_path).parent.path + link_path) and then it needs to be processed so that: /a/b/c/../../bla.md --> /a/bla.md

FR: First compile markdown then compile HTML

Atm html and md are compiled side by side, because that's easier.
But this does not allow existing proper markdown to function as input without possible bugs.

We should separate the code into a md compilation stage and an html compilation stage.

obsidian-html / obsidian-html Goto Github PK

obsidian-html's Introduction

Obsidian-html

Description

obsidian-html's People

Contributors

Stargazers

Watchers

Forkers

obsidian-html's Issues

Recommend Projects

Recommend Topics

Recommend Org