Git Product home page Git Product logo

Comments (7)

josdemmers avatar josdemmers commented on September 3, 2024 3

First step is done, v3 is now available for download. Detection now works with OCR.
Haven't looked into affix rolls yet. That is for later.

from diablo4companion.

mutschler avatar mutschler commented on September 3, 2024 1

maybe consider using tesseract for OCR.
Since you already mask out the areas which are interesting it should be pretty fast and supports more than 100 Languages already

From a quick test using the example image from readme with tesseract on a rather old system it took 0.04 to 0.3 seconds to run the text recognition on the whole image:

PROTECTING
ADVENTURER'S TUNIC.

Legendary Chest Armor
379 Item Power

663 Armor

© +17 Willpower +[13 - 19]

‘© +15 Intelligence +[13 - 19]

‘© 22.0% Fire Resistance [13.0 - 22.0]%

© 1.8% Damage Reduction [1.2 - 3.0]%

+ When hit while not Healthy, a
‘magical bubble is summoned around
you for 4.5 [3.0 - 5.0] seconds. While
standing in the bubble players are
Immune. Can only occur once every
90 seconds.

Requires Level 21
Account Bound

Results might could be further improved by not feeding the whole item at once and splitting it up in different chunks. There are problems detection the Diablo Style "O" (used in item name only i think) and a few other characters (like affix markers etc) but that could simply be ignored / cutoff

from diablo4companion.

Jeremy-JJ avatar Jeremy-JJ commented on September 3, 2024 1

Someone wrote something in python. For the sake eliminating a C# implement, you could work from here:
https://github.com/mxtsdev/d4-item-tooltip-ocr

This is also written in C# (might be able to borrow some code):
https://github.com/Riketta/VersaLootFilterD4

This library may even be of use, particularly the trained dataset:
https://github.com/SanctuaryTeam/diablo4trading-ocr

from diablo4companion.

josdemmers avatar josdemmers commented on September 3, 2024 1

Thanks for the examples. But I think I have already found an OCR lib I'm going to use:
https://github.com/charlesw/tesseract, it's free and open source.

The one from the VersaLootFilterD4 uses IronOCR, it's too expensive to use (https://ironsoftware.com/csharp/ocr/licensing/).

The complete Diablo4Companion project is in C#, so I'm not planning to switch to another language. But it can always be interesting to look at implementations in other languages.

Regarding the trained dataset that is probably just the default from tesseract?
They are available on the official repo including all other languages:
https://github.com/tesseract-ocr/tessdata/

from diablo4companion.

josdemmers avatar josdemmers commented on September 3, 2024

Thanks for the suggestion. With the current implementation, using images to detect affixes this is not possible.

However, it is likely that I change this to OCR in the future. Everything will be converted to text then.
That should make it possible to compare the affix roll as well.

I have currently no idea if OCR will be fast enough for the overlay. So no promises yet I'll implement this.

from diablo4companion.

josdemmers avatar josdemmers commented on September 3, 2024

Sounds promising, and yes was planning to use tesseract. Still looking for a C# implementation though.
Used this one in the past for another project: https://github.com/CptWesley/TesserNet
But that comes only with an English language model. Suggestions for a C# implementation are welcome.

0.04 to 0.3

Does it really vary that much? 0.3 sec would be way too slow to be useful.
Applying OCR is only the first step. Would also need to clean up the result and find an affix matching the text. Costs some time as well.

Version 2.0 of the app is already designed with OCR in mind. Each affix and aspect has it's own region in the app. That should help getting the processing time down to a minimum.

I'll get to implementing OCR eventually, but not anytime soon.

from diablo4companion.

Jeremy-JJ avatar Jeremy-JJ commented on September 3, 2024

Thanks for the examples. But I think I have already found an OCR lib I'm going to use: https://github.com/charlesw/tesseract, it's free and open source.

The one from the VersaLootFilterD4 uses IronOCR, it's too expensive to use (https://ironsoftware.com/csharp/ocr/licensing/).

The complete Diablo4Companion project is in C#, so I'm not planning to switch to another language. But it can always be interesting to look at implementations in other languages.

Regarding the trained dataset that is probably just the default from tesseract? They are available on the official repo including all other languages: https://github.com/tesseract-ocr/tessdata/

Jos,

I wasn't aware Versa was using Iron, but Tesseract does work quite well and is utilized by the Sanctuary Team for diablo.trade.

It does make occasional mistakes, however, which is why I presumed training was required to minimize error or allow for detection of OCR error, i.e. with a model knowing a given affix has to be within a particular numerical range and so forth.

With respect to implementation, I'm not sure if they trained their model to do so or have it hardcoded in. Judging from what I've seen of other OCR implementations for such an application, particularly loot filters, they trained their own models specifically for D4.

Beyond this, I look forward to your implementation of OCR and the changes to come to the UI, especially for setting min values. At present, I've been bouncing between your application and Aeon0's D4LF, which has some extra features for marking your loot as trash or favorite, but has no UI and requires a few YAML edits, which can be somewhat inconvenient or render the application unusable for the average person.

All the best,
Jeremy

from diablo4companion.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.