Git Product home page Git Product logo

chordinaryworld's Introduction

Hi there ๐Ÿ‘‹

  • ๐Ÿ”ญ Iโ€™m currently working at the F# language team in Microsoft.
  • ๐Ÿ‘ฏ Iโ€™d be happy to collaborate on things that bring something good to this world.
  • ๐Ÿ’ฌ Ask me about F#, .NET, and testing. I like to help and have an active StackOverflow profile.
  • ๐Ÿ“ซ How to reach me: LinkedIn.
  • โšก Fun fact: I've been writing down my dreams for about 10 years already, a lot of fun and surreal stuff there.

chordinaryworld's People

Contributors

dependabot[bot] avatar psfinaki avatar

Watchers

 avatar  avatar

chordinaryworld's Issues

Trim user input

Problem

Currently the spaces between the words are handled while the spaces in the beginning and in the end will lead to wrong URL and consequently inability to find chords. This is inconsistent.

Ideas

Related logic currently lives in UrlGod. Adding trimming is simple however I feel it is high time to create some separate module for all the logic about editing user input.

Split unit and integration tests

Problem

The project is well-tested and tests are split into different test projects. However, not all the tests are unit tests while lying along with them. Some of them rather remind integration tests as they test functions which call other functions internally. This should be clear at a glance.

Examples

  1. Nearly all the CoreTests
  2. HandlesNoTabs in TabMasterTests
  3. ...

Handle Cosmos DB errors

Problem

No possible Cosmos DB errors (connection, invalid id...) are handled now, they break the execution. The service should still bring to the user number of harmonies for a song even if it unable to save it.

Introduce general error handling mechanism

Problem

Currently there are a few error messages that the service can bring, they are used for control flow in
unlucky but normal scenarios (empty input, chords not found...).

There is, however, nearly no general error handling. Functions are defenseless.

Example

let MakeUrl (artist: string, title: string, version: int) =
    let firstLetter = artist.[0]
    ...

This will obviously blow up if artist is empty.

Ideas

Exceptions or error codes? F# nicely integrates with both. There are a different opinions:
Error codes
Error codes
Exceptions
Exceptions - p 3.4

Report unknown chord flavours

Problem

Deflavourer in the Engine is clever enough to complain when it does not recognize the flavour. This may signal either about a true new flavour or about poor logic of retrieving chords from HTML in HtmlParser. However, now we do not do anything about it, except for asking user in the Web UI to write about the case.

Ideas

Every time this happens, the song and/or the flavour should be logged or a letter should be sent.

Deal with Azure Cosmos DB secret key configuration

Problem

Obviously secret key for connection to Cosmos DB cannot be published here. For now, it is just changed to hardcoded string "secret". This means that the execution of code out of the box will fail on any machine and this is sad.

Ideas

Ideally, it should be like this. The key should be taken from some file that is not tracked in git. So it will work seamlessly on machines where this file is present. If it is not there, the code for interacting with Cosmos DB should just not be fired.

Some ideas are here.

Write license

Problem

I doubt that this app ever become anyhow popular, but just in case and for practice a license should be introduced.

When writing, licenses of used frameworks and tools should be considered.

Improve HTTP status codes

Problem

For all service errors, status 400 is returned (and then specially handled in TS). This is not bad but there are much more and recommendations say the following:

Look through available response code from selected category. If one of them has a name which matches well to your situation, you can use it. Otherwise just fallback to x00 code (200, 400, 500). If you doubt, fallback to x00 code.

So, probably some of errors are worth changing in order to stick with good practices.

Treat warnings as errors and fix them

Problem

There are several dozens of warnings in the solution caused by different things. They all are not dangerous and sometimes not clear but generally warnings from the F# compiler can be very important (e.g. incomplete pattern matching). Because of this something can be missed in the future. Also, ignoring warnings becomes a habit with time while.

Fix footer in Web UI

Problem

If browser window is small, the lower link (wat | main) interferes with other stuff in the page.
A common way to do proper footers should be found.

Move to HTTPS

Problem

Basically, all the internet is moving to HTTPS, this is trendy. Apparently, for now there is not much of a reason to do this for CW.

However, soon Google Chrome will start to mark nearly all the HTTP web sites as not secure. This will make UX for CW worse. So, as a generally good thing, HTTPS should be once introduced.

Think about access modifiers in code

Problem

There is no encapsulation in the project though its structure seems to be quite prepared to introduce it.

Ideas

Read up. Maybe something like internal + [<InternalsVisibleTo>] (for tests) should be done.

Create separate type for a song

Problem

In the Core, when working with songs, many functions take both artist and title parameters. There is a feeling that it could be refactored to a separate type to improve code readability and robustness.

Ideas

There is different stuff in F# to help: classes, structs, records, records and classes marked as structs...There should be a thoughtful choice.

Check legal stuff

Problem

Currently only the open data is used in the project which should not make a problem. However, after reading a lot of scary articles about copyrights and so on, I suppose it is better to reassure this now than later.

Get the best chord page from the UG, not the first one

Problem

Currently Core.UrlGod forms the following Url:
https://tabs.ultimate-guitar.com/{0}/{1}/{2}_crd.htm
where 0 is first letter of the artist name, 1 is the artist name, 2 is the song title.

This works, but there are often other versions located at
https://tabs.ultimate-guitar.com/{0}/{1}/{2}_ver{3}_crd.htm
where 3 is the number of version (starting with 2).

Sometimes the latter versions have much better rating so it makes sense to choose them for further analysis.

Random example

Daft Punk - Get Lucky. For this song ver3 has 5x better rating than ver1.

Ideas

Approach 1: Compare rating and the select the best at the search page
https://www.ultimate-guitar.com/search.php?search_type=title&value=...

Approach 2: Or loop through all the chord pages for the song, get ratings, compare them and then select the best.

Create testing environment

Problem

It can happen that app works when running service locally but does not when calling remote server. For this reason there should be some testing server where the code could be deployed just before deploying to production. If something breaks, there will be no panic and feverish struggles to fix things.

Consider frequencies of chords

Problem

Currently the service counts unique chord roots in the tab. However, if some girls are bigger than others some chords are much more frequent than others, the latter may suggest those are just part of the bridge, or they are random chords put for some strange reason in the tab. Those should not be included in counting of harmonies.

Examples

R.E.M. - Losing My Religion. C major appears only in the "That was just a dream" part which is actually a bridge. Brings one redundant harmony thereby.
Dire Straits - Walk Of Life. The last section of the tab is just information about what chords to use in a transposed version. Brings two redundant harmonies now.
The Doors - Light My Fire. Same story.

But nevermind Dire Straits, because of the same thing with song transposition the service brings some 11 chords for Justin Bieber! This is a really serious problem.

Ideas

Can start with throwing away chords who make less than X% of all chords in the tab. X% may be 5% or something more cleverly counted.

Handle duplicates in the database

Problem

Currently every successful result goes to database. This means same songs can be saved there multiple times which does not make sense.

Ideas

The song should be checked before being saved. If it is there, the number of harmonies should be also checked as there may be a need to update it.

Handle empty input

Problem

Currently the service will throw because of let firstLetter = artist.[0] in the UrlMaker.

Ideas

Time to introduce some kind of validation to the service. Then, maybe, to the client as well.

Get concrete unknown flavours

Problem

Currently Deflavourer is able to emit UnknownFlavour error when unable to find a flavour in the dictionary. However, it does not say which one exactly is erroneous.

This is indeed unimportant for user however it is important for logging and API. Therefore a way to pass such flavours from Engine to Core is needed. It should be considered that one tab can have multiple unknown flavours.

Log ChordsNotFound error

Problem

Currently only UnknownFlavours is logged in order to widen flavours dictionary.

There is an idea to log ChordsNotFound as well. Although there is not much to do about it, the list of songs with no chords may lead to some thoughts. Also, it will make Logger easier to test manually (with UnknownFlavours some flavours should be temporarily removed from the dictionary).

Refactor solution to be able to use different chord sources

Problem

Currently Core is tightly coupled with the logic of getting chords for a song.

In future, it will be possible that Core will search for chords at different resources depending on some conditions, or at several resources at the same time with some logic of results comparison.

Thus it makes sense to leave in Core only input validation and calls to different components.

Save data to database

Problem

Is is high time to throw a database in! Song and number of harmonies for it should be saved so that it can be used in stats.

Because I do not have any experience in introducing databases to architecture, it is going to be a trial and error approach.

Ideas

The idea is to start with something NoSQL (NewSQL?) to get better with that as well. The current plan is to go with Azure and its Cosmos DB. It should be possible to start with Cosmos DB emulator and then deploy everything to the cloud, which is convenient.

Here is an example of calling Azure Cosmos DB from F#. There is also a project with type provider for Cosmos DB, but looks like it is somewhat rudimentary and abandoned, regretfully.

Fire a progress ring while counting chords

Problem

Another minor UX issue. There is no sign of anything happening after the request has been sent which looks confusing. One can start clicking button over and over again for whatever reasons.

Think about better webclient-webservice error connection

Problem

Error generating at Web UI currently works the following way. Client calls service, service calls core. Core brings Result type, service converts that to error codes, client converts that to string messages.

This may be overcomplicated. Also, there is behavioral inconsistency: expected errors are brought by error codes in the done branch while unexpected ones are brought by exceptions in the fail branch.

Split Crawler into two modules

Problem

Crawler has become too fat and now does two things:

  1. Searches for tabs for particular song name variation
  2. Does all the auxiliary work: makes these variations, selects best from found stuff and so on.

This is not good in terms of SRP and testing.

Dehardcode Azure Cosmos DB configuration

Problem

Currently all the configuration is hardcoded with AzureConnector module. This is a mauvais ton, configuration should be stored separately.

Ideas

There is a project FSharp.Configuration which gives type providers for configuration in F# projects. Looks like it a solid and alive. Do not know yet if AppSettingsProvider or YamlProvider should be chosen but type-safety in config will definitely increase code robustness.

Think about including songs in results

Problem

Currently Result is either a number of chords in a song or an error. This was enough when the only purpose of service was to display the result for some input.

However, results are going to be used: saved when everything is alright and logged in case of errors. For this, Result may need to contain the song to which is relates.

Think about CORS

Problem

In order to connect to the service from the UI app, the following was added to the web.config of web service:

  <system.webServer>
    ...
    <httpProtocol>
      <customHeaders>
        <add name="Access-Control-Allow-Origin" value="*" />
      </customHeaders>
    </httpProtocol>
  </system.webServer>

This feels very insecure and probably needs to be changed.

Ideas

Read up, here is some video for inspiration.
MS Azure also has something about CORS in the app settings.

Improve UI alignment

Problem

There are a few minor problems with alignments in the html:

  1. Lower link is exactly in the middle which is not what is needed actually. It should be rather under the conjunction of inputs
  2. The progress symbol is hold under that conjunction in quite a poor way, basically a hardcoded margin shift. This should be refactored

Log to Azure

Problem

Currently only local logging is present in the app. To properly log app events from the internet, logging to the hosting (Azure) should be introduced.

Create about page

Problem

It is high time to tell non-existent visitors what the site is about.

Properly throw for bad input in the Analyzer

Problem

The current strategy for parsing chords is the following. HtmlParser in Core should be clever enough to tell what is a chord in the html it receives, to pass all the collected chords to Engine for further proper analysis.

That means Engine should deal with something that at least looks like chords which actually either are proper ones or contain some UnknownFlavour. Something totally irrelevant is considered as an exceptional situation and the code should throw, which will mean HtmlParser does not do its job well.

Now, we have this code in AnalyzeChord function in Analyzer:

let AnalyzeChord chord =
    let regex = "([ABCDEFG]{1}[#b]?)(.*)"
    let groups = Regex.Match(chord, regex).Groups

    let tonic = groups.[1].Value
    let flavour = groups.[2].Value
    (tonic, flavour)

Turns out nothing actually throws here when match is false. This is strange for me but it is by design.

Example

Words History or Uprising.

Ideas

Introduce some throwing wrapper in this code.

Handle unknown chord flavours

Problem

When Engine.Deflavourer meets some unknown chord flavour, it silently ignores it.

It should be a good strategy in the majority of cases as most probably such a chord is yet another crazy variation of a simpler chord existing in the same song rather than a new harmony.

However, theoretically it can be something valuable and it should be viewed manually.

Example

Glass Animals - Pork Soda
Chords are shit but it does not matter. A#M in the end of the verse 4 is unique chord in the tab. It turned out to be another notation for A#maj7 so it deflavours to A#. It appears to be unique harmony in the tab thus should me counted.

M is already in the flavours list but something like this can happen again. It turns out that UG's lists of "all" chords (1, 2) are incomplete.

Ideas

Instead of ignoring such a chord, a separate Error should appear for that.

Improve the algorithm of parsing HTML from UG

Problem

The Core.HtmlParser algorithm to retrieve the tab content is very dumb, this is the regex used:

.*js-tab-content js-copy-content\">(.*)</pre>.*

Feels like if UG changes the tags here a little bit, the parsing will either take too long or fail.

Ideas

Read up a little bit, make something more fail proof.
When parsing takes too long, handle it and log it.

Improve algorithm of selecting the best tab

Problem

Current algorithm looks at number of votes for the tab. That basically works as people tend to vote for good tabs. However, theoretically it is possible that people will vote for some bad tab (2-3 stars) to show it is very bad.

Example

Example needed! E.g. a 3-star tab for a song has more votes than 5-star.

Ideas

Look not only at the rating but at the stars. The tabs with the biggest number of stars compete by rating.

On the other hand, the following situation is possible. There is a good 4-star tab for a song which has 100 votes. But then appears a new version for the tab which has 5 stars but only 2 votes. Should the latter really be selected?

  1. Thinking of stars:
    1.1 The filtering could be applied only for low-quality tabs
    1.2 Only big difference between stars of those two tabs could be considered (e.g. 1.5 stars)
  2. Thinking of votes:
    2.1 The second tab could need some magical number of votes
    2.2 The number of votes of the second tab could need to be some percent of the first (e.g. 10%)

Handle punctuation in artist names and song titles

Problem

So, there are a lot of artists and songs with some punctuation in the name. Currently Core is too dumb to process it properly thus the UG url it creates contains all those punctuation and consequently gets not found there. Here are possible punctuation marks.

Random examples

The Doors - Hello, I Love You
OK GO - Needing/Getting
List of bands with punctuation
Songs with parentheses

Ideas

Form the list of particular punctuation signs to deal with. Then do something about it.
Note that not every sign can be just thrown away. In the OK GO example the slash should be replaced by the space.

Decouple Failures

Problem

So far there are only three possible failures from the app:

  • ChordsNotFound
  • EmptyInput
  • UnknownFlavour

They all reside in ErrorMessage type in Common. However, they are different in the sense that first two are generated in Core while third is generated in Engine.

There is an opinion that such things should be split in some way. Otherwise domain model is polluted by an application boundary concern.

Add clarification at the front page

Problem

When a person first time enters the site, it is unclear what the site actually does. There should be a sentece to describe what the service does.

Reference ChordsProvider assembly directly in Core

Problem

Due to the legal reasons, I cannot expose the code that gets the chords from the internet. For that, the code should be referenced directly, via a compiled assembly, instead of project reference.

Make site appear at Google

Problem

The site is not there. If people suddenly get any interest in what is done here, they will not be able to find it.

Improve algorithm of handling power and suspended chords

Problem

Power chords (E5, A5, ...) and suspended chords (Esus2, Asus4, Csus2sus4, ...) are neither major nor minor as they do not have a third to define a mode.
Currently Engine.Sieve just throws them away because it is unclear where to deflavour them.

However, it can result in a skipped harmony if a significant part of the tab is written via such chords.

Example

Metallica - Enter Sandman (should not throw away A5)
Radiohead - High & Dry (should not throw away Asus2)

Ideas

Algorithm might be the following. If there is a major or minor chord with the same tonic in the tab, deflavour power chords to that chord. Otherwise, count it as a separate harmony.

Handle general error when calling the service in the UI

Problem

Currently UI is handling only the successful service call. If something bad happens inside, there is no reaction. This results in poor UX: progress symbol acts forever and it is not understood if the error has happened or the operation just takes long.

Example

Empty input (#11 is not fixed at this moment).

Set up diagnostics in Azure

Problem

As the service is getting a little bit more serious when logging and database are introduced, we should be prepared to figure out different sorts of issues. For that, some tracing should be turned on.

Report broken tab rating parsing

Problem

HtmlParser is able to retrieve tab rating from the tab html. We have this regex in the GetRating function:
"<span itemprop=\"ratingCount\">(.*?)</span>"
If match is success, we retrieve the rating, otherwise zero is returned.

However, the code has no defense against wrong HTML. If UG guys change the layout or some CSS class name, the code will always return zero,

Good news is that some tests should fail in this situation because then the first tab will be considered as the best which is not always correct. However, it will be better to have some notification when such a thing happens.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.