Git Product home page Git Product logo

Comments (10)

mikebryant avatar mikebryant commented on June 19, 2024

I think you're correct, it's not accurate. But my stats knowledge is pretty weak.

I would love to fix that, but it's going to take me some time to figure out how to code that - this is a great analysis to help with that.

Also very open to anyone else providing a PR to fix this :)

from ac-nh-turnip-prices.

ZZZZzzzzac avatar ZZZZzzzzac commented on June 19, 2024

I'm working on the code , but know little about JS so can't help with PR. maybe Python or C++ ? or just pseudo code

Btw, maybe it's better to say the original algorithm is more "rough" instead of "not accurate", it follows bayes rule as well, but using P( input price X fall in some interval | pattern is k ) rather than P( input price is exactly X | pattern is k ).
That event input price X fall in some interval given pattern k is more genearal than input price is exactly X given pattern k, so lose more accurarcy

from ac-nh-turnip-prices.

ZZZZzzzzac avatar ZZZZzzzzac commented on June 19, 2024

I'm not certain how the script is calculating 7.69%, and 2.2% for the 42 possible Pattern 0's (fluctuating).

I think the easiest way to illustrate current algorithm is:

  1. choose last week pattern and using predefined probability
    (0.5,0.05,0.2,0.15 for your example)
  2. calculate all pattern and their subpattern and uniformly distribute to all subpattern
    (there are 56 initial pattern 0 subpattern so you get 0.5/56=0.00893=0.893%)
  3. everytime you input a price, it will rule out some impossible subpattern, and recalculate probability for each subpattern based their pervious probability
    ( after you input 105 to MonAM, there will only 42 of 56 pattern0 remain, 0 of 7 pattern1 remain, 0 of 1 pattern2 remain, and only 1 of 7 pattern3 remain, so you will have
42*0.893%/(42*0.893%+0*0.714%+0*20%+1*3.12%) / 42 = 0.0219 =2.20% for each pattern0
1*3.12%/(42*0.893%+0*0.714%+0*20%+1*3.12%) / 42 = 0.0767 = 7.67% for each pattern3, 

I think you may confused that only 1 of 7 pattern3 remain part, this is subpattern small spike with MonAM price between 90-140, not the overall small spike, it only has 1/7 of all pattern3 probability

original algorithm is accurate if everytime you input a price, it rule out impossible subpattern with equal probability, unfortunately it's not.

from ac-nh-turnip-prices.

ZZZZzzzzac avatar ZZZZzzzzac commented on June 19, 2024

Python code here, give current price, base price, and random bound ABCD to get corresponding probability

import numpy as np
def get_uniform_prob(price,base,A,B): 
    # sellPrices[work++] = intceil(randfloat(A, B) * basePrice);

    # it is uniform distribution
    price_max = int(np.ceil(B*base))
    price_min = int(np.ceil(A*base))
    norm = 0
    if price > price_min and price < price_max:
        norm = 1
    elif price == price_min:
        norm = price_min - A*base
    elif price == price_max:
        norm = B*base-price_max+1
    return norm/base/(B-A)

def get_uu_prob(price,base,A,B):
    # rate = randfloat(A, B); sellPrices[work++] = intceil(randfloat(C, rate) * basePrice) - 1;

    # => X->U(A,B), Y->U(C,X), Y-C->U(0,X-A), Y-C->U(0,1)*(X-A), Y-C->U(0,1)*U(C-A,B-A)
    # luckly here A=C, let Z=Y-A/B-A,  Z->U(0,1)*U(0,1), Z is product of two 2 uniform distribution
    # pdf_Z = -log(z) cdf_Z = z-z*log(z)
    rate_max = ((price+1)/base-A)/(B-A)
    rate_min = ((price)/base-A)/(B-A)
    cdf = lambda z: 0 if z<=0 else z-z*np.log(z) if z<=1 else 1
    # def cdf(z):
    #     if z > 0:
    #         if z <=1:
    #             return     
    #         else:
    #             return 1
    #     else:
    #         return 0
    return cdf(rate_max)-cdf(rate_min)

def get_bell_prob(price,base,A,B,C,D,L):   
    # rate = randfloat(A, B);
    # for (work = 2; work < L; work++)    {
    #   sellPrices[work] = intceil(rate * basePrice);
    #   rate -= randfloat(C, D);    }

    # X = U(A,B), Y_k = U(C,D), Z = X-sum(Y_k), sum(Y_k) follows irwin–hall distribution
    # cdf of z is not easy to get, so here i'm using numerical method
    rate_max = ((price)/base-A)/(B-A)
    rate_min = ((price-1)/base-A)/(B-A)
    price_max = int(np.ceil((B-C*L)*base))
    price_min = np.max([int(np.ceil((A-D*L)*base)),0])
    bins = price_max-price_min+1
    pdf_X = np.ones(int((B-A)/0.001))/(B-A)
    pdf_Y = np.ones(int((D-C)/0.001))/(D-C)
    pdf = pdf_X
    for i in np.arange(L):
        pdf = np.convolve(pdf,pdf_Y)
    cdf = np.cumsum(pdf)    
    margin = cdf.size/bins
    idx_start = np.max([int(np.ceil((price-price_min)*margin)),0])
    idx_end = np.min([int(idx_start+margin),cdf.size-1])
    # idx_end = int(np.ceil((price+1-price_min)*margin))
    return (cdf[idx_end]-cdf[idx_start])/cdf[-1]

def get_bell_prob(price,base,last_price,A,B):
    # last_rate = randfloat(last_rate_min,last_rate_max)
    # new_rate = last_rate - randfloat(A,B)
    # Z = new_rate - last_rate_min + A = U(0,1)*(max-min)+U(0,1)*(A-B)
    last_rate_max = (last_price)/base
    last_rate_min = (last_price-1)/base
    d1 = last_rate_max-last_rate_min
    d2 = B-A
    min_L = np.min([d1,d2])
    M = min_L/d1/d2
    left_end = last_rate_min-B
    right_end = last_rate_max-A
    def cdf(x):
        if left_end <= x and x < left_end+min_L:
            return np.max([0,M*0.5/min_L*(x-left_end)**2])
        if left_end+min_L <= x and x <= right_end-min_L:
            return M*(2*x-2*left_end-min_L)*0.5
        if right_end-min_L < x and x <= right_end:
            return np.min([1,1 - (M*(right_end-x)/min_L)*(right_end-x)*0.5])
        return 0
    rate_max = (price)/base
    rate_min = (price-1)/base
    return cdf(rate_max)-cdf(rate_min)

edit: fix end point probability problem for get_uniform_prob(price,base,A,B)
edit2: add another easier get_bell_prob(price,base,last_price,A,B) based on previous price

from ac-nh-turnip-prices.

astranberg avatar astranberg commented on June 19, 2024

I'm not sure what the variables A, B, C, D and L refer to.

I do believe the prediction is inaccurate.

If only 1 out of 7 pattern3 remain, the overall odds that pattern3 is the correct pattern is unchanged, i.e. still 25%. And that 25% should now be redistributed amongst all remaining pattern3 subpatterns (so 25% / 1 = 25%). Instead, the algorithm would redistribute those odds to all patterns, which is incorrect.

You cannot redistribute a patterns odds to the remaining unless all of the subpatterns are eliminated.

from ac-nh-turnip-prices.

peter50216 avatar peter50216 commented on June 19, 2024

From the code itself I feel that you should redistribute a patterns odds to remaining even if there are some subpatterns left.

If only 1 out of 7 pattern3 remain, the overall odds that pattern3 is the correct pattern is unchanged, i.e. still 25%. And that 25% should now be redistributed amongst all remaining pattern3 subpatterns (so 25% / 1 = 25%). Instead, the algorithm would redistribute those odds to all patterns, which is incorrect.

I don't think it's true, the odds of getting pattern3 condition on your current input would change.

The python code looks good to be except the part that the range is actually uniform on the randfloat(A, B) part, not on ceil(A*base) and ceil(B*base) (The endpoints would be less possible due to rounding), so there might be a little inaccuracy from there. The current probability on master is definitely off though.

If no one take this, I can try to implement this tomorrow.

from ac-nh-turnip-prices.

ZZZZzzzzac avatar ZZZZzzzzac commented on June 19, 2024

sorry for late response, i don't have access to git during weekend :(

You cannot redistribute a patterns odds to the remaining unless all of the subpatterns are eliminated.

you confused probability with conditional probability, everytime you feel confusing, using bayes law:

P(pattern is 3| give MonAM price 90~140) 
= P(90~140 | pattern is 3) * P(pattern 3)/ P(90~140)                                         
= 1/8                      * 25%         / (1/8*25% + 42/56*50%)                                         
= 1/13 = 7.69%

or even easier, take every subpattern as independent events, it give same answer

the problem in this issue is not about redistribution, is about event P( 90<price<140) is too general and the probability of P(price=105 | 90<price<140) is not considered

the range is actually uniform on the randfloat(A, B) part, not on ceil(Abase) and ceil(Bbase)

you're right, I'm working on fixing that (Edit: done, update in python code above)

from ac-nh-turnip-prices.

ZZZZzzzzac avatar ZZZZzzzzac commented on June 19, 2024

tried to convert python function to js one, may have bug

function get_uniform_prob(price,base,A,B) {
  // sellPrices[work++] = intceil(randfloat(A, B) * basePrice);

  // it is uniform distribution
  var price_max = intceil(B*base)
  var price_min = intceil(A*base)
  var norm = 0
  if (price > price_min && price < price_max) {
    norm = 1
  }       
  if (price == price_min) {
    norm = price_min - A*base
  }
  if (price == price_max) {
    norm = B*base-price_max+1
  }      
  return norm/base/(B-A)
}

function get_uu_prob(price,base,A,B) {
    // rate = randfloat(A, B); sellPrices[work++] = intceil(randfloat(C, rate) * basePrice) - 1;

    // => X->U(A,B), Y->U(C,X), Y-C->U(0,X-A), Y-C->U(0,1)*(X-A), Y-C->U(0,1)*U(C-A,B-A)
    // luckly here A=C, let Z=Y-A/B-A,  Z->U(0,1)*U(0,1), Z is product of two 2 uniform distribution
    // pdf_Z = -log(z) cdf_Z = z-z*log(z)
    var rate_max = ((price+1)/base-A)/(B-A)
    var rate_min = ((price)/base-A)/(B-A)
    var cdf = (z)=> {     
      if (z > 0) {
        if (z <=1) {
          return z-z*Math.log(z)
        } else {
          return 1
        }
      } else {
        return 0
      }  
    }
    return cdf(rate_max)-cdf(rate_min)
}

function conv(vec1, vec2){
  var disp = 0; // displacement given after each vector multiplication by element of another vector
  var convVec = [];
  // for first multiplication
  for (j = 0; j < vec2.length ; j++){
      convVec.push(vec1[0] * vec2[j]);
  }
  disp = disp + 1;
  for (i = 1; i < vec1.length ; i++){
      for (j = 0; j < vec2.length ; j++){
          if ((disp + j) !== convVec.length){
              convVec[disp + j] = convVec[disp + j] + (vec1[i] * vec2[j])
          }
          else{
              convVec.push(vec1[i] * vec2[j]);
          }
      }
      disp = disp + 1;
  }
  return convVec;
}

function get_bell_prob(price,base,A,B,C,D,L) {
    // # rate = randfloat(A, B);
    // # for (work = 2; work < L; work++)    {
    // #   sellPrices[work] = intceil(rate * basePrice);
    // #   rate -= randfloat(C, D);    }

    // # X = U(A,B), Y_k = U(C,D), Z = X-sum(Y_k), sum(Y_k) follows irwin–hall distribution
    // # cdf of z is not easy to get, so here i'm using numerical method
    var price_max   = intceil((B-C*L)*base)
    var price_min   = Math.max(intceil((A-D*L)*base),0)
    if (price>price_max || price< price_min){return 0}
    var bins        = price_max-price_min+1
    var pdf_X       = Array.from({ length: intceil((B-A)/0.001) }).fill(1/(B-A)); 
    var pdf_Y       = Array.from({ length: intceil((D-C)/0.001) }).fill(1/(D-C));
    var pdf         = pdf_X
    for (i=0; i<L; i++) {
        pdf = conv(pdf,pdf_Y)
    }
    const cumulativeSum = (sum => value => sum += value)(0); //cumsum
    cdf             = pdf.map(cumulativeSum)
    var margin      = cdf.length/bins
    var idx_start   = Math.max(intceil((price-price_min)*margin),0)
    var idx_end     = Math.min(Math.floor(idx_start+margin),cdf.length-1)
    return (cdf[idx_end]-cdf[idx_start])/cdf[cdf.length-1]
}

And this is how these function should work ( just my thought):

  1. enter Sunday price, give initial probability according to past pattern, record probabilities P(k) (note this pattern k is subpattern, means 56 for pattern 1, etc.)
  2. enter price of next day X
  3. when recalculate possible pattern, say pattern k, use functions above to calculate P(X|k) and record it (note this is only for the day of input price, although generate_pattern_0_with_XXX() function will generate all week price.)
  4. after recalculate all possible subpattern, we have all P(k) and P(X|k), then do beyes:
    P(k|X) = P(X|k)*P(k) / sum of all P(X|k)*P(k)
  5. record P(k|X) as new P(k), repeat 1 to 4

from ac-nh-turnip-prices.

peter50216 avatar peter50216 commented on June 19, 2024

I realized (as in #103) that the price other than the previous known one might also affect the rate range, so it might be hard to only use get_bell_prob to calculate the correct probability in some edge case.

I'm currently working on using numerical methods all the way down to do this, would send a PR soon after I complete it.

from ac-nh-turnip-prices.

ZZZZzzzzac avatar ZZZZzzzzac commented on June 19, 2024

I realized (as in #103) that the price other than the previous known one might also affect the rate range, so it might be hard to only use get_bell_prob to calculate the correct probability in some edge case.

I doubt, in bell shape distribute, the rate now(rate[n]) is only dependent on the last rate(rate[n-1]) rate[n] = rate[n-1] -randfloat(A,B) , not other history rate. It's true that rate[n-1] is dependent on rate[n-2], but once you input pirce[n-1] and derive rate[n-1], that random event is fixed and not conditional anymore

But you made me realize we only need last rate to calculate new probability, going to rewrite bell function

from ac-nh-turnip-prices.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.