Git Product home page Git Product logo

chatgpt-source-watch's Introduction

About Glenn 'devalias' Grant

Glenn is a full-stack, polyglot developer with an acute interest in the offensive side of security. Whether building something new or finding the cracks to break in, there is always a solution to be found; even if it requires learning something entirely new. If you can improve/automate something, do it, and if you've put the effort in to do so, open-source it and share it with everyone else.

When not hacking and coding, Glenn can be found snowboarding the peaks of Japan, cruising on his longboard, floating around underwater, or just finding the most efficient path between A and B (even if that's over walls). Life is short. Do the things you love, embrace the unknown, live your dreams, and share your passion.

Elsewhere on the internet

The most likely places you can find me/get in contact are on Twitter, LinkedIn, and/or Keybase.

I primarily write about things on my Blog, though sometimes I will also post on Medium, LinkedIn Articles, and similar places.

I have presented at a few security conferences and meetups, which I tend to keep track of on my blog and in 0xdevalias/presentations. You can also usally find my slides on Speaker Deck and/or SlideShare.

I like to solve problems, and tend to be quite active on GitHub providing solutions/details for the issues I come across while working on projects.

I also like to contribute answers on StackOverflow / StackExchange when I run into a common problem that I have managed to solve.

Fun Stats

0xdevalias's github stats

(via anuraghazra/github-readme-stats)

chatgpt-source-watch's People

Contributors

0xdevalias avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

chatgpt-source-watch's Issues

Explore running CodeQL queries against the extracted/unpacked webpack source

From a chat with a friend:

Dunno how well it will work in reality.. but apparently I can run codeql against a random site's webpacked frontend code that I downloaded locally (in this case chatgpt)

codeql database create ~/Desktop/chatgpt-codeql-test-db --language=javascript --source-root ./unpacked

And I could use Chrome Devtools Protocol (CDP) to watch a site for when scripts are parsed, and then to access the source of those parsed scripts (which I could then automagically save locally/similar, and then run codeql on)

codeql database analyze ~/Desktop/chatgpt-codeql-test-db --format=csv --output=./chatgpt-codeql-output.csv --download codeql/javascript-queries

image

image

Huh.. it actually worked and output a bunch of warnings. Could be false positives/irrelevant/etc.. and would need to manually look closer to understand more about them and if they are actually interesting.. but the fact that it worked at all on webpacked code (that had only been run through prettier to format it) is pretty neat

"Improper code sanitization","Escaping code as HTML does not provide protection against code injection.","error","Code construction depends on an [[""improperly sanitized value""|""relative:///_next/static/chunks/pages/_app.js:28576:35:28576:52""]].","/_next/static/chunks/pages/_app.js","28576","21","28576","60"
"Improper code sanitization","Escaping code as HTML does not provide protection against code injection.","error","Code construction depends on an [[""improperly sanitized value""|""relative:///_next/static/chunks/pages/_app.js:28581:35:28581:52""]].","/_next/static/chunks/pages/_app.js","28581","21","28581","60"
"Incomplete URL substring sanitization","Security checks on the substrings of an unparsed URL are often vulnerable to bypassing.","warning","'[[""slack.com""|""relative:///_next/static/chunks/496.js:8801:33:8801:43""]]' can be anywhere in the URL, and arbitrary hosts may come before or after it.","/_next/static/chunks/496.js","8801","11","8801","44"
"Overly permissive regular expression range","Overly permissive regular expression ranges match a wider range of characters than intended. This may allow an attacker to bypass a filter or sanitizer.","warning","Suspicious character range that is equivalent to [&'()*+,\-.\/0-9:;].","/_next/static/chunks/653.js","42385","18","42385","20"
"Overly permissive regular expression range","Overly permissive regular expression ranges match a wider range of characters than intended. This may allow an attacker to bypass a filter or sanitizer.","warning","Suspicious character range that is equivalent to [?@A-Z].","/_next/static/chunks/653.js","42385","22","42385","24"
"Overly permissive regular expression range","Overly permissive regular expression ranges match a wider range of characters than intended. This may allow an attacker to bypass a filter or sanitizer.","warning","Suspicious character range that is equivalent to [A-Z\[\\\]^_`a-z].","/_next/static/chunks/653.js","48571","30","48571","32"
"Overly permissive regular expression range","Overly permissive regular expression ranges match a wider range of characters than intended. This may allow an attacker to bypass a filter or sanitizer.","warning","Suspicious character range that is equivalent to [A-Z\[\\\]^_`a-z].","/_next/static/chunks/653.js","52124","34","52124","36"
"Incomplete string escaping or encoding","A string transformer that does not replace or escape all occurrences of a meta-character may be ineffective.","warning","This replaces only the first occurrence of ""*"".","/_next/static/chunks/1f110208.js","7333","17","7333","33"
"Incomplete string escaping or encoding","A string transformer that does not replace or escape all occurrences of a meta-character may be ineffective.","warning","This replaces only the first occurrence of ""\\"".","/_next/static/chunks/1f110208.js","8042","33","8042","51"
"Incomplete string escaping or encoding","A string transformer that does not replace or escape all occurrences of a meta-character may be ineffective.","warning","This replaces only the first occurrence of ""\\"".","/_next/static/chunks/1f110208.js","8048","33","8048","52"
"Incomplete string escaping or encoding","A string transformer that does not replace or escape all occurrences of a meta-character may be ineffective.","warning","This does not escape backslash characters in the input.","/_next/static/chunks/653.js","55568","32","55568","40"
"Incomplete string escaping or encoding","A string transformer that does not replace or escape all occurrences of a meta-character may be ineffective.","warning","This replaces only the first occurrence of /%3A/i.","/_next/static/chunks/main.js","5109","18","5109","46"
"Incomplete string escaping or encoding","A string transformer that does not replace or escape all occurrences of a meta-character may be ineffective.","warning","This replaces only the first occurrence of ""#"".","/_next/static/chunks/main.js","5130","18","5130","26"
"Incomplete string escaping or encoding","A string transformer that does not replace or escape all occurrences of a meta-character may be ineffective.","warning","This replaces only the first occurrence of /[\]]/.","/_next/static/chunks/pages/_app.js","24434","20","24434","50"
"Incomplete string escaping or encoding","A string transformer that does not replace or escape all occurrences of a meta-character may be ineffective.","warning","This replaces only the first occurrence of /[[]/.","/_next/static/chunks/pages/_app.js","24434","20","24434","28"
"Prototype-polluting function","Functions recursively assigning properties on objects may be the cause of accidental modification of a built-in prototype object.","warning","The property chain [[""here""|""relative:///_next/static/chunks/pages/_app.js:38412:19:38412:22""]] is recursively assigned to [[""Y""|""relative:///_next/static/chunks/pages/_app.js:38414:46:38414:46""]] without guarding against prototype pollution.","/_next/static/chunks/pages/_app.js","38414","46","38414","46"
"Insecure randomness","Using a cryptographically weak pseudo-random number generator to generate a security-sensitive value may allow an attacker to predict what value will be generated.","warning","This uses a cryptographically insecure random number generated at [[""Math.random()""|""relative:///_next/static/chunks/polyfills.js:182:9:182:21""]] in a security context.","/_next/static/chunks/polyfills.js","4811","29","4811","38"
"Insecure randomness","Using a cryptographically weak pseudo-random number generator to generate a security-sensitive value may allow an attacker to predict what value will be generated.","warning","This uses a cryptographically insecure random number generated at [[""Math.random()""|""relative:///_next/static/chunks/polyfills.js:182:9:182:21""]] in a security context.","/_next/static/chunks/polyfills.js","4812","31","4812","40"
"Insecure randomness","Using a cryptographically weak pseudo-random number generator to generate a security-sensitive value may allow an attacker to predict what value will be generated.","warning","This uses a cryptographically insecure random number generated at [[""Math.random()""|""relative:///_next/static/chunks/polyfills.js:182:9:182:21""]] in a security context.","/_next/static/chunks/polyfills.js","4819","29","4819","38"
"Insecure randomness","Using a cryptographically weak pseudo-random number generator to generate a security-sensitive value may allow an attacker to predict what value will be generated.","warning","This uses a cryptographically insecure random number generated at [[""Math.random()""|""relative:///_next/static/chunks/polyfills.js:182:9:182:21""]] in a security context.","/_next/static/chunks/polyfills.js","4820","31","4820","40"
"Insecure randomness","Using a cryptographically weak pseudo-random number generator to generate a security-sensitive value may allow an attacker to predict what value will be generated.","warning","This uses a cryptographically insecure random number generated at [[""Math.random()""|""relative:///_next/static/chunks/polyfills.js:182:9:182:21""]] in a security context.","/_next/static/chunks/polyfills.js","4828","31","4828","40"
"Insecure randomness","Using a cryptographically weak pseudo-random number generator to generate a security-sensitive value may allow an attacker to predict what value will be generated.","warning","This uses a cryptographically insecure random number generated at [[""Math.random()""|""relative:///_next/static/chunks/polyfills.js:182:9:182:21""]] in a security context.","/_next/static/chunks/polyfills.js","4829","33","4829","42"
"Insecure randomness","Using a cryptographically weak pseudo-random number generator to generate a security-sensitive value may allow an attacker to predict what value will be generated.","warning","This uses a cryptographically insecure random number generated at [[""Math.random()""|""relative:///_next/static/chunks/polyfills.js:182:9:182:21""]] in a security context.","/_next/static/chunks/polyfills.js","4837","29","4837","38"
"Insecure randomness","Using a cryptographically weak pseudo-random number generator to generate a security-sensitive value may allow an attacker to predict what value will be generated.","warning","This uses a cryptographically insecure random number generated at [[""Math.random()""|""relative:///_next/static/chunks/polyfills.js:182:9:182:21""]] in a security context.","/_next/static/chunks/polyfills.js","4838","31","4838","40"
"Insecure randomness","Using a cryptographically weak pseudo-random number generator to generate a security-sensitive value may allow an attacker to predict what value will be generated.","warning","This uses a cryptographically insecure random number generated at [[""Math.random()""|""relative:///_next/static/chunks/polyfills.js:182:9:182:21""]] in a security context.","/_next/static/chunks/polyfills.js","4850","31","4850","40"
"Insecure randomness","Using a cryptographically weak pseudo-random number generator to generate a security-sensitive value may allow an attacker to predict what value will be generated.","warning","This uses a cryptographically insecure random number generated at [[""Math.random()""|""relative:///_next/static/chunks/polyfills.js:182:9:182:21""]] in a security context.","/_next/static/chunks/polyfills.js","4851","33","4851","42"
"Insecure randomness","Using a cryptographically weak pseudo-random number generator to generate a security-sensitive value may allow an attacker to predict what value will be generated.","warning","This uses a cryptographically insecure random number generated at [[""Math.random()""|""relative:///_next/static/chunks/polyfills.js:182:9:182:21""]] in a security context.","/_next/static/chunks/polyfills.js","5079","25","5079","34"
"Insecure randomness","Using a cryptographically weak pseudo-random number generator to generate a security-sensitive value may allow an attacker to predict what value will be generated.","warning","This uses a cryptographically insecure random number generated at [[""Math.random()""|""relative:///_next/static/chunks/polyfills.js:182:9:182:21""]] in a security context.","/_next/static/chunks/polyfills.js","5080","25","5080","34"

Explore AST based diff tools

There can be a lot of 'noise' when diffing minimised bundled code, as the bundler will often change the minified variable names it uses at times between builds (even if the rest of the code hasn't changed)

We can attempt to reduce this by using non-default git diff modes such as patience / histogram / minimal:

⇒ git diff --diff-algorithm=default -- unpacked/_next/static/chunks/pages/_app.js | wc -l
  116000

⇒ git diff --diff-algorithm=patience -- unpacked/_next/static/chunks/pages/_app.js | wc -l
   35826

⇒ git diff --diff-algorithm=histogram -- unpacked/_next/static/chunks/pages/_app.js | wc -l
   35835

⇒ git diff --diff-algorithm=minimal -- unpacked/_next/static/chunks/pages/_app.js | wc -l
   35844

Musings

⭐ Suggestion

It would be cool if ast-grep was able to show a diff between 2 files, but do it using the AST rather than just a raw text compare. Ideally we would be able to provide options to this, such as ignoring chunks where the only change is to a variable/function name (eg. for diffing minimised JavaScript webpack builds)

Ideally the output would be text still (not the AST tree), but the actually diffing could be done at the AST level.

💻 Use Cases

This would be really useful for minimising the noise when diffing minimised source builds looking for the 'real changes' between the builds (not just minimised variable names churning, etc)

Looking through current diff output formats shows all of the variable name changes as well, which equates to a lot of noise while looking for the relevant changes.

Some alternative potential workarounds I've considered are either pre-processing the files to standardize their variable/function names; and/or post-processing the diff output to try and detect when the only changes in a chunk are variable/function names, and then suppressing that chunk. Currently I'm just relying on git diff --diff-algorithm=minimal -- thefile.js

Originally posted by @0xdevalias in ast-grep/ast-grep#901

See Also

Script to calculate the raw/minimised lines of diff change for each file

Being able to see at a glance what the raw and minimised diff lines are for each chunk file can help in determining how much effort reviewing a build will take. If we can copy/paste this as markdown (or insert it directly into the default CHANGELOG entry), then we can also give some useful stats for a build even if we do no deeper manual analysis.

eg.

- TODO: The following files haven't been deeply reviewed:
  - `unpacked/_next/static/chunks/101.js` (`931` lines)
  - `unpacked/_next/static/chunks/2637.js` (`5290` lines)
  - `unpacked/_next/static/chunks/3032.js` (`150,855` lines)
  - `unpacked/_next/static/chunks/30750f44.js` (diff: `45,368` lines, minimised diff: `17,151` lines)
  - `unpacked/_next/static/chunks/3453.js` (`403` lines)
    - Seem to be a bunch of images, likely related to image generation styling or similar.
  - `unpacked/_next/static/chunks/3472.js` (`320` lines)
    - Statsig, Feature Gates, Experimental Gates, etc
  - `unpacked/_next/static/chunks/3842.js` (`755` lines)
  - `unpacked/_next/static/chunks/3a34cc27.js` (diff: `4373` lines, minimised diff: `1633` lines)
  - `unpacked/_next/static/chunks/4114.js` (diff: `1411` lines, minimised diff: `1373` lines)

Simple diff analysis based on strings and identifiers

This is an idea for a type of analysis of code diffs. This issue is just for tracking notes and ideas

Example input 1

+            guidance: function (e) {
+              var t = e.isSearchable,
+                n = e.isMulti,
+                r = e.isDisabled,
+                i = e.tabSelectsValue;
+              switch (e.context) {
+                case "menu":
+                  return "Use Up and Down to choose options"
+                    .concat(
+                      r
+                        ? ""
+                        : ", press Enter to select the currently focused option",
+                      ", press Escape to exit the menu",
+                    )
+                    .concat(
+                      i
+                        ? ", press Tab to select the option and exit the menu"
+                        : "",
+                      ".",
+                    );

Here the extraction might be

guidance
e
t
isSearchable
n
isMulti
isDisabled
i
tabSelectsValue
context
"menu"
"Use Up and Down to choose options"
concat
r
""
", press Enter to select the currently focuse doption"
", press Escape to exit the menu"
", press Tab to select the option and exit the menu"
"."

Of course, this gives you far less information than the original, but I think it could be a good trade-off in cases where you want to look at the diff a little bit but don't have time to see everything.

Example input 2

Since common generic ones like e, t, n, "", and "." would show up frequently in any context, they would have already been seen in the past, and therefore filtered out. You'd more focus on the e.g. "Use Up and Down to choose options", with some kind of convenient way to jump back to see it in-context in the code.

For input like the following:

+      var a = n(72843);
+      function s(e, t) {
+        for (var n = 0; n < t.length; n++) {
+          var r = t[n];
+          (r.enumerable = r.enumerable || !1),
+            (r.configurable = !0),
+            "value" in r && (r.writable = !0),
+            Object.defineProperty(e, (0, a.Z)(r.key), r);
+        }
+      }
+      function l(e, t, n) {
+        return (
+          t && s(e.prototype, t),
+          n && s(e, n),
+          Object.defineProperty(e, "prototype", { writable: !1 }),
+          e
+        );
+      }

, none of the names or strings would probably be new, and so you wouldn't see it at all. This is intended, because I can't gleam any conclusions from looking at it, and thus would prefer not to see it

Glenn's comments

https://twitter.com/_devalias/status/1770284997385277554

I think given the size of a lot of the JS files, and the diffs themselves; it would probably end up being a LOT of strings; which might be confusing when removed from the rest of the context of the surrounding code.

For large diffs I think it'd be a lot, but strings and names are a subset of the raw diff, so it should still be less work than a full manual analysis. The idea is to just visually filter through them until you see a name/string that looks interesting on their own, which could lead to something good in-context.

It should be fairly easy to prototype a script using babel parser and babel traverse though.
You would add a rule or couple to the traverse so that it matches on whatever strings are called in the AST; and then output them to console or a file or similar.

Haven't worked with Babel but some relevant docs seem to be

Are there other AST parsers too? Would something like TreeSitter work? I'd generally prefer to avoid node.js if it's not required

Then you would just diff that output file of strings between one build and the next.
If code moves around between builds it might introduce it’s own form of noise (but maybe git diff —color-moved would handle that still anyway)

I haven't seen enough diffs to exactly anticipate how these would look like but there might be different solutions like color-moved that could work depending on how it goes

I also noticed you liked some of my tweets about my more generalised diff minimiser; which would reduce the noise of things a fair bit overall as well.
I still need to polish that and commit/upload it; been super busy lately and haven’t had a chance to yet.

Related:

Feel free to open an issue on the ChatGPT Source watch repo about the string extractor idea + link back to these tweets/copy the relevant info in.
I’d be happy to give some more pointers about it and/or include it in the repo if you wanted to work on it.

Yeah, I want to make a prototype and see if it will kind of work. I'm still not sure on the implementation, though; the most efficient system might be to integrate with a text editor, which makes it harder to be replicable

Script to identify language/translation files + list them + diff them

Currently it's a manual process to identify which of the chunk files are language/translation files, list them in the CHANGELOG, identify the English translation file, extract/parse the JSON within it, then do a sorted/JSON diff to determine what changed (while also minimising the noise of renamed keys/etc)

It would be good to write a script to automate this process.

This is an entry that had language file changes:

Explore creating a 'reverse engineered' records.json / stats.json file from a webpack build

This is an idea I've had in passing a few times, but keep forgetting to document it:

  • https://medium.com/@songawee/long-term-caching-using-webpack-records-9ed9737d96f2
    • there are many factors that go into getting consistent filenames. Using Webpack records helps generate longer lasting filenames (cacheable for a longer period of time) by reusing metadata, including module/chunk information, between successive builds. This means that as each build runs, modules won’t be re-ordered and moved to another chunk as often which leads to less cache busting.

    • The first step is achieved by a Webpack configuration setting: recordsPath: path.resolve(__dirname, ‘./records.json’)
      This configuration setting instructs Webpack to write out a file containing build metadata to a specified location after a build is completed.

    • It keeps track of a variety of metadata including module and chunk ids which are useful to ensure modules do not move between chunks on successive builds when the content has not changed.

    • With the configuration in place, we can now enjoy consistent file hashes across builds!

    • In the following example, we are adding a dependency (superagent) to the vendor-two chunk.

      We can see that all of the chunks change. This is due to the module ids changing. This is not ideal as it forces users to re-download content that has not changed.

      The following example adds the same dependency, but uses Webpack records to keep module ids consistent across the builds. We can see that only the vendor-two chunk and the runtime changes. The runtime is expected to change because it has a map of all the chunk ids. Changing only these two files is ideal.

  • https://webpack.js.org/configuration/other-options/#recordspath
    • recordsPath: Use this option to generate a JSON file containing webpack "records" – pieces of data used to store module identifiers across multiple builds. You can use this file to track how modules change between builds.

  • https://github.com/search?q=path%3A%22webpack.records.json%22&type=code

I'm not 100% sure if this would be useful, or partially useful, but I think I am thinking of it tangentially in relation to things like:

Fix script for extracting CSS URLs from `webpack.js` + unpacking `*.css` files

In the past there was only a single *.css URL extracted from webpack.js from the miniCssF field, so it was unpacked as miniCssF.css (as the *.css files hashes change every time they are re-built, and they don't seem to have a static chunk part to their filename when downloaded)

More recently, there have been new *.css files specific to certain chunks (sometimes shared among multiple chunks), and so the scripts for extracting this are broken and produce an entry like this:

https://cdn.oaistatic.com/_next/undefined

We also need to think about how best to name the files. I think main.css would probably work for the 'main' chunk (previously what we called miniCssF). For the *.css related to the other chunks, if they only applied to a single chunk I would probably have named them based on that chunk, but sometimes they are used in multiple chunks. If doing it manually we could probably figure out what they are used for and name them based on that, but not sure the best way to do this automatically. We can't use the hash of the *.css file, as that changes every time the file changes.

Update README to align with current scripts/process

I haven't really updated the 'Helper Scripts' / 'Getting Started' sections of the README for quite a while, and so they aren't really fully aligned to how I am actually doing things these days (more based on the older manual'ish methods, or maybe the first iteration of semi-automation)

It would be good to figure out what is completely outdated, what is still relevant but the 'old way' of doing things, and what is the 'new/current way' of doing things; and then update the README to capture that knowledge rather than it being locked up in my head/similar.


At a very high level, off the top of my head, my current process is basically:

  • Load ChatGPT and let my userscript check/notify me if there are any new script files
  • If there are new scripts, use the 'Copy ChatGPT Script data to clipboard' menu option on Tampermonkey
  • Run the following script to get a filtered list of the JSON (with dates) and list of URLs to be downloaded:
    • pbpaste | ./scripts/filter-urls-not-in-changelog.js --json-with-urls
    • # Example
      ⇒ pbpaste | ./scripts/filter-urls-not-in-changelog.js --json-with-urls
      {
        url: 'https://cdn.oaistatic.com/_next/static/chunks/pages/_app-783c9d3d0c38be69.js',
        date: '2024-02-24T02:18:13.376Z'
      }
      {
        url: 'https://cdn.oaistatic.com/_next/static/chunks/webpack-2e4c364289bb4774.js',
        date: '2024-02-24T02:18:13.376Z'
      }
      {
        url: 'https://cdn.oaistatic.com/_next/static/WRJHgIqMF1lNwSuszzsvl/_buildManifest.js',
        date: '2024-02-24T02:18:13.376Z'
      }
      {
        url: 'https://cdn.oaistatic.com/_next/static/WRJHgIqMF1lNwSuszzsvl/_ssgManifest.js',
        date: '2024-02-24T02:18:13.376Z'
      }
      https://cdn.oaistatic.com/_next/static/chunks/pages/_app-783c9d3d0c38be69.js
      https://cdn.oaistatic.com/_next/static/chunks/webpack-2e4c364289bb4774.js
      https://cdn.oaistatic.com/_next/static/WRJHgIqMF1lNwSuszzsvl/_buildManifest.js
      https://cdn.oaistatic.com/_next/static/WRJHgIqMF1lNwSuszzsvl/_ssgManifest.js
  • Copy the output of this command and paste it into SublimeText as a scratch pad/reference
  • Copy the list of URLs, then run the following command
    • pbpaste | ./scripts/add-new-build-v2.sh 2>&1 | subl
    • This will does a bulk of the automation of checking/downloading the URLs, extracting additional URLs from the _buildManifest.js and webpack.js + downloading those, unpacking/formatting the downloaded files, generating a copy/pasteable CHANGELOG entry, etc.
    • Note that as part of running this script, it will ask for the date of the build (from the above JSON) to be input at one point before the CHANGELOG entry is generated
  • Manually copy/paste the generated CHANGELOG entry into the CHANGELOG.md file, generate and add the updated link in the Table of Contents, then modify the entry to add manual analysis notes, etc
  • Commit/push the downloaded files + updated CHANGELOG
  • Potentially write a tweet about the update linking back to the CHANGELOG update, etc
    • If we do, then we should also edit the CHANGELOG again to add a link to that Tweet/thread.
    • Sometimes I will also make a crossposted update on Reddit / LinkedIn / HackerNews / etc; if I do, I tend to also link to those posts in the Tweet thread (and maybe sometimes in the CHANGELOG as well, but I don't think I have bothered with that much lately)

There might be bits in that that aren't perfectly documented; or little snippets of nuance that i've missed, but roughly that is my process currently.


In manually reviewing the diffs to add my 'manual analysis' to the CHANGELOG, there is often a lot of 'diff churn' noise from the minified variable names changing in the webpack build/etc. I've been working on some new scripts that help minimise that; which I haven't currently pushed, but you can see some of my notes about them in this issue:

Currently, I sort of roughly/hackily run them with a command similar to this:

diffmin-wc-raw () { git diff --diff-algorithm=patience $1 | wc -l; };

diffmin-wc () { git diff --diff-algorithm=patience $1 | ./scripts/ast-poc/diff-minimiser.js 2>/dev/null | wc -l; };

diffmin () { git diff --diff-algorithm=patience $1 | ./scripts/ast-poc/diff-minimiser.js 2>/dev/null | delta; };

# diffmin-wc-raw unpacked/_next/static/chunks/pages/_app.js
# diffmin-wc unpacked/_next/static/chunks/pages/_app.js
diffmin unpacked/_next/static/chunks/pages/_app.js

See Also

Automate checking for new builds with GitHub action/similar

Currently the process of checking for new builds is somewhat of a 'manual assisted' process of browsing to the ChatGPT site, letting the chatgpt-web-app-script-update-notifier user script check if any of the script files had changed, then potentially reacting to that notification with more manual steps.

You can see the full manual steps outlined on this issue:

But the core of the start of them are summarised below:

At a very high level, off the top of my head, my current process is basically:

Originally posted by @0xdevalias in #7 (comment)

Because the notifier currently only runs when the ChatGPT app is accessed, it is both easy to miss updates (eg. if updates happen but the ChatGPT app isn't accessed), and easy to get distracted from the task that ChatGPT was originally being opened for by the fact that there is a new update (leading to a tantalising procrastination/avoidance 'treat' when the task at hand brings less dopamine)

The proposed solution would be to use GitHub actions or similar to schedule an 'update check' to happen at a regular interval (eg. once per hour). The following are some notes I made in an intial prompt to ChatGPT for exploring/implementing this:

Can you plan out and create a github action that will:

- run on a schedule (eg. every 1hr)
- check the HTML on a specified webpage and extract some .js script URLs related to a bundled webpack/next.js app
- check (against a cache or similar? not sure of the best way to implement this on github actions) if those URLs have been previously recorded
- if they are new URLs, notify the user and/or kick off further processing (this will probably involve executing one or more scripts that will then download/process the URLs)

That describes the most basic features this should be able to handle (off the top of my head), but the ideal plan is that the solution will be expandable to be able to handle and automate more of the process in future. Some ideas for future features would be:

  • being able to open a Pull Request for each new build, that contains the downloaded files, and the results of various scripts being run on them. This PR would also serve as an interface to prompt the user with any manual actions that are required of them, and some 'bot commands'/workflow for finalising the updates to the CHANGELOG/etc (eg. rebase the PR)
  • etc

See Also

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.