I'm an independent JVM (Scala, Kotlin & Java) contractor specializing in backend web development. Please contact me at (github-username) (at) pm.me for work.
I tweet memorable quotes from podcasts at @podquotesio and AWK stuff at @mawkic.
a toy jvm in awk
License: MIT License
I'm an independent JVM (Scala, Kotlin & Java) contractor specializing in backend web development. Please contact me at (github-username) (at) pm.me for work.
I tweet memorable quotes from podcasts at @podquotesio and AWK stuff at @mawkic.
hmmm what do you mean by that ? here's a fully-functional hex-encoder for gawk (sorry for the poor formatting - i dug it up from my pile)
even in gawk unicode-byte, i got it to hex encode 2 different binary mp3 files with ease, and without any error messages popping up (try not to use it in gawk -P posix mode - all kinds of weird behavior may bubble up. I think the octal encoder also works, but haven't tested it lately. lemme know if this works or not ?
if that offset 8^8 doesn't work, use 0xDC00 instead. if that also fails, then try the last resort of -4^4.
gawk -e 'function hexencode(str,chr) { for(chr in b2hex) { if (chr!~/[[:alnum:]%\\]/) { gsub(chr,b2hex[chr],str) } }; return str } function octencode(str,chr) { gsub(/\\/,b2oct["\\"],str); gsub(/[0-7]/,"\06&",str); for(chr in b2oct) { if(chr!~/[0-7\\]/) { gsub(chr,b2oct[chr],str) str } }; return str } BEGIN { offset=8^8;for(x=0;x<256;x++) { byte=sprintf("%c",x+offset);b2hex[byte]=sprintf("\\x%.2X",x);b2oct[byte]=sprintf("\\%03o",x) }; spc1="/\\^[]";spc2="~!@#%&_-{}:;\42\47\140 <>,$.|()*+=?"; for(x=length(spc1);x;x-=1) { byte=substr(spc1,x,1); b2hex[("\\"(byte))]=b2hex[byte]; b2oct[("\\"(byte))]=b2oct[byte]; delete b2hex[byte]; delete b2oct[byte] }; for(x=length(spc2);x;x--) { byte=substr(spc2,x,1); b2hex[("["(byte)"]")]=b2hex[byte]; b2oct[("["(byte)"]")]=b2oct[byte]; delete b2hex[byte]; delete b2oct[byte] } } BEGIN { RS=FS="^$"; OFS=""; ORS=""; } END { print hexencode($0) }'
this encoder may not be 100% to URL-encoding spec per se - it was simply i quickly slabbed together another time before. it's currently instructed to only skip encoding the alphanumeric ones, but will encode the other punctuation symbols that aren't part of the spec. feel free to modify it.
If you are willing to use the gawk -b
argument it isn't hard to make a hexdumper. The following is something I cooked up that gives identical output to your hexdump -v -e '/1 "%01u "'
script.
I recommend using https://www.gnu.org/software/gawk/manual/gawk.html#Extension-Sample-Readfile instead of the gross randomstring()
stuff below.
#!/usr/bin/gawk -bf
# If you look up in the shebang the -b argument is what makes this work. It
# forces gawk to read the characters in as a stream of bytes rather than
# encoded characters.
#
# I know that makes no sense, but the docs describe it as:
#
# > an easy way to tell gawk, "Hands off my data!"
#
# and that turns out to be just what we need.
function randomstring() {
output = ""
for (i=0; i < 16; i++) {
output = output sprintf("%04x", int(rand() * 65536))
}
return output
}
BEGIN {
srand()
# By setting the RS to a big random string we get the file as a single record
# without using a gnu extension or the ugly concat loop. As the strring is
# very unlikely to appear in any file ever. You could just hardcode a uuid if
# you don't mind it not being future proof.
RS = randomstring()
FPAT = "."
# We just build an encoding table rather than try to compute this somehow
for (i=0; i <= 255; i++) {
c = sprintf("%c", i)
codes[c] = i
}
}
{
for (i=1; i <= NF; i++) {
printf("%d ", codes[$i])
}
}
we're currently running the tests on macos-latest
and ubuntu-latest
. Can we make it work on windows-latest
as well? Given that both javac
and gawk
are installed there by default, this should not be too difficult :)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.