|Posted by:||Harry Potter (rose.joseph…@yahoo.com)|
|Date:||Mon, 25 May 2020|
Hi! I just looked at the Wikipedia article "Prediction by partial matching=
" and like what I hear. I believe I get the gist of it: shorten each chara=
cter to the likelihood that it will occur after the previous character. No=
w, I have an idea to shorten the code:
1. Scan the input for all strings and count the number of occurrences of e=
ach character after the last.
2. Scan the occurrences and write, for each preceding character, the 16 mo=
st-often-occurring following characters. 16 could be any useful number.
3. Scan the input again and, for each character stored as likely-occurring=
, write 1 then the shortened count. Otherwise, write 0 then a fraction of =
the entry skipping the stored values over the total non-stored values. I h=
ave a way to shorten these values in bit streams.
Now, this requires a lot of memory: the counts alone would require 256k and=
, therefore, I deem it a 32-bit technique. I am currently working on 8- an=
d 16-bit compression technique. I plan to do 32- and 64-bit compression at=
a later date. Unless, of course, I can shorten the buffer to include only=
the often-occurring values. :)
What do you think?