์ฅ๋ ์์์๋ ์ ๊พธ๊ฐ ๋ณต๋ ์ ์ด๋ค. - ํ๋ ๋ฐ๋ฆฌํฌ ๋์
๏ปฟ * ๊ตฌ๊ธ์์ ์ฐ์ฐํ ์ฐพ์ ํ์ผ์์ ๋ด์ฉ ์ค RLE ๋ถ๋ถ๋ง ๋ฒ์ญํ ๊ฒ์
๋๋ค.
- ์๋ณธํ์ผ :
CODECS.ZIP
1 ๊ฐ์ #
RLE์ Run Length Encoding์ ์ฝ์ด์ด๋ค. ๋ค๋ฅธ ์ฝ์ด๋ฅผ ๋ค์๋ฉด RLC (Run Length Coding)๋ผ๊ณ ๋ ํ๋ค. (๋์ ๊ฐ์ ๋ป์ด๋ค.) ์ด ๋ฐฉ์์ ์์ด๋์ด๋ ๋ฐ์ดํ๋ฅผ ๋ฐ๋ณต๋๋ ํ๋ ์์ ๊ธฐ์คํ์ฌ ์ฌ์ฝ๋ฉํ๋ ๊ฒ์ด๋ค. ํ๋ ์์ ํ๋ฒ ์ด์ ๋ฐ๋ณต๋๋ ํ๋ ์ด์์ ๋ฐ์ดํธ๋ค์ด ๋ ์ ์๋ค.
There are several means to encode occurrences. ๋ฐ๋ผ์, ๋ช๊ฐ์ง ์ฝ๋ฑ(๋ณตํธ๊ธฐ)๋ฅผ ๊ฐ์ง๊ฒ ๋ ๊ฒ์ด๋ค. ์๋ฅผ ๋ค์๋ฉด, ๋ค์๊ณผ ๊ฐ์ ๋ฐฐ์ด์ ๊ฐ์ง๊ณ ์๋ค๊ณ ํ์:
0,0,0,0,0,0,255,255,255,2,3,4,2,3,4,5,8,11
๋ช๋ช ์ฝ๋ฑ์ ๋จ์ง 0๊ณผ 255 ์ ์ฒด์ ๋ฐ๋ณต๋ง์ ๋ค๋ฃฐ ๊ฒ์ด๊ณ , ๋ช๋ช ์ฝ๋ฑ์ 0์ ๋ฐ๋ณต๊ณผ 255์ ๋ฐ๋ณต์ ๋ณ๋๋ก ๋ค๋ฃฐ ๊ฒ์ด๋ค.
์ด ์์ ์ ๊ธฐ์ดํ ์ค์ํ ์ ๋ค์ ๋ช
์ฌํ๋๋ก. ์ฝ๋ฑ์ ์์ถํ๋ ค๊ณ ํ๋ ๋ชจ๋ ๋ฐ์ดํ์ ๋์ํ์ง๋ ์๋๋ค. ๊ทธ๋์, ๋ฐ๋ณต๋๋ ์์๊ฐ ์๋ ๊ฒฝ์ฐ, RLE๋ฐฉ์์ ์ฝ๋ฑ์ ์์ถํ ์ ์๋ค๊ณ ํ ํ์๋ ์๋ค. ์ค์ ๋ก, ๋ฐ๋ณต๋์ง ์์ ๋ฐ์ดํ๋ฅผ ์์ถํ๋ ค๊ณ ์๋ํ ๊ฒ์ด๋ค. (์ด๊ฒ์ ๋ฐ์ดํ๋ฅผ ๊ทธ๋๋ก ๋ณต์ฌํ๋ ๊ฒ๊ณผ ๊ฐ๋ค.) ๋ฌผ๋ก , ์
๋ ฅ ๋ฐ์ดํ์ ๋ณต์ฌ๋ณธ์ ๊ฐ ๋ณต์ฌ ๋ธ๋ญ์ ์์ "ํค๋"๊ฐ ๋ถ์ด์๋ค. ์ด๊ฒ์ ์ ์ฒด๋ฐ์ดํ๋ฅผ ์์ถํ๋ค๊ธฐ ๋ณด๋ค๋ ๋๋ฆฌ์ด ํฌ๊ธฐ๋ฅผ ์ฆ๊ฐ์ํฌ ์ ์๋ค. ๊ทธ๋ฌ๋ ์์ถ๋๋ ์ ์ฒด ๋ฐ์ดํ์ ์
์ฅ์์ ๋ณด๋ฉด, ๋ฐ๋ณต๋๋ ํ๋ ์ ๋ถ๋ถ์ ์์ถ์ ๋น์์ถ๋ถ๋ถ๋ณด๋ค ํจ์ฌ ์ ์ ๊ณต๊ฐ์ ์ฐจ์งํ ๊ฒ์ด๋ค.
RLE๋ผ๋ ์ด๋ฆ์ ๊ฐ์ง ๋ชจ๋ ์๊ณ ๋ฆฌ์ฆ์ ๋ค์๊ณผ ๊ฐ์ 3~4๊ฐ์ ์์น๋ฅผ ๊ฐ์ง๊ณ ์๋ค:
- Value saying if there's a repetition
- Value saying how many repetitions (or non repetition)
- Value of the length of the frame (useless if you just encode frame with one byte as maximum length)
- Value of the frame to repeat (or not)
2 ์ฒซ๋ฒ์งธ RLE ๊ตฌ์กฐ(scheme) #
์ฒซ๋ฒ์งธ ๊ตฌ์กฐ๋ ๋ด๊ฐ ์๊ณ ์๋ ํ ๊ฐ์ฅ ๊ฐ๋จํ ๊ตฌ์กฐ์ด๋ค. MAC ์ด๋๋ ์ค ์์คํ
(MacPackBit)์์ ์ฌ์ฉ๋ ๊ฒ๊ณผ Targa, PCX, TIFF๋ฑ๋ฑ์ ํ์ผํฌ๋ฉง์์ ์ฌ์ฉ๋ ๋ฐฉ์๊ณผ ๋น์ทํ๊ฒ ๋ณด์ผ ์๋ ์๋ค.
์ฌ๊ธฐ, ๋ชจ๋ ์์ถ๋ ๋ธ๋ญ์ ํค๋(header)๋ผ๊ณ ์ด๋ฆ์ง์ด์ง 1 ๋ฐ์ดํธ๋ก ์์๋๋ค. ์ด๊ฒ์ ์๋ ๊ทธ๋ฆผ๊ณผ ๊ฐ๋ค:
| Bits | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| Header | X | X | X | X | X | X | X | X |
- ๋นํธ 7 : ์์ถ ์ํฉ (1 = ์์ถ์ด ์ ์ฉ๋จ)
- 0 ~ 6 : ๋ค๋ฃจ์ด์ผํ ๋ฐ์ดํธ์ ์
๋ณด๋ค์ํผ, ์ด ๋ฐฉ๋ฒ์ ๋จ์ง 1 ๋ฐ์ดํธ๋ก ํ๋ ์์ ์ ์ดํ ๋ฟ์ด๋ค.
์ถ๊ฐ์ฌํญ : ์ฌ๋ฌ๋ถ์ ์์ถํ๊ธฐ ์ํด์๋ ์ ์ด๋ 1๊ฐ์ ๋ฐ์ด๋๋ฅผ ๊ฐ์ ธ์ผ ํ๋ฏ๋ก ๋น์์ถ ํ๋ ์์ ๋ํด.. You have 'minus 1' for non-repeated frames because you must have at least one byte to compress and 'minus 2' for repeated frames because the repetition must be 2, at least.
Compression scheme:
First byte=Next /\ / \ Count the byte Count the occurrence of NON identical occurrences bytes (maximum 128 times) (maximum 129 times) and store them in an array | | | | 1 bit '1' 1 bit '0' + 7 bits giving + 7 bits giving the number (-2) the number (-1) of repetitions of non repetition + repeated byte + n non repeated bytes | | 1xxxxxxx,yyyyyyyy 0xxxxxxx,n bytes [-----------------] [----------------]
Example:
See codecs source codes: codrle1.c and dcodrle1.c
| Sequence of bytes to encode | Coded values | Differences with compression (unit: byte) |
| 255,15, | 1,255,15, | -1 |
| 255,255, | 128,255, | 0 |
| 15,15, | 128,15, | 0 |
| 255,255,255, | 129,255, | +1 |
| 15,15,15, | 129,15, | +1 |
| 255,255,255,255, | 130,255, | +2 |
| 15,15,15,15 | 130,15 | +2 |
See codecs source codes: codrle1.c and dcodrle1.c
3 Second RLE scheme #
In the second scheme of RLE compression you look for the less frequent byte
in the source to compress and use it as an header for all compressed block.
In the best cases, the occurrence of this byte is zero in the data to compress.
Two possible schemes, firstly with handling frames with only one byte,
secondly with handling frames with one byte *and* more. The first case is
the subject of this current compression scheme, the second is the subject
of next compression scheme.
For the frame of one byte, header byte is written in front of all repetition
with at least 4 bytes. It is then followed by the repetition number minus 1 and
the repeated byte.
Header byte, Occurrence number-1, repeated byte
If a byte don't repeat more than tree times, the three bytes are written without
changes in the destination stream (no header nor length, nor repetition in front
or after theses bytes).
An exception: If the header byte appears in the source one, two, three and up
times, it'll be respectively encoded as following:
- Header byte, 0
- Header byte, 1
- Header byte, 2
- Header byte, Occurrence number-1, Header byte
Example, let's take the previous example. A non frequent byte is zero-ASCII
because it never appears.
Sequence of bytes to encode | Coded values | Differences with compression
| (unit: byte) |
255,15, | 255,15, | 0
255,255, | 255,255, | 0
15,15, | 15,15, | 0
255,255,255, | 255,255,255, | 0
255,255,255,255, | 0,3,255, | -1
15,15,15, | 15,15,15, | 0
15,15,15,15 | 0,3,15 | -1
If the header would appear, we would see:
Sequence of bytes to encode | Coded values | Differences with compression
| (unit: byte) |
0, | 0,0, | +1
255, | 255, | 0
0,0, | 0,1, | 0
15, | 15, | 0
0,0,0, | 0,2, | -1
255, | 255, | 0
See codecs source codes: codrle2.c and dcodrle2.c
0,0,0,0 | 0,3,0 | -1
*** Third RLE scheme ***
It's the same idea as the second scheme but we can encode frames with
more than one byte. So we have three cases:
- If it was the header byte, whatever is its occurrence, you encode it with:
Header byte,0,number of occurrence-1
- For frames which (repetition-1)*length>3, encode it as:
Header byte, Number of frame repetition-1, frame length-1,bytes of frame
- If no previous cases were detected, you write them as originally (no header,
nor length, nor repetition in front or after theses bytes).
Example based on the previous examples:
Sequence of bytes to encode | Coded values | Differences with compression
| (unit: byte) |
255,15, | 255,15, | 0
255,255, | 255,255, | 0
255,255,255, | 255,255,255, | 0
15,15, | 15,15, | 0
15,15,15, | 15,15,15, | 0
15,15,15,15, | 15,15,15,15, | 0
255,255,255,255, | 255,255,255,255, | 0
16,17,18,16,17,18, |16,17,18,16,17,18,| 0
255,255,255,255,255, | 0,4,0,255, | -1
15,15,15,15,15, | 0,4,0,15, | -1
16,17,18,16,17,18,16,17,18,| 0,2,2,16,17,18, | -3
26,27,28,29,26,27,28,29 |0,1,3,26,27,28,29 | -1
If the header (value 0) would be met, we would see:
Sequence of bytes to encode | Coded values | Differences with compression
| (unit: byte) |
0, | 0,0,0, | +2
255, | 255, | 0
0,0, | 0,0,1, | +1
0,0,0, | 0,0,2, | 0
15, | 15, | 0
255, | 255, | 0
See codecs source codes: codrle3.c and dcodrle3.c
0,0,0,0 | 0,0,3 | -1
*** Fourth RLE scheme ***
This last RLE algorithm better handles repetitions of any kind (one byte
and more) and non repetitions, including few non repetitions, and does not
read the source by twice as RLE type 3.
Compression scheme is:
First byte=Next byte?
1 bit '0' 1 bit '1'
/\
/ \
Yes / \ No
/ \
/ \
Count the Motif of several
occurrences repeated byte?
of 1 repeated ( 65 bytes repeated
byte (maximum 257 times maxi)
16449 times) /\
/\ / \
/ \ / \
/ \ / \
/ \ / \
1 bit '0' 1 bit '1' 1 bit '0' 1 bit '1'
+ 6 bits + 14 bits + 6 bits of |
giving the giving the the length Number of non repetition
length (-2) length (-66) of the motif (maximum 8224)
of the of the + 8 bits of /\
repeated byte repeated byte the number (-2) < 33 / \ > 32
+ repeated byte + repeated byte of repetition / \
| + bytes of the 1 bit '0' 1 bit '1' |
| | the numer (-1) of the |
| | repetition of repetition |
| | repeated repeated |
| | | | |
| | | ------------------------- |
| | 110xxxxx,n bytes |
| | |
| ------------------------- |
Example, same as previously:
Sequence of bytes to encode | Coded values | Differences with compression
| (unit: byte) |
255,15,255,255,15,15 |11000101b,255,15,255,255,15,15 | +1
255,255,255,255 | 00000010b,255, | -2
255,255,255 | 00000001b,255, | -1
15,15,15 | 00000001b,15, | -1
15,15,15,15 | 00000010b,15, | -2
16,17,18,16,17,18 | 10000001b,0,16,17,18, | -1
255,255,255,255,255 | 00000011b,255, | -3
15,15,15,15,15 | 00000011b,15, | -3
16,17,18,16,17,18,16,17,18 | 10000001b,16,17,18, | -4
26,27,28,29,26,27,28,29 | 10000010b,26,27,28,29 | -2








