Member-only story
[VIDEO] GZIP compress web page’s content and save in MySQL using GORM (Golang) — Kanan Rahimov
2 min readJan 3, 2021
In the attached video I discuss the following topics:
- Save links and images from the webpage.
- Mark URL as complete in the pipeline once it is fully parsed.
- Refactor: extract text compressor to the separate function (similar to decompressor).
- To-do: define a task for the “webpage data parser” worker.
Compress using GZIP
We retrieve the web page content as a text body. Since we expect to have many URLs saved locally, it would be optimal to compress this date; thus, it will take less storage. By my benchmarks, we can see an average of 60–85 % compression level. See the video for examples.
I use gzip ( compress/gzip
) to compress the text. In this video, I refactored text compression to a separate function. I then executed the whole pipeline to see if the webpage's data fetched and saved in the compressed version (manual test).