Follow

linux-y question, boosts appreciated 

So I'm trying to capture a bunch of websites as png or pdf repeatedly through a cronjob. So far, the best luck I had was with Chromium, and I also tried CutyCapt and Firefox. However, none of them are able to reliably capture tagesschau.de. If anyone knows any good commandline website capturing tools I'd appreciate you telling me about them!

· · Web · 7 · 12 · 2

more info: linux-y question, boosts appreciated 

firefox only works half the time, the rest of the time it idles indefinitely at low CPU usage, chromium always idles, cutycapt produces blank files

re: linux-y question, boosts appreciated 

@kaniini I feel like I should've used that one from the beginning

re: linux-y question, boosts appreciated 

@noiob not an answer or recommendation, but it might help folks to know more specifically how your tools fail on that site

re: linux-y question, boosts appreciated 

@Vann thanks!

re: linux-y question, boosts appreciated 

@noiob w-wait but i didn't help at al-

re: linux-y question, boosts appreciated 

@Vann you gave me a good recommendation!

re: linux-y question, boosts appreciated 

@noiob ah right

linux-y question, boosts appreciated 

@noiob @socialskeleton I have had good luck with phantomjs.org/screen-capture.h in the past. Some sites need time for JS to load, and there are ways to set it to wait for them to finish. I’m afraid it is less a finished tool than some tools you can use to make your own (including several for-pay tools).

linux-y question, boosts appreciated 

@noiob you can use #scrapy - python web scraper, cURL, wget.

re: linux-y question, boosts appreciated 

@Nixfreak that doesn't actually render the page into png or pdf

re: linux-y question, boosts appreciated 

@noiob oh I'm sorry you wanted to render a web page into a PDF or png , I thought you were trying to scrape PDF's or PNG images , my bad. Could you image-magic to convert html to png or pdf I think you can at least.

re: linux-y question, boosts appreciated 

@Nixfreak I use wkhtmltoimage now and it seems to work great

re: linux-y question, boosts appreciated 

@noiob It's still chromium, but maybe this is worth a shot?
github.com/GoogleChrome/puppet

I'm wary of npm, though... wouldn't run that outside a sandbox 😅

Sign in to participate in the conversation
Awoo Space

Awoo.space is a Mastodon instance where members can rely on a team of moderators to help resolve conflict, and limits federation with other instances using a specific access list to minimize abuse.

While mature content is allowed here, we strongly believe in being able to choose to engage with content on your own terms, so please make sure to put mature and potentially sensitive content behind the CW feature with enough description that people know what it's about.

Before signing up, please read our community guidelines. While it's a very broad swath of topics it covers, please do your best! We believe that as long as you're putting forth genuine effort to limit harm you might cause – even if you haven't read the document – you'll be okay!